CN117111834A

CN117111834A - Memory and computing system including memory

Info

Publication number: CN117111834A
Application number: CN202310651848.XA
Authority: CN
Inventors: 孙鹏昊; 郑圣安; 张婉茹; 马若岩; 黄林鹏; 邓开江; 杨杰; 王贯中; 朱峰; 李舒
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2023-06-02
Filing date: 2023-06-02
Publication date: 2023-11-24

Abstract

The present disclosure relates to a memory and a computing system including the memory. The computing system includes a host and a memory. The memory includes: the storage medium module comprises a plurality of physical blocks, each physical block comprises a plurality of physical pages, the read-write operation is executed by taking the physical pages as a unit, and the erase operation is executed by taking the physical blocks as a unit; and the page classification module is used for predicting the cold and hot degree of the logic page to be written in the storage medium module by using the machine learning model, wherein the logic page predicted to be hot and the logic page predicted to be cold are respectively stored in physical pages of different physical blocks. The cold and hot data classification is carried out by using the machine learning model in the memory, so that the consumption of calculation power of the host computer by the reasoning prediction operation of the machine learning model is avoided, and higher classification accuracy can be realized with lower calculation power burden of the host computer.

Description

Memory and computing system including memory

Technical Field

The present disclosure relates to a memory and a computing system including the memory.

Background

In recent years, solid state disk (SSD, solid State Drive) technology has rapidly developed, and has become a very important nonvolatile storage device.

A solid state disk is a hard disk that uses a flash memory chip as a main storage medium, the flash memory chip including a number of flash blocks (hereinafter, also referred to as "physical blocks"), each of which includes a number of flash pages (pages, opposite logical pages, also referred to as "physical pages"). The read and write operations of flash memory are performed in units of flash memory pages, which are typically 4KB to 64KB in size. Flash pages cannot be directly overwritten after being written, and need to be written again after an erase operation. The erase operation is performed in units of flash blocks.

Because each block of a conventional block storage device such as a magnetic disk can be directly rewritten, in order to hide the characteristic that flash memory needs to be erased and then written to a host, a solid state disk is usually mounted with a controller for running flash translation layer (FTL, flash Translation Layer) logic.

The flash memory conversion layer FTL is a software layer operated by the solid state disk controller, and maintains a mapping relation between a logical address and a physical address of a flash memory page, so that a correct physical address can be found when a host requests to read data based on the logical address.

The FTL maps the logical addresses in all write requests of the host to the new physical addresses, and if the logical addresses already have the corresponding physical addresses (i.e. have been written, the logical page will be overwritten), the physical addresses of the old physical flash pages are marked as invalid, so that the overwriting is converted into the append writing, and the erasing operation during the overwriting is avoided.

Because FTL will convert the overwriting of flash memory pages into append writing, flash memory pages may be exhausted before the logical capacity of the solid state disk is full. When the flash pages of the flash memory are about to be exhausted, the FTL performs a Garbage Collection (GC) operation (which may also be referred to as "data reclamation" in the context of the present disclosure), and the erased portion contains the flash blocks of the invalid flash pages for subsequent writing.

In the garbage recycling process, valid flash memory pages in a flash memory block to be erased need to be copied to other positions, the process can lead to the fact that the same data are written for multiple times, the problem of write amplification is generated, and the performance and the service life of the solid state disk can be influenced by severe write amplification.

An effective method for reducing the write amplification caused by garbage collection is to store cold and hot data separately in a solid state disk. Here, hot data refers to data that is frequently overwritten, and cold data refers to data that is less frequently overwritten. When the hot data is concentrated in the same flash block, the flash pages in the flash block will be overwritten and become invalid in a shorter time, thereby reducing the amount of data copied at the time of garbage collection and reducing write amplification.

The existing schemes for reducing the write amplification of the solid state disk by using the cold and hot data classification mode can be divided into a rule-based scheme and a machine learning-based scheme.

In the scheme based on the rule, the system judges the cold and hot degree of the data written into the solid state disk by the host by using a fixed logic rule, and stores the data with different cold and hot degrees into different flash memory pages in the solid state disk respectively to realize the separation of the cold and hot data.

A rule-based scheme classifies data into cold and hot types, data written by a host is regarded as hot data, and data copied during garbage collection is regarded as cold data. When the host writes data, another rule-based scheme takes the time interval between the last time the data is written and the current time as the predicted value of the time interval between the current time and the next time the data is written, takes the data with the predicted value lower than the threshold value as hot data and takes the data higher than the threshold value as cold data.

The cold and hot data classification mechanism based on rules has less calculated amount, but a single logic rule cannot adapt to changeable application workload, so that the classification accuracy is lower.

In a machine learning based approach, the system predicts how hot and cold data is from various runtime collected features using a machine learning model and stores the hot and cold data into different flash pages.

The existing machine learning model prediction is carried out on the host side, model calculation is needed to be carried out on all application writing requests, the calculation burden of the host is greatly increased, and service resources are occupied.

Therefore, there is still a need for an improved scheme for predicting the cold and hot degree of data to be written into a storage device, so as to reduce the computational burden of a host while improving the classification accuracy.

Disclosure of Invention

The disclosure provides a memory and a computing system including the same, which can realize higher classification accuracy with lower computational burden of a host computer when predicting the cold and hot degree of data to be written into a storage device.

According to a first aspect of the present disclosure, there is provided a memory comprising: the storage medium module comprises a plurality of physical blocks, each physical block comprises a plurality of physical pages, the read-write operation is executed by taking the physical pages as a unit, and the erase operation is executed by taking the physical blocks as a unit; and the page classification module is used for predicting the cold and hot degree of the logic page to be written in the storage medium module by using the machine learning model, wherein the logic page predicted to be hot and the logic page predicted to be cold are respectively stored in physical pages of different physical blocks.

Optionally, the machine learning model is a gated loop unit model.

Optionally, the machine learning model obtains the cold and hot degree prediction result based on the characteristics of the logical page to be written and metadata recorded when the logical page to be written was previously written.

Optionally, the metadata includes a current writing time and a current intermediate result of the logical page, as a previous writing time and a previous intermediate result when the logical page is predicted next time, the machine learning model obtains the current intermediate result based on the feature of the logical page to be written, the previous writing time of the logical page to be written, and the previous intermediate result obtained when the logical page to be written is predicted previously, and obtains the cold and hot degree prediction result based on the current intermediate result.

Optionally, the physical block includes a physical page of data and a physical page of metadata, the physical page of metadata storing metadata of a logical page corresponding to the physical page of data in the physical block.

Optionally, metadata of the data physical page is stored in an out-of-band area of the data physical page, and when the data recovery operation is performed on the physical block where the data physical page is located, the metadata stored in the out-of-band area of the physical page is written into the metadata physical page of the new physical block.

Optionally, the memory may further include: the buffer comprises a data buffer area and a metadata buffer area, wherein the data buffer area temporarily stores data of a logic page to be written in by the host, the data is written in a physical page of a corresponding physical block after the prediction of the page classification module is completed, and the metadata buffer area stores metadata of part of the logic page for the page classification module to extract for prediction.

Optionally, determining the physical page number of the metadata physical page where the metadata of the logical page to be written is located according to the offset of the physical page number corresponding to the logical page number of the logical page to be written in the corresponding physical block; and storing the data of the metadata physical page in the metadata cache region in a red-black tree structure, and searching the metadata of the current logical page to be written in the red-black tree structure based on the physical page number of the metadata physical page where the metadata of the logical page to be written is located.

Optionally, under the condition that metadata of the current logical page to be written does not exist in the metadata cache region, searching a corresponding metadata physical page from a physical block where a physical page corresponding to the logical page is located, caching the data of the searched metadata physical page into the metadata cache region, and under the condition that the metadata cache region is full, deleting the data of other metadata physical pages cached in the metadata cache region according to the least recently used principle.

Optionally, the characteristics of the logical page to be written include at least one of: the time interval between the previous writing time and the current writing time of the logic page to be written; the size of the current write request containing the logical page to be written; whether the current writing request containing the logic page to be written is continuous writing or not; the number of times the logical block to which the logical page to be written belongs is hit in the latest write request for many times; the number of times of hitting the logic block to which the logic page to be written belongs in the latest multiple reading requests; and the read-write ratio of the last multiple requests.

Optionally, the page classification module obtains trained model parameters of the machine learning model from the host.

Optionally, the page classification module obtains the characteristics of the logical page to be written from a write request from the host.

Optionally, the machine learning model of the page classification module predicts the cold or hot degree of the logical page to be written while the data buffer receives the data of the logical page to be written from the host.

According to a second aspect of the present disclosure, there is provided a computing system comprising: a memory according to the first aspect of the present disclosure; and the host computer is used for writing and reading the memory and comprises a model training module which is used for training the machine learning module to obtain trained model parameters and sending the trained model parameters to the page classification module of the memory so as to deploy the trained model parameters for the machine learning model.

Optionally, the model training module records the characteristics required by prediction for each logic page related to the write-in request sent by the host to the memory, records the write-in time of each logic page, and acquires the write-in time interval of two adjacent write-in requests under the condition that the write-in request related to the same logic page appears for multiple times; based on the characteristics and the writing time intervals acquired for the logic pages related to the plurality of writing requests, a training sample set is obtained, and each logic page is marked as cold or hot according to whether the writing time interval of each logic page is larger than a threshold value or not; training the machine learning model by using the marked training sample set to obtain model parameters, and transmitting the model parameters obtained by training to a page classification module of a memory so as to deploy the machine learning model.

Optionally, after performing a round of training on the machine learning model based on the logical pages related to the set of multiple write requests and transmitting the model parameters obtained by the training to the page classification module of the memory, starting performing a next round of training on the machine learning model based on the logical pages related to the next set of multiple write requests.

Optionally, taking inflection points of a cumulative distribution curve of write time intervals of a set of a plurality of logical pages involved in the write request as initial values of the threshold values; determining a sequencing percentile of a threshold used in the previous round of training in a writing time interval of a logic page related to the current round of training, training a machine learning model to obtain a plurality of machine learning models based on the sequencing percentile, one or more percentiles larger than the sequencing percentile and one or more writing intervals respectively corresponding to the one or more percentiles smaller than the sequencing percentile as candidate threshold values, testing the accuracy of the plurality of machine learning models, transmitting parameters of the machine learning model with the highest accuracy to a page classification module of a memory, and taking the candidate threshold value with the highest accuracy as the threshold value used in the current round.

Therefore, the present disclosure proposes a framework capable of classifying cold and hot data by using a machine learning model in a memory such as a Solid State Disk (SSD), so as to avoid the consumption of calculation power of a host by inference prediction operation of the machine learning model.

In the embodiment, the machine learning model predicts the cold and hot degree of the logic page to be written while the data buffer area receives the data of the logic page to be written from the host, the time for reasoning and predicting the model is covered by the time for data transmission, and the performance cost caused by the prediction of the machine learning model is reduced.

In addition, in the embodiment, metadata which is obtained previously and can be used for cold and hot prediction of the logic page to be written is also stored, so that the computing performance of the system is further improved.

In addition, the machine training model is used for classifying cold and hot data on one side of the memory, so that the write amplification of the memory such as a Solid State Disk (SSD) caused by data recovery operation can be effectively reduced, and the service life and the read-write performance of the memory such as the Solid State Disk (SSD) are improved.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be apparent from the following more particular descriptions of exemplary embodiments of the disclosure as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout exemplary embodiments of the disclosure.

FIG. 1 is a system architecture schematic diagram of a computing system according to the present disclosure.

FIG. 2 schematically illustrates a GRU model architecture that may be used by embodiments of the present disclosure.

Fig. 3 schematically illustrates a management mode of metadata by a buffer and a storage medium module.

Fig. 4 is a schematic block diagram of a memory controller.

Fig. 5 is a schematic flow chart of a write operation to a memory.

Detailed Description

Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The disclosure provides a cold and hot data classification method based on a machine learning model, which is used for reducing write amplification generated by garbage collection operation in a memory. The machine learning model is embedded into the memory to realize the classified storage of cold and hot data during operation.

Hereinafter, a Solid State Disk (SSD) will be mainly described as an example. Those skilled in the art will appreciate that other types of memory are equally applicable to the aspects of the present disclosure.

As shown in fig. 1, a computing system may include a host and memory such as a Solid State Disk (SSD).

The host may perform write operations and read operations on the memory. The upper layer application of the host sends an input/output (I/O) request to the device driver layer. The device driver sends a read/write request to the SSD to perform the read/write operation.

A storage medium module, such as a NAND flash memory, may be included on a memory, such as an SSD. A storage medium module such as a NAND flash memory may include a plurality of physical blocks, each physical block including a plurality of physical pages. As described above, the read and write operations may be performed in units of physical pages and the erase operation may be performed in units of physical blocks. The physical block may be, for example, a SuperBlock (SPB), and GC operation is performed on an already-written closed SuperBlock to write valid data therein into a new open SuperBlock. In the case where the storage medium module is a NAND flash memory, the physical block may be a super block of the flash memory, and the physical page may be a flash page.

The memory side also includes a page classification module. For example, the page classification module is located inside the solid state disk. The page classification module may include a machine learning model and may be capable of collecting at run-time the features required by the machine learning model for inferential prediction.

The page classification module may predict a degree of coldness of a logical page to be written to the storage medium module using a machine learning model, the logical page predicted to be hot and the logical page predicted to be cold being respectively stored in physical pages of different physical blocks.

The machine learning model of the page classification module may predict the cold or hot degree of the logical page to be written while the data buffer area receives the data of the logical page to be written from the host. In this way, the time of model reasoning prediction is covered by the time of data transmission, and the performance cost caused by machine learning model prediction is reduced.

In an embodiment, for a logical page written by a host, the page classification module may collect the characteristics required for model prediction from the write request from the host and use a machine learning model to predict how hot the logical page is. For example, the machine learning model may output a binary result indicating whether the logical page is cold or hot. It should be appreciated that the output of the machine learning model may also be more than a binary result, and may be divided into more levels based on the degree of coldness, e.g., very cold, colder, normal, hotter, very hot, etc. Therefore, according to the prediction result of the page classification module, the logical pages with different cold and hot degrees can be respectively stored in different physical blocks.

In one embodiment, the machine learning model may be a Time Convolutional Network (TCN) model that predicts the next time each logical page written to the solid state disk is overwritten to determine its extent of coldness.

In another embodiment, the machine learning model may be a Long Short Term Memory (LSTM) model that predicts how hot each logical page is within a time window of a set length.

In another embodiment, the machine learning model may be a gated loop unit (GRU, gated Recurrent Unit) model.

Hereinafter, a further detailed description will be given taking the use of the GRU model as an example.

As shown in fig. 2, the GRU model uses as input a feature sequence of sets of features acquired each time a logical page is written.

For example, if a logicPages are written five times in total, the feature sequence contains five sets of features X _t-4 、X _t-3 、X _t-2 、X _t-1 、X _t . Wherein the first group of features X _t-4 For features acquired when the page was first written (time t-4), a second set of features X _t-3 Features collected the second time the page is written to (time t-3), and so on.

The GRU unit uses the last intermediate result obtained from the last prediction (in each prediction, H in order) _t-4 、H _t-3 、H _t-2 、H _t-1 ) And the characteristics collected this time (in each prediction, in turn: x is X _t-3 、X _t-2 、X _t-1 、X _t ) The intermediate results of this time were obtained (in each prediction, in order: h _t-4 、H _t-3 、H _t-2 、H _t-1 、H _t ) And based on the intermediate result, for example, using the full link layer, the prediction result of the cold and hot degree of this time is obtained (the prediction results of each time are respectively: y is Y _t-3 、Y _t-2 、Y _t-1 、Y _t )。

The intermediate result may be, for example, an intermediate output result (hidden variable) of the hidden layer (hidden), the last intermediate result may be denoted as "prev_hidden", and the present intermediate result may be denoted as "hidden".

A more detailed model input-output architecture is illustrated in fig. 2 with the prediction of the current time t as an example.

As shown in fig. 2, for each logical page to be written, each set of features in the sequence of features (e.g., X _t ) May include at least one of the following features:

-a time interval (prev_life) between a previous write time t-1 and a current write time t of the logical page to be written;

-the size (io_len) of the current write request containing the logical page to be written;

-whether the current write request containing the logical page to be written is a continuous write (is_seq);

-the number of times (chunk_write) the logical block to which the logical page to be written belongs is hit in the last write request;

-the number of times (chunk_read) the logical block to which the logical page to be written belongs is hit in the last plurality of read requests; and

-the read-write ratio (rw_rate) of the last multiple requests.

The "latest plural times" herein may mean the latest predetermined number of times, may mean plural times (an indefinite number) within the latest predetermined time, and may mean other meanings.

The time interval (prev_life) between the previous writing time t-1 and the current writing time t is the most direct characteristic for representing the cold and hot degree of the logic page, and can be used as an input characteristic in model prediction.

The other features described above may also be related to some extent to the degree of coldness of the logical pages, and thus may also be provided as input features to the machine learning model for prediction.

For example, write requests such as log writes tend to be very small random writes, which are typically relatively hot. While some relatively large writes, such as multimedia files, tend to be very large sequential writes, typically one write will not be followed by the next long after such a write, which is typically relatively cool. Therefore, the size (io_len) of the current write request including the logical page to be written, continuous write or random write (is_seq), the locality feature (chunk_write, chunk_read) of the write request, and the like may also be provided as an input feature to the machine learning model for prediction.

Regarding locality characteristics, when predicting a certain logical page (assuming a 16KB size), the number of times chunk_write, chunk_read of a logical page in a larger logical block (e.g., a range of 1MB before and after) adjacent to the logical address of this logical page may be counted among the last number of write requests or the last number of read requests (e.g., the last 4096 times, the number of times configurable).

In addition, the read-write ratio rw_rate is a global feature, and in the case of a relatively large number of requests, the write request may reflect the cold or hot degree of the current logical page to be written, so that the read-write ratio rw_rate may be provided as an input feature to the machine learning model for prediction.

As shown in the enlarged view of the lower part of FIG. 2, the above feature X is _t Input the GRU unit and output the previous intermediate result H from the GRU unit at the previous prediction _t-1 Input GRU unit to obtain the intermediate result H _t 。

Based on the intermediate result H _t The full-connection layer is used to obtain the prediction result Y of the current cold and hot degree _t 。

Various features X required by machine learning model _t Most may be obtained by the page classification module from a write request sent from the host.

For example, the size io_len of the current write request and whether it is a continuous write is_seq may be directly calculated from the information of the current write request. The chunk_write, chunk_read and rw_rate can be calculated by maintaining a read-write request information window with a certain length, and recording information of the read-write request in the window. The window length can be fixed or can be changed according to some practical situations; may include read-write requests for a certain length of time or may be a certain number of read-write requests.

However, the time interval prev_life needs to record the last write time of each logical page. In this way, the present write time of each logical page can be stored as metadata as the previous write time when the logical page is predicted next time.

In addition, in order to reduce the time overhead of single prediction, the machine learning model may be used to predict the feature sequence for one logical page, and the intermediate result of the last step of the cyclic operation may be obtained (the intermediate result of each prediction in fig. 2 is H _t-4 、H _t-3 、H _t-2 、H _t-1 、H _t ) As metadata, as the previous intermediate result when predicting the logical page next time, so that when predicting the logical page next time, only the stored previous intermediate result H is needed _t-1 With a set of features X acquired last _t As input, a predicted result can be obtained by performing one-step operation.

Thus, the metadata recorded when writing the logical page may include the current writing time and the current intermediate result of the logical page.

In the metadata, the last write time t of each logical page and the intermediate result H of the machine learning model operation _t A high space overhead is generated and it is not preferable to store the metadata of all logical pages in a buffer such as RAM (random access memory).

For this purpose, the two items of metadata may be stored in a metadata physical page of a storage medium module such as a flash memory (in the case where the storage medium module is a flash memory, it may also be referred to as a "metadata flash page").

On the other hand, in order to avoid that one physical page of metadata needs to be read every time metadata is acquired, the prediction efficiency of the machine learning model is improved, and metadata of some logical pages can be stored in a buffer, such as a RAM.

In this way, the machine learning model of the page classification module can obtain the prediction result of the cold and hot degree based on the characteristics of the logic page to be written and the metadata recorded when the logic page to be written is written previously. The page classification module may obtain the prediction feature of the logical page to be written from the write request from the host, and may obtain the metadata from the flash memory or the RAM, so as to perform the reasoning prediction.

As shown in fig. 1, each physical block (which may be, for example, a superblock) of a storage medium module such as a NAND flash memory may include a physical page of data and a physical page of metadata. The metadata physical page may be located at the end of the physical block.

The metadata physical page stores metadata of a logical page corresponding to the data physical page in the physical block, and the metadata includes the last written time (T) of the data physical page and a model operation intermediate result (H). Metadata entries in one physical page of metadata correspond to consecutive physical pages of data in the present physical block.

On the other hand, the buffer of the memory SSD may include a data buffer area and a metadata buffer area.

The data buffer area can temporarily store the data of the logic page to be written in by the host, and the data is written in the physical page of the corresponding physical block after the prediction of the page classification module is completed. The data buffer may also temporarily store logical page data to be read by the host from a physical page on a storage medium module such as a NAND flash memory.

The metadata cache may hold metadata for a portion of the logical pages for retrieval by the page classification module for prediction.

In this way, it is avoided that one physical page of metadata needs to be read from a storage medium module, such as a NAND flash memory, each time metadata is acquired.

As shown in fig. 3, in a storage medium module such as a NAND flash memory, one super block (physical block) includes a number of physical pages of data (data pages) and a number of physical pages of metadata (metadata pages), which may be arranged at the end of the super block.

A metadata page, such as a metadata page with a physical page number (PPN, which may also be referred to as a "physical address") of "408", may store metadata for a number of data pages preceding the superblock, such as a number of data pages including physical page numbers "…, 365, 366, …". The metadata may include a last write time T and a last intermediate result H corresponding to each physical page number PPN.

In addition, in order to avoid the need to additionally read the metadata physical pages for migrating metadata during garbage collection, the metadata T and H of each data physical page may be similarly saved in an out-of-band (OOB) area of the data physical page.

For example, as shown in fig. 3, user data is stored in a data area (e.g., 16 KB) of a data page having a physical page number "365", and metadata corresponding to the data page is stored in an OOB area (e.g., 256B) of the page following the data area.

In this way, when the data recovery operation is performed on the physical block where the data physical page is located, the metadata stored in the out-of-band area of the physical page can be written into the metadata physical page of the new physical block, and the metadata corresponding to the migration does not need to be searched in the metadata page.

In the metadata cache area, for example, a red-black tree structure may be used, and PPN (physical page number) of a metadata physical page is used as an index key to store data in a part of the metadata physical page, where the metadata includes metadata corresponding to a logical page by a plurality of data physical pages. Wherein Red Black Tree (Red Black Tree) is a self-balancing binary search Tree data structure, typically used to implement an associative array.

The process of acquiring metadata of a logical page to be predicted is further described below with reference to fig. 3.

First, a physical page number (MPPN) of a physical page of metadata where metadata of a logical page to be written is located may be determined according to an offset of a Physical Page Number (PPN) corresponding to the Logical Page Number (LPN) of the logical page to be written in a corresponding physical block.

Here, the PPN of the corresponding physical page may be obtained according to the LPN query mapping table of the logical page to be written. For example, a logical page with LPN 80 corresponds to a physical page with PPN 126; the logical page with LPN 81 corresponds to the physical page with PPN 365.

And calculating the physical page number MPPN of the metadata physical page containing the required metadata according to the offset of the PPN in the physical block. For example, metadata for a physical page with PPN 126 is stored in a metadata physical page with MPPN 160; metadata for a physical page with PPN 408 is stored in a metadata physical page with MPPN 365.

Then, the metadata of the current logical page to be written may be searched in the red-black tree structure stored in the metadata cache region of the buffer, such as the RAM, based on the physical page number (MPPN) of the physical page of the metadata where the metadata of the logical page to be written is located, in other words, using the MPPN as a key for indexing.

If a physical page of the desired metadata is found in the metadata cache if a hit is found in the red-black tree, the desired metadata can be read directly from the metadata cache in a buffer such as RAM.

For example, in fig. 3, for the case of LPN 80, ppn 126, mppn 160, a hit is found in the red-black tree, thereby finding the metadata of the current logical page to be written with LPN 80 in the metadata cache.

On the other hand, when the search of the red and black tree is not hit, and the metadata of the current logical page to be written does not exist in the metadata cache region, searching the corresponding metadata physical page from the physical block where the physical page corresponding to the logical page is located, and caching the data of the searched metadata physical page into the metadata cache region. In addition, the physical page number (MPPN) of the physical page of the searched metadata may also be inserted into the red-black tree.

For example, in FIG. 3, for the case of LPN 81, PPN 365, MPPN 408, a miss is found in the red-black tree. The metadata of the physical page of metadata with MPPN 408 is then cached in the metadata cache for later predictive use. In addition, the corresponding MPPN value "408" may be inserted into the red-black tree for later lookup.

In the case that the metadata cache is full, replacement may be performed according to the least recently used (LRU, least Recently Used) principle, and data of other metadata physical pages that are least recently used and are cached in the metadata cache may be deleted. In addition, a physical page number (MPPN) of the deleted metadata physical page may be deleted accordingly in the red-black tree.

Fig. 4 is a schematic block diagram of a memory controller.

In the case of a solid state disk, the controller may run Flash Translation Layer (FTL) logic, for example.

As shown in fig. 4, the controller 400 may include a cache management module 420 and a flash management module 430 in addition to the page classification module 410 that has been described previously.

The cache management module 420 manages a cache such as RAM space. The flash memory management module 430 manages the flash memory space in units of physical pages.

The buffer management module 420 and the flash management module 430 may manage the buffer and the storage medium module such as NAND flash memory as described above.

The flow of writing data to memory by the host is described below with reference to fig. 5.

Fig. 5 is a schematic flow chart of a write operation to a memory.

After the memory such as a Solid State Disk (SSD) receives the write request of the write data sent by the host in step S510, the logical page data to be written is transferred from the host to the data buffer area of the buffer of the memory, for example, by DMA (direct memory access) in step S520.

At the same time, the page classification module uses a machine learning model to classify each logical page involved in the write request as cold or hot. The specific operation is as follows.

In step S530, for each logical page involved in the write request, it is first checked whether the metadata cache region contains metadata required for the machine learning model to predict the logical page to be written.

If the required metadata is included, the process may proceed directly to step S550.

Otherwise, in step S540, the metadata is first read from the corresponding metadata physical page in the storage medium module, such as the flash memory, into the metadata cache region of the cache, and then step S550 is proceeded.

In step S550, cold-hot prediction is performed using the machine learning model, and the logical page to be written is predicted as cold or hot.

Then, in step S560, the data buffer area of the buffer may be written to the corresponding physical block in the storage medium module, such as the NAND flash memory, according to the cold or hot prediction result of the machine learning model, wherein the logical page predicted to be hot and the logical page predicted to be cold are respectively stored in physical pages of different physical blocks.

It should be noted that, since the features required for the machine learning model to predict are included in the write command sent by the host, and the metadata is in the buffer or storage medium module, the prediction of the logical page by the machine learning model and the transfer of the logical page data itself from the host to the data buffer can be performed in parallel. This ability allows model inference prediction time to be masked by data transmission time, reducing the performance overhead caused by machine learning model prediction.

The process of reading data from the memory, such as the solid state disk, by the host may be the same as that of the conventional scheme, and will not be described herein.

The training process for the machine learning model is further described below.

Training of the machine learning model may be performed by the host computer, and the page classification module may obtain trained model parameters of the machine learning model from the host computer.

As shown in fig. 1, the host computer may further include a model training module configured to train the machine learning module to obtain trained model parameters, and send the trained model parameters to the page classification module of the memory, so as to deploy the trained model parameters to the machine learning model.

The model training module may collect, from a device driver layer of the host operating system, read-write requests sent by an upper layer application to a memory, such as a solid state disk, collect features required for extracting the model, and dynamically mark logical pages written by the host as hot or cold, and construct a training data set for training the machine learning model.

After model training is completed, each model parameter is transmitted to a page classification module of a memory such as a solid state disk, so as to complete machine learning model deployment of the page classification module.

In the running process of the system, the model training module continuously collects training data and regularly performs model training to obtain new model parameters, and the new model parameters are transmitted to the page classification module to redeploy the machine learning model so as to adapt to the continuously changing application workload characteristics.

The specific training process is described below.

The model training module records the characteristics required by prediction for each logic page related to the write-in request sent by the host to the memory, records the write-in time of each logic page, and obtains the write-in time interval of two adjacent write-in requests under the condition that the write-in request related to the same logic page appears for a plurality of times.

Based on the characteristics and write time intervals acquired for the logical pages involved in the plurality of write requests, a training sample set is obtained, and each logical page is marked as cold or hot, respectively, depending on whether the write time interval for each logical page is greater than a threshold.

Specifically, the model training module may divide write requests issued by the host into windows, and the write requests of a window may be considered as a set of write requests, corresponding to one training of the machine learning model. The window size can be fixed or can be changed according to some practical situations; may include a write request for a period of time, or may be a number of write requests.

For each write request within the window, the model training module may record the characteristics required for model prediction and record the time written to the logical pages. If a logical page is written again within the same window, the interval between two writes of the logical page can be calculated.

Thus, after the window ends, the model training module may have a data set of model predictive features and write intervals for logical pages, and then the model training module marks each page as cold or hot based on whether each page write interval is above a threshold.

The machine learning model is trained using a labeled training sample set. For example, a machine learning model may be trained using the acquired features as input and page cool-hot markers as output. Therefore, the trained model parameters can be obtained, and the model parameters obtained through training are transmitted to the page classification module of the memory so as to deploy the machine learning model.

The write interval threshold for marking pages cold or hot may be initialized to the inflection point of the cumulative distribution curve of all page write intervals in the first window.

In other words, the inflection point of the cumulative distribution curve of the write time intervals of the logical pages related to a set (for example, the first set) of the plurality of write requests may be used as the initial value of the threshold.

In subsequent rounds of training (subsequent windows), the threshold value for this round of use may be determined based on the threshold value used in the previous round of training (previous window).

For example, the ranking percentile P of the thresholds used by the previous run (previous window) in the write time interval of the logical pages involved in the present run may be determined. The machine learning model is trained to obtain a plurality of machine learning models based on the ranking percentile P, one or more percentiles (e.g., P+5) greater than the ranking percentile, and write intervals respectively corresponding to one or more percentiles (e.g., P-5) less than the ranking percentile as candidate thresholds.

And testing the accuracy of the obtained multiple machine learning models, and selecting the obtained machine learning model with the highest accuracy and the parameters thereof.

Thus, the parameters of the machine learning model with the highest accuracy can be transmitted to the page classification module of the memory, and the candidate threshold with the highest accuracy is used as the threshold used in the round.

After a round of training a machine learning model based on a set of multiple (one training round) logical pages involved in a write request and transferring model parameters resulting from the training to a page classification module of a memory, a next round of training the machine learning model based on a next set of multiple logical pages involved in a write request is started.

Memory and computing systems for classifying logical pages cold and hot based on machine learning on the memory side according to the present disclosure are described in detail above.

In an embodiment, the prediction of the machine learning model and the data transfer of the host to the data cache may be performed in parallel. And, previous intermediate results required for machine learning model operations may also be cached. Thus, the computational overhead of the machine learning model can be reduced.

Metadata required by the machine learning model can be stored in a storage medium module such as a NAND flash memory, and a metadata cache region can be maintained in a buffer such as a RAM, so that frequent physical page reading of metadata can be avoided while space overhead for storing metadata is reduced.

The existing rule-based cold and hot data classification scheme has the main defects that a single logic rule is difficult to widely adapt to the characteristics of a host side application service read-write solid state disk, and the classification accuracy of cold and hot data is low.

In contrast, the present disclosure uses a machine learning based approach, where the machine learning model can choose to employ neural networks with characteristics that fit arbitrary functions, which can provide better accuracy in predicting cold and hot data.

The main disadvantage of the existing cold and hot data classification schemes based on machine learning is that the prediction of the machine learning model is performed on the host side. In order to improve the accuracy of classifying cold and hot data, machine learning model prediction should be performed on each logical page written into the solid state disk by the host, so that the prediction on the host side consumes more host computing power and occupies other service resources. In addition, after the host side predicts, the model prediction result needs to be transmitted to the solid state disk, and this operation needs to change the communication protocol (such as NVMe protocol) between the host and the solid state disk, so as to affect the compatibility of the scheme.

In contrast, the machine learning model prediction of the present disclosure proceeds within a memory, such as a solid state disk, without consuming host computing power and without having to alter the communication protocol.

Memory and computing systems including the memory according to the present disclosure have been described in detail above with reference to the accompanying drawings.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.

Furthermore, the method according to the present disclosure may also be implemented as a computer program or computer program product comprising computer program code instructions for performing the above steps defined in the above method of the present disclosure.

Alternatively, the present disclosure may also be implemented as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or computer program, or computer instruction code) that, when executed by a processor of an electronic device (or computing device, server, etc.), causes the processor to perform the steps of the above-described method according to the present disclosure.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A memory, comprising:

the storage medium module comprises a plurality of physical blocks, each physical block comprises a plurality of physical pages, the read-write operation is executed by taking the physical pages as a unit, and the erase operation is executed by taking the physical blocks as a unit;

a page classification module for predicting the cold and hot degree of the logic page to be written in the storage medium module by using a machine learning model,

wherein the logical pages predicted to be hot and the logical pages predicted to be cold are respectively stored in physical pages of different physical blocks.

2. The memory of claim 1, wherein,

The machine learning model is a gated loop unit model; and/or

And the machine learning model obtains a cold and hot degree prediction result based on the characteristics of the logic page to be written and the metadata recorded when the logic page to be written is written previously.

3. The memory of claim 2, wherein,

the metadata includes the current writing time and the current intermediate result of the logical page as the previous writing time and the previous intermediate result when predicting the logical page next time,

the machine learning model obtains a current intermediate result based on the characteristics of the logic page to be written, the previous writing time of the logic page to be written and the previous intermediate result obtained when the logic page to be written is predicted, and obtains the cold and hot degree prediction result based on the current intermediate result.

4. The memory of claim 2, wherein,

the physical block comprises a data physical page and a metadata physical page, and the metadata physical page stores metadata of a logical page corresponding to the data physical page in the physical block.

5. The memory of claim 4, wherein,

the metadata of a physical page of data is saved in an out-of-band region of the physical page of data,

And when the data recovery operation is carried out on the physical block where the data physical page is located, writing the metadata stored in the out-of-band area of the physical page into the metadata physical page of the new physical block.

6. The memory of claim 4, further comprising:

a buffer comprising a data buffer area and a metadata buffer area,

the data buffer area temporarily stores the data of the logic page to be written in by the host, the data is written in the physical page of the corresponding physical block after the prediction of the page classification module is completed,

the metadata cache region stores the metadata of part of the logical pages for extraction by the page classification module for prediction.

7. The memory of claim 6, wherein,

determining the physical page number of a metadata physical page where metadata of a logical page to be written is located according to the offset of the physical page number corresponding to the logical page number of the logical page to be written in a corresponding physical block;

and storing the data of the metadata physical page in the metadata cache region in a red-black tree structure, and searching the metadata of the current logical page to be written in the red-black tree structure based on the physical page number of the metadata physical page where the metadata of the logical page to be written is located.

8. The memory of claim 6, wherein,

And searching a corresponding metadata physical page from a physical block where a physical page corresponding to the logical page is located under the condition that metadata of the current logical page to be written does not exist in the metadata cache region, caching data of the searched metadata physical page into the metadata cache region, and deleting data of other metadata physical pages cached in the metadata cache region according to the least recently used principle under the condition that the metadata cache region is full.

9. The memory of claim 2, wherein the characteristics of the logical page to be written include at least one of:

the time interval between the previous writing time and the current writing time of the logic page to be written;

the size of the current write request of the logic page to be written is contained;

whether the current writing request containing the logic page to be written is continuous writing or not;

the number of times of hitting the logic block to which the logic page to be written belongs in the latest multiple times of writing requests;

the number of times of last multiple reading requests hit the logic block to which the logic page to be written belongs; and

the read-write ratio of the last multiple requests.

10. The memory of claim 2, wherein,

the page classification module acquires trained model parameters of the machine learning model from the host; and/or

The page classification module acquires the characteristics of a logic page to be written from a writing request from a host; and/or

The machine learning model of the page classification module predicts the cold and hot degree of the logic page to be written while the data buffer area receives the data of the logic page to be written from the host.

11. A computing system, comprising:

the memory according to any one of claims 1 to 10; and

a host performing a write operation and a read operation on the memory,

the host comprises a model training module, which is used for training the machine learning module to obtain trained model parameters, and sending the trained model parameters to a page classification module of the memory so as to deploy the trained model parameters to the machine learning model.

12. The computing system of claim 11 wherein,

the model training module records the characteristics required by prediction for each logic page related to the write-in request sent by the host to the memory, records the write-in time of each logic page, and acquires the write-in time interval of two adjacent write-in requests under the condition that the write-in request related to the same logic page appears for a plurality of times;

based on the characteristics and the writing time intervals acquired for the logic pages related to the plurality of writing requests, a training sample set is obtained, and each logic page is marked as cold or hot according to whether the writing time interval of each logic page is larger than a threshold value or not;

And training the machine learning model by using the marked training sample set to obtain model parameters, and transmitting the model parameters obtained by training to a page classification module of a memory so as to deploy the machine learning model.

13. The computing system of claim 12 wherein,

after training the machine learning model for one round based on the logical pages related to a set of multiple write requests and transmitting model parameters obtained by training to a page classification module of a memory, starting to train the machine learning model for the next round based on the logical pages related to the next set of multiple write requests.

14. The computing system of claim 13 wherein,

taking inflection points of a cumulative distribution curve of write time intervals of a logical page related to a group of a plurality of write requests as initial values of the threshold values;

determining a sequencing percentile of a threshold used in the previous round of training in the writing time interval of a logic page related to the current round of training, training the machine learning model to obtain a plurality of machine learning models based on the sequencing percentile, one or more percentiles larger than the sequencing percentile and one or more writing intervals respectively corresponding to the one or more percentiles smaller than the sequencing percentile as candidate threshold values, testing the accuracy of the plurality of machine learning models, transmitting the parameter of the machine learning model with the highest accuracy to a page classification module of a memory, and taking the candidate threshold value with the highest accuracy as the threshold value used in the current round.