CN117632021A - Cache data management method, device and equipment - Google Patents

Cache data management method, device and equipment Download PDF

Info

Publication number
CN117632021A
CN117632021A CN202311630274.4A CN202311630274A CN117632021A CN 117632021 A CN117632021 A CN 117632021A CN 202311630274 A CN202311630274 A CN 202311630274A CN 117632021 A CN117632021 A CN 117632021A
Authority
CN
China
Prior art keywords
flash memory
data block
data
read
memory data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311630274.4A
Other languages
Chinese (zh)
Inventor
向春艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Cloud Technologies Co Ltd
Original Assignee
New H3C Cloud Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Cloud Technologies Co Ltd filed Critical New H3C Cloud Technologies Co Ltd
Priority to CN202311630274.4A priority Critical patent/CN117632021A/en
Publication of CN117632021A publication Critical patent/CN117632021A/en
Pending legal-status Critical Current

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a cache data management method, device and equipment, which are used for solving the technical problem that the index of a cache disk occupies excessive memory. The invention improves the index data structure for managing the flash memory data blocks in the memory, manages a plurality of flash memory data blocks in groups by using one management unit in a bitmap mode and correspondingly improves the read/write flow so as to achieve the aim of reducing the memory space occupied by the data index created by the cache acceleration example scache.

Description

Cache data management method, device and equipment
Technical Field
The present invention relates to the field of computing and storage technologies, and in particular, to a method, an apparatus, and a device for managing cache data.
Background
In an environment where flash memory and a Disk storage system are used in a mixed manner (referred to as a mixed flash memory environment), a Solid State Disk (SSD) of a flash memory medium is used as a Hard Disk (HDD) of a magnetic medium to be cached, and a block (block) read/write interface is provided upwards and data is exchanged downwards with the HDD to improve random input/output (I/O) performance.
Typically, an SSD is divided into multiple partitions, each paired with an HDD to create a cache acceleration instance, i.e., a scache instance, based on which an object storage device (Object Storage Device, OSD) instance is recreated.
SSD disks typically include a header (header) for recording global information, a metadata area, and a flash data block area; the metadata area is used for recording flash metadata, and comprises a plurality of metadata blocks (meta blocks); the flash data block area is used for caching disk data, and comprises a plurality of flash data blocks (data blocks). One metadata block includes a plurality of flash metadata items (flash_meta), and each flash_meta corresponds to one data block of the flash.
In the prior art, a buffer data index is created by a buffer memory example in a memory for accelerating data access, management units (units) in the index and flash memory data blocks (datablks) on an SSD have a one-to-one correspondence, the buffer data index is required to be configured in proportion, and under the condition that the configured SSD space is large, a large amount of memory space is occupied by the index formed by the management units correspondingly, so that the memory resource is tense or insufficient.
For example, assuming that each management unit has a size of 24 bytes, i.e. 24B, and a flash memory data block has a size of 4KB, the ratio of management units to flash memory data blocks is about 6/1000 (calculation method sizeof (unit)/sizeof (data block) =24b/4 KB), i.e. if 1TB of flash memory space is allocated, the management units in the memory occupy 6GB of memory space. As the capacity of the cache disk increases, more space is allocated to the management unit in the memory, which results in a significant consumption of memory resources.
Disclosure of Invention
In view of the above, the present invention provides a method, apparatus and device for managing cache data, which are used for solving the technical problem that the index of the cache disk occupies too much memory.
Based on an aspect of the embodiment of the present invention, the present invention provides a method for managing cache data, where the method includes:
the method comprises the steps that an index for accelerating data access is created in a memory by a cache acceleration instance, the index is used for associating flash memory data blocks in a cache disk through a management unit, and the flash memory data blocks are used for caching data blocks in the disk; wherein a management unit associates a plurality of consecutive blocks of flash data, and identifies whether the blocks of flash data are cached with disk data via a first identification field in the management unit, and identifies whether the blocks of flash data are being used by other I/Os via a second identification field.
Further, the method further comprises: when a read request is processed, judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field, judging whether the corresponding flash memory data block is used by other I/O according to the second identification field under the condition that the data block to be read is cached in the flash memory data block, if the data block to be read is judged not to be used by other I/O, directly reading the data block to be read from the corresponding flash memory data block, and setting a flag bit corresponding to the corresponding flash memory data block in the second identification field during reading.
Further, the method further comprises: when a write request is processed, judging whether the data to be written is cached in a flash memory data block according to the address range of the data to be written and the first identification field, judging whether the corresponding flash memory data block is used by other I/Os according to the second identification field under the condition that the data to be written is cached in the flash memory data block, and if the data to be written is not used by the other I/Os, double-writing the data to be written into the corresponding flash memory data block and the disc, and setting the corresponding flag bit of the corresponding flash memory data block in the second identification field during writing.
Further, when the read request or the write request is processed, if the corresponding flash memory data block is judged to be used by other I/Os according to the second identification field, the read or write operation is performed again when the corresponding flash memory data block is not used by the other I/Os.
Further, if it is determined that the data block to be written is not cached in the flash memory data block according to the address range of the data to be written and the first identification field, a new management unit is created in the index, the data to be written is written into the flash memory data block and the disk associated with the newly created management unit, the flag bit corresponding to the flash memory data block in the second identification field is set during writing, and the flag bit corresponding to the flash memory data block in the first identification field is set after writing is completed.
Further, when processing the write request, if the data volume to be written is determined to be greater than a preset write-through threshold, the data block to be written is written into the disk, and if the covered data blocks are determined to be cached into the flash memory data blocks according to the address range to be written, the corresponding flag bits of the covered flash memory data blocks in the first identification field and the second identification field are cleared.
Further, when processing the read request, before executing the determining whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field, the method further includes:
judging whether the read data quantity exceeds the transparent read water line according to the address range of the data to be read, if so, directly reading the data to be read from the disk, and if not, executing the step of judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field.
Further, the first identification field is an effective cache bitmap field, the second identification field is a mutually exclusive bitmap field, and each binary bit of the bitmap field corresponds to one flash memory data block.
According to another aspect of the embodiment of the present invention, there is also provided a cache data management apparatus, where the apparatus is applied to a device running a cache acceleration instance, the apparatus including:
the system comprises an index creation module, a management unit and a storage module, wherein the index creation module is used for creating an index for accelerating data access in a memory by caching an acceleration instance, the index is used for associating flash memory data blocks positioned in a cache disk through the management unit, and the flash memory data blocks are used for caching data blocks in the disk; wherein a management unit associates a plurality of consecutive blocks of flash data, and identifies whether the blocks of flash data are cached with disk data via a first identification field in the management unit, and identifies whether the blocks of flash data are being used by other I/Os via a second identification field.
Further, the apparatus further comprises:
the read request processing module is used for judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field when the read request is processed, judging whether the corresponding flash memory data block is used by other I/Os according to the second identification field when the data block to be read is judged to be cached in the flash memory data block, and directly reading the data block to be read from the corresponding flash memory data block if the data block to be read is judged not to be used by the other I/Os, and setting a flag bit corresponding to the corresponding flash memory data block in the second identification field during reading;
the write request processing module is used for judging whether the data to be written is cached in the flash memory data block according to the address range of the data to be written and the first identification field when the write request is processed, judging whether the corresponding flash memory data block is used by other I/O according to the second identification field when the data to be written is judged to be cached in the flash memory data block, and writing the data to be written into the corresponding flash memory data block and the disc in double mode if the data to be written is judged not to be used by other I/O, and setting the flag bit corresponding to the corresponding flash memory data block in the second identification field during writing.
The device provided by the invention can be realized in a mode of software, hardware or a combination of software and hardware. When implemented as a software module, the program code of the software module is loaded into a storage medium of the device, and the program code in the storage medium is read and executed by a processor.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will briefly describe the drawings required to be used in the embodiments of the present invention or the description in the prior art, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to these drawings of the embodiments of the present invention for a person having ordinary skill in the art.
FIG. 1 is a diagram illustrating a data structure for providing a flash data index according to an embodiment of the present invention;
FIG. 2 is a schematic diagram showing a memory management unit managing a plurality of flash memory blocks according to an embodiment of the present invention;
FIG. 3 is a block diagram illustrating a status switch of a flash memory according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating steps of a method for managing cache data according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating steps for processing a read request according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating steps for processing a write request according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device for implementing the method for managing cached data according to an embodiment of the present invention.
Detailed Description
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of embodiments of the invention. As used in this embodiment of the invention, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be understood that although the terms first, second, third, etc. may be used in embodiments of the present invention to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one from another or similar information, entity or step, but not to describe a particular sequence or order. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of embodiments of the present invention. Furthermore, the word "if" as used may be interpreted as "at … …" or "at … …" or "in response to a determination". The "and/or" in the present invention is merely an association relationship describing the association object, and indicates that three relationships may exist, for example, a and/or B may indicate: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural. Also, in the description of the present invention, unless otherwise indicated, "a plurality" means two or more than two. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
The invention aims to provide a cache data management scheme which is used for solving the technical problem that the index of a flash memory data block on a management cache disk occupies too much memory space. The basic idea of the invention is that: the index data structure for managing the flash memory data blocks in the memory is improved, a management unit is used for managing a plurality of flash memory data blocks in groups in a bitmap mode, and the read/write flow is correspondingly improved so as to achieve the aim of reducing the memory space occupied by the management unit in the data index created by the cache acceleration example.
Based on the basic idea of the invention, it is to be explained that the steps shown in the flowcharts of the figures can be performed in a computer system such as a set of computer executable instructions, and that, although a logical order is shown in the flowcharts, in some cases the steps shown or described can be performed in an order different from that here.
Fig. 1 is a schematic diagram of a data structure of a flash memory data index according to an embodiment of the present invention, where, as shown in the drawing, for accelerating data searching in processes of reading, writing, brushing, eliminating, etc. of storage data by a storage system, a cache data index (simply referred to as an index) is created in a memory by a cache acceleration cache instance, the index may be implemented based on a hash table (hash table), and a plurality of management units (such as unit-1, unit-2, unit-3) are installed in a bucket (such as a bucket-1) in the hash table. The invention does not adopt a one-to-one corresponding management structure of the management units and the flash memory data blocks, but creatively adopts a one-to-many management structure, and realizes the function of managing the flash memory data blocks (datablk) in a plurality of SSDs by one management unit on the premise of not increasing the size of the data structure of the management unit, thereby reducing the occupation of the cache data index to the memory resources by times.
FIG. 2 is a schematic diagram illustrating a memory management unit managing a plurality of flash memory blocks according to an embodiment of the invention. As illustrated in the figure, a management unit manages n flash data blocks, where in this embodiment, the number n of flash data blocks that can be managed by the management unit is configurable, and the maximum value of n may be determined by testing according to a specific application scenario, for example, the maximum value of n may be set to 8 in some system environments. Taking the data structure size of each management unit as 24 bytes (Byte), the size of each flash data block is 4KB as an example. If a one-to-one management mode is adopted, the proportional relation between the management unit and the flash memory data blocks is about 6/1000, and if the one-to-many management mode is adopted, the space of the flash memory data blocks which can be managed by the management unit is multiplied, so that the occupation of the cache data index in the memory to the memory resource is greatly reduced, and the system performance is improved.
TABLE 1
Management unit data structure attribute members Attribute type Default value
Address field addr uint64:39
Valid for valid bitmap field uint64:8 0
Mutually exclusive bitmap field in_use uint64:8 0
Table 1 is an example of the composition of attribute fields of a management unit data structure according to an embodiment of the present invention, where the management unit includes at least 3 fields, respectively:
address field (addr): the address offset of the storage position of the data block content of the first flash memory data block in the n flash memory data blocks datablk correspondingly managed by the memory management unit on the HDD disk is represented, and the storage positions of the data block content stored in the n flash memory data blocks on the flash memory disk and the disk are continuous. The association between the management unit and the flash memory data blocks may be achieved by means of address mapping.
Valid bitmap field (valid): the field indicates whether the content of the data block to be accessed is cached in the SSD cache disk in a bitmap form, each bit of the bitmap field corresponds to a block of flash memory data block, a binary bit of 1 indicates that the content of the corresponding data block is cached in the flash memory data block managed by the management unit, and a binary bit of 0 indicates that the content of the corresponding data block is not cached in the corresponding flash memory data block. For example, this field takes 8 bits (maximum value 0 xff) by default, each bit corresponds to one flash data block in the SSD disk, 8 bits corresponds to 8 consecutive flash data blocks in the SSD disk, and when the corresponding binary bit is 1, it indicates that the content of the accessed data block is cached in the flash data block corresponding to the bit in the SSD disk, and when the corresponding binary bit is 0, it indicates that the content of the accessed data block is not cached in the flash data block corresponding to the bit in the SSD disk.
Mutually exclusive use bitmap field (in_use): this field indicates in bitmap form whether the managed n flash data blocks datablk are used by other I/os, each bit of the bitmap field corresponds to a block of flash data blocks, a corresponding binary bit of 1 indicates not occupied by other I/O requests, and a bit of 0 indicates occupied by other I/O requests, the bitmap field being used to control read/write exclusive access to the managed flash data blocks. For example, the field occupies 8 bits (maximum value 0 xff) by default, each bit corresponds to one flash data block in one SSD disk, 8 bits corresponds to 8 consecutive flash data blocks in the SSD disk, when the corresponding binary bit is 1, it indicates that the flash data block corresponding to the bit is being mutually exclusive used by other I/O streams, and when the corresponding binary bit is 0, it indicates that the flash data block corresponding to the bit is not currently mutually exclusive used by other I/O streams. The purpose of the exclusive access is to ensure data consistency, for example, when a certain I/O request is writing input data into a flash memory data block, other I/O requests cannot be read or written, and the storage system realizes the exclusive access to the flash memory data block according to the bitmap.
FIG. 3 is a block status switching example of a flash memory according to an embodiment of the present invention, wherein a management unit in a memory is used as a management unit for a plurality of flash memory blocks, and is also used as a control unit for I/O operation access. The bits of the mutually exclusive bitmap in_use field in the memory management unit change as the state of the flash data block on the SSD disk changes, but affecting the state of the in_use field is only related to the "use" state in fig. 3, and the bits in the in_use field are modified accordingly only when the corresponding flash data block is in the "use" state. As in the example of fig. 3, the state of a flash block of data normally switches among a create state, an idle wait state (free chain), a used/access state (bucket chain), a retired state (LRU retired chain), a wait state (wait chain), and a release state as the I/O process of the storage system progresses.
Fig. 4 is a flowchart illustrating a method for managing cache data according to an embodiment of the present invention, where the method is applied to a device or equipment running a cache acceleration instance, and the method includes:
step 401, creating an index for accelerating data access in a memory by the cache acceleration instance;
the index is used for associating flash memory data blocks in a cache disk through a management unit, wherein the flash memory data blocks are used for caching data blocks in the disk; wherein, a management unit associates a plurality of continuous flash memory data blocks, and identifies whether the flash memory data blocks are cached with disk data or not through a first identification field in the management unit, and identifies whether the flash memory data blocks are being used by other I/Os or not through a second identification field;
in the embodiment of the invention, the buffer disk can be a solid state hard disk adopting a flash memory storage medium or other storage medium disks with higher access rate compared with a magnetic storage medium. In some application scenarios, the cache disk may be divided into a plurality of partitions, each of which may be paired with a disk or a volume formed by a plurality of disks to create a cache acceleration instance, which recreates the OSD object storage device instance.
To increase the data access rate, the scache instance will create an index in memory for fast access to the cached data blocks in the cache disk, the management unit (unit) in the index has a pair of n correspondences with the flash data block (datablk) in the cache disk, n being configurable as a system configuration item.
In one embodiment of the present invention, the management unit may be 24 bytes, and one management unit may manage 8 flash data blocks, each of which is 4K bytes, that is, one management unit may manage 32K flash data space in groups.
In an embodiment of the present invention, the first identification field is a valid buffer bitmap field, the second identification field is a mutually exclusive bitmap field, and each binary bit of the bitmap field corresponds to a flash memory data block.
Step 402, when processing the read request, judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field, and judging whether the corresponding flash memory data block is used by other I/O according to the second identification field when judging that the data block to be read is cached in the flash memory data block, if judging that the data block to be read is not used by other I/O, directly reading the data block to be read from the corresponding flash memory data block, and setting the flag bit corresponding to the corresponding flash memory data block in the second identification field during reading.
When a read request is processed, firstly, matching is needed in an index according to the address range of data to be read in the read request, whether the data is hit or not is judged by matching with an address field in a management unit, if the data is hit by judging the management unit, whether the data block to be read is cached in a flash memory data block is judged according to the first identification field, and under the condition that the data block to be read is not cached in the flash memory data block, the data block to be read is read from a magnetic disk and cached in the corresponding flash memory data block.
If the data block to be read is judged not to be cached in the flash memory data block according to the address range of the data to be read and the first identification field, a new management unit is created in the index, the data block to be read is cached in the flash memory data block associated with the new management unit from the disk, and the corresponding flag bit of the flash memory data block in the first identification field is set.
Step 403, when processing the write request, judging whether the data to be written is cached in the flash memory data block according to the address range of the data to be written and the first identification field, and judging whether the corresponding flash memory data block is used by other I/Os according to the second identification field when judging that the data to be written is cached in the flash memory data block, and if judging that the data to be written is not used by other I/Os, writing the data to be written into the corresponding flash memory data block and the disc, and setting the flag bit corresponding to the corresponding flash memory data block in the second identification field during writing.
Based on the above steps, when processing the read request or the write request, if it is determined that the corresponding flash memory data block is used by other I/O according to the second identification field, the read or write operation is performed again when the corresponding flash memory data block is not used by other I/O.
If it is determined that the data block to be written is not cached in the flash memory data block according to the address range of the data to be written and the first identification field, a new management unit is created in the index, the data to be written is written into the flash memory data block and the disk associated with the newly created management unit, the flag bit corresponding to the flash memory data block in the second identification field is set during writing, and after writing is completed, the flag bit corresponding to the flash memory data block in the first identification field is set to identify that the data to be written is cached in the corresponding flash memory data block of the cache disk.
When processing a write request, if the data quantity to be written is determined to be larger than a preset write-through threshold value, writing the data block to be written into a disk, and if the covered data blocks are determined to be cached into the flash memory data blocks according to the address range to be written, clearing corresponding zone bits of the covered flash memory data blocks in a first identification field and a second identification field.
FIG. 5 is a flowchart illustrating a processing step of processing a read request according to an embodiment of the present invention, where the processing step of the read request in the cache data management method includes:
step 501, an src cache acceleration instance acquires an address range of data to be read in a read request, and applies for I/O related context resources;
after receiving the read request, the storage engine issues the read request to a corresponding srcache cache acceleration instance for processing, and the srcache instance acquires an address range of data to be read, which is transmitted by an upper layer and is formed by a read address initial offset and a data volume, and then applies for read I/O related context resources through a preparation_io function.
Step 502, judging whether the data hit in the index according to the address range of the data to be read;
the srcache example inquires whether the index is hit in the index according to the address range of the data to be read, takes the hash table mode as an example, if the addr address field in one or a plurality of management units linked by the socket is in the read address range through address range comparison, whether the flag bit corresponding to the flash memory data block in the valid field of the effective cache bitmap in each management unit in the address range is 0 or not is further checked, wherein 0 represents that the index is not hit, and 1 represents that the index is hit. A hit in the index represents a hit in the cache disk.
In a specific application scenario of the present invention, the minimum unit of disk reading and writing is 4K bytes, so the size of the flash memory data block is also allocated in 4K units, so each binary bit in the valid buffer bitmap field corresponds to a flash memory data block located in the buffer disk of 4K bytes, a corresponding bit of 1 indicates that the 4K data block to be read has been buffered from the disk into the corresponding flash memory data block, and a corresponding bit of 0 indicates that the 4K data block to be read has not been buffered from the disk into the corresponding flash memory data block.
Step 503, in case of hit, judging whether the corresponding flash memory data block is used by other I/O based on the mutually exclusive use bitmap field;
judging whether the flash memory data block for caching the data block to be read is used by other I/Os according to the mutually exclusive use bitmap in_use field, if the binary bit corresponding to the flash memory data block in the in_use is 1, the flash memory data block is currently used by the other I/Os (for example, other I/Os are writing data into the flash memory data block), and if the binary bit is 0, the flash memory data block is not used by the other I/Os.
Step 504, switching the read request to a waiting state in case it is determined that the read request is used by other I/O;
if the binary bit corresponding to the flash memory data block in the in_use is 1, the current read request is required to be added into the waiting linked list, the next examination is continued to be waited, if the next examination is still 1, the waiting is continued, and if the next examination is 0, the reading can be performed.
Step 505. In case it is determined that it is not used by other I/Os, reading the flash data block hit in the cache disk and setting the corresponding position 1 in the in_use field during the reading;
if the binary bit corresponding to the flash memory data block in the in_use is 0, the data block to be read is read from the corresponding flash memory data block, namely, the content in the flash memory data block which is read from the cache disk and hits is returned to the upper layer, and the flag position 1 corresponding to the read flash memory data block in the in_use field is returned during the reading. After the reading of the corresponding data block is completed, the flag position 0 corresponding to the read flash memory data block in the in_use field is again set after the reading is completed, and after the reading of all the data blocks is completed, step 510 is executed to release the control authority of the management unit.
Step 506, under the condition of missing, a new management unit is applied, and a data block to be read is read from a disk and is cached into a flash memory data block associated with the new management unit;
in the case of miss, the data to be read needs to be cached from the disk to the cache disk, so that a new management unit needs to be applied for the data block to be read, and then in the process of caching the data block to be read in the disk to the flash memory data block associated with the new management unit, the flash memory data block needs to be in the corresponding binary position 1 in the in_use field to avoid other I/O use, and after caching is completed, the flash memory data block is in the corresponding binary position 1 in the valid field to identify that the data block to be read is cached in the corresponding flash memory data block.
Step 510, releasing the control authority of the management unit, and releasing the I/O related context resource.
In an embodiment of the present invention, before executing step 502, a determining step of determining whether a read-through (remote) condition is satisfied is further included, and if the read-through condition is satisfied, the read-through is executed without executing step 502.
Read-through refers to the situation where the amount of data read exceeds the read-through water line, and the data is read directly from the disk. For example, when the transparent reading water line is set to 256K, when it is determined that the amount of data to be read exceeds the transparent reading water line 256K according to the address range of the data to be read, the data to be read is directly read from the disk.
If it is determined that the amount of data to be read does not exceed the read-through water line according to the address range of the data to be read, the step of step 502 may be continuously performed, whether the data hit in the index is determined according to the address range of the data to be read, and for the hit data to be directly read from the cache disk, the missed data needs to be cached in the cache disk while being read.
FIG. 6 is a flowchart illustrating the processing steps of processing a write request according to an embodiment of the present invention, where the processing steps of the write request in the method for managing cache data include:
step 601. The srcache cache acceleration instance obtains the address range of the data to be written in the write request, and applies for the I/O related context resource;
after receiving the write request, the storage engine issues the write request to a corresponding srcache cache acceleration instance for processing, and the srcache instance acquires an address range of data to be written, which is transmitted by an upper layer and is formed by a write address initial offset and a data volume, and then applies for writing I/O related context resources through a preparation_io function.
Step 602, judging whether the data volume to be written is larger than or equal to a write-through threshold value;
in an embodiment of the present invention, before writing data, it is determined whether the amount of data to be written reaches a preset write-through threshold (e.g., 4 mbytes), and if so, write-through is performed. The write-through refers to that data to be written is directly written into a disk and is not written into a cache disk.
Step 603, judging whether the data hit in the index according to the address range of the data to be written;
the srcache example inquires whether the index is hit in the index according to the address range of the data to be written, takes a hash table mode as an example, if the addr address field in one or a plurality of management units linked by the socket is in the address range of the data to be written through address range comparison, whether the flag bit corresponding to the flash memory data block in the valid field of the effective cache bitmap in each management unit in the address range is 0 or not is further checked, wherein 0 represents that the index is not hit, and 1 represents that the index is hit. A hit in the index represents a hit of the corresponding block of data to be written in the cache disk.
Step 604, in case of hit, determining whether the corresponding flash memory data block is used by other I/O based on the mutually exclusive use bitmap field;
judging whether the flash memory data block for caching the data block to be written is used by other I/Os according to the mutually exclusive use bitmap in_use field, if the binary bit corresponding to the flash memory data block in_use is 1, the flash memory data block is currently used by other I/Os (for example, other I/Os are writing data into the flash memory data block or reading data), and if the binary bit is 0, the flash memory data block is not used by other I/Os.
Step 605. In case it is determined that it is used by other I/O, switching the write request to a wait state;
if the binary bit corresponding to the flash memory data block in the in_use is 1, the current write request is required to be added into a waiting linked list, the next examination is continued to be waited, if the next examination is still 1, the waiting is continued, and if the next examination is 0, the writing can be performed.
Step 606, in case of miss, applying for a new management unit;
step 607, writing the data block to be written into the corresponding flash memory data block and the magnetic disk, and writing the flag position 1 corresponding to the corresponding flash memory data block in the in_use field.
Step 608, after the double writing is completed, releasing the control authority of the management unit, and then executing step 610;
step 609, under the condition that the write-through threshold value is exceeded, the data to be written is written into the disk in a penetrating way, and the flash memory data blocks occupied by the covered data blocks in the cache disk and the corresponding management units are released.
In the write-through case, if it is determined that the covered data block in the index is already cached in the flash data block according to the address range to be written, it is necessary to first determine whether the corresponding flash data block is used by other I/O according to steps 604 and 605, and in the unused case, clear the flag bits of the corresponding valid and in_use of the flash data block to be covered in the management unit by 0.
Step 610. Release I/O related context resources.
Fig. 7 is a schematic structural diagram of an electronic device for implementing the method for managing cache data according to an embodiment of the present invention, where the device 700 includes: a processor 710 such as a Central Processing Unit (CPU), a communication bus 720, a communication interface 740, and a memory 730. Wherein processor 710 and memory 730 may communicate with each other via communication bus 720. The memory 730 stores a computer program that, when executed by the processor 710, performs the functions of the steps of the cache data management method provided by the present invention.
Memory refers to a device for storing computer programs and/or data based on some storage medium, which may be a Volatile Memory (VM) or a Non-Volatile Memory (NVM). The memory is an internal memory for directly exchanging data with the processor, and can read and write data at any time, and has high speed, and is used as a storage medium for temporary data of an operating system and other running programs. The memory may be synchronous dynamic random access memory (Synchronous Dynamic Random Access Memory, SDRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), or the like. The nonvolatile memory is a memory using a persistent storage medium, and has a large capacity and can store data permanently, and may be a storage class memory (Storage Class Memory, SCM), a Solid State Disk (SSD), a NAND flash memory, a magnetic Disk, or the like. SCM is a common name for new storage medium between memory and flash memory, and is a composite storage technology combining persistent storage characteristic and memory characteristic, and has access speed slower than that of DRAM and SSD hard disk.
The processor may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
It should be appreciated that embodiments of the invention may be implemented or realized by computer hardware, a combination of hardware and software, or by computer instructions stored in non-transitory (or referred to as non-persistent) memory. The method may be implemented in a computer program using standard programming techniques, including a non-transitory storage medium configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose. Furthermore, the operations of the processes described in the present invention may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes (or variations and/or combinations thereof) described herein may be performed under control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications), by hardware, or combinations thereof, collectively executing on one or more processors. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable computing platform, including, but not limited to, a personal computer, mini-computer, mainframe, workstation, network or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and so forth. Aspects of the invention may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optical read and/or write storage medium, RAM, ROM, etc., such that it is readable by a programmable computer, which when read by a computer, is operable to configure and operate the computer to perform the processes described herein. Further, the machine readable code, or portions thereof, may be transmitted over a wired or wireless network. When such media includes instructions or programs that, in conjunction with a microprocessor or other data processor, implement the steps described above, the invention described herein includes these and other different types of non-transitory computer-readable storage media. The invention also includes the computer itself when programmed according to the methods and techniques of the present invention.
The foregoing is merely exemplary of the present invention and is not intended to limit the present invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A method for managing cache data, the method comprising:
the method comprises the steps that an index for accelerating data access is created in a memory by a cache acceleration instance, the index is used for associating flash memory data blocks in a cache disk through a management unit, and the flash memory data blocks are used for caching data blocks in the disk; wherein a management unit associates a plurality of consecutive blocks of flash data, and identifies whether the blocks of flash data are cached with disk data via a first identification field in the management unit, and identifies whether the blocks of flash data are being used by other I/Os via a second identification field.
2. The method according to claim 1, wherein the method further comprises:
when a read request is processed, judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field, judging whether the corresponding flash memory data block is used by other I/O according to the second identification field under the condition that the data block to be read is cached in the flash memory data block, if the data block to be read is judged not to be used by other I/O, directly reading the data block to be read from the corresponding flash memory data block, and setting a flag bit corresponding to the corresponding flash memory data block in the second identification field during reading.
3. The method according to claim 1, wherein the method further comprises:
when a write request is processed, judging whether the data to be written is cached in a flash memory data block according to the address range of the data to be written and the first identification field, judging whether the corresponding flash memory data block is used by other I/Os according to the second identification field under the condition that the data to be written is cached in the flash memory data block, and if the data to be written is not used by the other I/Os, double-writing the data to be written into the corresponding flash memory data block and the disc, and setting the corresponding flag bit of the corresponding flash memory data block in the second identification field during writing.
4. A method according to claim 2 or 3, characterized in that,
when processing the read request or the write request, if the corresponding flash memory data block is judged to be used by other I/O according to the second identification field, waiting for the other I/O to not use the corresponding flash memory data block, and then performing read or write operation.
5. The method of claim 3, wherein the step of,
if it is determined that the data block to be written is not cached in the flash memory data block according to the address range of the data to be written and the first identification field, a new management unit is created in the index, the data to be written is written into the flash memory data block and the disk associated with the newly created management unit, the flag bit corresponding to the flash memory data block in the second identification field is set during writing, and the flag bit corresponding to the flash memory data block in the first identification field is set after writing is completed.
6. The method of claim 3, wherein the step of,
when processing a write request, if the data quantity to be written is determined to be larger than a preset write-through threshold value, writing the data block to be written into a disk, and if the covered data blocks are determined to be cached into the flash memory data blocks according to the address range to be written, clearing the corresponding flag bits of the covered flash memory data blocks in the first identification field and the second identification field.
7. The method of claim 2, wherein the step of determining the position of the substrate comprises,
when processing the read request, before executing the judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field, the method further comprises:
judging whether the read data quantity exceeds the transparent read water line according to the address range of the data to be read, if so, directly reading the data to be read from the disk, and if not, executing the step of judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field.
8. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the first identification field is an effective cache bitmap field, the second identification field is a mutually exclusive bitmap field, and each binary bit of the bitmap field corresponds to one flash memory data block.
9. A cache data management apparatus, the apparatus being applied to a device running a cache acceleration instance, the apparatus comprising:
the system comprises an index creation module, a management unit and a storage module, wherein the index creation module is used for creating an index for accelerating data access in a memory by caching an acceleration instance, the index is used for associating flash memory data blocks positioned in a cache disk through the management unit, and the flash memory data blocks are used for caching data blocks in the disk; wherein a management unit associates a plurality of consecutive blocks of flash data, and identifies whether the blocks of flash data are cached with disk data via a first identification field in the management unit, and identifies whether the blocks of flash data are being used by other I/Os via a second identification field.
10. The apparatus of claim 9, wherein the apparatus further comprises:
the read request processing module is used for judging whether the data block to be read is cached in the flash memory data block according to the address range of the data to be read and the first identification field when the read request is processed, judging whether the corresponding flash memory data block is used by other I/Os according to the second identification field when the data block to be read is judged to be cached in the flash memory data block, and directly reading the data block to be read from the corresponding flash memory data block if the data block to be read is judged not to be used by the other I/Os, and setting a flag bit corresponding to the corresponding flash memory data block in the second identification field during reading;
the write request processing module is used for judging whether the data to be written is cached in the flash memory data block according to the address range of the data to be written and the first identification field when the write request is processed, judging whether the corresponding flash memory data block is used by other I/O according to the second identification field when the data to be written is judged to be cached in the flash memory data block, and writing the data to be written into the corresponding flash memory data block and the disc in double mode if the data to be written is judged not to be used by other I/O, and setting the flag bit corresponding to the corresponding flash memory data block in the second identification field during writing.
11. An electronic device is characterized by comprising a processor, a communication interface, a storage medium and a communication bus, wherein the processor, the communication interface and the storage medium are communicated with each other through the communication bus;
a storage medium storing a computer program;
a processor for implementing the method of any of claims 1-8 when executing a computer program stored on a storage medium.
CN202311630274.4A 2023-11-30 2023-11-30 Cache data management method, device and equipment Pending CN117632021A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311630274.4A CN117632021A (en) 2023-11-30 2023-11-30 Cache data management method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311630274.4A CN117632021A (en) 2023-11-30 2023-11-30 Cache data management method, device and equipment

Publications (1)

Publication Number Publication Date
CN117632021A true CN117632021A (en) 2024-03-01

Family

ID=90022994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311630274.4A Pending CN117632021A (en) 2023-11-30 2023-11-30 Cache data management method, device and equipment

Country Status (1)

Country Link
CN (1) CN117632021A (en)

Similar Documents

Publication Publication Date Title
US10739996B1 (en) Enhanced garbage collection
JP6613375B2 (en) Profiling cache replacement
US9798655B2 (en) Managing a cache on storage devices supporting compression
US11593186B2 (en) Multi-level caching to deploy local volatile memory, local persistent memory, and remote persistent memory
US20090070526A1 (en) Using explicit disk block cacheability attributes to enhance i/o caching efficiency
US20080016297A1 (en) Multi-Level Memory Architecture With Data Prioritization
US10310997B2 (en) System and method for dynamically allocating memory to hold pending write requests
US8914570B2 (en) Selective write-once-memory encoding in a flash based disk cache memory
CN105095116A (en) Cache replacing method, cache controller and processor
US9946660B2 (en) Memory space management
US20190004968A1 (en) Cache management method, storage system and computer program product
US11126553B2 (en) Dynamic allocation of memory between containers
Kim et al. Analysis of smartphone I/O characteristics—Toward efficient swap in a smartphone
CN115794669A (en) Method, device and related equipment for expanding memory
US10073851B2 (en) Fast new file creation cache
US9699263B1 (en) Automatic read and write acceleration of data accessed by virtual machines
US9104325B2 (en) Managing read operations, write operations and extent change operations
CN114115711B (en) Quick buffer storage system based on nonvolatile memory file system
CN117632021A (en) Cache data management method, device and equipment
Chen et al. Co-optimizing storage space utilization and performance for key-value solid state drives
CN111796757B (en) Solid state disk cache region management method and device
CN114518962A (en) Memory management method and device
CN115509437A (en) Storage system, network card, processor, data access method, device and system
US10698621B2 (en) Block reuse for memory operations
US11782854B2 (en) Cache architecture for a storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination