CN116820342A - Data processing method and device of disk array and disk array - Google Patents

Data processing method and device of disk array and disk array Download PDF

Info

Publication number
CN116820342A
CN116820342A CN202310797036.6A CN202310797036A CN116820342A CN 116820342 A CN116820342 A CN 116820342A CN 202310797036 A CN202310797036 A CN 202310797036A CN 116820342 A CN116820342 A CN 116820342A
Authority
CN
China
Prior art keywords
cache
array
request
disk
disk array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310797036.6A
Other languages
Chinese (zh)
Inventor
邸忠辉
梁欣玲
仇锋利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310797036.6A priority Critical patent/CN116820342A/en
Publication of CN116820342A publication Critical patent/CN116820342A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a data processing method and device of a disk array and the disk array, belonging to the technical field of storage, wherein the method comprises the following steps: receiving an input-output IO request; decomposing the IO request based on a first cache array and a second cache array of the disk array; executing the decomposed IO request; the first cache array is arranged at an input/output IO request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules; the second cache array is arranged at a disk access end of the disk array and is used for caching the stripe check data of the disk array, and the second cache array comprises a second preset number of independent cache modules. The invention can effectively reduce the competition intensity of the IO access cache, reduce the IO access times and the disk access times of RAID, and can improve the performance of RAID and the overall performance of a storage system.

Description

Data processing method and device of disk array and disk array
Technical Field
The present invention relates to the field of storage technologies, and in particular, to a method and an apparatus for processing data in a disk array, and a disk array.
Background
A cache is generally disposed in a redundant array of independent disks (Redundant Arrays of Independent Drives, RAID, simply referred to as a disk array) to reduce the number of disk accesses of the RAID. However, one RAID IO request task is decomposed into a plurality of IO tasks corresponding to stripes, and the IO tasks of the stripes are converted into a plurality of IO tasks corresponding to disk blocks, so that the number of IO tasks in the RAID is increased dramatically, access to the cache is also frequent, so that the cache competition is also very large, the performance of the cache in the RAID is very poor, the cache becomes a bottleneck of the performance of the RAID, and the performance of the system is reduced.
Disclosure of Invention
The invention provides a data processing method and device of a disk array and the disk array, which are used for solving the defects that the internal cache competition of RAID is large and the cache limits the RAID performance in the prior art.
The invention provides a data processing method of a disk array, which comprises the following steps:
receiving an input-output IO request;
decomposing the IO request based on a first cache array and a second cache array of a disk array;
executing the decomposed IO request;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
The second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
According to the data processing method of the disk array, the range of IO requests processed by each cache module in the first cache array is determined according to the number of stripes of the disk array and the first preset number.
According to the data processing method of a disk array provided by the invention, the first cache array and the second cache array based on the disk array decompose the IO request, and the method comprises the following steps:
determining the IO request as a read operation;
and if the target cache module in the first cache array hits the data requested by the IO request, reading the data requested by the IO request from the target cache module.
According to the data processing method of a disk array provided by the invention, the first cache array and the second cache array based on the disk array decompose the IO request, and the method further comprises the following steps:
if the cache module does not hit the data requested by the IO request in the first cache array, further decomposing the IO request;
wherein said further decomposing of said IO request comprises:
decomposing the IO request into a plurality of stripe IOs according to stripes;
each stripe IO is decomposed into a plurality of data blocks IO and check blocks IO.
According to the data processing method of a disk array provided by the invention, the first cache array and the second cache array based on the disk array decompose the IO request, and the method comprises the following steps:
determining the IO request as a write operation;
decomposing the IO request into a plurality of stripe IOs according to stripes;
decomposing each stripe IO into a plurality of data blocks IO and check blocks IO;
and caching the check block IO through the second cache array.
According to the data processing method of the disk array, the read data is cached by taking a strip as a unit;
The caching of the read data is realized by the following modes:
calculating the stripe number accessed by the IO request according to the logical block address of the IO request;
judging a cache module accessed by the IO request according to the strip number;
carrying out hash calculation by using the stripe number accessed by the IO request to obtain a hash value corresponding to the IO request;
the hash value is stored in a first hash table of the corresponding cache module.
According to the data processing method of the disk array provided by the invention, the second cache array is constructed in the following way:
grouping the stripes of the disk array, wherein a third preset number of stripes with adjacent stripe numbers are grouped into a group;
and forming the second cache array according to a thread corresponding to each group, wherein each cache module of the second cache array corresponds to one thread.
According to the data processing method of the disk array, the stripe verification data is cached by taking the stripe as a unit;
the caching of the stripe verification data is realized by the following steps:
carrying out hash calculation by using the stripe numbers of the stripe verification data to obtain hash values corresponding to the stripe verification data;
The hash value is stored in a second hash table of the corresponding thread cache.
The invention also provides a data processing device of the disk array, comprising:
the receiving unit is used for receiving the IO input and output request;
the decomposing unit is used for decomposing the IO request based on a first cache array and a second cache array of the disk array;
the execution unit is used for executing the decomposed IO request;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
The invention also provides a disk array, which comprises a cache matrix, wherein the cache matrix comprises a first cache array and a second cache array;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a data processing method of a disk array as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a data processing method for a disk array as described in any one of the above.
According to the data processing method of the disk array, the disk array and the storage medium, the read data of the disk array is cached through the first cache array arranged at the top end of the disk array, the second cache array arranged at the bottom of the disk array is used for caching the stripe check data, the first cache array and the second cache array are matched with each other and are mutually complemented, so that the competition intensity of IO access cache can be effectively reduced, the IO access times and the disk access times of RAID are reduced, the problem that the performance is seriously influenced due to the strong cache competition of the RAID layer is effectively solved, and the performance of the RAID and the integral performance of a storage system can be improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an internal cache of a RAID in the related art;
FIG. 2 is a flowchart illustrating a method for processing data of a disk array according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a RAID cache matrix according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a first cache array according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a hash table corresponding to a cache of a first cache array according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a lru linked list corresponding to a cache of a first cache array according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a second cache array according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a hash table of a thread cache of a second cache array according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a lru linked list of a thread cache according to an embodiment of the present invention;
FIG. 10 is a schematic diagram of a data processing apparatus of a disk array according to an embodiment of the present invention;
fig. 11 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In a cloud computing data center, the performance of data storage is a core issue of concern to users. And in the storage system, a plurality of magnetic disks form a magnetic disk array so as to improve the data access performance of the whole magnetic disk system and improve the data security through redundant data of the magnetic disk array. To improve the performance of disk array based storage systems, RAID is typically provided with a cache to cache all Input/Output (IO) data for the RAID internal stripe and block. By setting the cache, the disk access times of RAID can be effectively reduced, and RAID performance can be greatly improved, thereby improving the performance of the storage system.
Caching is a software mechanism that allows a system to retain some of the data normally stored on disk in random access memory (Random Access Memory, RAM) so that further access to that data can be satisfied as soon as possible without further access to disk. Because access to the same disk data occurs frequently, cache usage of the storage system can greatly impact the performance of the storage system.
FIG. 1 is a diagram of an internal cache of a RAID in the related art. As shown in FIG. 1, RAID is partitioned according to stripe, and the RAID stripe is partitioned according to disk block stripes. One RAID IO request task is decomposed into a plurality of IO tasks of corresponding stripes, the IO tasks of the stripes are converted into a plurality of IO tasks of corresponding disk blocks, the IO tasks of the RAID stripes are completed after the IO tasks of all the disk blocks are completed, and the IO request tasks are completed after the IO tasks of all the decomposed stripes are completed. The mechanism of RAID caching can cause the number of IO tasks in RAID to increase dramatically, and the access to the cache is very frequent, so that the cache competition is very large, the performance of the cache in RAID can be very poor, and the cache becomes a bottleneck of RAID performance. For the random IO buffer module, the performance of the system is further reduced.
In order to solve the defects of large internal cache competition and RAID performance limitation of the cache in the prior art, the invention provides a data processing method and device of a disk array and the disk array.
The data processing method and apparatus for a disk array and the disk array of the present invention are described below with reference to fig. 2 to 11.
Fig. 2 is a flow chart of a data processing method of a disk array according to an embodiment of the present invention. As shown in fig. 2, the data processing method of the disk array includes:
step 200, receiving an input/output IO request;
optionally, the IO request includes a read operation or a write operation.
Step 201, decomposing the IO request based on a first cache array and a second cache array of a disk array;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
Wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
It will be appreciated that the present invention improves upon the caching within a RAID. The invention sets a buffer matrix in RAID, and the buffer matrix is divided into two buffer arrays, namely a first buffer array and a second buffer array.
The first cache array is arranged at the top end (or called the upper layer) of the RAID, namely the IO request receiving end, and comprises a first preset number of independent cache modules, wherein the first preset number is recorded as Y, namely the first cache array comprises Y independent cache modules, so that the competition of IO to the cache can be effectively reduced.
The first cache array is used for caching read data of the RAID, wherein the read data can be understood as data which needs to be read when the IO request is a read operation. The first cache array can cache hot spot data, and the IO access times of RAID are reduced through caching the hot spot data, so that the number of sub-IOs in RAID is reduced, the disk access times are reduced, and the performance of the RAID system and the overall performance of the storage system are improved.
The second cache array is disposed at the bottom (or referred to as the lower layer) of the RAID, that is, the disk access end, and includes a second preset number of independent cache modules, and the second preset number is denoted as Z, that is, the second cache array includes Z independent cache modules. The second cache array is a cache array formed by threads, and the threads are bound to the IO. Therefore, the stripe has no thread competition to the access of the thread cache, and lock-free can be really realized.
It should be noted that, the second cache array is different from the original RAID internal cache and the first cache array, and the second cache array does not cache data blocks, and because the first cache array already provides the function of IO data caching, the second cache array is transparent to the data blocks IO decomposed by the RAID, so that the number of the internal decomposed IOs of the RAID is greatly reduced, and the competition to the second cache array is reduced.
The second cache array is used for providing the cache of RAID stripe verification data. This is because for RAID write IOs, especially small block writes, each IO operation requires a read-write modification operation on a check block, especially for sequential small block writes, assuming that one RAID stripe has U data blocks, V check blocks, then the IO for each data block corresponds to 2V check read-write operations, and the entire stripe will be subjected to at least 2VU check operations. According to the invention, the second cache array does not cache data blocks and only caches check blocks, so that compared with the original RAID internal cache, more check blocks can be cached, and the cache of the check blocks has very important significance in reducing the write amplification of RAID IO, reducing the number of disk accesses, reducing the competition intensity of IO access cache and improving the system performance.
The first preset number and the second preset number are different, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
FIG. 3 is a schematic diagram of a RAID cache matrix according to an embodiment of the present invention. As shown in fig. 3, the first cache array includes Y independent cache modules, i.e., up cache 0, up cache 1, … …, up cache Y-1. After an input/output IO request is received, the IO request is decomposed through a first cache array and a second cache array of a disk array, wherein the first cache array is used for caching read data, the second cache array is used for caching stripe check data, if the IO request is a read operation, whether the first cache array can hit the read data or not is determined, if the IO request can hit the read data, the IO request is directly returned, further decomposition is not needed, if the IO request cannot hit the read data, further decomposition is needed, the specific decomposition method is that stripe decomposition is firstly carried out, and after the disk block decomposition is carried out, if the IO request is a write operation, verification block IO is cached to the second cache array, so that the decomposition of the IO request is realized.
Step 202, executing the decomposed IO request;
and accessing the disk by executing the decomposed IO request to finish reading or writing the data.
According to the data processing method of the disk array, the first cache array arranged at the top end of the disk array caches read data of the disk array, the second cache array arranged at the bottom of the disk array caches stripe check data, and the first cache array and the second cache array are matched with each other to supplement each other, so that competition intensity of IO access caches can be effectively reduced, IO access times of RAID are reduced, the disk access times are reduced, the problem that performance is seriously affected due to strong RAID layer cache competition is effectively solved, and RAID performance and overall performance of a storage system can be improved.
In some embodiments, the range of the IO requests processed by each cache module in the first cache array is determined according to the number of stripes of the disk array and the first preset number.
Specifically, if the number of stripes of the disk array is MAX, the range of IO requests processed by each cache module in the first cache array is MAX/Y.
Fig. 4 is a schematic diagram of a first cache array according to an embodiment of the present invention. As shown in FIG. 4, assuming that RAID consists of MAX stripes, when User IOs access stripe 0 and stripe (MAX/Y) simultaneously, both IOs may access RAID top cache array layer up cache 0 and up cache 1 simultaneously. The two do not need to perform lock competition, so that the competition intensity of IO to the cache is reduced.
In some embodiments, the decomposing the IO request by the first cache array and the second cache array based on the disk array includes:
determining the IO request as a read operation;
and if the target cache module in the first cache array hits the data requested by the IO request, reading the data requested by the IO request from the target cache module.
It will be appreciated that in the case where the IO request is a read operation, the user wishes to access the hard disk from which data is read. If the target cache module in the first cache array hits the data requested by the IO request, the data requested by the IO request is directly read from the target cache module, and a disk is not required to be accessed to read the data requested by the IO request, so that the IO request is not required to be decomposed any more, and the number of the IO decomposed in the RAID is reduced.
In the embodiment of the invention, the first cache array caches the read data, so that the number of IO inside the RAID can be reduced, the number of disk accesses is reduced, and the performance of the RAID system and the overall performance of the storage system are improved.
Based on the foregoing embodiment, the decomposing the IO request by the first cache array and the second cache array based on the disk array further includes:
If the cache module does not hit the data requested by the IO request in the first cache array, further decomposing the IO request;
wherein said further decomposing of said IO request comprises:
decomposing the IO request into a plurality of stripe IOs according to stripes;
each stripe IO is decomposed into a plurality of data blocks IO and check blocks IO.
It will be appreciated that in the case of a read operation for an IO request, this indicates that the user wishes to access the hard disk from which data is to be read. If the data requested by the IO request can be hit by the cache module in the first cache array, the IO request needs to be normally decomposed, and the specific decomposition method is that the IO request is decomposed into a plurality of stripe IOs according to stripes, and then each stripe IO is decomposed into a plurality of data block IOs and a verification block IO.
For example, if the request scope of User IO is on both up cache 0 and up cache 1 and there is no cache hit, then it is broken down into two Rd IOs per cache scope, as shown in FIG. 3. Further, the stripe IO is decomposed into a plurality of stripes IO, such as stripe 0IO and … … stripe M-1IO in FIG. 3, and then each of the stripes IO is decomposed into a plurality of data chunks IO and check chunks IO.
In the embodiment of the invention, the first cache array caches the read data, so that the number of IO inside the RAID can be reduced, the number of disk accesses is reduced, and the performance of the RAID system and the overall performance of the storage system are improved.
In some embodiments, the decomposing the IO request by the first cache array and the second cache array based on the disk array includes:
determining the IO request as a write operation;
decomposing the IO request into a plurality of stripe IOs according to stripes;
decomposing each stripe IO into a plurality of data blocks IO and check blocks IO;
and caching the check block IO through the second cache array.
It will be appreciated that if the IO request is determined to be a write operation, it indicates that the user wishes to access the hard disk and write data to the hard disk. Decomposing the IO request into a plurality of strip IOs according to the strips; and decomposing each stripe IO into a plurality of data blocks IO and check blocks IO, and caching the check blocks IO through the second cache array.
It should be noted that, the data partitioning IO is transparent to the second cache array, and when executed, directly accesses the hard disk, without competing for the cache.
In the embodiment of the invention, the second cache array caches the stripe verification data, so that the write amplification of RAID IO can be reduced, the number of disk accesses is reduced, and the competition intensity of IO access caches is reduced.
In some embodiments, the read data is buffered in units of stripes;
the caching of the read data is realized by the following modes:
calculating the stripe number accessed by the IO request according to the logical block address of the IO request;
judging a cache module accessed by the IO request according to the strip number;
carrying out hash calculation by using the stripe number accessed by the IO request to obtain a hash value corresponding to the IO request;
the hash value is stored in a first hash table of the corresponding cache module.
It will be appreciated that the first cache array is cached in stripes. First, the stripe number of the IO request access, i.e. stride=lba/stripe data width, is calculated from the logical block address (Logical Block Address, lba) of the IO request. And judging which cache module in the first cache array needs to be accessed according to the IO start stripe number and the end stripe number. And then carrying out hash calculation by using the stripe number accessed by the IO request, and storing the hash value in a hash table (namely a first hash table) of the corresponding cache module, wherein the position of the data cache can be found through the hash value later.
It should be noted that, after the IO processing is completed, the data buffered in the first cache array is put into the least recently used (Least Recently Used, lru) linked list of the first cache array, and when the memory is insufficient, the data is recovered from the buffer that is first put into the lru linked list of the first cache array. Fig. 5 is a schematic diagram of a hash table corresponding to a cache of a first cache array according to an embodiment of the present invention. Fig. 6 is a schematic diagram of a lru linked list corresponding to a cache of a first cache array according to an embodiment of the present invention.
In the embodiment of the invention, the read data of the disk array is cached by taking the stripe as a unit, so that the number of IO inside RAID can be reduced, lock competition among different cache modules is not needed, and the competition intensity of IO on the cache is reduced.
In some embodiments, the second cache array is constructed in the following manner:
grouping the stripes of the disk array, wherein a third preset number of stripes with adjacent stripe numbers are grouped into a group;
and forming the second cache array according to a thread corresponding to each group, wherein each cache module of the second cache array corresponds to one thread.
Fig. 7 is a schematic diagram of a second cache array according to an embodiment of the present invention. As shown in FIG. 7, RAID stripes are grouped, B stripes with adjacent stripe numbers are grouped into a group, and meanwhile, a corresponding number of cache arrays are built according to the number of threads, each cache module corresponds to one of the threads, and meanwhile, thread binding on IO is realized. Because the stripes in the same thread only need to access the cache module of the thread, the stripes do not have thread competition for accessing the thread cache, and lock-free can be really realized.
In the embodiment of the invention, the second cache array forms the cache array according to the threads, and simultaneously the threads are bound to the IO, so that the access of the stripe to the thread cache does not have thread competition, and the RAID performance can be effectively improved.
In some embodiments, the stripe verification data is cached in stripe units;
the caching of the stripe verification data is realized by the following steps:
carrying out hash calculation by using the stripe numbers of the stripe verification data to obtain hash values corresponding to the stripe verification data;
the hash value is stored in a second hash table of the corresponding thread cache.
It will be appreciated that the check data buffered by the second buffer array operation is buffered in stripe units. All the verification data are subjected to hash calculation according to the belonged stripe number, and are stored in a hash table (namely a second hash table) of the program cache.
All data in the second cache array is not released immediately after the IO processing is completed, but is placed in the lru linked list of the thread cache. When the RAID system memory is insufficient and the memory needs to be recovered, the RAID system memory is recovered from the cache which is firstly put into the lru linked list. Fig. 8 is a schematic diagram of a hash table of a thread cache of a second cache array according to an embodiment of the present invention. Fig. 9 is a schematic diagram of a lru linked list of a thread cache according to an embodiment of the present invention.
In the embodiment of the invention, the second cache array caches the stripe verification data, so that the write amplification of RAID IO can be reduced, the number of disk accesses is reduced, the competition intensity of IO access cache is reduced, and the performance of RAID and the performance of a storage system are improved.
The data processing apparatus of a disk array provided by the present invention is described below, and the data processing apparatus of a disk array described below and the data processing method of a disk array described above may be referred to correspondingly.
Fig. 10 is a schematic structural diagram of a data processing apparatus of a disk array according to an embodiment of the present invention. As shown in fig. 10, a data processing apparatus 1000 of a disk array includes:
a receiving unit 1010, configured to receive an IO input/output request;
a decomposition unit 1020, configured to decompose the IO request based on a first cache array and a second cache array of a disk array;
an execution unit 1030 configured to execute the decomposed IO request;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
Wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
Optionally, the range of the IO requests processed by each cache module in the first cache array is determined according to the number of stripes of the disk array and the first preset number.
Optionally, the decomposing unit 1020 is specifically configured to:
determining the IO request as a read operation;
and if the target cache module in the first cache array hits the data requested by the IO request, reading the data requested by the IO request from the target cache module.
Optionally, the decomposing unit 1020 is further configured to:
if the cache module does not hit the data requested by the IO request in the first cache array, further decomposing the IO request;
wherein said further decomposing of said IO request comprises:
decomposing the IO request into a plurality of stripe IOs according to stripes;
each stripe IO is decomposed into a plurality of data blocks IO and check blocks IO.
Optionally, the decomposing unit 1020 is further configured to:
determining the IO request as a write operation;
decomposing the IO request into a plurality of stripe IOs according to stripes;
decomposing each stripe IO into a plurality of data blocks IO and check blocks IO;
and caching the check block IO through the second cache array.
Optionally, the read data is cached in units of stripes;
the caching of the read data is realized by the following modes:
calculating the stripe number accessed by the IO request according to the logical block address of the IO request;
judging a cache module accessed by the IO request according to the strip number;
carrying out hash calculation by using the stripe number accessed by the IO request to obtain a hash value corresponding to the IO request;
the hash value is stored in a first hash table of the corresponding cache module.
Optionally, the second cache array is constructed in the following manner:
grouping the stripes of the disk array, wherein a third preset number of stripes with adjacent stripe numbers are grouped into a group;
and forming the second cache array according to a thread corresponding to each group, wherein each cache module of the second cache array corresponds to one thread.
Optionally, the stripe verification data is cached in a stripe unit;
the caching of the stripe verification data is realized by the following steps:
carrying out hash calculation by using the stripe numbers of the stripe verification data to obtain hash values corresponding to the stripe verification data;
the hash value is stored in a second hash table of the corresponding thread cache.
It should be noted that, the data processing device for a disk array provided by the embodiment of the present invention can implement all the method steps implemented by the data processing method embodiment of the disk array, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those of the method embodiment in the embodiment are omitted.
The present invention also provides a disk array comprising: the cache matrix comprises a first cache array and a second cache array;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
Wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
The disk array provided in the embodiment of the present invention may refer to the description in the foregoing embodiment, and will not be repeated here.
Fig. 11 illustrates a physical structure diagram of an electronic device, as shown in fig. 11, which may include: processor 1110, communication interface Communications Interface 1120, memory 1130 and communication bus 1140, wherein processor 1110, communication interface 1120 and memory 1130 communicate with each other via communication bus 1140. Processor 1110 may invoke logic instructions in memory 1130 to perform a data processing method for a disk array, the method comprising: receiving an input-output IO request; decomposing the IO request based on a first cache array and a second cache array of a disk array; executing the decomposed IO request; the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules; the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules; wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
Further, the logic instructions in the memory 1130 described above may be implemented in the form of software functional units and sold or used as a stand-alone product, stored on a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor can perform a method for processing data of a disk array provided by the above methods, where the method includes: receiving an input-output IO request; decomposing the IO request based on a first cache array and a second cache array of a disk array; executing the decomposed IO request; the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules; the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules; wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a method for processing data of a disk array provided by the above methods, the method comprising: receiving an input-output IO request; decomposing the IO request based on a first cache array and a second cache array of a disk array; executing the decomposed IO request; the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules; the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules; wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for processing data of a disk array, comprising:
receiving an input-output IO request;
decomposing the IO request based on a first cache array and a second cache array of a disk array;
executing the decomposed IO request;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
Wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
2. The method for processing data of a disk array according to claim 1, wherein the range of IO requests processed by each cache module in the first cache array is determined according to the number of stripes of the disk array and the first preset number.
3. The method for processing data of a disk array according to claim 1, wherein the decomposing the IO request by the first cache array and the second cache array based on the disk array comprises:
determining the IO request as a read operation;
and if the target cache module in the first cache array hits the data requested by the IO request, reading the data requested by the IO request from the target cache module.
4. The method for processing data of a disk array according to claim 3, wherein the first cache array and the second cache array based on the disk array decompose the IO request, further comprising:
If the cache module does not hit the data requested by the IO request in the first cache array, further decomposing the IO request;
wherein said further decomposing of said IO request comprises:
decomposing the IO request into a plurality of stripe IOs according to stripes;
each stripe IO is decomposed into a plurality of data blocks IO and check blocks IO.
5. The method for processing data of a disk array according to claim 1, wherein the decomposing the IO request by the first cache array and the second cache array based on the disk array comprises:
determining the IO request as a write operation;
decomposing the IO request into a plurality of stripe IOs according to stripes;
decomposing each stripe IO into a plurality of data blocks IO and check blocks IO;
and caching the check block IO through the second cache array.
6. The method for processing data of a disk array according to claim 1, wherein the read data is buffered in units of stripes;
the caching of the read data is realized by the following modes:
calculating the stripe number accessed by the IO request according to the logical block address of the IO request;
judging a cache module accessed by the IO request according to the strip number;
Carrying out hash calculation by using the stripe number accessed by the IO request to obtain a hash value corresponding to the IO request;
the hash value is stored in a first hash table of the corresponding cache module.
7. The method for processing data of a disk array according to claim 1, wherein the second cache array is constructed in the following manner:
grouping the stripes of the disk array, wherein a third preset number of stripes with adjacent stripe numbers are grouped into a group;
and forming the second cache array according to a thread corresponding to each group, wherein each cache module of the second cache array corresponds to one thread.
8. The method for processing data of a disk array according to claim 7, wherein the stripe verification data is buffered in stripe units;
the caching of the stripe verification data is realized by the following steps:
carrying out hash calculation by using the stripe numbers of the stripe verification data to obtain hash values corresponding to the stripe verification data;
the hash value is stored in a second hash table of the corresponding thread cache.
9. A data processing apparatus for a disk array, comprising:
the receiving unit is used for receiving the IO input and output request;
The decomposing unit is used for decomposing the IO request based on a first cache array and a second cache array of the disk array;
the execution unit is used for executing the decomposed IO request;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
the second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
10. A disk array, comprising: the cache matrix comprises a first cache array and a second cache array;
the first cache array is arranged at an input/output (IO) request receiving end of the disk array and is used for caching read data of the disk array, and the first cache array comprises a first preset number of independent cache modules;
The second cache array is arranged at a disk access end of the disk array and is used for caching stripe verification data of the disk array, and the second cache array comprises a second preset number of independent cache modules;
wherein the first preset number is different from a second preset number, the first preset number is related to the number of threads allocated to the first cache array, and the second preset number is related to the number of threads allocated to the second cache array.
CN202310797036.6A 2023-06-30 2023-06-30 Data processing method and device of disk array and disk array Pending CN116820342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310797036.6A CN116820342A (en) 2023-06-30 2023-06-30 Data processing method and device of disk array and disk array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310797036.6A CN116820342A (en) 2023-06-30 2023-06-30 Data processing method and device of disk array and disk array

Publications (1)

Publication Number Publication Date
CN116820342A true CN116820342A (en) 2023-09-29

Family

ID=88114262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310797036.6A Pending CN116820342A (en) 2023-06-30 2023-06-30 Data processing method and device of disk array and disk array

Country Status (1)

Country Link
CN (1) CN116820342A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117234430A (en) * 2023-11-13 2023-12-15 苏州元脑智能科技有限公司 Cache frame, data processing method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117234430A (en) * 2023-11-13 2023-12-15 苏州元脑智能科技有限公司 Cache frame, data processing method, device, equipment and storage medium
CN117234430B (en) * 2023-11-13 2024-02-23 苏州元脑智能科技有限公司 Cache frame, data processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US11119694B2 (en) Solid-state drive control device and learning-based solid-state drive data access method
US10042779B2 (en) Selective space reclamation of data storage memory employing heat and relocation metrics
CN105095116B (en) Cache method, cache controller and the processor replaced
US9135181B2 (en) Management of cache memory in a flash cache architecture
CN105930282B (en) A kind of data cache method for NAND FLASH
US9959054B1 (en) Log cleaning and tiering in a log-based data storage system
US9921973B2 (en) Cache management of track removal in a cache for storage
CN103207840B (en) For imperfect record to be degraded to the system and method for the second buffer memory from the first buffer memory
KR101221185B1 (en) Prioritization of directory scans in cache
US9710283B2 (en) System and method for pre-storing small data files into a page-cache and performing reading and writing to the page cache during booting
CN108920387A (en) Reduce method, apparatus, computer equipment and the storage medium of read latency
CN110413211B (en) Storage management method, electronic device, and computer-readable medium
CN116820342A (en) Data processing method and device of disk array and disk array
US20240086332A1 (en) Data processing method and system, device, and medium
US11379326B2 (en) Data access method, apparatus and computer program product
CN109213693A (en) Memory management method, storage system and computer program product
CN115657946A (en) Off-chip DDR bandwidth unloading method under RAID sequential writing scene, terminal and storage medium
US7596665B2 (en) Mechanism for a processor to use locking cache as part of system memory
CN110659305B (en) High-performance relational database service system based on nonvolatile storage system
JPH08263380A (en) Disk cache control system
US20040078544A1 (en) Memory address remapping method
CN115480697A (en) Data processing method and device, computer equipment and storage medium
CN115904795A (en) Data storage method and device in storage system
JPH06505820A (en) How to create a program that is combined from multiple modules
US10503651B2 (en) Media cache band cleaning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination