CN109344012B - Data reconstruction control method, device and equipment - Google Patents

Data reconstruction control method, device and equipment Download PDF

Info

Publication number
CN109344012B
CN109344012B CN201811072330.6A CN201811072330A CN109344012B CN 109344012 B CN109344012 B CN 109344012B CN 201811072330 A CN201811072330 A CN 201811072330A CN 109344012 B CN109344012 B CN 109344012B
Authority
CN
China
Prior art keywords
data block
disk
data
persistence level
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811072330.6A
Other languages
Chinese (zh)
Other versions
CN109344012A (en
Inventor
张天洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd Chengdu Branch
Original Assignee
New H3C Technologies Co Ltd Chengdu Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Technologies Co Ltd Chengdu Branch filed Critical New H3C Technologies Co Ltd Chengdu Branch
Priority to CN201811072330.6A priority Critical patent/CN109344012B/en
Publication of CN109344012A publication Critical patent/CN109344012A/en
Application granted granted Critical
Publication of CN109344012B publication Critical patent/CN109344012B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a data reconstruction control method, apparatus and device, which are applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, and the method includes: determining the persistence level of the data block according to the redundancy degree of the data block in the current cluster; and controlling the I/O resources of the disk for reconstructing the data block according to a preset I/O resource allocation principle and the persistence level. By the data reconstruction control method, the data reconstruction control device and the data reconstruction control equipment, data persistence and system availability of a cluster storage system can be balanced, and overall performance of the system is improved.

Description

Data reconstruction control method, device and equipment
Technical Field
The present disclosure relates to the field of cluster storage technologies, and in particular, to a data reconstruction control method, apparatus, and device.
Background
The cluster storage technology is to aggregate storage spaces in a plurality of storage nodes (also called storage devices or hosts) into a storage pool capable of providing a uniform access interface and a management interface for a service host, wherein the storage pool and the service host form a cluster storage system (or a storage cluster), and applications in the service host can transparently access and utilize disks on all the storage nodes through the access interface, so that the performance of the storage nodes and the utilization rate of the disks can be fully exerted. Data is stored and read from a plurality of storage nodes according to a certain rule so as to obtain higher concurrent access performance.
Data persistence and system availability are important indexes for evaluating a cluster storage system, the data persistence generally refers to the capability of data not lost within one year, and the availability refers to the capability of the storage system to provide storage service to the outside continuously. On the one hand, the higher the values of the two indexes, the better the reliability and stability of the system are; on the other hand, the two indexes have a certain restriction relationship, and one index is strengthened while the other index is weakened.
Taking the failure of a disk as an example, the annual failure rate of the disk is about 2-4%. When a certain disk fails, the associated disk related to the data stored in the disk needs to perform data reconstruction operation, and all I/O resources of the associated disk need to be used for data reconstruction in order to optimize data durability. However, if the application in the service host needs to use the I/O resource of the associated disk for read or write operations, the associated disk will not respond, resulting in a decrease in availability of the cluster storage system. To optimize system availability, when a disk fails, all I/O resources of the associated disk preferably satisfy the read or write requirements of the service host, but at this time, the reconstruction speed of data becomes slow, and thus the data persistence is reduced.
Disclosure of Invention
In view of this, an object of the present disclosure is to provide a data reconfiguration control method, apparatus and device, which can balance data durability and system availability of a cluster storage system, and improve overall performance of the system.
In order to achieve the above purpose, the technical scheme adopted by the disclosure is as follows:
in a first aspect, the present disclosure provides a data reconstruction control method, which is applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, and the method includes: determining the persistence level of the data block according to the redundancy degree of the data block in the current cluster; controlling the I/O resources of the disk for reconstructing the data block according to a preset I/O resource allocation principle and the persistence level; in the preset I/O resource allocation principle, the persistence level and the I/O resource allocated by the disk are in an inverse relation.
In a second aspect, the present disclosure provides a data reconstruction control apparatus, applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, including: the persistence level determining module is used for determining the persistence level of the data block according to the redundancy degree of the data block in the current cluster; the I/O resource control module is used for controlling and reconstructing the I/O resources of the disk of the data block according to a preset I/O resource allocation principle and the persistence level; in the preset I/O resource allocation principle, the persistence level and the I/O resource allocated by the disk are in an inverse relation.
In a third aspect, the disclosed embodiments provide a data reconstruction control device, comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor executing the machine-executable instructions to implement the above method.
In a fourth aspect, the disclosed embodiments provide a machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement the above-described method.
According to the data reconstruction control method, the data reconstruction control device, the data reconstruction control equipment and the machine-readable storage medium, the persistence level of the data block is determined according to the redundancy degree of the data block in the current cluster, the I/O resource of the disk of the reconstructed data block is controlled according to the persistence level, the I/O resource of the disk of the reconstructed data block can be matched with the persistence level of the data block, the data persistence and the system availability of the cluster storage system are further balanced, and the overall performance of the system is improved.
Additional features and advantages of the disclosure will be set forth in the description which follows, or in part may be learned by the practice of the above-described techniques of the disclosure, or may be learned by practice of the disclosure.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic diagram of a distributed cluster storage system using a replication mechanism according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of data block reconstruction when a disk fails according to an embodiment of the present disclosure;
fig. 3 is a flowchart of a data reconstruction control method according to an embodiment of the present disclosure;
fig. 4 is a schematic diagram of a centralized cluster storage system according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a data reconstruction control device according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of another data reconstruction control device according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a data reconstruction control device according to an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the embodiments of the present disclosure will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It should be noted that the above method embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
In a clustered storage system, system availability and data persistence have different priorities under different conditions. For example, fig. 1 is a schematic diagram of a distributed cluster storage system employing a replication mechanism. The system comprises 4 storage nodes, wherein each storage node comprises four magnetic disks which are respectively represented as a magnetic disk 1, a magnetic disk 2, a magnetic disk 3 and a magnetic disk 4 in the figure; the service host is connected with the storage cluster system through a front-end network, and the storage cluster system can also be connected with other hosts through a back-end network.
And the service host writes the data to be stored into the storage cluster system through the front-end network. Generally, data in a storage cluster system is divided into equal-sized data blocks to be stored in different disks. The references 1, 2,3, 4, 5 on each disk in fig. 1 indicate the corresponding identifications of the cut data blocks. Taking the storage node 1 as an example, the data block 1 and the data block 5 are stored in the disk 1; the disk 2 stores data blocks 2 and 3. To ensure high durability of a data block, fig. 1 adopts a 3-copy policy, that is, there are 3 identical data blocks, which are referred to as 3 copies of the data block, and these copies are stored on disks of different storage nodes, which are also referred to as associated disks of the data block because they all store copies of the data block. For example, 3 copies of the data block 1 are stored in the disk 1 of the storage node 1, the disk 2 of the storage node 2, and the disk 2 of the storage node 3, respectively, in this scenario, the disk 1 of the storage node 1, the disk 2 of the storage node 2, and the disk 2 of the storage node 3 are associated disks of the data block 1.
In practical applications of the storage cluster system, a disk failure or a storage node failure may occur, or a network failure and other failures connected to the storage node may occur, and these failures may cause that a certain data block or several data blocks stored in the storage cluster system are unavailable, so that the number of copies of the data block is no longer 3, and once the number of the unavailable data blocks is increased, the number of copies of a certain data block is likely to be 1, sometimes even 0. In order to avoid the problem of the reduction of the number of copies in the storage cluster system in time, the reconstruction operation is usually performed on the data block when a copy is lost or unavailable.
Referring to the schematic diagram of data block reconstruction in the case of a disk failure shown in fig. 2, assuming that a disk 2 of a storage node 4 fails, both data blocks 2 and 4 thereon will be lost, and in order to ensure high durability of data of the storage system, the lost data block needs to be generated at other nodes (specifically, which disk generation of which node is determined by a storage software algorithm), this process is called data reconstruction. As shown in fig. 2, the data block 2 is reconstructed on the disk 3 of the node 3; the data block 4 is reconstructed on the disk 4 of the node 1. After reconstruction is completed, all data becomes 3 copies again, and the same data persistence as before is achieved.
The system described above takes a copy mechanism as an example, where the copy mechanism is a technology adopted to ensure data persistence, and specifically refers to keeping multiple identical copies of data in a cluster storage system. Such as 2 copies, 3 copies, etc. are common. In addition, an Erasure Code (EC) mechanism may also be adopted, in which data is divided into segments (N pieces), redundant segments (M pieces) are generated according to a certain algorithm, and lost data may be reconstructed according to the remaining data in the case where part of the data (not more than M pieces) is lost. This segmentation method is generally denoted as (N, M). Common configurations are (4,2), (6,2), (12,3), etc.
Whether data reconstruction in a copy mechanism or data reconstruction in an EC mechanism, the shorter the reconstruction time is, the faster the data integrity is recovered, and the higher the data persistence of the storage cluster is. During the period of data reconstruction, the disk participating in reconstruction will perform a large amount of data reading and writing. And the read/write I/O capability of each disk is limited (the random IOPS capability of a hard disk is generally about 100-. If the I/O capability is mostly occupied by data reconstruction, the service access I/O to the front-end host is necessarily reduced in service level (e.g., IOPS is reduced, latency is increased), or even unresponsive to the front-end I/O, thereby reducing system availability.
For a centralized cluster storage system, the contradiction between the data persistence and the system availability also exists, and based on the contradiction, the embodiment of the disclosure provides a data reconstruction control method and device, which can balance the data persistence and the system availability of the cluster storage system and improve the overall performance of the system. The following is a description by way of specific embodiments.
Referring to the flowchart of the data reconstruction control method shown in fig. 3, the method may be applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, and the method includes the following steps:
step S302, determining the persistence level of the data block according to the redundancy degree of the data block in the current cluster;
the redundancy level of a data block can be measured by the size of the data amount of the data block with backup data. As one implementation, the degree of redundancy of a data block may be determined based on the storage mechanism of the data, such as: if the storage mechanism is a copy mechanism, the redundancy degree of the data block can be measured by using the copy number of the data block in the current cluster, the more the copy number is, the greater the redundancy degree is, and vice versa; if the storage mechanism is an EC mechanism, the redundancy degree of the data block can be measured by the number of fragments lost by the data corresponding to the data block, the more fragments are lost, the smaller the redundancy degree is, and vice versa; of course, the redundancy degree of the data block can also be measured by subtracting the lost number of the fragments from the total number of the check blocks corresponding to the data block, so that the larger the difference is, the larger the redundancy degree is, and vice versa.
The disk or the controller may obtain the redundancy level of the data block based on an event triggering manner, for example, when there is a failure event (e.g., a disk failure), the redundancy level of each data block is obtained, and the specific obtaining manner may be performed based on the data storage mechanism. Or on the basis of the event triggering mode, the redundancy degree of each data block can also be acquired in a periodic mode, for example: if the above event does not occur within the set period, the redundancy level obtaining process of the data block may be performed once.
The redundancy level is related to the persistence level of the data block, for example, when the number of copies of the data block is 1, the redundancy level is low, and the persistence level is low at this time, because once only one copy is lost, the data block is completely lost and cannot be persisted. When the number of copies of the data block is large, the redundancy degree is high, and the persistence level is high. Thus, in one possible implementation, the degree of redundancy of a data block is directly proportional to the persistence level of the data block.
Step S304, controlling and reconstructing the I/O resources of the disk of the data block according to a preset I/O resource allocation principle and the persistence level; in a preset I/O resource allocation principle, the persistence level and the I/O resource allocated to the disk are in an inverse relation, namely the higher the persistence level is, the lower the I/O resource allocated to the disk is; the lower the persistence level, the higher the I/O resources allocated by the disk.
Wherein, the I/O resource of the disk for reconstructing the data block refers to the I/O resource used for reconstructing the data block in the disk. When the persistence level is very low, it indicates that the data is not sustainable, that is, the data is about to face the risk of complete loss, and the I/O resource of the disk of the reconstructed data block can be controlled to be a larger value, so that more I/O resources are put into the reconstructed data block, and the persistence level of the data block is promoted as soon as possible.
When the persistence level is higher, it is indicated that the redundancy degree of the data is higher, and the possibility that the data is completely lost in a short time is very low, so that the I/O resources of the disk for reconstructing the data block can be controlled to be reduced to some extent, and more I/O resources are used for the front-end service, so as to ensure that the availability of the system is higher.
The method determines the persistence level of the data block according to the redundancy degree of the data block in the current cluster, controls the I/O resource of the disk of the reconstructed data block according to the persistence level, can match the I/O resource of the disk of the reconstructed data block with the persistence level of the data block, further balances the data persistence and the system availability of the cluster storage system, and improves the overall performance of the system.
Taking the above method applied to the disk in the distributed storage cluster system shown in fig. 1 as an example, the associated disks of the data block 1 (i.e., the disk 1 of the storage node 1, the disk 2 of the storage node 2, and the disk 2 of the storage node 3) send heartbeat messages to each other, determine whether the disk associated with the associated disks is failed, and if the heartbeat response message replied by the opposite-end disk is not received within a preset time period, determine that the opposite-end disk is failed. Taking the example that the disk 1 of the storage node 1 sends the heartbeat message, under normal conditions, the disk 2 of the storage node 2 and the disk 2 of the storage node 3 will receive the heartbeat message and respectively reply the heartbeat response message to the disk 1 of the storage node 1, and if the disk 2 of the storage node 2 fails at this time, the disk 1 of the storage node 1 cannot receive the heartbeat response message of the disk 2 of the storage node 2, and only can receive the heartbeat response message replied by the disk 2 of the storage node 3. Therefore, the disk 1 of the storage node 1 may determine that the current data block 1 has 2 copies in the system, and the redundancy degree of the data block 1 is the redundancy degree corresponding to the 2 copies, and further determine that the persistence level of the data block belongs to a medium level, which is neither the highest persistence level corresponding to the 3 copies nor the lowest persistence level corresponding to the 1 copy, so that the disk 1 of the storage node 1 may allocate relatively large I/O resources for reconstructing the data block 1, for example, 50% of the I/O resources on the disk 1 of the storage node 1 are allocated to the reconstructed data block 1, and the remaining 50% are used for responding to the service request of the front-end application.
For a centralized cluster storage system, the method is also applicable, and refer to a schematic diagram of the centralized cluster storage system shown in fig. 4, where a controller manages all disks on a host in a unified manner, a data block 1 and a data block 2 are stored on a disk 1 and a disk 4, respectively, and a data block 2 and a data block 3 are stored on a disk 2 and a disk 3, respectively. For the centralized storage cluster system, direct communication among the disks is not performed any more, all the disks are managed through an external controller, the controller can know the data blocks stored on each disk, which disks are related to each other, once the controller finds that a certain disk fails, an instruction for reconstructing the data blocks is sent to the related disk of the data blocks stored on the disk, the instruction will instruct the associated disk to copy several data blocks to which disk, and for implementing the method on the controller, the controller can also determine which data block has a changed redundancy level according to the currently failed disk, for a changed data block, determining a persistence level of the data block based on a current level of redundancy of the data block, and further controlling how many I/O resources the disk associated with the data block performs the reconstruction operation of the data block. Therefore, the data reconstruction control method and device provided by the embodiment of the disclosure can be used in a distributed storage cluster and can also be used in a centralized storage cluster.
In a possible implementation manner, the determining manner of the persistence level of the data block in step S302 may include the following steps:
(1) if the current cluster is a cluster adopting a copy mechanism, determining the persistence level of the data block according to the copy number of the current data block; for example: for a cluster with a 3-copy mechanism, if the number of copies of a data block is 3, the redundancy level of the data block may be 3, the persistence level of the data block is a slight level, and the slight level may also be represented by a numeral 3.
(2) If the current cluster is the cluster adopting an erasure code mechanism, calculating the difference value of the total number of the check blocks of the source data of the current data block minus the number of the lost data blocks, and determining the persistence level of the data block according to the obtained difference value. Wherein, the source data of the current data block refers to data before being divided. In the (N, M) configuration, N represents the number of slices (data blocks) into which the source data is divided, and M is the number of redundant slices generated according to a certain algorithm. The disk where each fragment of the same data source is located is a related disk, and if several disks do not respond to the heartbeat message, the user knows that several fragments are lost currently, and further determines which related disks are invalid. If K fragments are lost (0 ≦ K ≦ M), the data persistence level is defined as (M-K + 1). For example, for the (9, 2) configuration, if no data fragmentation loss occurs, the data persistence level is a slight level, e.g., 3; if 1 fragment is lost, the data persistence level is a general level, e.g., 2; if 2 shards are lost, the data persistence level is a critical level, e.g., 1.
In one possible embodiment, for a cluster employing a replica mechanism: if there are N copies of a block of data, its data persistence level is defined as N. Referring to fig. 2, which is a schematic diagram of data block reconstruction when a disk fails, all data blocks 1, 2,3, 4, and 5 have three copies in a normal state, and the data persistence level is 3; assuming that the disk 2 on the storage node 4 is damaged, the data blocks 2 and 4 are lost, the data blocks 2 and 4 are affected by the disk failure, and the data persistence levels of the data blocks 2 and 4 are changed to 2; if the disk damage continues during the data reconstruction process, for example, the disk 1 of the storage node 3 fails again, the data persistence level of the data block 4 becomes 1.
For a cluster adopting an erasure code mechanism, in order to keep the definition of the data persistence level of a storage cluster adopting the erasure code mechanism consistent with the definition of the data persistence level of a storage cluster adopting a replica mechanism, in the above method, the data persistence level in the erasure code mechanism is defined as (M-K +1), N represents the number of fragments (data blocks) into which source data is divided, M is the number of redundant fragments generated according to a certain algorithm, and K is the number of lost fragments.
It can also be seen from the above definition of the data persistence level that when the data persistence level is 1, the data persistence becomes worst, and data reconstruction must be performed as quickly as possible to improve the data persistence level; when the data persistence level is greater than 1, the possibility of data loss is relatively small, and data reconstruction can be performed at a relatively gentle speed.
The method can be started when a failure event occurs, and can also be started in the process of reconstructing data. For a scenario in which a failure event occurs, the foregoing step S302 may further include: if a failure event is monitored in the current cluster, the redundancy degree of each data block in the cluster is obtained; wherein the failure event comprises at least one of: disk failures in the cluster, node failures in the cluster, network failures in the cluster, and the like. After the redundancy degree of the data block is obtained, the persistence level of each data block can be determined according to the redundancy degree of each data block. Whether related disks or related nodes survive or not can be confirmed between the disks and between the storage nodes through a heartbeat mechanism, if one party cannot receive heartbeat response sent by the other party, the other party can be judged to be invalid, the related data blocks stored by the other party can be judged to be lost, the redundancy degree of the related data blocks is further determined to be reduced, if N copies exist originally and one copy is lost, the redundancy degree is changed into N-1, and the persistence level is changed into N-1.
By starting the method under the condition that a failure event occurs, the excessive resource of the system can be avoided being consumed, and the system performance can be maintained.
For an application scenario of the data reconstruction process, considering that the redundancy degree of the data blocks may change dynamically, in order to achieve better resource allocation, the method may further include: (1) monitoring the redundancy degree of each data block in the current cluster; (2) judging whether the redundancy degree of the data block changes or not according to the monitored redundancy degree of the data block; (3) if the redundancy degree of the data block is changed, adjusting the persistence level of the data block by using the changed redundancy degree of the data block, and controlling and reconstructing the I/O resources of the disk of the data block according to the preset I/O resource allocation principle and the adjusted persistence level; (4) if the redundancy level of the data block is not changed, the I/O resources of the data block allocation disk are maintained unchanged.
If the redundancy degree of the data block is the same as the redundancy degree monitored last time, the monitored redundancy degree is judged to be changed, and if the monitored redundancy degree is the same as the redundancy degree monitored last time, the redundancy degree of the corresponding data block is not changed. Taking the copy mechanism as an example, if the data block 1 is before reconstruction, the copy is 2, and the redundancy level is also 2. By performing data reconstruction on the data block 1, if it is monitored that the number of copies of the data block 1 is 3 and the redundancy degree of the data block 1 is 3, the redundancy degree of the data block 1 changes.
The monitoring of the redundancy degree may be a periodic monitoring or a real-time monitoring. By the above manner of dynamically adjusting the persistence level of the data block based on monitoring the redundancy degree of the data block, I/O resources of a disk for reconstructing the data block can be dynamically controlled, and it is fully considered that the redundancy degree of the data block will be better and better (i.e. more and more redundant data) in the data reconstruction process, and further the persistence level of the data block is higher and higher, so that required I/O resources will be correspondingly reduced, and dynamic reasonable allocation of the I/O resources is realized. For example: if only one copy of a data block exists in the system, the persistence level of the data block is a critical level, and then more I/O resources are allocated to reconstruct data, and along with the progress of data reconstruction, the number of the copies of the data block becomes 2, at this time, the persistence level of the data block is no longer so urgent, and can be regarded as a general level, and at this time, the general level can be updated to allocate less I/O resources to the subsequent reconstruction process of the data block, so as to improve the availability of the system.
In addition to the case where the number of copies of a data block is increased due to reconstruction, there are also cases where a failure event occurs during reconstruction, resulting in a decrease in the number of copies of a data block. Referring to fig. 2, a schematic diagram of data block reconstruction when a disk fails, because a failure event occurs, data block 4 needs to be reconstructed, and at this time, data block 4 has two copies, and the redundancy degree is 2; however, during the reconstruction process, the disk may fail again, and the redundancy level of the data block 4 changes from 2 to 1, so that the reconstruction speed cannot be reconstructed according to the reconstruction strategy with redundancy of 2, but needs to be reconstructed according to the reconstruction strategy with redundancy of 1.
In order to simplify the implementation process, a reconfiguration policy corresponding relationship may be pre-established and stored, where the reconfiguration policy corresponding relationship records a corresponding relationship between a persistence level and an I/O resource ratio, and based on this, the step S304 may include: and searching the corresponding relation by using the persistence level, obtaining the I/O resource ratio of the disk for reconstructing the data block, and controlling the disk to reconstruct the data block according to the searched I/O resource ratio of the disk. The I/O resource ratio here refers to a ratio of I/O resources of the disk used for data block reconstruction to total I/O resources of the disk. The persistence level in the reconstruction policy corresponding relationship is in an inverse relationship with the I/O resource occupation ratio of the disk, that is, the higher the persistence level is, the greater the redundancy degree of the data block is, and the lower the requirement on the timeliness of the reconstructed data block is.
In a possible implementation manner, the reconfiguration policy corresponding relationship may include a corresponding relationship between three persistence levels and I/O resource occupation ratios of the disks that reconfigure the data block; wherein, the corresponding relation comprises: the method comprises the steps that the I/O resource occupation ratio of a corresponding disk belongs to a first interval when the persistence level is a critical level, the I/O resource occupation ratio of the corresponding disk belongs to a second interval when the persistence level is a general level, and the I/O resource occupation ratio of the corresponding disk belongs to a third interval when the persistence level is a slight level, wherein the first interval, the second interval and the third interval are preset intervals ranging from 0 to 1. For more visual illustration, table 1 shows an implementation form of the reconstruction policy correspondence.
TABLE 1
Persistence level I/O resource occupation ratio of disk for reconstructing data block
1 (Critical) (80%,90%)
2 (in general) (10%,20%)
3 (slight) (5%,10%)
The I/O resource occupation ratio of the disk of the reconstructed data block in table 1 is only an example, and each interval may be flexibly set as needed, and may overlap with a part of the values, or may not overlap with each other. In table 1, when the persistence level is 1 and belongs to the critical level, the I/O resource occupation interval of the disk of the corresponding reconstructed data block found is (0.8, 0.9). When the persistence level is 2 or 3, it is a general or mild level, so when the persistence level is 2 or 3, it should be considered to ensure the availability of the system. The I/O resource occupancy of the disk to reconstruct the data block is adjusted to 20% or less, for example, 20% or 15%.
Table 1 is so defined, based primarily on the following considerations: first, when the persistence level is 1, the data block is in a critical state, and the data must be reconstructed with the greatest capacity possible, so as to improve the persistence level. Of course, at this time, it is also necessary to reserve a part of the most basic I/O resources for the front-end traffic to prevent the traffic initiated by the front-end host from being unresponsive and completely losing the system availability. Second, the probability of different persistence levels occurring varies greatly. In a typical distributed system, the number of times data is in the "critical" state is 2 orders of magnitude lower than in the "normal" state, which in turn is 2 orders of magnitude lower than in the "mild" state. In this sense, a "critical" state is a small probability event. Most data reconstruction events, data are in a 'normal' or 'light' state, the I/O resource occupation ratio of a disk for reconstructing a data block is not higher than 20%, and the availability of the system is well guaranteed.
In a specific implementation, the correspondence between the persistence level and the I/O resource occupation ratio of the disk for reconstructing the data block can be defined by itself based on the probability of occurrence of different data states and the balance between the data states and the system availability. The three intervals in table 1 are not continuous, but may be set to be continuous, see the correspondence between the persistence level and the I/O resource ratio of the disk for reconstructing the data block shown in table 2.
TABLE 2
Persistence level I/O resource occupation ratio of disk for reconstructing data block
1 Greater than or equal to 80 percent and less than or equal to 90 percent
2 Less than 80% and greater than or equal to 20%
3 Less than 20 percent
In a possible implementation manner, by setting a restrictive index parameter of a disk, a ratio of I/O resources of the disk used for reconstructing the data block to total I/O resources of the disk is made equal to the found I/O resource ratio; wherein the restrictive index parameter includes at least one of: an IOPS (Input/Output Operations Per Second, the number of times of performing read/write I/O Operations Per Second), a disk bandwidth, the number of disks participating in the data block reconstruction, the number of processes participating in the data block reconstruction, and the like. By setting the restrictive index parameter, it is easier to control the I/O resource ratio of the disk for the above data block reconstruction to match the persistence level of the data block, and maintain the balance between data persistence and system availability.
In correspondence to the foregoing method, the present disclosure provides a data reconstruction control apparatus, where the apparatus is applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, and referring to fig. 5, the apparatus includes: a persistence level determining module 501, configured to determine a persistence level of a data block in a current cluster according to a redundancy level of the data block; an I/O resource control module 502, configured to control the I/O resources of the disk for reconstructing the data block according to a preset I/O resource allocation principle and the above persistence level; in the preset I/O resource allocation principle, the persistence level and the I/O resource allocated by the disk are in an inverse relation.
The device determines the persistence level of the data block according to the redundancy degree of the data block in the current cluster, controls the I/O resource of the disk of the reconstructed data block according to the persistence level, can match the I/O resource of the disk of the reconstructed data block with the persistence level of the data block, further balances the data persistence and the system availability of the cluster storage system, and improves the overall performance of the system.
The persistence level determining module 501 is configured to determine, if the current cluster is a cluster using a copy mechanism, a persistence level of the data block according to the number of copies of the current data block; if the current cluster is the cluster adopting an erasure code mechanism, calculating the difference value of the total number of the check blocks of the source data of the current data block minus the number of the lost data blocks, and determining the persistence level of the data block according to the obtained difference value.
Corresponding to the foregoing method, an embodiment of the present disclosure further provides a data reconstruction control apparatus, where the apparatus is applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, and referring to fig. 6, the apparatus further includes, on the basis of the apparatus shown in fig. 5: a monitoring module 601 and a judging module 602; wherein; the monitoring module 601 is configured to monitor the redundancy degree of each data block in the current cluster; the judging module 602 is configured to judge whether the redundancy level of the data block changes according to the monitored redundancy level of the data block;
correspondingly, the persistence level determining module 501 is further configured to, when the determination result of the determining module is yes, adjust the persistence level of the data block by using the changed redundancy level of the data block; and the I/O resource control module 502 is further configured to, when the persistence level determination module 501 adjusts the persistence level of the data block, control the I/O resources of the disk that reconstructs the data block according to a preset I/O resource allocation principle and the adjusted persistence level.
The persistence level determining module 501 is configured to, if a failure event is monitored in a current cluster, obtain a redundancy level of each data block in the cluster; wherein the failure event comprises at least one of: failure of a disk in the cluster, failure of a node in the cluster, and failure of a network in the cluster; and determining the persistence level of each data block according to the redundancy degree of each data block.
The I/O resource control module 502 is configured to search a reconstruction policy corresponding relationship in a preset I/O resource allocation principle by using the persistence level to obtain an I/O resource ratio of the disk for reconstructing the data block, where the persistence level in the reconstruction policy corresponding relationship is in an inverse relationship with the I/O resource ratio of the disk; and controlling the disk to reconstruct the data block according to the searched I/O resource ratio of the disk.
The I/O resource control module 502 is further configured to set a restrictive index parameter of the disk, so that a ratio of an I/O resource of the disk used for reconstructing the data block to an I/O total resource of the disk is equal to the found I/O resource ratio; wherein the restrictive index parameter includes at least one of: the number of times IOPS of read-write I/O operation per second, the disk bandwidth, the number of disks participating in the data block reconstruction and the number of processes participating in the data block reconstruction.
The reconstruction strategy corresponding relation comprises corresponding relations of three persistence levels and the proportion of the I/O resources of the disk; wherein, the corresponding relation comprises: the method comprises the steps that the I/O resource occupation ratio of a corresponding disk belongs to a first interval when the persistence level is a critical level, the I/O resource occupation ratio of the corresponding disk belongs to a second interval when the persistence level is a general level, and the I/O resource occupation ratio of the corresponding disk belongs to a third interval when the persistence level is a slight level, wherein the first interval, the second interval and the third interval are preset intervals ranging from 0 to 1.
The present embodiment provides a data reconstruction control device corresponding to the above method embodiment. Fig. 7 is a schematic structural diagram of the data reconstruction control device, which may be a controller in a centralized storage cluster. As shown in fig. 7, the apparatus includes a processor 701 and a memory 702; the memory 702 is used for storing one or more computer instructions, which are executed by the processor to implement the above-mentioned data reconstruction control method.
The device shown in fig. 7 further comprises a bus 703, and the processor 701 and the memory 702 are connected via the bus 703.
The Memory 702 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Bus 703 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
The processor 701 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 701. The Processor 701 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 702, and the processor 701 reads the information in the memory 702, and completes the steps of the method of the foregoing embodiment in combination with the hardware thereof.
The embodiment of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions, and when the machine-executable instructions are called and executed by a processor, the machine-executable instructions cause the processor to implement the above data reconstruction control method, and specific implementation may refer to method implementation embodiments, and will not be described herein again.
The data reconstruction control device and the data reconstruction control equipment provided by the embodiment of the invention have the same implementation principle and the same technical effect as the method embodiment, and for the sake of brief description, the corresponding content in the method embodiment can be referred to where the device embodiment is not mentioned partially.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, and the flowcharts and block diagrams in the figures, for example, illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (12)

1. A data reconfiguration control method, applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, the method comprising:
determining the persistence level of the data block according to the redundancy degree of the data block in the current cluster; the redundancy degree of the data block is in direct proportion to the persistence level of the data block;
controlling the I/O resources of the disk for reconstructing the data block according to a preset I/O resource allocation principle and the persistence level; in the preset I/O resource allocation principle, the persistence level and the I/O resource allocated by the disk are in an inverse relation.
2. The method of claim 1, wherein the step of determining the persistence level of the data block according to the redundancy level of the data block in the current cluster comprises:
if the current cluster is a cluster adopting a copy mechanism, determining the persistence level of the data block according to the copy number of the current data block;
if the current cluster is the cluster adopting an erasure code mechanism, calculating the difference value of the total number of the check blocks of the source data of the current data block minus the number of the lost data blocks, and determining the persistence level of the data block according to the obtained difference value.
3. The method of claim 1, wherein prior to the step of determining the persistence level of the data block based on the degree of redundancy of the data block in the current cluster, the method further comprises:
if a failure event is monitored in the current cluster, the redundancy degree of each data block in the cluster is obtained; wherein the failure event comprises at least one of: disk failures in the cluster, node failures in the cluster, and network failures in the cluster.
4. The method of claim 1, further comprising:
monitoring the redundancy degree of each data block in the current cluster;
judging whether the redundancy degree of the data block changes or not according to the monitored redundancy degree of the data block;
if so, adjusting the persistence level of the data block by using the changed redundancy degree of the data block, and controlling and reconstructing the I/O resources of the disk of the data block according to the preset I/O resource allocation principle and the adjusted persistence level.
5. The method of claim 1, wherein the step of controlling the I/O resources of the disk for reconstructing the data block according to a preset I/O resource allocation rule and the persistence level comprises:
searching a reconstruction strategy corresponding relation in a preset I/O resource allocation principle by using the persistence level to obtain an I/O resource ratio of a disk for reconstructing the data block, wherein the persistence level in the reconstruction strategy corresponding relation is in an inverse relation with the I/O resource ratio of the disk;
and controlling the disk to reconstruct the data block according to the searched I/O resource ratio of the disk.
6. The method according to claim 5, wherein the step of controlling the disk to reconstruct the data block according to the found I/O resource ratio of the disk comprises:
setting a restrictive index parameter of the disk so that the proportion of the I/O resources of the disk, which are used for reconstructing the data block, to the total I/O resources of the disk is equal to the searched I/O resource proportion; wherein the restrictive indicator parameter comprises at least one of: the number of times IOPS of read-write I/O operation per second, the disk bandwidth, the number of disks participating in the data block reconstruction and the number of processes participating in the data block reconstruction.
7. The method of claim 5, wherein the reconfiguration policy correspondence comprises three persistence level to disk I/O resource ratio correspondences; wherein the corresponding relationship comprises: the method comprises the steps that the I/O resource occupation ratio of a corresponding disk belongs to a first interval when the persistence level is a critical level, the I/O resource occupation ratio of the corresponding disk belongs to a second interval when the persistence level is a general level, and the I/O resource occupation ratio of the corresponding disk belongs to a third interval when the persistence level is a slight level, wherein the first interval, the second interval and the third interval are preset intervals ranging from 0 to 1.
8. A data reconstruction control device is applied to a disk in a distributed storage cluster or a controller in a centralized storage cluster, and comprises:
the persistence level determining module is used for determining the persistence level of the data block according to the redundancy degree of the data block in the current cluster; the redundancy degree of the data block is in direct proportion to the persistence level of the data block;
the I/O resource control module is used for controlling and reconstructing the I/O resources of the disk of the data block according to a preset I/O resource allocation principle and the persistence level; in the preset I/O resource allocation principle, the persistence level and the I/O resource allocated by the disk are in an inverse relation.
9. The apparatus of claim 8, wherein the persistence level determining module is configured to determine the persistence level of the data block according to the number of copies of the current data block if the current cluster is a cluster using a copy mechanism;
if the current cluster is the cluster adopting an erasure code mechanism, calculating the difference value of the total number of the check blocks of the source data of the current data block minus the number of the lost data blocks, and determining the persistence level of the data block according to the obtained difference value.
10. The apparatus of claim 8, further comprising:
the monitoring module is used for monitoring the redundancy degree of each data block in the current cluster;
the judging module is used for judging whether the redundancy degree of the data block changes or not according to the monitored redundancy degree of the data block;
the persistence level determining module is further configured to adjust the persistence level of the data block by using the changed redundancy degree of the data block when the determination result of the determining module is yes;
the I/O resource control module is further configured to, when the persistence level determining module adjusts the persistence level of the data block, control the I/O resources of the disk that reconstructs the data block according to the preset I/O resource allocation principle and the adjusted persistence level.
11. The apparatus according to claim 8, wherein the I/O resource control module is configured to use the persistence level to search for a reconfiguration policy corresponding relationship in a preset I/O resource allocation principle, and obtain an I/O resource fraction of a disk that reconfigures the data block, where the persistence level in the reconfiguration policy corresponding relationship is in an inverse relationship with the I/O resource fraction of the disk; and controlling the disk to reconstruct the data block according to the searched I/O resource ratio of the disk.
12. A data reconstruction control device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor to perform the method of any one of claims 1 to 7.
CN201811072330.6A 2018-09-14 2018-09-14 Data reconstruction control method, device and equipment Active CN109344012B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811072330.6A CN109344012B (en) 2018-09-14 2018-09-14 Data reconstruction control method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811072330.6A CN109344012B (en) 2018-09-14 2018-09-14 Data reconstruction control method, device and equipment

Publications (2)

Publication Number Publication Date
CN109344012A CN109344012A (en) 2019-02-15
CN109344012B true CN109344012B (en) 2022-04-12

Family

ID=65305559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811072330.6A Active CN109344012B (en) 2018-09-14 2018-09-14 Data reconstruction control method, device and equipment

Country Status (1)

Country Link
CN (1) CN109344012B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111190542B (en) * 2019-12-27 2023-08-29 天津中科曙光存储科技有限公司 Method and system for realizing file layout of file system
CN111399779B (en) * 2020-03-18 2022-09-30 杭州宏杉科技股份有限公司 Flow control method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101515273A (en) * 2001-08-03 2009-08-26 易斯龙系统公司 Systems and methods providing metadata for tracking of information on a distributed file system of storage devices
CN103605582A (en) * 2013-11-27 2014-02-26 华中科技大学 Erasure code storage and reconfiguration optimization method based on redirect-on-write
CN104424052A (en) * 2013-09-11 2015-03-18 杭州信核数据科技有限公司 Automatic redundant distributed storage system and method
CN106776108A (en) * 2016-12-06 2017-05-31 郑州云海信息技术有限公司 It is a kind of to solve the fault-tolerant method of storage disk
CN108073472A (en) * 2017-12-12 2018-05-25 华中科技大学 A kind of memory correcting and eleting codes location mode perceived based on temperature

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7533330B2 (en) * 2005-06-27 2009-05-12 Seagate Technology Llc Redundancy for storage data structures

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101515273A (en) * 2001-08-03 2009-08-26 易斯龙系统公司 Systems and methods providing metadata for tracking of information on a distributed file system of storage devices
CN104424052A (en) * 2013-09-11 2015-03-18 杭州信核数据科技有限公司 Automatic redundant distributed storage system and method
CN103605582A (en) * 2013-11-27 2014-02-26 华中科技大学 Erasure code storage and reconfiguration optimization method based on redirect-on-write
CN106776108A (en) * 2016-12-06 2017-05-31 郑州云海信息技术有限公司 It is a kind of to solve the fault-tolerant method of storage disk
CN108073472A (en) * 2017-12-12 2018-05-25 华中科技大学 A kind of memory correcting and eleting codes location mode perceived based on temperature

Also Published As

Publication number Publication date
CN109344012A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN108780386B (en) Data storage method, device and system
CN107807794B (en) Data storage method and device
US11243706B2 (en) Fragment management method and fragment management apparatus
US11307776B2 (en) Method for accessing distributed storage system, related apparatus, and related system
US11422703B2 (en) Data updating technology
CN110096220B (en) Distributed storage system, data processing method and storage node
CN110018783B (en) Data storage method, device and system
WO2018121456A1 (en) Data storage method, server and storage system
CN112764661B (en) Method, apparatus and computer program product for managing a storage system
US20220291996A1 (en) Systems, methods, and devices for fault resilient storage
CN109344012B (en) Data reconstruction control method, device and equipment
CN111309245B (en) Hierarchical storage writing method and device, reading method and device and system
CN116954523B (en) Storage system, data storage method, data reading method and storage medium
US20230244570A1 (en) Fault resilient storage device
CN110377664B (en) Data synchronization method, device, server and storage medium
EP4170499A1 (en) Data storage method, storage system, storage device, and storage medium
JP2015158768A (en) Storage device and control method thereof
CN103685359B (en) Data processing method and device
CN115981559A (en) Distributed data storage method and device, electronic equipment and readable medium
JP2015179419A (en) Cache control apparatus, control method thereof, storage device, and computer program
CN109542687B (en) RAID level conversion method and device
CN114721585A (en) Storage management method, apparatus and computer program product
CN114374707B (en) Management method, device, equipment and medium for storage cluster
CN118277344B (en) Storage node interlayer merging method and device of distributed key value storage system
CN115599315B (en) Data processing method, device, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant