WO2014101144A1

WO2014101144A1 - Data storage method and device

Info

Publication number: WO2014101144A1
Application number: PCT/CN2012/087922
Authority: WO
Inventors: 田晓波
Original assignee: 华为技术有限公司
Priority date: 2012-12-28
Filing date: 2012-12-28
Publication date: 2014-07-03
Also published as: CN103384550A; CN103384550B

Abstract

The present invention relates to the technical field of electronic information. Disclosed in an embodiment of the present invention are a data storage method and device, capable of storing data in a plurality of resource pools to reduce the chance of completely damaging data, thus improving data security. The method of the present invention is used for a storage system; the resource pool in the storage system is logically divided into at least two sub-resource pools; each sub-resource pool comprises N number of neighboring storage devices, N≥2. The method comprises: dividing the to-be-stored data into R number of sections, and distributing the sections into the divided sub-resource pools, the number of copies of the to-be-restored data desired to be generated being less than N, R≥2; and generating a copy section for each section distributed in each sub-resource pool, and storing the generated copy section in the sub-resource pool corresponding to the section, the copy section corresponding to the same section of the to-be-stored data.

Description

Method and device for storing data

The present invention relates to the field of electronic information technology, and in particular, to a method and apparatus for storing data.

Background technique

Distributed storage technology is a commonly used data storage technology. In a distributed storage scenario, all hard disks in the system can be managed as a storage resource pool. When the system reads, writes, and stores data, The entire storage resource pool serves as a storage area for data interaction. E.g:

A distributed storage system can map a virtual volume to all the hard disks in the storage resource pool, and divide the space on the volume into several 1M-sized data blocks, each of which corresponds to one of all the hard disks in the resource pool. The block holds the logical partition unit Partition. When there are multiple copies of a Partition, all the data in a Partition will be saved in multiple copies. For example: In a dual-copy scenario, each data block corresponds to two Partitions in all the hard disks in the resource pool, and these two The contents of Partition are identical. According to the commonly used allocation algorithm, the same two Partitions such as P1 and ΡΓ are often assigned to two different hard disks such as: P1 on hard disk 1, ΡΓ on hard disk 2, so that Partition can be evenly distributed throughout In the resource pool, this method of allocating Partitions is widely used because it can maximize the read and write speed of hard disks in the resource pool.

When any hard disk in the distributed storage system is damaged, data reconstruction can be performed. For example: Hard disk 1 is damaged. Because of Pl, P4, P5, and P8 on hard disk 1, there are corresponding copies on hard disks 2, 3, and 4. , the data of the hard disk 1 can be repaired by the corresponding copy on the hard disks 2, 3, and 4.

In the existing distributed storage system, if any hard disk is damaged, data reconstruction will be performed. In the process of data reconstruction, if other hard disks in the storage resource pool fail again, the entire virtual volume The data will be corrupted. For example, as shown in Figure 1, when the hard disk 1 is damaged and data repair is started, if the hard disk 2 is damaged, the corresponding copy P of the hard disk 1 stored on the hard disk 2 is repaired. It will be damaged, so the data cannot be repaired, resulting in data corruption of the entire virtual volume and reduced data security.

Summary of the invention

An embodiment of the present invention provides a method and an apparatus for storing data, which can divide all storage devices in a storage system into at least two resource pools, and store the data in the divided resource pools, because each resource pool The data in the data is independent. Only when two storage devices in the same resource pool are damaged, the data will be completely damaged, thereby reducing the probability of complete data corruption and improving data security.

In order to achieve the above object, the embodiment of the present invention adopts the following technical solutions:

In a first aspect, an embodiment of the present invention provides a method for storing data, where the storage system includes a resource pool for storing data, and the resource pool is logically divided into at least two sub-resource pools. Each sub-resource pool includes an adjacent N storage devices, N > 2, and the method includes:

The data to be stored is divided into R slices, and the R slices are distributed into the divided sub-resource pools, and the number of copies to be generated by the data to be stored is less than N, R >2; Each of the slices in the sub-resource pool generates a replica slice and stores the replica slice in a sub-resource pool to which the slice belongs, the replica slice corresponding to the slice having the same data. With reference to the first aspect, in a first possible implementation manner of the first aspect, the storing the copy in a sub-resource pool to which the slice belongs includes: storing the replica slice to correspond to the slice On the storage device in the sub-resource pool to which it belongs, and each of the replica slice stores is stored on a different storage device. With reference to the first aspect, and the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the method further includes: dividing the divided sub-resource pool into different levels of sub-resources a pool set, each sub-resource pool set includes at least one sub-resource pool, and each of the different levels of sub-resource pool sets has a different number of replica slices that are expected to be generated; each pair is distributed to each sub-resource pool And generating, by the slice, the replica slice, comprising: generating, according to the number of replica slices that each slice of the slice corresponding to the level of the sub-resource pool pool to which the sub-resource pool belongs, the slice-distributed to each of the sub-resource pools Corresponding number of copies of the slice. With reference to the first aspect, and the first possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the replica slice is stored in a sub-resource pool to which the slice belongs After that, it also includes:

For any one of the sub-resource pools, when the number of storage devices in the sub-resource pool is reduced, the slice in the reduced storage device is allocated to other storage devices in the sub-resource pool, The number of remaining storage devices in the sub-resource pool is greater than the number of replicas expected to be generated. With reference to the first aspect, and the first possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the replica slice is stored in a sub-resource pool to which the slice belongs After that, it also includes:

For any one of the sub-resource pools, when the number of storage devices in the sub-resource pool is increased, the replica slice of the slice in the sub-resource pool is re-allocated to the storage device in the sub-resource pool. In combination with the second possible implementation of the first aspect, in a fifth possible implementation The distributing the R slices into the pre-divided sub-resource pool includes:

According to the importance of the data in the R slices, R slices are distributed to different levels of child resource pools.

In a second aspect, an embodiment of the present invention provides an apparatus for storing data, where the storage system includes a resource pool for storing data, and the resource pool is logically divided into at least two sub-resource pools. Each sub-resource pool includes an adjacent N storage devices, N > 2, and the method includes:

a distribution module, configured to divide the data to be stored into R slices, and distribute the R pieces into the divided sub-resource pool, where the number of copies to be stored is expected to be less than N, R > 2 ;

a copy generation module, configured to generate a copy of each of the slices distributed to each of the child resource pools;

And a storage module, configured to store the replica slice in a sub-resource pool corresponding to the slice, where the replica slice and the same slice of the data correspond to each other. With reference to the second aspect, in a first possible implementation of the second aspect, the method further includes: the storage module, configured to store the replica slice to a storage device corresponding to a sub-resource pool to which the slice belongs The storage device stored in each of the replica slices is different from the storage device corresponding to the slice storage, and the replica slices with the same data are stored on different storage devices. With reference to the second aspect, and the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the method further includes:

a partitioning module, configured to divide the divided sub-resource pool into different levels of sub-resource pools, each sub-resource pool set including at least one sub-resource pool, and different levels of sub-resource pools The number of replica slices that are expected to be generated for each of the slices in the collection is different;

The copy generation module is further configured to allocate each of the copies to the sub-resource pool according to the number of replica slices that each of the slices corresponding to the sub-resource pool of the sub-resource pool belongs to. The slice generates a corresponding number of copies of the slice. With reference to the second aspect, and the first possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the method further includes:

a negative adjustment module, configured to reduce the number of storage devices in the sub-resource pool for any one of the sub-resource pools;

a data distribution module, configured to allocate the slice in the reduced storage device to another storage device in the sub resource pool, where a remaining number of storage devices in the sub resource pool is greater than the expected generated copy number. With reference to the second aspect, and the first possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the method further includes:

a positive adjustment module, configured to increase the number of storage devices in the sub-resource pool for any one of the sub-resource pools;

The data distribution module is further configured to: when the number of storage devices in the sub-resource pool is increased, re-allocate a replica slice of the slice in the sub-resource pool to a storage device in the sub-resource pool. In conjunction with the second possible implementation of the second aspect, in a fifth possible implementation, the method further includes:

The distribution module is further configured to distribute R slices to different levels of sub-resource pools according to the importance of data in the R slices. In a third aspect, an embodiment of the present invention provides a storage system for storing data, where the storage system includes a resource pool for storing data, and the resource pool is logically divided into at least two sub-resource pools, each sub-resource. The pool includes adjacent N storage devices, N > 2, and the storage system includes: a processor, a communication interface, and a bus, and is characterized by further including a storage device, the processor, the communication interface, and all storage The devices complete communication with each other through the bus, wherein:

The processor is configured to divide data to be stored into R slices, and distribute the R pieces into the divided sub-resource pools through the communication interface, where the data to be stored is expected to be generated The number is less than N, R > 2;

The processor is further configured to generate a replica slice for each of the slices distributed in each sub-resource pool, and store the replica slice in a sub-resource pool to which the slice belongs, the replica slice and The slices of the same data correspond to each other. With reference to the third aspect, in a first possible implementation manner of the third aspect, the method includes: the processor is further configured to store the replica slice to a storage device corresponding to a sub-resource pool to which the slice belongs And the storage device stored in each of the replica slices is different from the storage device corresponding to the slice storage, and the replica slices with the same data are stored on different storage devices. With reference to the third aspect, and the first possible implementation manner of the third aspect, in a second possible implementation manner of the third aspect, the processor is further configured to divide the divided sub-resource pool For a different set of sub-resource pools, each sub-resource pool set includes at least one sub-resource pool, and each of the different levels of the sub-resource pool sets has a different number of replica slices that are expected to be generated;

The processor is further configured to allocate each slice in the sub-resource pool according to the number of replica slices that each slice corresponding to the slice corresponding to the sub-resource pool of the sub-resource pool is expected to generate. Generate a corresponding number of copy tiles. With reference to the third aspect, and the first possible implementation manner of the third aspect, in a third possible implementation manner of the third aspect, the processor is further configured to store the replica slice in a corresponding location After the sub-resource pool to which the slice belongs, for any one of the sub-resource pools, when the number of storage devices in the sub-resource pool is reduced, the slice in the reduced storage device is used by the communication interface Allocating to other storage devices in the sub-resource pool, the number of remaining storage devices in the sub-resource pool is greater than the number of copies that are expected to be generated. With reference to the third aspect, and the first possible implementation manner of the third aspect, in a fourth possible implementation manner of the third aspect, the processor is further configured to store the replica slice in a corresponding location After the sub-resource pool to which the slice belongs, for any one of the sub-resource pools, when the number of storage devices in the sub-resource pool is increased, the slice in the sub-resource pool is The replica slice is reassigned to the storage device in the child resource pool. With reference to the second possible implementation manner of the third aspect, in a fifth possible implementation, the processor is further configured to distribute R slices to different levels according to importance of data in the R slices. In the child resource pool.

In a fourth aspect, an embodiment of the present invention provides a computer program product for storing data, which is used in a storage system, where the storage system includes a resource pool for storing data, wherein the resource pool is logically divided. For at least two sub-resource pools, each sub-resource pool includes an adjacent N storage devices, N > 2, the computer program product includes a computer-readable storage medium storing program code, and the program code includes instructions In:

The data to be stored is divided into R slices, and the R slices are distributed into the divided sub-resource pools, and the number of copies that the data to be stored is expected to be generated is less than N, R >2; Generating a replica slice for each of the slices distributed into each of the sub-resource pools, and storing the replica slice in a sub-resource pool to which the slice belongs, the replica slice corresponding to the same data as the slice corresponding to each other .

The method, device, storage system and computer program product for repairing data provided by the embodiments of the present invention can divide all storage devices in the storage system into at least two sub-resource pools, and store the data in the divided sub-resource pools. Because the data in each sub-resource pool is independent, only the two storage devices in the same sub-resource pool are damaged, and the data is completely damaged. In the prior art, in the case where there is only one resource pool, the damage of the two storage devices may result in complete data corruption. In the embodiment of the present invention, multiple sub-resource pools can be divided, and the data is stored in the divided In a sub-resource pool, only when two storage devices in the same sub-resource pool are damaged, the data is completely damaged. The probability of two storage devices in the same sub-resource pool is less than that in the prior art. In the case of a pool, the probability of damaging two storage devices is such that the embodiment of the present invention can reduce the probability of complete data corruption relative to the prior art, thereby improving data security.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those of ordinary skill in the art in light of the inventive work.

FIG. 1 is a flowchart of a method for storing data according to an embodiment of the present invention;

1a is a schematic diagram of a specific example of a method for storing data according to an embodiment of the present invention;

FIG. 1 is a schematic diagram of another specific example of a method for storing data according to an embodiment of the present disclosure;

FIG. 2 is a flowchart of another method for storing data according to an embodiment of the present invention; FIG. 3 is a flowchart of still another method for storing data according to an embodiment of the present invention; FIG. 3a is a schematic diagram of still another specific example of a method for storing data according to an embodiment of the present disclosure;

FIG. 3 is a flowchart of still another method for storing data according to an embodiment of the present invention; FIG. 3 is a schematic diagram of another specific example of a method for storing data according to an embodiment of the present disclosure;

FIG. 4 is a flowchart of still another method for storing data according to an embodiment of the present invention; FIG. 4b is another flowchart of a method for storing data according to an embodiment of the present invention; A schematic structural diagram of an apparatus for storing data according to an embodiment of the present invention; FIG. 6 is a schematic structural diagram of another apparatus for storing data according to an embodiment of the present invention; FIG. 7 is a schematic diagram of storing data according to an embodiment of the present invention. Schematic diagram of the network architecture of the storage system.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

In one aspect, an embodiment of the present invention provides a method for storing data, as shown in FIG. 1, including:

It should be noted that, the embodiment of the present invention may be used in a storage system, where the storage system includes a resource pool for storing data, wherein the resource pool is logically divided into at least two sub-resource pools, each sub-resource. The pool includes adjacent N storage devices, N > 2. Moreover, the embodiments of the present invention may be implemented by a device having a data transmission, processing, and storage function, for example, a management server, a mobile workstation, and the like for managing a hard disk array in a distributed storage system.

101. Divide the data to be stored into R slices, and distribute the R slices into the divided sub-resource pool. The number of copies that the data to be stored is expected to be generated is less than N, that is, the number of copies of the data to be stored is expected to be smaller than the number of storage devices in the child resource pool, and 1 > 2. For example: technical means that can spread the data slice distribution in the distributed storage architecture, for example: the slice allocated to the sub-resource pool 1 is Pl, P2, P3, and the number of copies expected to be generated is 2, then the management server can be allocated according to The slice into the sub-resource pool 1 is P1, P2, P3, and the copy slice ΡΓ, Ρ2, Ρ3' of the first copy, and the copy slice ΡΓ, Ρ2,,, Ρ3, of the second copy are generated. The PI, P2, P3, ΡΓ, Ρ2, Ρ3', ΡΓ, Ρ2,, Ρ3 are stored in the storage device in the sub-resource pool 1 by means of distributed breaking techniques.

In this embodiment, the management server may divide the data to be stored into multiple data slices by using common technical means. For example, the management server may use the data slicing method in the RAID technology well known to those skilled in the art to store the data. The data is divided into at least 2 data slices, and the divided data slices are distributed on the child resource pool. In this embodiment, the data slice can be distributed on the sub-resource pool. For example, the management server can distribute the data slice to the sub-resources by means of distributed distribution in the RAID technology. On the pool, you can also distribute the data slices in the divided sub-resource pools.

Further, in this embodiment, the management server may divide the storage device in the system into multiple sub-resource pools on a logical level, and each sub-resource pool is composed of multiple storage devices, for example:

There are a total of 24 hard disks in the distributed storage system. The management server can divide the 24 hard disks into 3 sub-resource pools through the MDC (Metadata Controller). Each sub-resource pool includes 8 hard disks. The architecture of the multi-sub-resource pool shown in Figure la is formed.

In practical applications, the commonly used LBA (Logical Block Addressing) in the process of storing the stored data can be mapped to different sub-resource pools, each sub-resource pool and the block storage logical partition unit Partition It can be distributed through the MDC cluster management node, for example: There are n Partitions in each sub-resource pool, and each sub-resource pool There can be multiple Partitions on each hard disk in the MDC. The MDC can record the PartitionID (the block holds the logical partition unit ID) and the sub-resource pool ID in each sub-resource pool, and pass the [[sub-resource pool ID.PartitionID]" The form of the character information identifies each Partition in the entire disk array, so that each Partition has unique identification information; other devices in the system can access the sub-resource pool by using common technical means according to the identification information of each Partition. Partition in .

102. Generate a replica slice for each of the slices distributed into each sub-resource pool, and store the replica slice in a sub-resource pool to which the slice belongs.

In this embodiment, the data to be stored is expected to generate at least two identical copies, and the number of copies of the data to be stored that are expected to be generated is less than N. Also, the duplicate slice corresponds to the same slice of data that needs to be stored, and the copy slice stored in one child resource pool is the same as the other copy slice stored in the child resource pool. That is, a copy slice in one child resource pool is different from a copy slice in another child resource pool. For example: As shown in FIG. 1b, the copy slice in the first copy can be divided into P1-P8 by the management server, and the copy slice in the second copy can be divided into ΡΓ-Ρ8, and the first copy and the second copy are The content is the same, that is, P1 is the same as ΡΓ, and Ρ2 is the same as Ρ2, until Ρ8 and Ρ8 are the same.

In the sub-resource pool 1, the hard disk 1 carries P1, Ρ2, Ρ5, Ρ6, and the hard disk 2 carries ΡΓ, Ρ2, Ρ5, Ρ6, that is, the content and the second copy of the first copy in the sub-resource pool 1 The content in the sub-resource pool 1 is the same. Similarly, the content of the first copy in the sub-resource pool 2 is the same as the content of the second copy in the sub-resource pool 2.

Further, for example: P1 and ΡΓ are in sub-resource pool 1, that is, the same Partition as P1 exists only in sub-resource pool 1. Similarly, the same Partition is in the same sub-resource pool, so that one sub-resource pool The data in it is content independent of other child resource pools. When the hard disk is damaged, for example, the hard disk 1 is damaged, the Ρ1, Ρ2, Ρ5 in the hard disk 1 can be repaired only according to ΡΓ, Ρ2, Ρ5, Ρ6 in the hard disk 2 in the same sub-resource pool 1. Ρ6, , so that when the hard disk in a sub-resource pool is damaged, the management server can be repaired only according to the data in other hard disks in the sub-resource pool. The data in the damaged hard disk is such that the data in the child resource pool is content independent of other child resource pools.

It should be noted that, in the embodiment of the present invention, since the replica slice in one sub-resource pool is different from the replica slice in the other sub-resource pool, the data in the sub-resource pool is independent of other sub-resource pools in content, and The copy generated according to the data to be stored is stored in multiple sub-resource pools. The condition that the entire replica data is completely damaged is: 2 storage devices in one sub-resource pool are damaged at the same time, for example: as shown in Figure lb In the dual-copy scenario, the condition that the entire replica data is completely damaged is: Hard disk 1 and hard disk 2 in the child resource pool 1 are simultaneously damaged, or the hard disk 3 and the hard disk 4 in the child resource pool 2 are simultaneously damaged. If only one hard disk is damaged in sub-resource pool 1, and only one hard disk is damaged in sub-resource pool 2, the replica data can still be repaired, that is, only when two hard disks in one sub-resource pool are damaged at the same time. The copy data will be corrupted. Since the probability of damage to the hard disk is basically the same, and the probability of damage of the hard disk is A, if the prior art solution is adopted, for example, the prior art solution is that the hard disks 1, 2, 3, and 4 in FIG. In a resource pool, the probability of duplicate data corruption (ie, the probability of two hard disks being damaged at the same time) is A ² , or it can be understood as the probability that the distributed storage system is corrupted during the time window of data reconstruction; In the solution of the embodiment, the hard disks 1 and 2 in FIG. 1b are in the sub-resource pool 1, the hard disks 3 and 4 are in the sub-resource pool 4, and since the data in the sub-resource pool 1 is content-independent from the sub-resource pool 2 , the probability of the copy data being damaged (that is, the probability that the two hard disks in the same sub-resource pool are damaged at the same time) is 0.5A ² , or can be understood as the probability that the distributed storage system is corrupted during the time window of data reconstruction. 0.5A; It can be seen that, in the solution of the embodiment of the present invention, the more sub-resource pools are divided, the lower the probability of replica data corruption, for example: dividing 100 sub-resource pools, then reconstructing data The probability that the replica data is corrupted in the inter-window is 0.01A. If each sub-resource pool just damages one hard disk, the solution provided by the embodiment of the present invention can cope with the extreme situation of up to 100 hard disk damage, and due to each sub-resource pool The data in the data is independent of each other and can also repair the data.

For example, in a practical application, the failure rate of a storage device such as a hard disk is generally 4%, taking into account the impact of the performance parameters of some commonly used storage devices on the failure rate of the hard disk, according to the solution of the embodiment of the present invention, the data as shown in Table 1 can be obtained:

Table I

It can be seen that, in the embodiment of the present invention, the sub-resource pool includes adjacent N storage devices, and the probability that the adjacent storage devices are damaged at the same time is small, so the damaged storage device is not in the same sub-resource pool. The probability is very large, and when the data in a sub-resource pool is damaged, it can be recovered through the data on other storage devices in the sub-resource pool, thereby increasing the fault tolerance of the distributed storage system and reducing data corruption. The chances of increasing data security.

The method for repairing data provided by the embodiment of the present invention can divide all storage devices into at least two sub-resource pools, and store the data in the divided sub-resource pools, because the data in each sub-resource pool is independent. For example: As shown in Figure lb, the data slices stored in sub-resource pool 1 are ΡΓ, Ρ2, Ρ5, Ρ6, Ρ1, Ρ2, Ρ5, Ρ6, and there is no data slice and sub-resource pool 2 If the stored data slices are the same, the data in the sub-resource pool 2 will not affect the data in the sub-resource pool 1, and the data in the sub-resource pool 1 need not be extracted to participate in the repair sub-resource pool 2. Data, so the data in each sub-resource pool is independent. Since the data in each sub-resource pool is independent, data corruption or data repair in a sub-resource pool does not affect the data in other sub-resource pools, only two of the same sub-resource pools The storage device is damaged and the data is completely damaged. In the prior art, a storage device that stores mutually repairable data may be damaged at the same time, resulting in complete data corruption. In the embodiment of the present invention, since multiple sub-resource pools can be divided, each sub-resource pool includes one adjacent one. Storage device, and store data in multiple sub-resource pools divided, only when the same sub-resource pool When the storage device in which the mutually repairable data is located is damaged, the data is completely damaged, and the probability of damage to the adjacent storage devices in the same sub-resource pool is small, so that the embodiment of the present invention is compared with the prior art. It can reduce the chance of complete data corruption, which improves data security. Optionally, the embodiment of the present invention further provides a method for storing data, as shown in FIG. 2, which may further include:

201. Divide the data to be stored into R slices, and distribute the R slices into the sub-resource pools of the storage device of the specified type.

The one resource pool may contain at least two adjacent storage devices, and the storage device is used to store data.

Moreover, the obtained sub-resource pool includes a storage device of a specified type, and the types of the storage device include but are not limited to: a hard disk, a solid-state hard disk SSD.

202. Generate a replica slice for each of the slices distributed into each sub-resource pool, and store the replica slice in a sub-resource pool to which the slice belongs.

In this embodiment, the management server may determine the type of the storage device to be used, and store the slice of the data to be stored in a sub-resource pool containing the type of storage device, and then distribute each to each of the sub-resource pools. The slice generates a replica slice and stores the replica slice in a sub-resource pool to which the slice belongs, for example:

The mapping relationship between the type of the data and the type of the storage device as shown in Table 2 can be pre-stored in the management server, and the type of the storage device to be used is determined according to the type of data to be stored, and the storage device including the type of storage device is obtained. The child resource pool, and finally stores the data to be stored in a sub-resource pool containing the type of storage device.

Data type storage device type Sub-resource pool where the storage device is located Picture mechanical hard disk sub-resource pool 1 Text SSD sub-resource pool 2

Table II

Another example:

The mapping between the size of the data and the type of the storage device as shown in Table 3 can be pre-stored in the management server, and the type of the storage device to be used is determined according to the size of the data to be stored, and the storage device including the type of storage device is obtained. The child resource pool, and finally stores the data to be stored in a sub-resource pool containing the type of storage device.

Table 3

That is, in the embodiment, the management server determines the type of the storage device to be used, and the specific implementation means may be various. In practical applications, those skilled in the art may also determine the method according to the embodiment of the present invention and the prior art. The type of storage device to use. Since the security of different types of storage devices is also different, for example: Many solid state drives are currently less likely to be damaged than mechanical hard disks. Therefore, the management server can select the corresponding type of storage device according to the importance of the data, thereby further improving the security of important data. Further, the embodiment of the present invention further provides a method for storing data, where, as shown in FIG. 3a, the method includes:

301. Divide the data to be stored into R slices, and distribute the R slices into the divided sub-resource pool.

302. Generate a replica slice for each of the slices distributed to each sub-resource pool, and store the replica slice on a storage device corresponding to the sub-resource pool to which the slice belongs.

Wherein, the duplicate slice corresponds to the same slice of the data to be stored. Each said The copy tiles are stored on different storage devices. E.g:

As shown in Figure lb, the copy of the first copy is P1-P8, the copy of the second copy is ΡΓ-Ρ8, and P1 is the same as ΡΓ, Ρ2 is the same as Ρ2, until Ρ8 is the same as Ρ8. In the sub-resource pool 1, the hard disk 17 carries P1, Ρ2, Ρ5, Ρ6, and the hard disk 27 carries ΡΓ, Ρ2, Ρ5, Ρ6, that is, the content and the second copy of the first copy in the sub-resource pool 1 The content in the sub-resource pool 1 is the same. Similarly, the content of the first copy in the sub-resource pool 2 is the same as the content of the second copy in the sub-resource pool 2.

Further, both P1 and ΡΓ are in sub-resource pool 1, that is, the same Partition as P1 exists only in sub-resource pool 1. Similarly, the same Partition is in the same sub-resource pool, so that in a sub-resource pool Data is content independent of other child resource pools. Moreover, the hard disk stored in each copy slice is different from the hard disk corresponding to the slice storage. For example, one copy of the sub-resource pool 1 and any one of the sub-resource pools 2 are different. And the same copy of the data is stored on different storage devices, such as: P1 and ΡΓ on different hard disks.

303a. For any one of the sub-resource pools, reduce the number of storage devices in the sub-resource pool. The hot plug/unplug of the storage device can increase or decrease the number of storage devices in the subresource pool while the subresource pool is running. For example, when a certain hard disk in a sub-resource pool fails and needs to be rebuilt by common technical means, the management server can rebuild and restore the data on the failed hard disk in the sub-resource pool, as shown in Figure 3a. After the hard disk 1 in the sub-resource pool X fails, the Partition on the hard disk 1 can be migrated to the other three hard disks in the sub-resource pool according to the preset rules. This reduces the number of storage devices in the child resource pool while the child resource pool is running.

304a, assigning the slice in the reduced storage device to other storage devices in the sub-resource pool. Parallel, as shown in Figure 3b, may also include:

303b. Add, for any one of the sub-resource pools, the number of storage devices in the sub-resource pool.

For example: as shown in Figure 3b l. After the management server adds the hard disk 4 to the sub-resource pool X, the Partition on the hard disk 1-3 can be evenly distributed in the hard disk 1-4 in the 4 fast hard disks according to the preset rule, thereby the hard disk 1-3 Part of the content was migrated to the hard disk 4. Thus, the number of storage devices in the child resource pool is increased during the running of the child resource pool.

304b, re-allocating a copy of the slice in the sub-resource pool to a storage device in the sub-resource pool.

In this embodiment, the management server may re-allocate the slices in the sub-resource pool after adding the storage devices in the sub-resource pool, for example, as shown in FIG. 3b, all the data on the hard disks 1-3. Slice, re-distribute on the hard disk 1-4.

The embodiment of the present invention can also implement hot plugging/unplugging of the storage devices in the sub-resource pool, so that each sub-resource pool can be stored in the sub-resource pool when abnormal operation such as storage device damage and storage device shutdown maintenance occurs. The data is not lost, so that the distributed storage system can eliminate the unusable storage devices at any time while ensuring data stability, thereby further improving the security of the data.

The method for repairing data provided by the embodiment of the present invention can divide all storage devices into at least two sub-resource pools, and store the data in the divided sub-resource pools, because the data in each sub-resource pool is independent. For example, if two storage devices are included in a sub-resource pool, only two storage devices in the same sub-resource pool are damaged, and the data is completely damaged. In the prior art, in the case where there is only one resource pool, the damage of the two storage devices may cause complete data corruption. In the embodiment of the present invention, multiple sub-resource pools can be divided, and the data is stored in the divided In a plurality of sub-resource pools, the data is completely damaged only when two storage devices in the same sub-resource pool are damaged. The probability of damage to two adjacent storage devices in the same sub-resource pool is smaller than that in the prior art. Damage 2 storages with only one resource pool There is a probability that the storage device can perform data recovery with each other, so that the embodiment of the present invention can reduce the probability of complete data corruption relative to the prior art, thereby improving data security. Further, optionally, the embodiment of the present invention further provides a method for storing data, which may further include: dividing the divided sub-resource pool into different levels of sub-resource pool sets, where each sub-resource pool set includes at least a sub-resource pool, the number of replica slices that are expected to be generated for each of the slices in the different levels of sub-resource pools is different; the replica slices are generated for each of the slices distributed to each of the sub-resource pools, including: The number of replica slices that each of the slices corresponding to the level of the sub-resource pool pool to which the sub-resource pool belongs is generated, and each of the slices distributed to the sub-resource pool generates a corresponding number of replica slices. In a specific implementation, as shown in FIG. 4 a , the method includes:

401a: Divide the divided sub-resource pool into different levels of sub-resource pool sets. Each sub-resource pool set includes at least one sub-resource pool, and the number of replica slices that are expected to be generated for each of the slices in different levels of the sub-resource pool set is different. E.g:

Determine Q primary resource pools, or Q, secondary secondary resource pools. The number of copies expected to be generated by the slice stored in the primary sub-resource pool is M, and the number of copies expected to be generated by the slice stored in the secondary sub-resource pool is 0, and M > 0 > 2, Q > 2, Q, > 2. The number of copies that the source pool stores are expected to generate, for example: the contents of the replicas 1, 2, and 3 are the same, the number of replicas expected to be generated in the sub-resource pool X is 2, and the number of replicas expected to be generated in the sub-resource pool y is 3. . Then, in the sub-resource pool X, after storing the copy slices P1, P2, P3 from the copy 1, and the copy slices P1', P2', P3' from the copy 2, it is no longer possible to store from the copy 3 The copy of the slice Pl ' ', Ρ 2 ' ', Ρ 3 ", in order to store all or part of the content from the copy 1, 2, 3, you need to use the sub-resource pool y.

Moreover, characters for defining the number of copies may be added to the identification information for indicating the attributes of the child resource pool, for example: 010 indicates that the child resource pool allows the expected number of copies of the stored data. The amount is 2, 011 indicates that the sub-resource pool allows the number of copies of the desired data to be stored as 3.

402 a, generating a corresponding number of copies for each of the slices distributed to the sub-resource pool according to the number of replica slices that each of the slices corresponding to the sub-resource pool of the sub-resource pool belongs to. slice.

In this embodiment, the management server determines that the level of the used sub-resource pool can be various. For example, the management server can determine the level of the sub-resource pool according to the type of data to be stored, for example, as shown in Table 4. The management server may pre-store the mapping relationship between the type of the data and the sub-resource pool level, and determine the level of the sub-resource pool required to store the data to be stored according to the type of data to be stored and the mapping relationship shown in Table 4.

Table 4

Since the importance of different data is different in practical applications, the embodiments of the present invention can adopt different levels of storage methods for data of different importance levels, and the relatively important data can be guaranteed by storing more copies. Data security, for relatively unimportant data, can increase the utilization of storage devices in the system through storage with fewer copies, thereby improving the operating efficiency of the distributed storage system, enabling the management server to be more Storage devices are used to store more important data, further enhancing the security of important data. Parallel, as shown in Figure 4b, also includes:

401b: Divide the divided sub-resource pool into different levels of sub-resource pool sets. Each sub-resource pool set includes at least one sub-resource pool, and different levels of sub-investments The number of replica slices that are expected to be generated for each of the slices in the source pool set is different.

402b, the data to be stored is divided into R slices, and the R slices are distributed to different levels of sub-resource pools according to the importance of the data in the R slices.

Where R > 2. E.g:

In the process of using the virtual machine, the data of the system volume of the virtual machine and the data of the user volume need to be stored in the sub-resource pool. Since the security of the data of the system volume is more important, the management server may correspond to the data of the system volume. The slice generates 3 copies and stores them in the first-level sub-resource pool, and generates 2 copies according to the slice corresponding to the data of the user volume and stores them in the second-level sub-resource pool, thereby realizing different data according to the same needs to be stored. Part of the generation of a different number of copies, and stored in the corresponding level of the sub-resource pool.

Since the importance degree of different parts of the same data is different in practical applications, the embodiment of the present invention can adopt different levels of storage methods for data of different degrees of different parts of the data, and the relatively important part of the data may be The storage system with more copies is used to ensure security. For the relatively unimportant parts, the storage device with fewer copies can be used to increase the utilization of storage devices in the system, thereby improving the operating efficiency of the distributed storage system. The management server can use more storage devices to store the more important parts of the data, further enhancing the security of important parts of the data.

The method for repairing data provided by the embodiment of the present invention can divide all storage devices into at least two sub-resource pools, and store the data in the divided sub-resource pools, because the data in each sub-resource pool is independent. Only when two storage devices in the same sub-resource pool are damaged, the data will be completely damaged. In the prior art, in the case where there is only one resource pool, the damage of the two storage devices may result in complete data corruption. In the embodiment of the present invention, multiple sub-resource pools can be divided, and the data is stored in the divided In the sub-resource pool, only when the two storage devices in the same sub-resource pool are damaged, the data is completely damaged, because the probability of damage to the two storage devices in the same sub-resource pool is small, so that the embodiment of the present invention is relatively The prior art can reduce the probability of complete data corruption, thereby improving data security. On the other hand, an embodiment of the present invention provides an apparatus for storing data, where the storage system includes a resource pool for storing data, and the resource pool is logically divided into at least two sub-resource pools. Each sub-resource pool includes N storage devices, N > 2, as shown in Figure 5, including:

The distribution module 51 is configured to divide the data to be stored into R slices, and distribute the R slices into the divided sub-resource pools.

Wherein, the data to be stored is expected to generate a copy number smaller than N, R > 2 „ copy generation module 52, for generating a copy slice for each of the slices distributed to each sub-resource pool.

The storage module 53 is configured to store the replica slice in a sub-resource pool corresponding to the slice, where the replica slice and the same slice of the data correspond to each other.

The apparatus for repairing data provided by the embodiment of the present invention can divide all storage devices into at least two sub-resource pools, and store the data in the divided sub-resource pools, because the data in each sub-resource pool is independent. Only when two storage devices in the same sub-resource pool are damaged, the data will be completely damaged. In the prior art, in the case where there is only one resource pool, the damage of the two storage devices may result in complete data corruption. In the embodiment of the present invention, multiple sub-resource pools can be divided, and the data is stored in the divided In a sub-resource pool, only when two storage devices in the same sub-resource pool are damaged, the data is completely damaged. The probability of two storage devices in the same sub-resource pool is less than that in the prior art. In the case of a pool, the probability of damaging two storage devices is such that the embodiment of the present invention can reduce the probability of complete data corruption relative to the prior art, thereby improving data security. Optionally, as shown in FIG. 6, the apparatus for storing data provided by the embodiment of the present invention may include:

a distribution module 62, configured to divide data to be stored into R slices, and cut the R slices The slices are distributed into the divided sub-resource pools.

Wherein, the data to be stored is expected to generate a copy number smaller than N, R > 2 „ copy generation module 63, for generating a copy slice for each of the slices distributed into each sub-resource pool.

The storage module 64 is configured to store the replica slice in a sub-resource pool corresponding to the slice, where the replica slice and the same slice of the data correspond to each other.

Further, the storage module 64 is further configured to store the replica slice to a storage device corresponding to the sub-resource pool to which the slice belongs. In the case of different data, the same copy of the copy is stored on different storage devices.

Further, the apparatus for storing data provided by the embodiment of the present invention may further include: a dividing module 61, configured to divide the divided sub-resource pool into different levels of sub-resource pool sets.

Each sub-resource pool set includes at least one sub-resource pool, and the number of replica slices that are expected to be generated for each of the slices in different levels of the sub-resource pool set is different.

The copy generating module 63 is further configured to generate, according to the number of replica slices that each slice of the slice corresponding to the level of the sub-resource pool pool to which the sub-resource pool belongs, to generate the slice into the sub-resource pool. Corresponding number of copies of the slice.

Since the importance of different data is different in practical applications, the device provided by the embodiments of the present invention can adopt different levels of storage methods for data of different importance levels, and relatively important data can be obtained by having more copies. Storage mode to ensure data security. For relatively unimportant data, the storage capacity of the system can be increased by the storage method with fewer copies, thereby improving the operating efficiency of the distributed storage system, so that the management server can More storage devices are used to store more important data, further enhancing the security of important data.

Further, the apparatus for storing data provided by the embodiment of the present invention may further include: The negative adjustment module 65 is configured to reduce the number of storage devices in the sub-resource pool for any one of the sub-resource pools.

The positive adjustment module 66 is configured to increase the number of storage devices in the sub-resource pool for any one of the sub-resource pools.

The data distribution module 67 is configured to allocate the slice in the reduced storage device to other storage devices in the sub-resource pool.

The number of remaining storage devices in the child resource pool is greater than the number of copies that are expected to be generated.

The data distribution module 67 is further configured to: when the number of storage devices in the sub-resource pool is increased, re-allocate a replica slice of the slice in the sub-resource pool to a storage device in the sub-resource pool .

Further, the distribution module 62 is further configured to distribute R slices to different levels of sub-resource pools according to the importance of data in the R slices.

The embodiment of the present invention can also implement hot plugging/unplugging of the storage devices in the sub-resource pool, so that when the storage devices in the sub-resource pools are damaged or shut down and maintained, the storage devices can be stored in the sub-resource pools. The data is not lost, so that the distributed storage system can eliminate the unusable storage devices at any time while ensuring data stability, thereby further improving the security of the data. The apparatus for repairing data provided by the embodiment of the present invention can divide all storage devices into at least two sub-resource pools, and store the data in the divided sub-resource pools, because the data in each sub-resource pool is independent. Only when two storage devices in the same sub-resource pool are damaged, the data will be completely damaged. In the prior art, in the case where there is only one resource pool, the damage of the two storage devices may result in complete data corruption. In the embodiment of the present invention, multiple sub-resource pools can be divided, and the data is stored in the divided In a sub-resource pool, only when two storage devices in the same sub-resource pool are damaged, the data is completely damaged. The probability of two storage devices in the same sub-resource pool is less than that in the prior art. The probability of damaging two storage devices in the case of a pool, thereby making the embodiment of the present invention The prior art can reduce the probability of complete data corruption, thereby improving data security. In another aspect, the embodiment of the present invention provides a storage system for storing data. As shown in FIG. 7, the storage system includes a storage area 74 for storing data, and the storage area 74 is logically divided into at least 2 a sub-resource pool, each sub-resource pool includes an adjacent N storage devices, N>2, the storage system includes: the processor 71, the communication interface 72, the bus 73, the processor 71, and the communication interface 72 and all the storage devices in the storage area 74, through the bus 73 to complete mutual communication, wherein:

The processor 71 is configured to divide data to be stored into R slices, and distribute the R slices to the divided sub-resource pools through the communication interface 72.

The number of copies that need to be stored is expected to be less than N, R>2 „the processor 71 is further configured to generate a replica slice for each of the slices distributed to each sub-resource pool, and the replica is The slice is stored in the sub-resource pool to which the slice belongs.

Wherein, the duplicate slice corresponds to the same slice of the data to be stored.

Further, the processor 71 is further configured to store the replica slice to a storage device corresponding to the sub-resource pool to which the slice belongs. Different, the same copy of the data is stored on different storage devices. Optionally, the processor 71 is further configured to divide the divided sub-resource pool into different sub-resource pool sets.

Each sub-resource pool set includes at least one sub-resource pool, and each of the different levels of sub-resource pool sets has a different number of replica slices that are expected to be generated.

The processor 71 is further configured to generate, according to the number of replica slices that each of the slices corresponding to the level of the sub-resource pool pool to which the sub-resource pool belongs, generate the corresponding one of the slices distributed to the sub-resource pool. The number of copies is sliced. Further, the processor 71 is further configured to: when the replica slice is stored in a sub-resource pool to which the slice belongs, for the sub-resource pool, when the sub-resource pool is reduced The number of storage devices in the reduced storage device is allocated to other storage devices in the sub-resource pool through the communication interface 72.

Optionally, the processor 71 is further configured to: when the replica slice is stored in a sub-resource pool to which the slice belongs, for the sub-resource pool, when the sub-resource is added When the number of storage devices in the pool is reached, the copy of the slice in the sub-resource pool is re-allocated to the storage device in the sub-resource pool through the communication interface 72. Further, the processor 71 is further configured to distribute R slices to different levels of sub-resource pools according to the importance of data in the R slices. The storage system for repairing data provided by the embodiment of the present invention can divide all storage devices in the storage system into at least two resource pools, and store the data in the divided resource pools, because the data in each resource pool It is independent. Only when two storage devices in the same resource pool are damaged, the data will be completely damaged. In the prior art, in the case where there is only one resource pool, the damage of the two storage devices may result in complete data corruption. In the embodiment of the present invention, multiple resource pools can be divided, and the data is stored in the divided data. In a resource pool, only when two storage devices in the same resource pool are damaged, the data is completely damaged, because the probability of two storage devices in the same resource pool is less than that in the prior art. In the case of a pool, the probability of damaging two storage devices is such that the embodiment of the present invention can reduce the probability of complete data corruption relative to the prior art, thereby improving data security. In another aspect, an embodiment of the present invention provides a computer program product for repairing data, which is used in a storage system, where the storage system includes a resource pool for storing data, and the resource pool is logically divided into at least two sub-resources. a pool, each sub-resource pool includes an adjacent N storage devices, N>2, the computer program product includes a computer-readable storage medium storing program code, and the program code includes instructions for:

The data to be stored is divided into R slices, and the R slices are distributed into the divided sub-resource pools, and the number of copies that need to be stored is expected to be generated less than N, R > 2 „ Each of the slices in the sub-resource pool generates a replica slice and stores the replica slice in a sub-resource pool to which the slice belongs, the replica slice corresponding to the slice having the same data.

The computer program product for repairing data provided by the embodiment of the present invention can divide all the storage devices in the storage system into at least two resource pools, and store the data in the divided resource pools, because each resource pool The data is independent. Only when 2 storage devices in the same resource pool are damaged, the data will be completely damaged. In the prior art, when there is only one resource pool, multiple resource pools can be divided, and data is stored in the divided resource pools. Only when two storage devices in the same resource pool are damaged, the data is The probability of damage to the two storage devices in the same resource pool is less than the probability of damaging the two storage devices in the case of only one resource pool in the prior art, so that the embodiment of the present invention is relatively current. There are techniques to reduce the chance of complete data corruption, which increases data security.

The various embodiments in the specification are described in a progressive manner, and similar parts of the various embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the device embodiment, since it is basically similar to the method embodiment, it is described in a relatively simple manner, and the relevant parts can be referred to the description of the method embodiment.

One of ordinary skill in the art can understand all or part of the method of implementing the above embodiments. The process may be performed by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, and when executed, the program may include a flow of an embodiment of the methods as described above. The storage medium may be a magnetic disk, an optical disk, or a read-only storage memory.

(Read-Only Memory, ROM) or Random Access Memory (RAM).

The above is only the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any change or replacement that can be easily conceived by those skilled in the art within the technical scope of the present invention is All should be covered by the scope of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

claims

1. A method of storing data, used in a storage system, the storage system includes a resource pool for storing data, characterized in that the resource pool is logically divided into at least 2 sub-resource pools, each sub-resource pool includes N adjacent storage devices, N > 2, and the method includes:

Divide the data that needs to be stored into R slices, and distribute the R slices into the divided sub-resource pools. The number of copies expected to be generated for the data that needs to be stored is less than N, R > 2; for distribution to each Each slice in the sub-resource pool generates a replica slice, and stores the replica slice in the sub-resource pool to which the corresponding slice belongs. The replica slice corresponds to the slice with the same data.

2. The method of storing data according to claim 1, wherein said storing the copy in a sub-resource pool to which the corresponding slice belongs includes: storing the copy slice into a sub-resource pool to which the corresponding slice belongs. on the storage device in the sub-resource pool, and each copy slice is stored on a different storage device.

3. The method of storing data according to claim 1 or 2, characterized in that:

It also includes: again dividing the divided sub-resource pool into sub-resource pool sets of different levels, each sub-resource pool set including at least one sub-resource pool, and expected generation of each slice in the sub-resource pool set of different levels. The number of replica slices is different;

The said slice generates a copy slice for each said slice distributed to each sub-resource pool, including: According to the number of replica slices expected to be generated by each slice corresponding to the level of the sub-resource pool set to which the sub-resource pool belongs, a corresponding number of replica slices will be generated for each of the slices distributed in the sub-resource pool.

4. The method of storing data according to claim 1 or 2, characterized in that, after storing the copy slice in the sub-resource pool to which the corresponding slice belongs, it further includes:

For any of the sub-resource pools, when the number of storage devices in the sub-resource pool is reduced, the slices in the reduced storage devices are allocated to other storage devices in the sub-resource pool, so The number of remaining storage devices in the sub-resource pool is greater than the number of copies expected to be generated.

5. The method of storing data according to claim 1 or 2, characterized in that, after storing the copy slice in the sub-resource pool to which the corresponding slice belongs, it further includes:

For any of the sub-resource pools, when the number of storage devices in the sub-resource pool is increased, the copy slices of the slices in the sub-resource pool are reallocated to the storage devices in the sub-resource pool.

6. The method of storing data according to claim 3, characterized in that distributing R slices into pre-divided sub-resource pools includes:

According to the importance of the data in the R slices, the R slices are distributed to different levels of sub-resource pools.

7. A device for storing data, used in a storage system. The storage system includes a device for storing data. A resource pool for storing data, characterized in that the resource pool is logically divided into at least 2 sub-resource pools, each sub-resource pool includes N adjacent storage devices, N > 2, and the method includes:

A distribution module, used to divide the data that needs to be stored into R slices, and distribute the R slices into the divided sub-resource pools. The number of copies of the data that needs to be stored is expected to be less than N, R > 2 ;

A copy generation module, configured to generate a copy slice for each of the slices distributed to each sub-resource pool;

A storage module, configured to store the copy slice in the sub-resource pool to which the corresponding slice belongs, and the copy slice corresponds to the slice with the same data.

8. The device for storing data according to claim 7, further comprising: the storage module, further configured to store the copy slice on a storage device in the sub-resource pool to which the corresponding slice belongs. , and the storage device where each replica slice is stored is different from the storage device where the corresponding slice is stored, and replica slices with the same data are stored on different storage devices.

9. The device for storing data according to claim 7 or 8, further comprising: a dividing module, configured to divide the divided sub-resource pool into sub-resource pool sets of different levels again, each sub-resource pool The set includes at least one sub-resource pool, and the number of replica slices expected to be generated for each slice in the set of sub-resource pools at different levels is different;

The copy generation module is also configured to generate each copy slice distributed to the sub-resource pool according to the number of copy slices expected to be generated for each slice corresponding to the level of the sub-resource pool set to which the sub-resource pool belongs. Each of the slices generates a corresponding number of replica slices.

10. The device for storing data according to claim 7 or 8, further comprising: a negative adjustment module, configured to reduce the number of storage devices in the sub-resource pool for any one of the sub-resource pools. ;

A data allocation module, configured to allocate the slices in the reduced storage devices to other storage devices in the sub-resource pool, where the number of remaining storage devices in the sub-resource pool is greater than the expected number of copies. number.

11. The device for storing data according to claim 10, further comprising: a positive adjustment module, configured to increase the number of storage devices in the sub-resource pool for any one of the sub-resource pools;

The data allocation module is also configured to reallocate the copy slices of the slices in the sub-resource pool to the storage devices in the sub-resource pool when the number of storage devices in the sub-resource pool is increased.

12. The device for storing data according to claim 9, further comprising: the distribution module, further configured to distribute the R slices to different levels of sub-resources according to the importance of the data in the R slices. in the pool.

13. A storage system for storing data. The storage system includes a resource pool for storing data. It is characterized in that the resource pool is logically divided into at least 2 sub-resource pools, and each sub-resource pool includes adjacent resource pools. N storage devices, N > 2, the storage system includes: processor, The communication interface and bus are characterized by also including a storage device. The processor, the communication interface and all storage devices complete mutual communication through the bus, where:

The processor is configured to divide the data that needs to be stored into R slices, and distribute the R slices into the divided sub-resource pool through the communication interface. The data that needs to be stored is expected to generate a copy. The quantity is less than N, R > 2;

The processor is also configured to generate a copy slice for each of the slices distributed in each sub-resource pool, and store the copy slice in the sub-resource pool to which the corresponding slice belongs, where the copy slice is The slices with the same data correspond to each other.

14. The storage system for storing data according to claim 13, characterized by comprising: the processor, further configured to store the copy slice on a storage device in the sub-resource pool to which the corresponding slice belongs. , and the storage device where each replica slice is stored is different from the storage device where the corresponding slice is stored, and replica slices with the same data are stored on different storage devices.

15. The storage system for storing data according to claim 13 or 14, characterized in that the processor is further used to divide the divided sub-resource pool into sub-resource pool sets of different levels, each sub-resource The pool set includes at least one sub-resource pool, and the number of expected generated copy slices for each slice in sub-resource pool sets of different levels is different;

The processor is further configured to generate a corresponding number of replica slices for each slice distributed to the sub-resource pool according to the number of copy slices expected to be generated for each slice corresponding to the level of the sub-resource pool set to which the sub-resource pool belongs. copy slice.

16. The storage system for storing data according to claim 13 or 14, wherein the processor is further configured to: after storing the copy slice in the sub-resource pool to which the corresponding slice belongs, In any of the sub-resource pools, when the number of storage devices in the sub-resource pool is reduced, allocate the slices in the reduced storage devices to other storage in the sub-resource pool through the communication interface. On the device, the number of remaining storage devices in the sub-resource pool is greater than the number of copies expected to be generated.

17. The storage system for storing data according to claim 13 or 14, wherein the processor is further configured to: after storing the copy slice in the sub-resource pool to which the corresponding slice belongs, In any of the sub-resource pools, when the number of storage devices in the sub-resource pool is increased, the copy slices of the slices in the sub-resource pool are reallocated to the sub-resource pool through the communication interface. on the storage device.

18. The storage system for storing data according to claim 15, wherein the processor is further configured to distribute the R slices to sub-resource pools of different levels according to the importance of the data in the R slices. .

19. A computer program product for repairing data, used in a storage system. The storage system includes a resource pool for storing data, characterized in that the resource pool is logically divided into at least 2 sub-resource pools, each sub-resource pool. The resource pool includes N adjacent storage devices, N > 2, and the computer program product includes a computer-readable storage medium storing program code, and the program code includes instructions for executing any of claims 1-6. The method described in 1.