CN112463050A

CN112463050A - Storage system capacity expansion method, device, equipment and machine-readable storage medium

Info

Publication number: CN112463050A
Application number: CN202011350171.9A
Authority: CN
Inventors: 张天洁
Original assignee: New H3C Technologies Co Ltd Chengdu Branch
Current assignee: New H3C Technologies Co Ltd Chengdu Branch
Priority date: 2020-11-26
Filing date: 2020-11-26
Publication date: 2021-03-09

Abstract

The present disclosure provides a storage system capacity expansion method, device, equipment and machine readable storage medium, the method includes: calculating and generating migration information according to a preset rule, wherein the migration information comprises codes of data units needing to be migrated to each empty hard disk in each data hard disk and mapping relations between the data units needing to be migrated and the corresponding migrated empty hard disks; migrating each data unit needing to be migrated in each data hard disk into a corresponding empty hard disk according to the migration information; updating the mapping relation between the data unit in the mapping table and the hard disk according to the migration result; the empty hard disks originally loaded on the newly added storage equipment part and the data hard disks originally loaded on the original storage equipment part are exchanged and installed, so that the storage equipment comprises a plurality of empty hard disks and a plurality of data hard disks. Through the technical scheme disclosed by the invention, the efficiency of finishing data balance after the capacity expansion of the storage pole group is greatly improved.

Description

Storage system capacity expansion method, device, equipment and machine-readable storage medium

Technical Field

The present disclosure relates to the field of communications technologies, and in particular, to a method, an apparatus, a device, and a machine-readable storage medium for capacity expansion of a storage system.

Background

Copy: (Replication): one technique employed to ensure data persistence refers to maintaining multiple identical copies of data within a system. Such as 2 copies, 3 copies, etc. are common.

Erasure Code (Erasure Code): a technique for ensuring data persistence is provided. It divides the data into segments (N-segments) and generates redundant segments (M-segments) according to a certain algorithm, and can reconstruct the lost data from the remaining data in case of losing part of the data (no more than M-segments). This segmentation method is generally denoted as (N, M). Common configurations are (4,2), (6,2), (12,3), etc.

Data balance (Data balance): refers to the data redistribution process caused by the increase and decrease of storage nodes or hard disks in the storage system. In this document, data is redistributed between each node and the hard disk in cluster expansion to achieve data balance.

CRUSH (controlled Replication Under Scalable hashing): the data distribution algorithm adopted in the open-source distributed storage software Ceph can distribute data to storage nodes and hard disks in a pseudo-random mode according to a preset strategy.

PG (plasmid group): the grouping is a logical concept in the Ceph, and each PG corresponds to a group of objects. Ceph manages data in units of PG.

Cluster Topology (Cluster Topology): the method comprises the steps that a storage cluster is formed by nodes, each node comprises hard disks, and the number, the weight, the state and the like of the nodes and the hard disks. The cluster topology information is generated by the control nodes in the cluster and this topology information is synchronized to the entire storage cluster so that each node has the same cluster topology information.

The rapid increase in data volume is the primary driver for the expansion of distributed storage systems. Examples are as follows: initially, a client purchases 3 distributed storage devices to form a storage cluster; one year later, as data grows, the utilization rate of the storage cluster reaches 70%, and the client needs to buy 3 devices for capacity expansion. And adding the new equipment into the original storage cluster, and performing data equalization, thereby achieving the equalization distribution of data on the new equipment and the old equipment. When the storage cluster is large in scale, a method capable of efficiently completing data balance and the like is absent at present.

Disclosure of Invention

In view of the above, the present disclosure provides a method and an apparatus for capacity expansion of a storage system, an electronic device, and a machine-readable storage medium, so as to solve the problem that data equalization cannot be efficiently performed.

The specific technical scheme is as follows:

the utility model provides a storage system capacity expansion method, is applied to the storage device in the storage cluster, the storage device includes a plurality of empty hard disks and a plurality of data hard disks, the method includes: calculating and generating migration information according to a preset rule, wherein the migration information comprises codes of data units needing to be migrated to each empty hard disk in each data hard disk and mapping relations between the data units needing to be migrated and the corresponding migrated empty hard disks; migrating each data unit needing to be migrated in each data hard disk into a corresponding empty hard disk according to the migration information; updating the mapping relation between the data unit in the mapping table and the hard disk according to the migration result; the empty hard disks are originally loaded on newly added storage equipment in the storage cluster, the data hard disks are originally loaded on original storage equipment in the storage cluster, and the empty hard disks originally loaded on part of the newly added storage equipment and the data hard disks originally loaded on part of the original storage equipment are installed in an exchange mode so that the storage equipment comprises a plurality of empty hard disks and a plurality of data hard disks.

As a technical solution, the calculating and generating migration information according to a preset rule includes: and calculating and generating migration information according to a preset rule so as to balance the data of the original empty hard disk and the original data hard disk after migration.

As a technical scheme, each storage device in a storage cluster comprises hard disks with the same number; the empty hard disk is originally loaded on a newly added storage device in the storage cluster, the data hard disk is originally loaded on an original storage device in the storage cluster, and the originally loaded empty hard disk of the newly added storage device part and the originally loaded data hard disk of the original storage device part are installed in an exchange mode so that the storage device comprises a plurality of empty hard disks and a plurality of data hard disks, and the method comprises the following steps: and the number difference of the empty hard disks included in each storage device in the storage cluster after exchange installation is less than or equal to one.

As a technical solution, updating the mapping relationship between the data unit and the hard disk in the mapping table according to the migration result includes: and storing a mapping table of each storage device in the storage cluster.

The present disclosure also provides a storage system capacity expansion device, is applied to the storage equipment in the storage cluster, storage equipment includes a plurality of empty hard disks of piece and a plurality of hard disks of piece, the device includes: the calculation module is used for calculating and generating migration information according to a preset rule, wherein the migration information comprises codes of data units needing to be migrated to each empty hard disk in each data hard disk and mapping relations between the data units needing to be migrated and the corresponding migrated empty hard disk; the migration module is used for migrating each data unit needing to be migrated in each data hard disk into a corresponding empty hard disk according to the migration information; the table item module is used for updating the mapping relation between the data unit in the mapping table and the hard disk in which the data unit is located according to the migration result; the empty hard disks are originally loaded on newly added storage equipment in the storage cluster, the data hard disks are originally loaded on original storage equipment in the storage cluster, and the empty hard disks originally loaded on part of the newly added storage equipment and the data hard disks originally loaded on part of the original storage equipment are installed in an exchange mode so that the storage equipment comprises a plurality of empty hard disks and a plurality of data hard disks.

The present disclosure also provides an electronic device, including a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions capable of being executed by the processor, and the processor executes the machine-executable instructions to implement the aforementioned storage system capacity expansion method.

The present disclosure also provides a machine-readable storage medium storing machine-executable instructions, which when invoked and executed by a processor, cause the processor to implement the aforementioned storage system capacity expansion method.

The technical scheme provided by the disclosure at least brings the following beneficial effects:

the empty hard disks and the data hard disks are physically, manually or automatically exchanged among the storage devices in the storage cluster in advance, so that data balance among the storage devices in the storage cluster is quickly realized, and then the data balance is completed among the empty hard disks and the data hard disks in the storage cluster by the storage devices.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present disclosure or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present disclosure, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present disclosure.

FIG. 1 is a flow chart of a method for capacity expansion of a storage system according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method for capacity expansion of a storage system according to an embodiment of the present disclosure;

FIG. 3 is a flowchart illustrating a method for capacity expansion of a storage system according to an embodiment of the disclosure;

FIG. 4 is a block diagram of a storage system capacity expansion device according to an embodiment of the disclosure;

FIG. 5 is a hardware block diagram of an electronic device in one embodiment of the disclosure;

FIG. 6 is a networking of capacity expansion for a storage system according to the present disclosure.

Detailed Description

The terminology used in the embodiments of the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information in the embodiments of the present disclosure, such information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".

The development of cloud computing, mobile computing, social media, and big data has led to an explosive growth of data. The traditional storage system is difficult to meet the data demand increased by explosion, and a new storage technology is urgently needed for revolution. Distributed storage based on the idea of Software Defined Storage (SDS) arises. The distributed storage generally adopts a commercial general storage server, virtualizes a hard disk on the server into a storage resource pool by using storage management software for unified management, and can provide fusion storage services including block storage service, file storage service, object storage service and the like.

Distributed storage is developed rapidly at present due to the advantages of expandability, high cost performance, high data reliability, high service flexibility and the like, and is widely applied to various industries.

As shown in fig. 4, the rapid increase in data volume is the main driving force for the capacity expansion of the distributed storage system. Examples are as follows: initially, a client purchases 3 distributed storage devices to form a storage cluster; one year later, as data grows, the utilization rate of the storage cluster reaches 70%, and the client needs to buy 3 devices for capacity expansion. And adding the new equipment into the original storage cluster, and performing data equalization so as to achieve the balanced distribution of data on the new equipment and the old equipment.

The data balancing process is a network transmission process of the data of the original cluster node to the newly added node. Namely, data is read out from the hard disk of the original node and then written into the hard disk of the new node. In the process, on one hand, a large amount of network bandwidth is occupied, and on the other hand, the balancing process is time-consuming and huge. For example, such as the original 6-node cluster, 36 hard disks per node, and 8TB per hard disk. When the capacity of the original cluster is used to 70%, the new expansion is carried out on 6 nodes, and the hard disk configuration of the new nodes is the same as that of the original nodes. According to the universal capability of data balance of commercial storage in the industry, the data capacity expansion time of 1TB needs 1 hour. At this time, the amount of data to be equalized is about 604TB (6 × 36 × 8 × 70% × 0.5), that is, it takes 604 hours (about 25 days).

In such a long data equalization process, the system is greatly affected due to frequent reading and writing of the hard disk and occupation of a large amount of network bandwidth, and the optimal performance cannot be achieved.

In a distributed system, data reliability and equality are its basic requirements.

In order to ensure the reliability of data, a copy or erasure code method is generally adopted to store data. In this way, when a hard disk or a node fails, the redundant data in the storage set is used to regenerate the data on the failed hard disk or the failed node on other hard disks or nodes in the cluster.

In order to ensure the balance, a certain data layout (layout) method is required when writing data.

As shown in fig. 5, the original file is first cut into "file blocks" of fixed length (e.g. 4MB size, configurable), and each file block generates a unique id, called "object block id", according to the original file and offset (offset) information in the file. Each object block id is mapped to a "data management group" by a hash algorithm (e.g., a modulo operation). A data management group is a logical concept, with different data management groups distinguished by numbers. The data management group has different names in different distributed storage implementations (for example, referred to as PG in ceph and referred to as Partition in Huantiofusages), and its main role is to uniformly manage the object blocks (each data management group typically contains thousands to tens of thousands of object blocks according to the storage size). Then, each data management group is mapped to a group of hard disks by using a pseudo-random algorithm (common pseudo-random algorithms such as a consistent HASH algorithm, a CRUSH algorithm and the like) or a mapping table mode. In this mapping process, the cluster topology information needs to be utilized. For example, in a 2-copy configuration, it maps to 2 hard disks; in the erasure code configuration of (4,2), 6 hard disks are mapped. FIG. 5 is a schematic diagram of each data management group mapped to a 2-block hard disk in a 2-copy configuration.

The object block id is mapped to the data management group, and the data management group is mapped to (hard disk 1, hard disk 2, …, hard disk K). in the process, a pseudo-random algorithm is applied in the first step, and a pseudo-random algorithm or a mapping table is applied in the second step, so that the data can be guaranteed to be finally distributed in the storage cluster in an approximately uniform mode.

From the perspective of the data management groups, the object blocks contained in each data management group are written into K hard disks; from the perspective of the hard disk, each hard disk actually carries several (in practical applications, several hundreds) data management groups (each data management group contains several object blocks).

During cluster capacity expansion, the mapping of the object block id to the data management group is kept unchanged; however, due to the change of the cluster topology, the mapping of the data management group to (hard disk 1, hard disk 2, …, hard disk K) "will change, and the data of the same data management group will reselect a group of (K) hard disks to carry the original data, which is a process of inducing data balancing.

In view of the above, the present disclosure provides a method and an apparatus for capacity expansion of a storage system, an electronic device, and a machine-readable storage medium to solve the above technical problem.

The specific technical scheme is as follows.

In an embodiment, the present disclosure provides a storage system capacity expansion method, which is applied to a storage device in a storage cluster, where the storage device includes a plurality of empty hard disks and a plurality of block hard disks, and the method includes: calculating and generating migration information according to a preset rule, wherein the migration information comprises codes of data units needing to be migrated to each empty hard disk in each data hard disk and mapping relations between the data units needing to be migrated and the corresponding migrated empty hard disks; migrating each data unit needing to be migrated in each data hard disk into a corresponding empty hard disk according to the migration information; updating the mapping relation between the data unit in the mapping table and the hard disk according to the migration result; the empty hard disks are originally loaded on newly added storage equipment in the storage cluster, the data hard disks are originally loaded on original storage equipment in the storage cluster, and the empty hard disks originally loaded on part of the newly added storage equipment and the data hard disks originally loaded on part of the original storage equipment are installed in an exchange mode so that the storage equipment comprises a plurality of empty hard disks and a plurality of data hard disks.

Specifically, as shown in fig. 1, the method comprises the following steps:

and step S11, calculating and generating migration information according to a preset rule.

The migration information includes the code of the data unit that needs to be migrated to each empty hard disk in each data hard disk, and the mapping relationship between each data unit that needs to be migrated and the corresponding migrated empty hard disk.

And step S12, migrating each data unit needing to be migrated in each data hard disk into the corresponding empty hard disk according to the migration information.

And step S13, updating the mapping relation between the data unit in the mapping table and the hard disk according to the migration result.

The empty hard disks are originally loaded on newly added storage equipment in the storage cluster, the data hard disks are originally loaded on original storage equipment in the storage cluster, and the empty hard disks originally loaded on part of the newly added storage equipment and the data hard disks originally loaded on part of the original storage equipment are installed in an exchange mode so that the storage equipment comprises a plurality of empty hard disks and a plurality of data hard disks.

In an embodiment, the calculating and generating migration information according to a preset rule includes: and calculating and generating migration information according to a preset rule so as to balance the data of the original empty hard disk and the original data hard disk after migration.

In one embodiment, each storage device in the storage cluster comprises hard disks with the same number of blocks; the empty hard disk is originally loaded on a newly added storage device in the storage cluster, the data hard disk is originally loaded on an original storage device in the storage cluster, and the originally loaded empty hard disk of the newly added storage device part and the originally loaded data hard disk of the original storage device part are installed in an exchange mode so that the storage device comprises a plurality of empty hard disks and a plurality of data hard disks, and the method comprises the following steps: and the number difference of the empty hard disks included in each storage device in the storage cluster after exchange installation is less than or equal to one.

In one embodiment, updating the mapping relationship between the data unit and the hard disk in the mapping table according to the migration result includes: and storing a mapping table of each storage device in the storage cluster.

As shown in fig. 6, the number of hard disks that the original node (original storage device) of the cluster needs to move is first determined, so that the hard disks containing data are evenly distributed among the new and old nodes. For example, the original cluster has 3 nodes, each node having 36 hard disks (old disks, with data). Newly added 3 nodes (newly added storage devices), each also 36 hard disks (new disks, no data). Thus, there are a total of 108 old disks (3 × 36), and 6 nodes (3+3) need to be allocated, each node being allocated 18 old disks (108/6). Thus, 18(36-18) hard disks are required to be unplugged from each of the original 3 nodes.

Secondly, based on the previous calculation, the hard disk (containing data) is pulled out from the old node and inserted into the new node. A new hard disk (no data) is then inserted in the spare bay on each node. The effect diagrams before and after the mobile hard disk are as follows:

it should be noted that, assuming that the number of disks to be moved obtained by calculation is not an integer, the integer part may be retained by rounding or by a tail-removing method, so that the existing data disks are equally divided among the nodes as much as possible. Or the newly added node and the original node have different hard disk numbers or different hard disk capacities, the number of hard disks to be moved is determined according to a data balancing (or weighting uniformity) strategy. For the node, after the number of hard disks needing to be moved is determined, it does not matter which disk is moved specifically, and the number can be determined according to the convenience of actual operation in the capacity expansion process.

In addition, when moving the hard disk, the hard disk needs to be moved from one hard disk to another (the hard disk is pulled out from the old node and inserted into the new node). Therefore, the front-end IO service can be guaranteed not to be affected (during the period of moving the hard disk, the system can be set to be in a temporary maintenance state, and the IO suspension caused by the degradation of data integrity can not be caused by data writing at the moment.

After the steps are completed, the new cluster completes data balance among the nodes. In this process, data equalization does not take up any network bandwidth. The cluster front end IO traffic is also unaffected.

To complete data equalization in a node, data remapping in the node needs to be completed, so that data is migrated in the node. The mapping relationship of "object block id is mapped to data management group, and data management group is mapped to (hard disk 1, hard disk 2, …, hard disk K)" needs to be recalculated. The change in cluster topology triggers this recalculation process.

Firstly, in each storage node, the sum of the number of data management groups carried on all hard disks is counted. And secondly, dividing the sum by the total number of the hard disks in the storage node to obtain the average number of the data management groups which each hard disk should bear. And thirdly, moving the hard disks with the number of the data management groups exceeding the average number to other hard disks with the exceeding parts of the data management groups.

Through the above operations, the number of "data management groups" carried by each hard disk is substantially the same (the number difference is at most 1) in each storage node. The control node is only a logically functional node, and can be the same physical device as the storage node in actual deployment.

The mapping table is distributed to each storage node at the control node, so that data balance in the storage nodes can be triggered.

Similarly, taking a 6-node storage cluster (36 hard disks per node, 8TB per disk, and 70% of data has been used per disk) as an example, after 6 nodes are expanded, the number of the hard disks moved between nodes is 108 (18 × 6), so that the overall moving time does not exceed 2 hours, and the time for completing hard disk deployment does not exceed 3 hours. The data balance time in the node depends on the data transmission speed of a hard disk in the node: assuming that each hard disk performs data equalization at a read-write rate of 50MB/s (the read-write rate of a single disk of an ordinary SATA hard disk is not less than 150MB/s, and the remaining bandwidth is reserved for front-end IO service), it takes about 31 hours (8 × 1000000 × 70%/50/3600) to complete data equalization for 36 disks to read and write in parallel. And 3 hours required by the deployment of the mobile hard disk are added, and the total time is 34 hours, which is compared with hundreds of hours required by the original method, so that the efficiency of completing data equalization is greatly improved.

In an embodiment, as shown in fig. 2, the present disclosure also provides a storage system capacity expansion apparatus applied to a storage device in a storage cluster, where the storage device includes a plurality of empty hard disks and a plurality of block hard disks, the apparatus includes: the calculation module 21 is configured to calculate and generate migration information according to a preset rule, where the migration information includes codes of data units that need to be migrated to each empty hard disk in each data hard disk, and a mapping relationship between each data unit that needs to be migrated and an empty hard disk that corresponds to the migrated data unit; the migration module 22 is configured to migrate, according to the migration information, each data unit that needs to be migrated in each data hard disk into a corresponding empty hard disk; the table entry module 23 is configured to update a mapping relationship between the data unit in the mapping table and the hard disk where the data unit is located according to the migration result; the empty hard disks are originally loaded on newly added storage equipment in the storage cluster, the data hard disks are originally loaded on original storage equipment in the storage cluster, and the empty hard disks originally loaded on part of the newly added storage equipment and the data hard disks originally loaded on part of the original storage equipment are installed in an exchange mode so that the storage equipment comprises a plurality of empty hard disks and a plurality of data hard disks.

The device implementation is the same or similar to the method implementation and is not described again.

In an embodiment, the present disclosure provides an electronic device, including a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions executable by the processor, and the processor executes the machine-executable instructions to implement the foregoing storage system capacity expansion method, and from a hardware level, a schematic diagram of a hardware architecture may be as shown in fig. 3.

In one embodiment, the present disclosure provides a machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement the aforementioned storage system capacity expansion method.

Here, a machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and so forth. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.

The systems, devices, modules or units described in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the various elements may be implemented in the same one or more software and/or hardware implementations in practicing the disclosure.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (which may include, but is not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above description is only an embodiment of the present disclosure, and is not intended to limit the present disclosure. Various modifications and variations of this disclosure will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the scope of the claims of the present disclosure.

Claims

1. A storage system capacity expansion method is applied to storage equipment in a storage cluster, wherein the storage equipment comprises a plurality of empty hard disks and a plurality of data hard disks, and the method comprises the following steps:

calculating and generating migration information according to a preset rule, wherein the migration information comprises codes of data units needing to be migrated to each empty hard disk in each data hard disk and mapping relations between the data units needing to be migrated and the corresponding migrated empty hard disks;

migrating each data unit needing to be migrated in each data hard disk into a corresponding empty hard disk according to the migration information;

updating the mapping relation between the data unit in the mapping table and the hard disk according to the migration result;

2. The method according to claim 1, wherein the calculating and generating migration information according to a preset rule comprises:

and calculating and generating migration information according to a preset rule so as to balance the data of the original empty hard disk and the original data hard disk after migration.

3. The method of claim 1, wherein each storage device in the storage cluster comprises a same number of hard disks;

the empty hard disk is originally loaded on a newly added storage device in the storage cluster, the data hard disk is originally loaded on an original storage device in the storage cluster, and the originally loaded empty hard disk of the newly added storage device part and the originally loaded data hard disk of the original storage device part are installed in an exchange mode so that the storage device comprises a plurality of empty hard disks and a plurality of data hard disks, and the method comprises the following steps:

and the number difference of the empty hard disks included in each storage device in the storage cluster after exchange installation is less than or equal to one.

4. The method of claim 1, wherein updating the mapping relationship between the data unit and the hard disk in the mapping table according to the migration result comprises:

and storing a mapping table of each storage device in the storage cluster.

5. The utility model provides a storage system flash chamber which characterized in that is applied to the storage device in the storage cluster, storage device includes a plurality of empty hard disks of piece and a plurality of data hard disk, the device includes:

the calculation module is used for calculating and generating migration information according to a preset rule, wherein the migration information comprises codes of data units needing to be migrated to each empty hard disk in each data hard disk and mapping relations between the data units needing to be migrated and the corresponding migrated empty hard disk;

the migration module is used for migrating each data unit needing to be migrated in each data hard disk into a corresponding empty hard disk according to the migration information;

the table item module is used for updating the mapping relation between the data unit in the mapping table and the hard disk in which the data unit is located according to the migration result;

6. The apparatus according to claim 5, wherein the calculating and generating migration information according to a preset rule comprises:

7. The apparatus of claim 5, wherein each storage device in the storage cluster comprises a same number of hard disks;

8. The apparatus of claim 5, wherein updating the mapping relationship between the data unit and the hard disk in the mapping table according to the migration result comprises:

and storing a mapping table of each storage device in the storage cluster.

9. An electronic device, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor to perform the method of any one of claims 1 to 4.

10. A machine-readable storage medium having stored thereon machine-executable instructions which, when invoked and executed by a processor, cause the processor to implement the method of any of claims 1-4.