CN112667160A - Rapid equalization method and device for mass storage system - Google Patents
Rapid equalization method and device for mass storage system Download PDFInfo
- Publication number
- CN112667160A CN112667160A CN202011566146.4A CN202011566146A CN112667160A CN 112667160 A CN112667160 A CN 112667160A CN 202011566146 A CN202011566146 A CN 202011566146A CN 112667160 A CN112667160 A CN 112667160A
- Authority
- CN
- China
- Prior art keywords
- disk
- physical
- node
- nodes
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000008569 process Effects 0.000 claims abstract description 17
- 230000009467 reduction Effects 0.000 claims abstract description 8
- 230000005012 migration Effects 0.000 claims description 14
- 238000013508 migration Methods 0.000 claims description 14
- 238000010586 diagram Methods 0.000 description 7
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method and a device for quickly balancing a mass storage system, wherein the method comprises the following steps: dividing a first preset number of logic nodes based on an initial first preset number of physical nodes, wherein an initial second preset number of physical disks are arranged in the physical nodes; in the balancing process after capacity expansion or capacity reduction, data is preferentially balanced among the logic nodes; and after the data balance among the logic nodes is finished, if the data in the logic nodes are unbalanced, carrying out data balance in the nodes on the logic nodes with data unbalance. The invention reduces the pressure brought to the storage network and the storage nodes when the storage system is balanced.
Description
Technical Field
The invention relates to the technical field of storage, in particular to a method and a device for quickly balancing a mass storage system.
Background
In recent years, with the continuous expansion of the scale of internet networks, mass data are required to be stored in emerging applications, edge computing, internet of things, big data analysis and real-time analysis every day, data storage becomes a hot topic of each big software design, and high-performance storage is required due to the large amount of stored data. The existing storage system is generally used as a distributed storage system, and the method for ensuring the data to be distributed evenly in the distributed storage system is an effective way for providing high performance by storage. However, the amount of data cannot be well estimated by software at the beginning of design, capacity expansion of storage capacity is faced in general situations, and capacity reduction may occur in individual situations.
Disclosure of Invention
The embodiment of the specification provides a method and a device for quickly balancing a mass storage system.
In one aspect, a method for quickly equalizing a mass storage system provided in an embodiment of the present specification includes: dividing a first preset number of logic nodes based on an initial first preset number of physical nodes, wherein an initial second preset number of physical disks are arranged in the physical nodes; in the balancing process after capacity expansion or capacity reduction, data is preferentially balanced among the logic nodes; and after the data balance among the logic nodes is finished, if the data in the logic nodes are unbalanced, carrying out data balance in the nodes on the logic nodes with data unbalance.
On the other hand, an embodiment of the present specification provides a fast equalization apparatus for a mass storage system, including: the logical node dividing module is used for dividing a first preset number of logical nodes based on an initial first preset number of physical nodes, wherein the physical nodes are provided with an initial second preset number of physical disks; the data balancing module among the logic nodes is used for balancing the data among the logic nodes preferentially in the balancing process after capacity expansion or capacity reduction; and the logical node internal data balancing module is used for carrying out node internal data balancing on the logical nodes with data imbalance.
The embodiment of the invention reduces the pressure brought to the storage network and the storage nodes when the storage system is rapidly balanced.
Drawings
Fig. 1 is a flow diagram of a method for fast leveling of a mass storage system according to some embodiments of the present disclosure.
Fig. 2 is a block diagram of a fast equalization apparatus for a mass storage system according to some embodiments of the present disclosure.
FIG. 3 is a distribution diagram of an initial storage system of some embodiments of the present description.
FIG. 4 is a distribution diagram of the initial storage system of FIG. 3 after three physical nodes have been expanded.
FIG. 5 is a distribution diagram of the storage system of FIG. 4 after partial disk migration.
FIG. 6 is a distribution diagram of the storage system of FIG. 5 with physical disks added.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
As shown in fig. 1, some embodiments of the present specification provide a method for fast leveling of a mass storage system, including dividing a first predetermined number of logical nodes based on an initial first predetermined number of physical nodes, where an initial second predetermined number of physical disks are disposed in the physical nodes; in the balancing process after capacity expansion or capacity reduction, data is preferentially balanced among the logic nodes; and after the data balance among the logic nodes is finished, if the data in the logic nodes are unbalanced, carrying out data balance in the nodes on the logic nodes with data unbalance.
Further, in some embodiments of the present specification, the step of balancing the data among the logic nodes is specifically to obtain a disk usage rate of each logic node, and when the disk usage rates of the logic nodes are different, migrate the data from at least one logic node with a higher disk usage rate to at least one logic node with a lower disk usage rate among the logic nodes until the disk usage rates of the logic nodes are the same.
Further, in some embodiments of the present specification, in a logical node with a higher disk usage rate, at least one physical disk with a higher usage rate is selected as a data migration disk; and selecting at least one physical disk with lower utilization rate as a data migration disk in the logical node with lower disk utilization rate.
Further, in some embodiments of the present specification, the step of performing intra-node data balancing on the logical nodes with data imbalance is specifically to obtain a usage rate of each physical disk of each physical node in each logical node, migrate, in the same physical node, the physical disk with a higher usage rate as a data migration disk, and migrate the data to the physical disk with a lower usage rate until the usage rates of each physical disk in the same physical node are the same.
In some embodiments of the present description, the physical disks are each provided with a physical node ID, a logical node ID, and a global ID.
With reference to fig. 2, an embodiment of the present invention further provides a fast equalization apparatus for a mass storage system, including a logical node dividing module, configured to divide a first predetermined number of logical nodes based on an initial first predetermined number of physical nodes, where the physical nodes are provided with an initial second predetermined number of physical disks; the data balancing module among the logic nodes is used for balancing the data among the logic nodes preferentially in the balancing process after capacity expansion or capacity reduction; and the logical node internal data balancing module is used for carrying out node internal data balancing on the logical nodes with data imbalance.
In some embodiments of the present specification, the inter-logical node data balancing module is specifically configured to obtain a disk usage rate of each logical node, and when the disk usage rates of the logical nodes are different, migrate data from at least one logical node with a higher disk usage rate to at least one logical node with a lower disk usage rate among the logical nodes until the disk usage rates of the logical nodes are the same.
In some embodiments of the present specification, the inter-logical-node data balancing module is further configured to select, in a logical node with a higher disk usage rate, at least one physical disk with a higher usage rate as a data migration disk; and selecting at least one physical disk with lower utilization rate as a data migration disk in the logical node with lower disk utilization rate.
In some embodiments of the present description, the data balancing module in a logical node is specifically configured to obtain a usage rate of each physical disk of each physical node in each logical node, migrate, in the same physical node, a physical disk with a higher usage rate as a data migration disk, and migrate data to a physical disk with a lower usage rate until the usage rate of each physical disk in the same physical node is the same.
In some embodiments of the present specification, the physical disks are each provided with a physical node ID, a logical node ID, and a global ID.
There is also provided in some embodiments of the present specification an electronic device and a computer-readable storage medium, the electronic device comprising a memory for storing a computer software program; and the processor is used for realizing the steps of the rapid balancing method of the mass storage system when the computer software program is run. The computer readable storage medium stores a computer software program that when executed implements the steps of the method for fast balancing of mass storage systems.
The following describes the expansion of the storage system and the equalization after the expansion in detail with reference to fig. 3 to 6.
As shown in fig. 3, the storage system initially has three physical nodes (physical node 1, physical node 2, and physical node 3), and may be divided into three logical nodes (logical node 1, logical node 2, and logical node 3) according to the three physical nodes, where each physical node includes two disks, and as can be seen from fig. 3, each disk is provided with three identifiers, namely, a physical node ID, a logical node ID, and a global ID, for example, for two disks in the physical node 1, the physical node ID and the logical node ID are both 1, and the global ID is 1 and 2, respectively; since the storage system initial data is allocated evenly, it can be assumed that the current usage per disk is 80%.
As shown in fig. 4, the initial storage system is expanded, and three physical nodes (physical node 4, physical node 5, and physical node 6) are respectively expanded in logical node 1, logical node 2, and logical node 3, where fig. 4 shows a state where three newly added physical nodes are not inserted with disks.
As shown in fig. 5, when partial disks of the original physical nodes (physical node 1, physical node 2, and physical node 3) are inserted into the three newly added physical nodes (physical node 4, physical node 5, and physical node 6), the logical node and the physical node have the same usage rate, and the real capacity is not increased.
As shown in fig. 6, a new disk (disk of global ID 7-12) is inserted into the position of the dashed box in fig. 5, where the logical node utilization is the same, the disk usage in the physical node is not balanced, and then intra-node balancing may be started based on the scheme of the embodiment of the present invention. After a period of time, the balance can be completed to make the utilization rate of all the disks be 40%
The following detailed explanation for the equalization process is as follows:
with reference to fig. 3 to 6, the balancing process is a data migration process, that is, a part of data of an original physical disk (a disk with a global ID of 1-6% usage in fig. 3 to 6) is migrated to a newly added physical disk (a physical disk with a global ID of 7-12% usage). And calculating the utilization rate of each logic node, and preferably performing balance among the logic nodes. For example, for the disk usage rate of the logical node 1, the disk usage rate of the logical node 1 is equal to (80% + 80% + 0% + 0%)/4 is equal to 40%, and similarly, the disk usage rates of the logical node 2 and the logical node 3 are also 40%, so that it can be proved that the usage rates of the three logical nodes are the same. However, if the disk usage rates of the logical nodes through the above process are not balanced, the disk of the logical node with the higher disk usage rate is found out from the logical nodes as a data migrated disk, and the disk of the logical node with the lower disk usage rate is found out from the logical nodes as a migrated disk, where there may be multiple migrated disks, and finally the disk usage rates of each logical node are the same.
After the data balance among the logical nodes is completed, the physical node balance is checked again. For example, regarding the disk usage rate of the physical node 1, the disk usage rate of the physical node 1 is equal to (80% +0)/2, which is equal to 40%, so that the physical disk with the global ID of 1 having the usage rate of 80% in the physical node 1 needs to migrate 40% of data to the physical disk with the global ID of 7 having the usage rate of 0%, and similarly, other physical nodes (physical nodes 2 to 6) should perform corresponding operations.
It should be noted that the foregoing expansion process only describes one of various expansion situations, and if the expansion is performed in multiples, the logical nodes do not need to be balanced, and only the physical node internal balance is performed.
In summary, in the embodiment of the present invention, data balancing is performed according to priority balancing among logical nodes, and then data balancing in the nodes is performed, so as to finally achieve balancing of disk data in the storage system, thereby reducing use of the storage network as much as possible, and preferably avoiding use of the storage network, and performing only physical node balancing.
While the process flows described above include operations that occur in a particular order, it should be appreciated that the processes may include more or less operations that are performed sequentially or in parallel (e.g., using parallel processors or a multi-threaded environment). The present invention is described with reference to flowchart illustrations and/or block diagrams of methods according to embodiments of the invention.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method or device comprising the element.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the method embodiment, since it is substantially similar to the apparatus embodiment, the description is simple, and the relevant points can be referred to the partial description of the apparatus embodiment. The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.
Claims (12)
1. A method for fast equalization of a mass storage system, the method comprising:
dividing a first preset number of logic nodes based on an initial first preset number of physical nodes, wherein an initial second preset number of physical disks are arranged in the physical nodes;
in the balancing process after capacity expansion or capacity reduction, data is preferentially balanced among the logic nodes;
and after the data balance among the logic nodes is finished, if the data in the logic nodes are unbalanced, carrying out data balance in the nodes on the logic nodes with data unbalance.
2. The method for fast equalization of a mass storage system according to claim 1,
the step of equalizing data among the logical nodes may, specifically,
and acquiring the disk utilization rate of each logic node, and when the disk utilization rates of the logic nodes are different, migrating data from at least one logic node with higher disk utilization rate to at least one logic node with lower disk utilization rate among the logic nodes until the disk utilization rates of the logic nodes are the same.
3. The method for fast equalization of a mass storage system according to claim 2,
selecting at least one physical disk with higher utilization rate as a data migration disk in a logic node with higher disk utilization rate;
and selecting at least one physical disk with lower utilization rate as a data migration disk in the logical node with lower disk utilization rate.
4. The method for fast equalization of a mass storage system according to claim 1,
the step of performing intra-node data balancing on the logical nodes with data imbalance is specifically,
and acquiring the utilization rate of each physical disk of each physical node in each logical node, taking the physical disk with higher utilization rate as a data migration disk in the same physical node, and migrating the data to the physical disk with lower utilization rate until the utilization rate of each physical disk in the same physical node is the same.
5. The method for fast equalization of a mass storage system according to claim 1,
and the physical disks are all provided with a physical node ID, a logical node ID and a global ID.
6. An apparatus for fast leveling of a mass storage system, comprising:
the logical node dividing module is used for dividing a first preset number of logical nodes based on an initial first preset number of physical nodes, wherein the physical nodes are provided with an initial second preset number of physical disks;
the data balancing module among the logic nodes is used for balancing the data among the logic nodes preferentially in the balancing process after capacity expansion or capacity reduction;
and the logical node internal data balancing module is used for carrying out node internal data balancing on the logical nodes with data imbalance.
7. Mass storage system fast equalization apparatus according to claim 6,
the inter-logical-node data balancing module is specifically configured to obtain a disk usage rate of each logical node, and when the disk usage rates of the logical nodes are different, migrate data from at least one logical node with a higher disk usage rate to at least one logical node with a lower disk usage rate among the logical nodes until the disk usage rates of the logical nodes are the same.
8. Mass storage system fast equalization apparatus according to claim 7,
the inter-logical-node data balancing module is further used for selecting at least one physical disk with higher utilization rate as a data migration disk from the logical nodes with higher disk utilization rate; and selecting at least one physical disk with lower utilization rate as a data migration disk in the logical node with lower disk utilization rate.
9. Mass storage system fast equalization apparatus according to claim 6,
the data balancing module in the logical node is specifically configured to obtain a utilization rate of each physical disk of each physical node in each logical node, migrate, in the same physical node, a physical disk with a higher utilization rate as a data migration disk, and migrate data to a physical disk with a lower utilization rate until the utilization rates of each physical disk in the same physical node are the same.
10. Mass storage system fast equalization apparatus according to claim 6,
and the physical disks are all provided with a physical node ID, a logical node ID and a global ID.
11. An electronic device, comprising
A memory for storing a computer software program;
a processor for implementing the steps of the method for fast leveling of a mass storage system according to any one of claims 1 to 5 when running said computer software program.
12. A computer-readable storage medium, characterized in that,
the computer readable storage medium has stored thereon a computer software program which when executed performs the steps of the method for fast leveling of a mass storage system according to any of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011566146.4A CN112667160A (en) | 2020-12-25 | 2020-12-25 | Rapid equalization method and device for mass storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011566146.4A CN112667160A (en) | 2020-12-25 | 2020-12-25 | Rapid equalization method and device for mass storage system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112667160A true CN112667160A (en) | 2021-04-16 |
Family
ID=75409437
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011566146.4A Pending CN112667160A (en) | 2020-12-25 | 2020-12-25 | Rapid equalization method and device for mass storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112667160A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102523251A (en) * | 2011-11-25 | 2012-06-27 | 北京开拓天际科技有限公司 | Cloud storage architecture for processing mass data and cloud storage platform using the same |
CN103327094A (en) * | 2013-06-19 | 2013-09-25 | 成都市欧冠信息技术有限责任公司 | Data distributed type memory method and data distributed type memory system |
CN103761059A (en) * | 2014-01-24 | 2014-04-30 | 中国科学院信息工程研究所 | Multi-disk storage method and system for mass data management |
CN104702691A (en) * | 2015-03-13 | 2015-06-10 | 华为技术有限公司 | Distributed load balancing method and device |
CN104917784A (en) * | 2014-03-10 | 2015-09-16 | 华为技术有限公司 | Data migration method and device, and computer system |
CN109788006A (en) * | 2017-11-10 | 2019-05-21 | 阿里巴巴集团控股有限公司 | Data balancing method, device and computer equipment |
CN110515947A (en) * | 2019-08-23 | 2019-11-29 | 苏州浪潮智能科技有限公司 | A kind of storage system |
CN111913670A (en) * | 2020-08-07 | 2020-11-10 | 北京百度网讯科技有限公司 | Load balancing processing method and device, electronic equipment and storage medium |
-
2020
- 2020-12-25 CN CN202011566146.4A patent/CN112667160A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102523251A (en) * | 2011-11-25 | 2012-06-27 | 北京开拓天际科技有限公司 | Cloud storage architecture for processing mass data and cloud storage platform using the same |
CN103327094A (en) * | 2013-06-19 | 2013-09-25 | 成都市欧冠信息技术有限责任公司 | Data distributed type memory method and data distributed type memory system |
CN103761059A (en) * | 2014-01-24 | 2014-04-30 | 中国科学院信息工程研究所 | Multi-disk storage method and system for mass data management |
CN104917784A (en) * | 2014-03-10 | 2015-09-16 | 华为技术有限公司 | Data migration method and device, and computer system |
CN104702691A (en) * | 2015-03-13 | 2015-06-10 | 华为技术有限公司 | Distributed load balancing method and device |
CN109788006A (en) * | 2017-11-10 | 2019-05-21 | 阿里巴巴集团控股有限公司 | Data balancing method, device and computer equipment |
CN110515947A (en) * | 2019-08-23 | 2019-11-29 | 苏州浪潮智能科技有限公司 | A kind of storage system |
CN111913670A (en) * | 2020-08-07 | 2020-11-10 | 北京百度网讯科技有限公司 | Load balancing processing method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107562531B (en) | Data equalization method and device | |
Xie et al. | Power of d choices for large-scale bin packing: A loss model | |
CN109408590B (en) | Method, device and equipment for expanding distributed database and storage medium | |
US10356150B1 (en) | Automated repartitioning of streaming data | |
CN106339386B (en) | Database flexible scheduling method and device | |
CN111290699B (en) | Data migration method, device and system | |
US11093288B2 (en) | Systems and methods for cluster resource balancing in a hyper-converged infrastructure | |
WO2012154177A1 (en) | Varying a characteristic of a job profile relating to map and reduce tasks according to a data size | |
JP2022539955A (en) | Task scheduling method and apparatus | |
CN104216784A (en) | Hotspot balance control method and related device | |
CN110413393B (en) | Cluster resource management method and device, computer cluster and readable storage medium | |
KR102326586B1 (en) | Method and apparatus for processing large-scale distributed matrix product | |
CN114047883B (en) | Data equalization method and device based on distributed storage system | |
CN110019528A (en) | Database manipulation load-balancing method, device, equipment and medium | |
CN109788013B (en) | Method, device and equipment for distributing operation resources in distributed system | |
CN109788006B (en) | Data equalization method and device and computer equipment | |
CN106412075A (en) | Resource allocation method and device based on cloud computing | |
CN111046004B (en) | Data file storage method, device, equipment and storage medium | |
KR101661475B1 (en) | Load balancing method for improving hadoop performance in heterogeneous clusters, recording medium and hadoop mapreduce system for performing the method | |
US10387578B1 (en) | Utilization limiting for nested object queries | |
CN112667160A (en) | Rapid equalization method and device for mass storage system | |
CN108259583B (en) | Data dynamic migration method and device | |
CN106201711A (en) | A kind of task processing method and server | |
CN112988367A (en) | Resource allocation method and device, computer equipment and readable storage medium | |
CN105373451A (en) | Virtual machine placement method and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |