CN117850680A - Optimization method for data equalization in distributed storage system - Google Patents
Optimization method for data equalization in distributed storage system Download PDFInfo
- Publication number
- CN117850680A CN117850680A CN202311722336.4A CN202311722336A CN117850680A CN 117850680 A CN117850680 A CN 117850680A CN 202311722336 A CN202311722336 A CN 202311722336A CN 117850680 A CN117850680 A CN 117850680A
- Authority
- CN
- China
- Prior art keywords
- disk
- storage system
- capacity
- expected
- migration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000005457 optimization Methods 0.000 title abstract description 14
- 230000005012 migration Effects 0.000 claims abstract description 47
- 238000013508 migration Methods 0.000 claims abstract description 47
- 230000002950 deficient Effects 0.000 claims abstract description 8
- 230000007812 deficiency Effects 0.000 claims abstract description 5
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0674—Disk device
- G06F3/0676—Magnetic disk device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of computer storage, and discloses an optimization method for data equalization in a distributed storage system; calculating the total capacity of the storage system; calculating the number of PG expected to be carried by each disk in the storage system; calculating the total use capacity of the storage system: calculating an average usage rate of the storage system and an expected usage capacity of each disk: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the number of deficiency PGs, the deficiency capacity and the data quantity expected to migrate into the PGs corresponding to the deficient disk; analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; when data equalization is carried out, the capacity utilization of the magnetic disk is more balanced, and the effective utilization rate of the capacity utilization is improved.
Description
Technical Field
The invention relates to the technical field of computer storage, in particular to an optimization method for data equalization in a distributed storage system.
Background
The resources in the storage system are composed of a large number of disks, and the data management mode is as follows: (1) composing the data into a chunk; (2) The chunk hash maps to the logical layer PG (placement group); (3) PG is fixedly mapped to the disk through the disk selection; when the storage system runs, a scene of adding and deleting magnetic disks is necessarily existed, so that the quantity of PG distributed on the magnetic disks is unbalanced, and the data quantity of the magnetic disks is unbalanced, thereby causing waste of a large amount of storage space;
the prior patent application publication No. CN114237520A discloses a ceph cluster data equalization method and system, wherein a plurality of equalization lists are generated by carrying out average division on data blocks to be equalized through a main reset group, then the main reset group reserves one equalization list, other equalization lists are respectively sent to a slave reset group, finally the main reset group and the slave reset group correspondingly send the data blocks stored by the slave reset group to a newly added data storage device according to the data block names on the equalization list to realize equalization, and the equalization operation is carried out by the mutual cooperation of the plurality of reset groups in unit time, so that the data equalization efficiency is greatly improved;
however, when the above technology and the prior art perform data balancing, the data volume actually carried by each PG is not considered, and the data volume of each PG is not used as a condition for selecting a target PG, so that the space of each disk is not utilized more effectively, and some disks carry more data, the read-write load is heavy, and the performance of the whole storage system is reduced;
in view of the above, the present invention proposes an optimization method for data balancing in a distributed storage system to solve the above-mentioned problems.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides the following technical scheme for achieving the purposes: an optimization method for data equalization in a distributed storage system, comprising:
step S1: computing a total storage system capacity XZR;
step S2: calculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i ;
Step S3: computing storage system total usage capacity XZS:
step S4: calculating average usage rate PJL of storage system and expected usage capacity CZS of each disk i :
Step S5: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the deficiency PG number BPS corresponding to the deficient disk i Insufficient capacity BZR i Data volume RPS expected to migrate into PG i ;
Step S6: analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; and marking the PG corresponding to the migration instruction as a migration PG, and moving the migration PG into the insufficient disk.
Further, the method for calculating the total capacity XZR of the storage system includes:
capacity information CPR for each disk in storage system collected through API i ;
Wherein i is the ith disk, m is the total number of disks in the storage system, and i epsilon m.
Further, the CPS of PG number expected to be carried by each magnetic disk in the computing and storage system i ;
Each disk in the storage system is expected to bear PG number CPS i The calculation method of (1) comprises the following steps:
wherein PGZ is the total number of PGs in the storage system.
Further, the method for calculating the total usage capacity XZS of the storage system includes:
collecting usage capacity information SPR of each disk in storage system through API i ;
Further, the method for calculating the average usage rate PJL includes:
further, the method for calculating the expected usage capacity of each disk includes:
further, the method for judging whether the disk is marked as insufficient or not comprises the following steps:
if the number of the carried PGs of the magnetic disk is smaller than the CPS number of the expected carried PGs i And uses volume CPR i CZS less than the desired usable capacity i Marking the disk as insufficient disk;
if the number of the carried PGs of the magnetic disk is greater than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Greater than or equal to the desired use capacity CZS i The disk is not marked as insufficient.
Further, the insufficient PG number BPS i The calculation method of (1) comprises the following steps:
BPS i =CPS i -PGC i ;
wherein PGC i The number of PGs carried for the disk;
said insufficient capacity BZR i The calculation method of (1) comprises the following steps:
BPS i =CZS i -SPR i
the data volume RPS expected to migrate into PG i The calculation method of (1) comprises the following steps:
further, the method for judging whether the disk is marked as the migration disk comprises the following steps:
if the number of the carried PGs of the magnetic disk is larger than the CPS number of the expected carried PGs i And uses volume CPR i CZS greater than the desired use capacity i Marking the disk as a migration disk;
if the number of the carried PGs of the magnetic disk is smaller than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Less than or equal to the desired usable capacity CZS i The disk is not marked as a migrated disk.
Further, the method for judging whether to generate the migration instruction comprises the following steps:
subtracting the RPS of the data amount expected to migrate into the PG from the data amount of each PG in the migration disk i Obtaining a difference value;
if the absolute value of the difference value is smaller than the difference value threshold value, generating a migration instruction;
if the absolute values of the differences are all greater than or equal to the difference threshold, no migration instruction is generated.
The invention discloses a technical effect and advantages of an optimization method for data equalization in a distributed storage system, which are as follows:
1. when data equalization is carried out, the capacity utilization of the magnetic disk is more balanced, and the effective utilization rate of the capacity utilization is improved.
2. The read-write load of each disk is more balanced, and the performance of the whole storage system is improved.
Drawings
FIG. 1 is a schematic diagram of an optimization method for data equalization in a distributed storage system according to embodiment 1 of the present invention;
fig. 2 is a schematic diagram of PG migration in embodiment 1 of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to embodiment 3 of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, the method for optimizing data equalization in a distributed storage system according to the present embodiment includes:
s1: computing a total storage system capacity XZR;
the method for calculating the total capacity XZR of the storage system comprises the following steps:
capacity information CPR for each disk in storage system collected through API i ;
Wherein i is the ith disk, m is the total number of disks in the storage system, and i is epsilon m;
an API is an application programming interface, which is a set of specifications that define interactions between software components; it allows communication, exchange of data between different software systems or applications, or access to the functions and services of the other party in order to achieve a specific operation or function;
s2: calculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i ;
Each disk in the storage system is expected to bear PG number CPS i The calculation method of (1) comprises the following steps:
wherein PGZ is the total number of PGs in the storage system;
the PGZ total number in the storage system is inquired and obtained by a command line tool in the storage system;
it should be noted that, PGs in a storage system are basic units for data distribution and data recovery, the number of PGs is the total number of PGs in the storage system, which is an important parameter for performance and load distribution of the storage system, the number of PGs determines the distribution of data in the storage system, and a greater number of PGs means better data distribution and better load balancing, but also increases management overhead and storage overhead of the storage system, so that a suitable number of PGs is required to provide good performance and reliability;
s3: computing storage system total usage capacity XZS:
the method for calculating the total usage capacity XZS of the storage system comprises the following steps:
collecting usage capacity information SPR of each disk in storage system through API i ;
S4: calculating average usage rate PJL of storage system and expected usage capacity CZS of each disk i :
The calculation method of the average use rate PJL comprises the following steps:
the calculation method of the expected usage capacity of each disk comprises the following steps:
CZS i =CPR i ×PJL;
s5: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the deficiency PG number BPS corresponding to the deficient disk i Insufficient capacity BZR i Data volume RPS expected to migrate into PG i ;
The method for judging whether the disk is marked as insufficient or not comprises the following steps:
if the number of the carried PGs of the magnetic disk is smaller than the CPS number of the expected carried PGs i And uses volume CPR i CZS less than the desired usable capacity i The disk is marked as insufficient,
if the number of the carried PGs of the magnetic disk is greater than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Greater than or equal to the desired use capacity CZS i The disk is not marked as insufficient;
BPS with less PG number i The calculation method of (1) comprises the following steps:
BPS i =CPS i -PGC i ;
wherein PGC i The number of PGs carried for the disk;
PG number PGC carried by each magnetic disk in storage system i Querying and acquiring by a command line tool in a storage system;
insufficient capacity BZR i The calculation method of (1) comprises the following steps:
BPS i =CZS i -SPR i ;
data volume RPS expected to migrate into PG i The calculation method of (1) comprises the following steps:
s6: analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; marking PG corresponding to the migration instruction as migration PG, and moving the migration PG into the insufficient disk;
the method for judging whether the disk is marked as the migration disk comprises the following steps:
if the number of the carried PGs of the magnetic disk is larger than the CPS number of the expected carried PGs i And uses volume CPR i CZS greater than the desired use capacity i Marking the disk as a migration disk;
if the number of the carried PGs of the magnetic disk is smaller than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Less than or equal to the desired usable capacity CZS i If the disk is not marked as a migration disk;
the method for judging whether to generate the migration instruction comprises the following steps:
subtracting the RPS of the data amount expected to migrate into the PG from the data amount of each PG in the migration disk i Obtaining a difference value;
if the absolute value of the difference value is smaller than the difference value threshold value, generating a migration instruction;
if the absolute values of the differences are all larger than or equal to the difference threshold value, no migration instruction is generated;
the data volume of each PG in the migration disk is acquired by an API of a storage system;
it should be noted that, the difference threshold is preset, under an experimental environment, a person skilled in the art moves the PG in the migrated disk into the insufficient disk multiple times to ensure that the usage rate of each disk space is similar, and subtracts the data amount RPS of the migrated PG corresponding to the insufficient disk from the PG to be migrated each time i Obtaining a difference value, calculating the average value of a plurality of absolute difference values, sorting the calculated average values, and taking the maximum average value as a difference value threshold;
the PG migration is performed to improve the effective utilization rate of the disk space, and meanwhile, the read-write load of the disk is more balanced, so that the performance of the whole storage system is improved;
for example, referring to fig. 2, there are 3 disks, and the PG numbers are unevenly distributed, and meanwhile, the data amounts carried by the PGs have a certain difference; because the capacities of the disk1, the disk2 and the disk3 are all 50G, the total capacity of the storage system is 50+50+50=150g, the total use capacity of the storage system is 40+3+31=75g, and the average use rate of the storage system is 74/150=49%; the expected use capacities of the disks disk1, disk2 and disk3 are all 50×49% =24.5g, the PG numbers expected to be carried by the disks disk1, disk2 and disk3 are all 12× (50/150×100%) =4, and the disks disk1 and disk3 are marked as migration disks because the use capacities of the disks disk1 and disk3 are larger than the expected use capacities and the PG numbers to be carried are larger than the PG numbers expected to be carried; since the capacity of use of the disk2 is smaller than the expected capacity of use and the number of PGs carried is smaller than the number of PGs expected to be carried, the disk2 is marked as insufficient disk; the data size expected to migrate into PG corresponding to disk2 is (24.5-3)/(4-1) =7.17G; analyzing the disk1 and disk3, and marking PG5 or PG2 and PG6 in the disk1 and PG9 or PG11 in the disk3 as migration PG; migration PG is moved into disk 2; after PG migration is completed, the corresponding usage rate of the disk1 is (2+6+10+8)/50×100% = 52%, the corresponding usage rate of the disk2 is (3+6+8+7)/50×100% = 48%, and the corresponding usage rate of the disk3 is (6+5+7+6)/50×100% = 48%, i.e. the usage rates of the disk spaces are similar;
aiming at a data balancing scene, the embodiment provides an optimization scheme for balancing the data quantity borne by the disk, and triggers migration data in the storage system according to a balancing strategy of PG distribution uniformity, so that the utilization rate of each disk space is consistent; the effective utilization rate of the disk space is improved, the read-write load of the disk is balanced, and the performance of the whole storage system is improved.
Example 2
Referring to fig. 3, the disclosure provides an electronic device, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements an optimization method for data balancing in a distributed storage system according to any one of the methods provided above when executing the computer program.
Since the electronic device described in this embodiment is an electronic device for implementing an optimization method for data balancing in a distributed storage system in this embodiment, based on an optimization method for data balancing in a distributed storage system described in this embodiment, those skilled in the art can understand a specific implementation manner of the electronic device and various variations thereof, so how to implement the method in this embodiment of the present application for this electronic device will not be described in detail herein. Electronic devices used by those skilled in the art to implement an optimization method for data balancing in a distributed storage system according to the embodiments of the present application are all within the scope of protection intended by the present application.
Example 3
The embodiment discloses a computer readable storage medium, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the optimization method of data equalization in a distributed storage system according to any one of the methods provided by the processor when executing the computer program.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present invention are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center over a wired network or a wireless network. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely one, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Finally: the foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (10)
1. A method for optimizing data equalization in a distributed storage system, comprising:
step S1: computing a total storage system capacity XZR;
step S2: calculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i ;
Step S3: computing storage system total usage capacity XZS:
step S4: calculating average usage rate PJL of storage system and expected usage capacity CZS of each disk i :
Step S5: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the deficiency PG number BPS corresponding to the deficient disk i Insufficient capacity BZR i Data volume RPS expected to migrate into PG i ;
Step S6: analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; and marking the PG corresponding to the migration instruction as a migration PG, and moving the migration PG into the insufficient disk.
2. The method for optimizing data equalization in a distributed storage system of claim 1, wherein said method for computing a total storage system capacity XZR comprises:
capacity information CPR for each disk in storage system collected through API i ;
Wherein i is the ith disk, m is the total number of disks in the storage system, and i epsilon m.
3. A method for optimizing data equalization in a distributed storage system as defined in claim 2, wherein saidCalculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i ;
Each disk in the storage system is expected to bear PG number CPS i The calculation method of (1) comprises the following steps:
wherein PGZ is the total number of PGs in the storage system.
4. A method of optimizing data balancing in a distributed storage system according to claim 3, wherein the method of computing the total usage capacity XZS of the storage system comprises:
collecting usage capacity information SPR of each disk in storage system through API i ;
5. The method for optimizing data equalization in a distributed storage system of claim 4, wherein said method for calculating average usage PJL comprises:
6. the method for optimizing data balancing in a distributed storage system according to claim 5, wherein the method for calculating expected usage capacity of each disk comprises:
CZSi=CPRi×PJL。
7. the method of optimizing data balancing in a distributed storage system of claim 6, wherein the determining whether to label as insufficient disk comprises:
if the number of the carried PGs of the magnetic disk is smaller than the CPS number of the expected carried PGs i And uses volume CPR i CZS less than the desired usable capacity i Marking the disk as insufficient disk;
if the number of the carried PGs of the magnetic disk is greater than or equal to the CPS of the PG number of the expected carrier i CPR using volume i Greater than or equal to the desired use capacity CZS i The disk is not marked as insufficient.
8. The method for optimizing data equalization in a distributed storage system of claim 7, wherein said insufficient PG number BPS i The calculation method of (1) comprises the following steps:
BPSi=CPSi-PGCi;
wherein PGC i The number of PGs carried for the disk;
said insufficient capacity BZR i The calculation method of (1) comprises the following steps:
BPSi=CZSi-SPRi
the data volume RPS expected to migrate into PG i The calculation method of (1) comprises the following steps:
9. the method for optimizing data balancing in a distributed storage system of claim 8, wherein the determining whether to label as a migrated disk comprises:
if the number of the carried PGs of the magnetic disk is larger than the CPS number of the expected carried PGs i And uses volume CPR i CZS greater than the desired use capacity i Marking the disk as a migration disk;
if the number of the carried PGs of the magnetic disk is smaller than or equal to the CPS of the PG number of the expected carrier i CPR using volume i Less than or equal to the desired usable capacity CZS i The disk is not marked as a migrated disk.
10. The method for optimizing data balancing in a distributed storage system according to claim 9, wherein the method for determining whether to generate the migration instruction comprises:
subtracting the RPS of the data amount expected to migrate into the PG from the data amount of each PG in the migration disk i Obtaining a difference value;
if the absolute value of the difference value is smaller than the difference value threshold value, generating a migration instruction;
if the absolute values of the differences are all greater than or equal to the difference threshold, no migration instruction is generated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311722336.4A CN117850680A (en) | 2023-12-14 | 2023-12-14 | Optimization method for data equalization in distributed storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311722336.4A CN117850680A (en) | 2023-12-14 | 2023-12-14 | Optimization method for data equalization in distributed storage system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117850680A true CN117850680A (en) | 2024-04-09 |
Family
ID=90530291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311722336.4A Pending CN117850680A (en) | 2023-12-14 | 2023-12-14 | Optimization method for data equalization in distributed storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117850680A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111880747A (en) * | 2020-08-01 | 2020-11-03 | 广西大学 | Automatic balanced storage method of Ceph storage system based on hierarchical mapping |
CN112463050A (en) * | 2020-11-26 | 2021-03-09 | 新华三技术有限公司成都分公司 | Storage system capacity expansion method, device, equipment and machine-readable storage medium |
CN113268203A (en) * | 2021-05-18 | 2021-08-17 | 天津中科曙光存储科技有限公司 | Capacity balancing method and device of storage system, computer equipment and storage medium |
CN114047883A (en) * | 2021-11-19 | 2022-02-15 | 北京天融信网络安全技术有限公司 | Data equalization method and device based on distributed storage system |
CN116991334A (en) * | 2023-09-26 | 2023-11-03 | 苏州元脑智能科技有限公司 | Data storage method, system, device, electronic equipment and readable storage medium |
-
2023
- 2023-12-14 CN CN202311722336.4A patent/CN117850680A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111880747A (en) * | 2020-08-01 | 2020-11-03 | 广西大学 | Automatic balanced storage method of Ceph storage system based on hierarchical mapping |
WO2022028033A1 (en) * | 2020-08-01 | 2022-02-10 | 广西大学 | Hierarchical mapping-based automatic balancing storage method for ceph storage system |
CN112463050A (en) * | 2020-11-26 | 2021-03-09 | 新华三技术有限公司成都分公司 | Storage system capacity expansion method, device, equipment and machine-readable storage medium |
CN113268203A (en) * | 2021-05-18 | 2021-08-17 | 天津中科曙光存储科技有限公司 | Capacity balancing method and device of storage system, computer equipment and storage medium |
CN114047883A (en) * | 2021-11-19 | 2022-02-15 | 北京天融信网络安全技术有限公司 | Data equalization method and device based on distributed storage system |
CN116991334A (en) * | 2023-09-26 | 2023-11-03 | 苏州元脑智能科技有限公司 | Data storage method, system, device, electronic equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3367251B1 (en) | Storage system and solid state hard disk | |
US9805140B2 (en) | Striping of directed graphs and nodes with improved functionality | |
KR102290540B1 (en) | Namespace/Stream Management | |
US10552089B2 (en) | Data processing for managing local and distributed storage systems by scheduling information corresponding to data write requests | |
CN104317742A (en) | Thin provisioning method for optimizing space management | |
CN105701028A (en) | Method and device for managing disks in distributed storage system | |
JP2013509658A (en) | Allocation of storage memory based on future usage estimates | |
JPWO2008102739A1 (en) | Virtual server system and physical server selection method | |
CN102156738A (en) | Method for processing data blocks, and data block storage equipment and system | |
CN104283959B (en) | A kind of memory mechanism based on grading performance suitable for cloud platform | |
CN103324533A (en) | distributed data processing method, device and system | |
CN105808443B (en) | A kind of method, apparatus and system of Data Migration | |
CN104205780B (en) | A kind of method and apparatus of data storage | |
CN109299190A (en) | The method and device of the metadata dealt with objects in distributed memory system | |
CN108228099B (en) | Data storage method and device | |
CN112825023A (en) | Cluster resource management method and device, electronic equipment and storage medium | |
WO2024098698A1 (en) | Redundant array of independent disks initialization method and apparatus, device, and readable storage medium | |
WO2016008338A1 (en) | I/o request processing method and storage system | |
CN106973091B (en) | Distributed memory data redistribution method and system, and master control server | |
CN113885803A (en) | Data storage method and device, electronic equipment and storage medium | |
CN106951190B (en) | Data storage and access method, node and server cluster | |
CN117850680A (en) | Optimization method for data equalization in distributed storage system | |
CN115665159A (en) | Metadata management method and system under big data environment | |
US11494076B2 (en) | Storage-usage-based host/storage mapping management system | |
KR102212108B1 (en) | Storage Orchestration Learning Optimization Target Volume Selection Method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |