CN117850680A - Optimization method for data equalization in distributed storage system - Google Patents

Optimization method for data equalization in distributed storage system Download PDF

Info

Publication number
CN117850680A
CN117850680A CN202311722336.4A CN202311722336A CN117850680A CN 117850680 A CN117850680 A CN 117850680A CN 202311722336 A CN202311722336 A CN 202311722336A CN 117850680 A CN117850680 A CN 117850680A
Authority
CN
China
Prior art keywords
disk
storage system
capacity
expected
migration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311722336.4A
Other languages
Chinese (zh)
Inventor
王达林
代怀刚
陈阳
刘啸滨
蒋波
王念秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
Tianyi Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Cloud Technology Co Ltd filed Critical Tianyi Cloud Technology Co Ltd
Priority to CN202311722336.4A priority Critical patent/CN117850680A/en
Publication of CN117850680A publication Critical patent/CN117850680A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of computer storage, and discloses an optimization method for data equalization in a distributed storage system; calculating the total capacity of the storage system; calculating the number of PG expected to be carried by each disk in the storage system; calculating the total use capacity of the storage system: calculating an average usage rate of the storage system and an expected usage capacity of each disk: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the number of deficiency PGs, the deficiency capacity and the data quantity expected to migrate into the PGs corresponding to the deficient disk; analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; when data equalization is carried out, the capacity utilization of the magnetic disk is more balanced, and the effective utilization rate of the capacity utilization is improved.

Description

Optimization method for data equalization in distributed storage system
Technical Field
The invention relates to the technical field of computer storage, in particular to an optimization method for data equalization in a distributed storage system.
Background
The resources in the storage system are composed of a large number of disks, and the data management mode is as follows: (1) composing the data into a chunk; (2) The chunk hash maps to the logical layer PG (placement group); (3) PG is fixedly mapped to the disk through the disk selection; when the storage system runs, a scene of adding and deleting magnetic disks is necessarily existed, so that the quantity of PG distributed on the magnetic disks is unbalanced, and the data quantity of the magnetic disks is unbalanced, thereby causing waste of a large amount of storage space;
the prior patent application publication No. CN114237520A discloses a ceph cluster data equalization method and system, wherein a plurality of equalization lists are generated by carrying out average division on data blocks to be equalized through a main reset group, then the main reset group reserves one equalization list, other equalization lists are respectively sent to a slave reset group, finally the main reset group and the slave reset group correspondingly send the data blocks stored by the slave reset group to a newly added data storage device according to the data block names on the equalization list to realize equalization, and the equalization operation is carried out by the mutual cooperation of the plurality of reset groups in unit time, so that the data equalization efficiency is greatly improved;
however, when the above technology and the prior art perform data balancing, the data volume actually carried by each PG is not considered, and the data volume of each PG is not used as a condition for selecting a target PG, so that the space of each disk is not utilized more effectively, and some disks carry more data, the read-write load is heavy, and the performance of the whole storage system is reduced;
in view of the above, the present invention proposes an optimization method for data balancing in a distributed storage system to solve the above-mentioned problems.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides the following technical scheme for achieving the purposes: an optimization method for data equalization in a distributed storage system, comprising:
step S1: computing a total storage system capacity XZR;
step S2: calculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i
Step S3: computing storage system total usage capacity XZS:
step S4: calculating average usage rate PJL of storage system and expected usage capacity CZS of each disk i
Step S5: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the deficiency PG number BPS corresponding to the deficient disk i Insufficient capacity BZR i Data volume RPS expected to migrate into PG i
Step S6: analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; and marking the PG corresponding to the migration instruction as a migration PG, and moving the migration PG into the insufficient disk.
Further, the method for calculating the total capacity XZR of the storage system includes:
capacity information CPR for each disk in storage system collected through API i
Wherein i is the ith disk, m is the total number of disks in the storage system, and i epsilon m.
Further, the CPS of PG number expected to be carried by each magnetic disk in the computing and storage system i
Each disk in the storage system is expected to bear PG number CPS i The calculation method of (1) comprises the following steps:
wherein PGZ is the total number of PGs in the storage system.
Further, the method for calculating the total usage capacity XZS of the storage system includes:
collecting usage capacity information SPR of each disk in storage system through API i
Further, the method for calculating the average usage rate PJL includes:
further, the method for calculating the expected usage capacity of each disk includes:
further, the method for judging whether the disk is marked as insufficient or not comprises the following steps:
if the number of the carried PGs of the magnetic disk is smaller than the CPS number of the expected carried PGs i And uses volume CPR i CZS less than the desired usable capacity i Marking the disk as insufficient disk;
if the number of the carried PGs of the magnetic disk is greater than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Greater than or equal to the desired use capacity CZS i The disk is not marked as insufficient.
Further, the insufficient PG number BPS i The calculation method of (1) comprises the following steps:
BPS i =CPS i -PGC i
wherein PGC i The number of PGs carried for the disk;
said insufficient capacity BZR i The calculation method of (1) comprises the following steps:
BPS i =CZS i -SPR i
the data volume RPS expected to migrate into PG i The calculation method of (1) comprises the following steps:
further, the method for judging whether the disk is marked as the migration disk comprises the following steps:
if the number of the carried PGs of the magnetic disk is larger than the CPS number of the expected carried PGs i And uses volume CPR i CZS greater than the desired use capacity i Marking the disk as a migration disk;
if the number of the carried PGs of the magnetic disk is smaller than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Less than or equal to the desired usable capacity CZS i The disk is not marked as a migrated disk.
Further, the method for judging whether to generate the migration instruction comprises the following steps:
subtracting the RPS of the data amount expected to migrate into the PG from the data amount of each PG in the migration disk i Obtaining a difference value;
if the absolute value of the difference value is smaller than the difference value threshold value, generating a migration instruction;
if the absolute values of the differences are all greater than or equal to the difference threshold, no migration instruction is generated.
The invention discloses a technical effect and advantages of an optimization method for data equalization in a distributed storage system, which are as follows:
1. when data equalization is carried out, the capacity utilization of the magnetic disk is more balanced, and the effective utilization rate of the capacity utilization is improved.
2. The read-write load of each disk is more balanced, and the performance of the whole storage system is improved.
Drawings
FIG. 1 is a schematic diagram of an optimization method for data equalization in a distributed storage system according to embodiment 1 of the present invention;
fig. 2 is a schematic diagram of PG migration in embodiment 1 of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to embodiment 3 of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, the method for optimizing data equalization in a distributed storage system according to the present embodiment includes:
s1: computing a total storage system capacity XZR;
the method for calculating the total capacity XZR of the storage system comprises the following steps:
capacity information CPR for each disk in storage system collected through API i
Wherein i is the ith disk, m is the total number of disks in the storage system, and i is epsilon m;
an API is an application programming interface, which is a set of specifications that define interactions between software components; it allows communication, exchange of data between different software systems or applications, or access to the functions and services of the other party in order to achieve a specific operation or function;
s2: calculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i
Each disk in the storage system is expected to bear PG number CPS i The calculation method of (1) comprises the following steps:
wherein PGZ is the total number of PGs in the storage system;
the PGZ total number in the storage system is inquired and obtained by a command line tool in the storage system;
it should be noted that, PGs in a storage system are basic units for data distribution and data recovery, the number of PGs is the total number of PGs in the storage system, which is an important parameter for performance and load distribution of the storage system, the number of PGs determines the distribution of data in the storage system, and a greater number of PGs means better data distribution and better load balancing, but also increases management overhead and storage overhead of the storage system, so that a suitable number of PGs is required to provide good performance and reliability;
s3: computing storage system total usage capacity XZS:
the method for calculating the total usage capacity XZS of the storage system comprises the following steps:
collecting usage capacity information SPR of each disk in storage system through API i
S4: calculating average usage rate PJL of storage system and expected usage capacity CZS of each disk i
The calculation method of the average use rate PJL comprises the following steps:
the calculation method of the expected usage capacity of each disk comprises the following steps:
CZS i =CPR i ×PJL;
s5: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the deficiency PG number BPS corresponding to the deficient disk i Insufficient capacity BZR i Data volume RPS expected to migrate into PG i
The method for judging whether the disk is marked as insufficient or not comprises the following steps:
if the number of the carried PGs of the magnetic disk is smaller than the CPS number of the expected carried PGs i And uses volume CPR i CZS less than the desired usable capacity i The disk is marked as insufficient,
if the number of the carried PGs of the magnetic disk is greater than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Greater than or equal to the desired use capacity CZS i The disk is not marked as insufficient;
BPS with less PG number i The calculation method of (1) comprises the following steps:
BPS i =CPS i -PGC i
wherein PGC i The number of PGs carried for the disk;
PG number PGC carried by each magnetic disk in storage system i Querying and acquiring by a command line tool in a storage system;
insufficient capacity BZR i The calculation method of (1) comprises the following steps:
BPS i =CZS i -SPR i
data volume RPS expected to migrate into PG i The calculation method of (1) comprises the following steps:
s6: analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; marking PG corresponding to the migration instruction as migration PG, and moving the migration PG into the insufficient disk;
the method for judging whether the disk is marked as the migration disk comprises the following steps:
if the number of the carried PGs of the magnetic disk is larger than the CPS number of the expected carried PGs i And uses volume CPR i CZS greater than the desired use capacity i Marking the disk as a migration disk;
if the number of the carried PGs of the magnetic disk is smaller than or equal to the CPS of the PG number of the expected carrier i Or using volume CPR i Less than or equal to the desired usable capacity CZS i If the disk is not marked as a migration disk;
the method for judging whether to generate the migration instruction comprises the following steps:
subtracting the RPS of the data amount expected to migrate into the PG from the data amount of each PG in the migration disk i Obtaining a difference value;
if the absolute value of the difference value is smaller than the difference value threshold value, generating a migration instruction;
if the absolute values of the differences are all larger than or equal to the difference threshold value, no migration instruction is generated;
the data volume of each PG in the migration disk is acquired by an API of a storage system;
it should be noted that, the difference threshold is preset, under an experimental environment, a person skilled in the art moves the PG in the migrated disk into the insufficient disk multiple times to ensure that the usage rate of each disk space is similar, and subtracts the data amount RPS of the migrated PG corresponding to the insufficient disk from the PG to be migrated each time i Obtaining a difference value, calculating the average value of a plurality of absolute difference values, sorting the calculated average values, and taking the maximum average value as a difference value threshold;
the PG migration is performed to improve the effective utilization rate of the disk space, and meanwhile, the read-write load of the disk is more balanced, so that the performance of the whole storage system is improved;
for example, referring to fig. 2, there are 3 disks, and the PG numbers are unevenly distributed, and meanwhile, the data amounts carried by the PGs have a certain difference; because the capacities of the disk1, the disk2 and the disk3 are all 50G, the total capacity of the storage system is 50+50+50=150g, the total use capacity of the storage system is 40+3+31=75g, and the average use rate of the storage system is 74/150=49%; the expected use capacities of the disks disk1, disk2 and disk3 are all 50×49% =24.5g, the PG numbers expected to be carried by the disks disk1, disk2 and disk3 are all 12× (50/150×100%) =4, and the disks disk1 and disk3 are marked as migration disks because the use capacities of the disks disk1 and disk3 are larger than the expected use capacities and the PG numbers to be carried are larger than the PG numbers expected to be carried; since the capacity of use of the disk2 is smaller than the expected capacity of use and the number of PGs carried is smaller than the number of PGs expected to be carried, the disk2 is marked as insufficient disk; the data size expected to migrate into PG corresponding to disk2 is (24.5-3)/(4-1) =7.17G; analyzing the disk1 and disk3, and marking PG5 or PG2 and PG6 in the disk1 and PG9 or PG11 in the disk3 as migration PG; migration PG is moved into disk 2; after PG migration is completed, the corresponding usage rate of the disk1 is (2+6+10+8)/50×100% = 52%, the corresponding usage rate of the disk2 is (3+6+8+7)/50×100% = 48%, and the corresponding usage rate of the disk3 is (6+5+7+6)/50×100% = 48%, i.e. the usage rates of the disk spaces are similar;
aiming at a data balancing scene, the embodiment provides an optimization scheme for balancing the data quantity borne by the disk, and triggers migration data in the storage system according to a balancing strategy of PG distribution uniformity, so that the utilization rate of each disk space is consistent; the effective utilization rate of the disk space is improved, the read-write load of the disk is balanced, and the performance of the whole storage system is improved.
Example 2
Referring to fig. 3, the disclosure provides an electronic device, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements an optimization method for data balancing in a distributed storage system according to any one of the methods provided above when executing the computer program.
Since the electronic device described in this embodiment is an electronic device for implementing an optimization method for data balancing in a distributed storage system in this embodiment, based on an optimization method for data balancing in a distributed storage system described in this embodiment, those skilled in the art can understand a specific implementation manner of the electronic device and various variations thereof, so how to implement the method in this embodiment of the present application for this electronic device will not be described in detail herein. Electronic devices used by those skilled in the art to implement an optimization method for data balancing in a distributed storage system according to the embodiments of the present application are all within the scope of protection intended by the present application.
Example 3
The embodiment discloses a computer readable storage medium, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the optimization method of data equalization in a distributed storage system according to any one of the methods provided by the processor when executing the computer program.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present invention are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center over a wired network or a wireless network. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely one, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Finally: the foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. A method for optimizing data equalization in a distributed storage system, comprising:
step S1: computing a total storage system capacity XZR;
step S2: calculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i
Step S3: computing storage system total usage capacity XZS:
step S4: calculating average usage rate PJL of storage system and expected usage capacity CZS of each disk i
Step S5: analyzing each disk in the storage system, judging whether the disk is marked as a deficient disk, and calculating the deficiency PG number BPS corresponding to the deficient disk i Insufficient capacity BZR i Data volume RPS expected to migrate into PG i
Step S6: analyzing each disk in the storage system, judging whether the disk is marked as a migration disk, analyzing each PG in the migration disk, and judging whether a migration instruction is generated; and marking the PG corresponding to the migration instruction as a migration PG, and moving the migration PG into the insufficient disk.
2. The method for optimizing data equalization in a distributed storage system of claim 1, wherein said method for computing a total storage system capacity XZR comprises:
capacity information CPR for each disk in storage system collected through API i
Wherein i is the ith disk, m is the total number of disks in the storage system, and i epsilon m.
3. A method for optimizing data equalization in a distributed storage system as defined in claim 2, wherein saidCalculating the CPS (CPS) number of PG expected to be carried by each disk in a storage system i
Each disk in the storage system is expected to bear PG number CPS i The calculation method of (1) comprises the following steps:
wherein PGZ is the total number of PGs in the storage system.
4. A method of optimizing data balancing in a distributed storage system according to claim 3, wherein the method of computing the total usage capacity XZS of the storage system comprises:
collecting usage capacity information SPR of each disk in storage system through API i
5. The method for optimizing data equalization in a distributed storage system of claim 4, wherein said method for calculating average usage PJL comprises:
6. the method for optimizing data balancing in a distributed storage system according to claim 5, wherein the method for calculating expected usage capacity of each disk comprises:
CZSi=CPRi×PJL。
7. the method of optimizing data balancing in a distributed storage system of claim 6, wherein the determining whether to label as insufficient disk comprises:
if the number of the carried PGs of the magnetic disk is smaller than the CPS number of the expected carried PGs i And uses volume CPR i CZS less than the desired usable capacity i Marking the disk as insufficient disk;
if the number of the carried PGs of the magnetic disk is greater than or equal to the CPS of the PG number of the expected carrier i CPR using volume i Greater than or equal to the desired use capacity CZS i The disk is not marked as insufficient.
8. The method for optimizing data equalization in a distributed storage system of claim 7, wherein said insufficient PG number BPS i The calculation method of (1) comprises the following steps:
BPSi=CPSi-PGCi;
wherein PGC i The number of PGs carried for the disk;
said insufficient capacity BZR i The calculation method of (1) comprises the following steps:
BPSi=CZSi-SPRi
the data volume RPS expected to migrate into PG i The calculation method of (1) comprises the following steps:
9. the method for optimizing data balancing in a distributed storage system of claim 8, wherein the determining whether to label as a migrated disk comprises:
if the number of the carried PGs of the magnetic disk is larger than the CPS number of the expected carried PGs i And uses volume CPR i CZS greater than the desired use capacity i Marking the disk as a migration disk;
if the number of the carried PGs of the magnetic disk is smaller than or equal to the CPS of the PG number of the expected carrier i CPR using volume i Less than or equal to the desired usable capacity CZS i The disk is not marked as a migrated disk.
10. The method for optimizing data balancing in a distributed storage system according to claim 9, wherein the method for determining whether to generate the migration instruction comprises:
subtracting the RPS of the data amount expected to migrate into the PG from the data amount of each PG in the migration disk i Obtaining a difference value;
if the absolute value of the difference value is smaller than the difference value threshold value, generating a migration instruction;
if the absolute values of the differences are all greater than or equal to the difference threshold, no migration instruction is generated.
CN202311722336.4A 2023-12-14 2023-12-14 Optimization method for data equalization in distributed storage system Pending CN117850680A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311722336.4A CN117850680A (en) 2023-12-14 2023-12-14 Optimization method for data equalization in distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311722336.4A CN117850680A (en) 2023-12-14 2023-12-14 Optimization method for data equalization in distributed storage system

Publications (1)

Publication Number Publication Date
CN117850680A true CN117850680A (en) 2024-04-09

Family

ID=90530291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311722336.4A Pending CN117850680A (en) 2023-12-14 2023-12-14 Optimization method for data equalization in distributed storage system

Country Status (1)

Country Link
CN (1) CN117850680A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111880747A (en) * 2020-08-01 2020-11-03 广西大学 Automatic balanced storage method of Ceph storage system based on hierarchical mapping
CN112463050A (en) * 2020-11-26 2021-03-09 新华三技术有限公司成都分公司 Storage system capacity expansion method, device, equipment and machine-readable storage medium
CN113268203A (en) * 2021-05-18 2021-08-17 天津中科曙光存储科技有限公司 Capacity balancing method and device of storage system, computer equipment and storage medium
CN114047883A (en) * 2021-11-19 2022-02-15 北京天融信网络安全技术有限公司 Data equalization method and device based on distributed storage system
CN116991334A (en) * 2023-09-26 2023-11-03 苏州元脑智能科技有限公司 Data storage method, system, device, electronic equipment and readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111880747A (en) * 2020-08-01 2020-11-03 广西大学 Automatic balanced storage method of Ceph storage system based on hierarchical mapping
WO2022028033A1 (en) * 2020-08-01 2022-02-10 广西大学 Hierarchical mapping-based automatic balancing storage method for ceph storage system
CN112463050A (en) * 2020-11-26 2021-03-09 新华三技术有限公司成都分公司 Storage system capacity expansion method, device, equipment and machine-readable storage medium
CN113268203A (en) * 2021-05-18 2021-08-17 天津中科曙光存储科技有限公司 Capacity balancing method and device of storage system, computer equipment and storage medium
CN114047883A (en) * 2021-11-19 2022-02-15 北京天融信网络安全技术有限公司 Data equalization method and device based on distributed storage system
CN116991334A (en) * 2023-09-26 2023-11-03 苏州元脑智能科技有限公司 Data storage method, system, device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
EP3367251B1 (en) Storage system and solid state hard disk
US9805140B2 (en) Striping of directed graphs and nodes with improved functionality
KR102290540B1 (en) Namespace/Stream Management
US10552089B2 (en) Data processing for managing local and distributed storage systems by scheduling information corresponding to data write requests
CN104317742A (en) Thin provisioning method for optimizing space management
CN105701028A (en) Method and device for managing disks in distributed storage system
JP2013509658A (en) Allocation of storage memory based on future usage estimates
JPWO2008102739A1 (en) Virtual server system and physical server selection method
CN102156738A (en) Method for processing data blocks, and data block storage equipment and system
CN104283959B (en) A kind of memory mechanism based on grading performance suitable for cloud platform
CN103324533A (en) distributed data processing method, device and system
CN105808443B (en) A kind of method, apparatus and system of Data Migration
CN104205780B (en) A kind of method and apparatus of data storage
CN109299190A (en) The method and device of the metadata dealt with objects in distributed memory system
CN108228099B (en) Data storage method and device
CN112825023A (en) Cluster resource management method and device, electronic equipment and storage medium
WO2024098698A1 (en) Redundant array of independent disks initialization method and apparatus, device, and readable storage medium
WO2016008338A1 (en) I/o request processing method and storage system
CN106973091B (en) Distributed memory data redistribution method and system, and master control server
CN113885803A (en) Data storage method and device, electronic equipment and storage medium
CN106951190B (en) Data storage and access method, node and server cluster
CN117850680A (en) Optimization method for data equalization in distributed storage system
CN115665159A (en) Metadata management method and system under big data environment
US11494076B2 (en) Storage-usage-based host/storage mapping management system
KR102212108B1 (en) Storage Orchestration Learning Optimization Target Volume Selection Method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination