CN109828718B

CN109828718B - Disk storage load balancing method and device

Info

Publication number: CN109828718B
Application number: CN201811496766.8A
Authority: CN
Inventors: 余澈
Original assignee: China United Network Communications Group Co Ltd; Unicom Big Data Co Ltd
Current assignee: China United Network Communications Group Co Ltd; Unicom Big Data Co Ltd
Priority date: 2018-12-07
Filing date: 2018-12-07
Publication date: 2022-03-18
Anticipated expiration: 2038-12-07
Also published as: CN109828718A

Abstract

The application discloses a method and a device for balancing disk storage load, which comprise the following steps: calculating the average value of the instantaneous storage utilization rate of each disk; calculating the average value of the historical storage utilization rate of each disk; generating a disk storage utilization rate list according to the average value of the instantaneous storage utilization rates of the disks; sorting the disks in the disk storage utilization ratio list according to the average value of the instantaneous storage utilization ratio and the average value of the historical storage utilization ratio of each disk in the disk storage utilization ratio list, and determining a disk to be migrated and a target disk according to the sorted sequence; and migrating the data from the disk to be migrated to the target disk. The disk to be migrated with the data migrated preferentially and the target disk with the data migrated preferentially can be obtained, the data of each disk is effectively balanced, the waste of CPU and disk IO of cluster nodes is also effectively avoided, the cluster storage occupation is balanced, and the cluster storage utilization rate is greatly improved.

Description

Disk storage load balancing method and device

Technical Field

The invention belongs to the technical field of data processing, and particularly relates to a disk storage load balancing method and device.

Background

Middleware is a service that an application provides in addition to services provided by an operating system, and a component in the "middle layer" is a bridge between an application on the upper layer and a service on the lower layer, and is also a bridge between applications (e.g., a distributed service component). The distributed message middleware supports a hardware or software infrastructure that sends and receives messages in a distributed system, i.e., the distributed message middleware is itself a distributed system.

Currently, high throughput distributed message middleware operating systems cache files rather than memory. The current high-throughput distributed message middleware stores data offsets to each disk as a recording basis for data reading and writing. The data are partitioned according to the subjects, the magnitude of each subject is different, and the rule of data falling is according to the number of the subjects existing on the disk.

When the concurrency of data reading and writing is improved, the data reading and writing speed is improved, the storage between nodes and between disks is unbalanced, and the uneven load of the IO (input/output) of the disks becomes a big pain point of a large-scale distributed message middleware. In practice, although the number of stored topics of each disk of a large distributed message middleware cluster is not very different, the amount of data stored in each disk is severely loaded unevenly. This may result in a large amount of theme data that falls on a part of the disks, a large read-write throughput of the theme, a small amount of theme data that may fall on the remaining part of the disks, and uneven load on the disks, which in turn may result in uneven storage between nodes and uneven read-write efficiency. In addition, in a big data scene, the largest distributed message middleware cluster needs 100+ nodes, and the number of wasted storage nodes reaches half of the number of cluster nodes; and the read-write concurrency rate of the cluster load is low.

Disclosure of Invention

The application provides a disk storage load balancing method and device aiming at the defects that in the prior art, data storage among nodes and disks in a distributed message middleware occupies unevenly, and the problem that certain storage resources are wasted due to the fact that the distributed message middleware is easy to face the problem of read-write peak of individual disks in a big data scene.

The application provides a method for balancing disk storage load, which comprises the following steps:

calculating the average value of the instantaneous storage utilization rate of each disk;

calculating the average value of the historical storage utilization rate of each disk;

generating a disk storage utilization rate list according to the average value of the instantaneous storage utilization rates of the disks;

sorting the disks in the disk storage utilization ratio list according to the average value of the instantaneous storage utilization ratio and the average value of the historical storage utilization ratio of each disk in the disk storage utilization ratio list, and determining a disk to be migrated and a target disk according to the sorted sequence;

and migrating the data from the disk to be migrated to the target disk.

Optionally, the disk storage utilization list includes: the step of generating a disk storage utilization list according to the average value of the instantaneous storage utilization of each disk specifically includes:

respectively calculating the lower limit value and the upper limit value of the disk storage utilization rate of each disk according to the average value of the instantaneous storage utilization rate of each disk and a preset threshold value of a fluctuation parameter;

determining a disk with a disk storage utilization rate larger than an upper limit value of the disk storage utilization rate to generate the first list;

and determining the disk with the disk storage utilization rate smaller than the lower limit value of the disk storage utilization rate to generate the second list.

Optionally, the step of sorting the disks in the disk storage utilization ratio list according to the instantaneous storage utilization ratio average value and the historical storage utilization ratio average value of each disk in the disk storage utilization ratio list, and determining the disk to be migrated and the target disk according to a sorted sequence specifically includes:

calculating a first difference absolute value of the average value of the instantaneous storage utilization rate and the average value of the historical storage utilization rate of each disk in the first list, and calculating a second difference absolute value of the average value of the instantaneous storage utilization rate and the average value of the historical storage utilization rate of each disk in the second list;

sequencing the disks in the first list according to a descending order of the first difference absolute values to generate a first sequence, and sequencing the disks in the second list according to a descending order of the second difference absolute values to generate a second sequence;

and determining the disks to be migrated according to the priority of the first sequence, and determining the target disks according to the priority of the second sequence, wherein the priority of the disks in the first sequence and the second sequence is the highest.

Optionally, the method further includes:

judging whether the historical storage utilization rate average value is larger than the lower limit value of the disk storage utilization rate and smaller than the upper limit value of the disk storage utilization rate, and whether the historical storage utilization rate average value exists in the disk storage utilization rate list;

and if so, deleting the disk corresponding to the average value of the historical storage utilization rate in the disk storage utilization rate list.

Optionally, the step of calculating an average value of instantaneous storage utilization of each disk specifically includes:

by the formula

wherein i is the ith node, j is the jth block disk of the ith node, and the storage utilization rate of the jth block disk of the ith node is X_ijThe range of the node i is 0-n, and the range of the disk j on the node is 0-m.

Optionally, the step of calculating the average value of the historical storage utilization rates of the disks specifically includes:

by the formula

wherein k is a label of disk storage utilization, n is a total of n different labels of disk storage utilization, t_kTags that are disk storage utilization have a long time axis,

the disk storage utilization for a state of the disk,

history of the jth disk of the ith node in T periodAnd storing the utilization rate average value.

Optionally, the step of migrating the data from the disk to be migrated to the target disk specifically includes:

by the formula

Calculating the maximum transferable basic data of the disk to be transferred;

wherein, BD_jpJ is the disk label information, p is the basic unit data label information,

a basic unit data union set conforming to the migration standard;

and migrating the maximum transferable basic data in the disk to be migrated to the target disk.

The present application further provides a device for balancing disk storage load, including:

the first calculation module is used for calculating the average value of the instantaneous storage utilization rate of each disk;

the second calculation module is used for calculating the average value of the historical storage utilization rate of each disk;

the list generation module is used for generating a disk storage utilization list according to the average value of the instantaneous storage utilization of each disk;

the determining module is used for sequencing the disks in the disk storage utilization ratio list according to the average value of the instantaneous storage utilization ratio and the average value of the historical storage utilization ratio of each disk in the disk storage utilization ratio list, and determining the disk to be migrated and the target disk according to the sequencing sequence;

and the migration module is used for migrating the data from the disk to be migrated to the target disk.

Optionally, the disk storage utilization list includes: the list generation module specifically comprises:

the first calculation submodule is used for respectively calculating the lower limit value and the upper limit value of the disk storage utilization rate of each disk according to the average value of the instantaneous storage utilization rate of each disk and a preset threshold value of a fluctuation parameter;

the first list generation submodule is used for determining a disk with a disk storage utilization rate larger than an upper limit value of the disk storage utilization rate so as to generate the first list;

and the second list generation submodule is used for determining the disk with the disk storage utilization rate smaller than the lower limit value of the disk storage utilization rate so as to generate the second list.

Optionally, the determining module specifically includes:

the second calculation submodule is used for calculating a first difference absolute value of the average value of the instantaneous storage utilization rate and the average value of the historical storage utilization rate of each disk in the first list and calculating a second difference absolute value of the average value of the instantaneous storage utilization rate and the average value of the historical storage utilization rate of each disk in the second list;

the sorting submodule is used for sorting the disks in the first list according to the descending order of the first difference absolute values to generate a first sequence, and sorting the disks in the second list according to the descending order of the second difference absolute values to generate a second sequence;

the determining submodule is used for determining a disk to be migrated according to the priority of the first sequence and determining a target disk according to the priority of the second sequence, wherein the priority of a disk in the first sequence and the priority of a disk in the second sequence are the highest;

optionally, the apparatus further comprises:

the judging module is used for judging whether the historical storage utilization rate average value is larger than the lower limit value of the disk storage utilization rate and smaller than the upper limit value of the disk storage utilization rate and whether the historical storage utilization rate average value exists in the disk storage utilization rate list;

Optionally, the first computing module specifically includes:

a fifth calculation submodule for passing through the formula

Optionally, the second calculating module specifically includes:

a sixth calculation submodule for passing through the formula

the disk storage utilization for a state of the disk,

and storing the utilization rate average value of the history of the jth disk of the ith node in the T period.

Optionally, the migration module specifically includes:

a seventh calculation submodule for passing the formula

Calculating the maximum transferable basic data of the disk to be transferred;

wherein BD_jpJ is the disk label information, p is the basic unit data label information,

a basic unit data union set conforming to the migration standard;

and the migration submodule is used for migrating the maximum migratable basic data in the disk to be migrated to the target disk.

According to the method and the device, the disk storage utilization rate list is generated through the instantaneous storage utilization rate mean value of each disk, and then the disks to be migrated and the target disk are determined according to the sequence of sequencing the instantaneous storage utilization rate mean value and the historical storage utilization rate mean value of each disk in the disk storage utilization rate list, so that data are migrated to the target disk from the disks to be migrated. The method is flexible, automatic and intelligent, and is embedded in the message middleware, so that a disk to be migrated with data priority and a target disk with data migrated with priority can be obtained, long-term tracking supervision control can be performed on the disk storage utilization rate, frequent disk data migration can be effectively avoided, the CPU of a cluster node is effectively avoided being wasted while each disk data is effectively balanced, disk IO (input/output) balances the cluster storage occupation, and the cluster storage utilization rate is greatly improved.

Drawings

Fig. 1 is a flowchart of a disk storage load balancing method according to a first embodiment of the present application;

fig. 2 is a flowchart of a disk storage load balancing method according to a second embodiment of the present application;

fig. 3 is a schematic structural diagram of a disk storage load balancing apparatus according to a third embodiment of the present application;

fig. 4 is a schematic structural diagram of a disk storage load balancing apparatus according to a fourth embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present invention better understood, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

The application provides a disk storage load balancing method and device. The following detailed description is made with reference to the drawings of the embodiments provided in the present application, respectively.

A method for balancing disk storage load provided in a first embodiment of the present application is as follows:

the execution subject of the embodiment of the present application is a node coordinator, and as shown in fig. 1, it shows a flowchart of a method for balancing a disk storage load provided by the embodiment of the present application, and includes the following steps.

And step S101, calculating the average value of the instantaneous storage utilization rate of each disk.

And step S102, calculating the average value of the historical storage utilization rate of each disk.

And step S103, generating a disk storage utilization rate list according to the average value of the instantaneous storage utilization rates of the disks.

And step S104, sorting the disks in the disk storage utilization ratio list according to the average value of the instantaneous storage utilization ratio and the average value of the historical storage utilization ratio of each disk in the disk storage utilization ratio list, and determining the disk to be migrated and the target disk according to the sorted sequence.

Step S105, migrating data from the disk to be migrated to the target disk.

A method for balancing disk storage load provided in a second embodiment of the present application is as follows:

the execution subject of the embodiment of the present application is a node coordinator, and as shown in fig. 2, it shows a flowchart of a method for balancing a disk storage load provided by the embodiment of the present application, and includes the following steps.

Step S201, calculating an average value of the instantaneous storage utilization of each disk.

In this embodiment, a Coordinator is used as a node Coordinator of a distributed message middleware to periodically collect multidimensional indexes of each node. Each node of the distributed cluster can transmit own state data and metadata to the node coordinators, and index data of each disk of the cluster are collected by the node coordinators and cached in a memory and stored in a distributed database or a time sequence database.

The multidimensional index refers to some basic stateless information of the collected cluster nodes, such as node information (node host names, node IPs, label information of each disk, storage utilization rate information of each disk, and the like), and basic unit data under each disk (basic unit data label information, storage utilization rate of the basic unit data, and the like). The acquired multidimensional indexes can also be stored in a time sequence database, and the movement of sudden increase and sudden decrease data is effectively avoided through the calculation of historical data, so that the waste of a node CPU and IO is avoided.

Preferably, the step of calculating an average value of instantaneous storage utilization rates of the respective disks specifically includes: by the formula

Calculating the average value of the instantaneous storage utilization rate of each disk; wherein i is the ith node, j is the jth block disk of the ith node, and the storage utilization rate of the jth block disk of the ith node is X_ijThe range of the node i is 0-n, and the range of the disk j on the node is 0-m.

In the step, the node coordinator calculates an instantaneous storage utilization average BU of each disk according to the collected node information and the storage utilization information of each disk in the multi-dimensional indexes of each node. BU means the average storage utilization per disk in each node, i.e., the average of the instantaneous storage utilization.

For example, if the node ranges from 0 to i, the disk ranges from 0 to j on the ith node, and the storage utilization rate of the jth disk of the ith node is X_ijThe range of the node i is 0-n, and the range of the disk j on the node is 0-m. Then, the formula for calculating the average value BU of the instantaneous storage utilization rates of the disks is as follows:

step S202, calculating the historical storage utilization rate average value of each disk.

In the step, the node coordinator calculates the historical storage utilization rate average value of each disk according to the collected node information and the storage utilization rate information of each disk in the multi-dimensional indexes of each node

Calculating the average value of the historical storage utilization rate of each disk according to the disk storage utilization rate information stored in the time sequence database

The meaning of (1) is the historical storage utilization average value of the jth disk of the ith node in the T period.

Preferably, the step of calculating the average value of the historical storage utilization rates of the respective disks specifically includes: by the formula

Calculating the average value of the historical storage utilization rate of each disk; wherein k is a label of disk storage utilization, n is a total of n different labels of disk storage utilization, t_kTags that are disk storage utilization have a long time axis,

the disk storage utilization for a state of the disk,

In this step, if the period is T, the value of T is configurable, and the default value is 24 hours; then the average of the historical storage utilization of a certain disk in the T period

Calculated by the following formula:

k denotes a label of disk storage utilization, n denotes a label of total n different disk storage utilization, t_kThe time axis owned by the tag representing disk storage utilization is long,

indicating the disk storage utilization for a certain state of the disk.

For example, assuming that the list of the disk storage utilization exceeding the storage upper limit acquires a disk named as/mnt/sata 01, in the disk history statistics, the disk storage utilization is 50% in 22 hours, the disk storage utilization is 60% in 1 hour, the disk storage utilization is 70% in 1 hour, and T is 24 hours. Then it can be calculated according to the above formula

Step S203, generating a disk storage utilization rate list according to the average value of the instantaneous storage utilization rates of the disks.

In this step, the list of disks whose disk storage utilization exceeds the threshold includes: the device comprises a first list and a second list, wherein the first list refers to a list with too high disk storage utilization rate, and the first list refers to a list with too low disk storage utilization rate. And determining the upper and lower limit thresholds of the storage utilization rate of the disk by setting the threshold of the effective fluctuation parameter.

Preferably, the disk storage utilization list includes: the step of generating a disk storage utilization list according to the average value of the instantaneous storage utilization of each disk specifically includes: respectively calculating the lower limit value and the upper limit value of the disk storage utilization rate of each disk according to the average value of the instantaneous storage utilization rate of each disk and a preset threshold value of a fluctuation parameter; determining a disk with a disk storage utilization rate larger than an upper limit value of the disk storage utilization rate to generate the first list; and determining the disk with the disk storage utilization rate smaller than the lower limit value of the disk storage utilization rate to generate the second list.

In this step, if the threshold of the fluctuation parameter is represented by VPT, VPT can be flexibly set according to the requirement. The lower limit value of the disk storage utilization rate of each disk is the disk storage minimum utilization rate, and the calculation formula is BU_min＝(BU-VPT)。

The upper limit value of the disk storage utilization rate of each disk is the maximum disk storage utilization rate, and the calculation formula is BU_max＝(BU+VPT)。

For each disk, if its storage utilization rate X_ij＞BU_maxThen the node coordinator enters the first list; if it has a storage utilization rate X_ij＜BU_minThen the second list is accounted for by the node coordinator. The data storage formats in the list are (disk name, disk storage utilization rate) for both the first list and the second list.

Step S204, judging whether the average value of the historical storage utilization rate is larger than the lower limit value of the disk storage utilization rate and smaller than the upper limit value of the disk storage utilization rate, and whether the average value of the historical storage utilization rate exists in the disk storage utilization rate list, if so, executing step S205; if not, the process ends.

Step S205, deleting the disk corresponding to the average value of the historical storage utilization in the disk storage utilization list.

In the above steps, after the disk storage utilization rate list is obtained, the disk storage utilization rate list is further filtered.

The specific filtering scheme is as follows: and updating the disk storage utilization rate list based on the pre-calculated historical storage utilization rate average value of each disk in the disk storage utilization rate list. If the average value of the historical storage utilization rate is (BU)_min,BU_max) Within range, it is removed from the first list or the second list. If it is not

Is at (BU)_min,BU_max) And if the range is out of the range, the subsequent processing is executed without processing.

A critical suppression value calculated from historical data stored in the time series data

I.e., historical storage utilization averages. Critical inhibition value

The innovation of the method has the advantage that a scientific stable value is calculated through historical data, and the value can be used as a scientific reference value of the storage utilization rate of the disk in the period time. By critical inhibition value

Some disks with suddenly increased storage can be filtered out, and are filtered out from a disk storage utilization rate list to be worth

And the data migration of the disk is carried out after the data meets the standard, so that the waste of a CPU, a memory and a disk IO (input/output) caused by frequent disk data migration can be effectively avoided, and the influence on the data service operation of the message middleware is avoided.

Step S206, sorting the disks in the disk storage utilization ratio list according to the average value of the instantaneous storage utilization ratio and the average value of the historical storage utilization ratio of each disk in the disk storage utilization ratio list, and determining the disk to be migrated and the target disk according to the sorted sequence.

In this step, the disk to be migrated with data migrated first and the target disk to which data is migrated first are determined based on the disk storage utilization lists updated in step S204 and step S205.

Preferably, the step of sorting the disks in the disk storage utilization ratio list according to the instantaneous storage utilization ratio average value and the historical storage utilization ratio average value of each disk in the disk storage utilization ratio list, and determining the disk to be migrated and the target disk according to a sorted sequence specifically includes: calculating a first difference absolute value of the average value of the instantaneous storage utilization rate and the average value of the historical storage utilization rate of each disk in the first list, and calculating a second difference absolute value of the average value of the instantaneous storage utilization rate and the average value of the historical storage utilization rate of each disk in the second list; sequencing the disks in the first list according to a descending order of the first difference absolute values to generate a first sequence, and sequencing the disks in the second list according to a descending order of the second difference absolute values to generate a second sequence; and determining the disks to be migrated according to the priority of the first sequence, and determining the target disks according to the priority of the second sequence, wherein the priority of the disks in the first sequence and the second sequence is the highest.

In this step, the absolute value of the difference | X of each disk in the updated disk storage utilization list is first calculated_ijG|＝|X_ijGnAnd BU |, the disks in the first list correspondingly calculate a first absolute difference value, and the disks in the second list correspondingly calculate a second absolute difference value.

Then, according to X in the first list_ijGAnd sorting the disks in the first list according to the sequence from large to small to obtain a first sequence. According to X in the second list_ijGAnd sorting the disks in the second list according to the sequence from large to small to obtain a second sequence. That is, the list with the over-high disk storage utilization rate correspondingly generates a sequence of first difference absolute values of one disk, and the list with the over-low disk storage utilization rate also correspondingly generates a sequence of second difference absolute values of one disk.

Finally, according to the first sequenceDetermining the disk to be migrated according to the priority, and determining the absolute value X of the difference value_ijGThe large disks, namely the disks ranked before, are subjected to data migration first, and the data load is balanced first. Determining the target disk according to the priority of the second sequence, and the absolute value X of the difference_ijGThe large disk, i.e. the disk ordered before, performs data migration first.

For example, the node coordinators collect disk storage data for each data node. Firstly, storing in memory, calculating BU, BU_min，BU_max. Meanwhile, real-time data are stored in a time sequence database, such as mainstream leveldb, influxdb and the like. Here, level db is more appropriate in view of the scene. Reading the historical data of the time sequence database to calculate the average value of the historical storage utilization rate

According to

And judging whether a disk with data burst increase or decrease exists in the disk storage utilization rate list or not according to the value, wherein the disk with data burst increase or decrease needs to be removed from the disk storage utilization rate list. And sorting the filtered disk storage utilization rate list according to the size, and screening out a disk to be migrated with data preferentially and a target disk with data preferentially migrated according to the sorted first sequence and second sequence.

Step S207, migrating the data from the disk to be migrated to the target disk.

In this step, data is migrated from the disk to be migrated, to which data is preferentially migrated, to the target disk, to which data is preferentially migrated.

Preferably, the step of migrating the data from the disk to be migrated to the target disk specifically includes: by the formula

Calculating the maximum transferable basic data of the disk to be transferred; wherein, BD_jpJ is the disk label information, p is the basic unit data label information,

a basic unit data union set conforming to the migration standard; and migrating the maximum transferable basic data in the disk to be migrated to the target disk.

In this step, the subject data to be migrated is first determined based on the acquired disk to be migrated. According to the information (node, disk name) of the disk to be migrated preferentially acquired by the node coordinator, and the basic unit data (basic unit data label information, basic unit data storage utilization rate and the like) under each disk, the method passes through a formula

And calculating subject data needing to be migrated under the disk to be migrated.

Wherein, BD_jpJ is the magnetic disk label information and has the same meaning as j in a formula for calculating the average value of the instantaneous storage utilization rate, and p is the basic unit data label information. Data is stored and utilized more than BU from the disk_maxThe storage utilization rate of the magnetic disk transferred to the magnetic disk is less than BU_minThe magnetic disk of (a) a (b),

it is the VPT that is calculated,

calculating the basic unit data union meeting the migration standard to ensure that the BD after migration_jpThe BU can be approached to the maximum extent, and the load balance of the disk is kept. Then through the max function

Taking the maximum value, namely determining the maximum transferable basic data BD required to be transferred by the disk to be transferred_maxNamely, the disk to be migrated determines the subject data to be migrated.

After determining that the to-be-migrated disk determines the subject data to be migrated, data migration may be performed. The specific migration process is as follows:

the node coordinator commands the disk to be migrated to use the basic unit data BD obtained in the step_maxFirstly copying to a target disk, monitoring the disk storage utilization rate in real time by a node coordinator in the copying process, and writing data updating records into a source database (Metadata base) by a migration data node according to data changes. And the source database confirms that the data storage state of the nodes is normal according to the record comparison so as to ensure that the nodes with large storage capacity of the original disk are instructed by the node coordinator to delete the copied data after the data copying is normally finished.

For example: the node coordinator calculates that part of unit data of the disk 1 needs to be migrated to the disk 2 according to the algorithm, and the process is a process of copying the data to the disk 2 first and then removing the copied part of the data of the disk 1. In the whole process, the node coordinators are responsible for management work such as coordinated scheduling, and the source database is responsible for metadata work such as change records.

The embodiment of the application has the following beneficial effects:

the method for updating the disk storage utilization rate list according to the average value of the historical storage utilization rate of the disk in the T period can perform long-term tracking supervision control on the disk with overhigh disk storage utilization rate and overhigh disk storage utilization rate, can effectively avoid frequent disk data migration, causes the waste of a CPU, an internal memory and a disk IO, and avoids influencing the data service operation of the message middleware.

And 2, sorting the disks in the disk storage utilization ratio list according to the absolute difference value of the historical storage utilization ratio mean value and the instantaneous storage utilization ratio mean value to obtain a disk to be migrated with data migrated preferentially and a target disk with data migrated preferentially, so that whether the basic unit data reaches the migration standard, the migration urgency and the migration priority can be judged, and the load balance of the cluster level can be guaranteed to the greatest extent.

And 3, the threshold value of the fluctuation parameter can be flexibly set, the average value of the instantaneous storage utilization rate of the disk level can be rapidly, accurately and efficiently calculated, and a reliable basis is provided for the confirmation, screening and sorting of the subsequent disk storage utilization rate list.

A disk storage load balancing apparatus provided in a third embodiment of the present application is as follows:

in the foregoing embodiment, a disk storage load balancing method is provided, and correspondingly, the present application also provides a disk storage load balancing apparatus.

Fig. 3 is a schematic structural diagram illustrating a disk storage load balancing apparatus according to an embodiment of the present application, and includes the following modules.

The first calculation module 11 is configured to calculate an instantaneous storage utilization average of each disk;

the second calculating module 12 is configured to calculate an average value of historical storage utilization rates of the disks;

the list generating module 13 is configured to generate a disk storage utilization list according to the average value of the instantaneous storage utilization of each disk;

a determining module 14, configured to sort, according to the instantaneous storage utilization average value and the historical storage utilization average value of each disk in the disk storage utilization list, and determine, according to a sorted sequence, a disk to be migrated and a target disk;

and the migration module 15 is configured to migrate data from the disk to be migrated to the target disk.

A disk storage load balancing apparatus provided in a fourth embodiment of the present application is as follows:

optionally, as shown in fig. 4, which shows a schematic structural diagram of a disk storage load balancing apparatus provided in the embodiment of the present application, based on the third embodiment, in the embodiment of the present application, the disk storage utilization list includes: the list generating module 13 specifically includes (not shown in the figure):

Optionally, as shown in fig. 4, the determining module 14 specifically includes (not shown in the figure):

optionally, as shown in fig. 4, the apparatus further includes:

a judging module 16, configured to judge whether the average historical storage utilization rate is greater than a lower limit of the disk storage utilization rate and less than an upper limit of the disk storage utilization rate, and whether the average historical storage utilization rate exists in the disk storage utilization rate list;

and a deleting module 17, configured to delete, if yes, the disk corresponding to the average value of the historical storage utilization in the disk storage utilization list.

Optionally, as shown in fig. 4, the first calculating module 11 specifically includes (not shown in the figure):

a fifth calculation submodule for passing through the formula

Optionally, as shown in fig. 4, the second calculating module 12 specifically includes:

a sixth calculation submodule for passing through the formula

the disk storage utilization for a state of the disk,

Optionally, as shown in fig. 4, the migration module 15 specifically includes (not shown in the figure):

a seventh calculation submodule for passing the formula

Calculating the maximum transferable basic data of the disk to be transferred;

number of base units to meet migration criteriaAccording to the union set;

It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims

1. A method for balancing disk storage load, comprising:

migrating data from the disk to be migrated to the target disk;

the step of calculating the historical storage utilization average value of each disk specifically includes:

by the formula

the disk storage utilization for a state of the disk,

the historical storage utilization rate average value of the jth disk of the ith node in the T period;

the list of disk storage utilizations comprises: the step of generating a disk storage utilization list according to the average value of the instantaneous storage utilization of each disk specifically includes:

determining a disk with a disk storage utilization rate smaller than a lower limit value of the disk storage utilization rate to generate the second list;

the step of sorting the disks in the disk storage utilization ratio list according to the average value of the instantaneous storage utilization ratio and the average value of the historical storage utilization ratio of each disk in the disk storage utilization ratio list, and determining the disk to be migrated and the target disk according to the sorted sequence specifically comprises the steps of:

2. The method for load balancing of disk storage according to claim 1, further comprising:

3. The method for load balancing of disk storage according to any one of claims 1 to 2, wherein the step of calculating the average value of the instantaneous storage utilization of each disk specifically includes:

by the formula

4. The method for load balancing of disk storage according to claim 1, wherein the step of migrating data from the disk to be migrated to the target disk specifically includes:

by the formula

Calculating the maximum transferable basic data of the disk to be transferred;

wherein, BD_jpStoring the utilization rate for the basic unit data, j is the disk label information, pIn order to tag information for the base unit data,

a basic unit data union set conforming to the migration standard;

5. An apparatus for load balancing disk storage, comprising:

the migration module is used for migrating data from the disk to be migrated to the target disk;

the second calculation module specifically includes:

a sixth calculation submodule for passing through the formula

disk storage utilization for a state of a disk，

the list of disk storage utilizations comprises: the list generation module specifically comprises:

the second list generation submodule is used for determining the disk with the disk storage utilization rate smaller than the lower limit value of the disk storage utilization rate so as to generate the second list;

the determining module specifically includes:

and the determining submodule is used for determining the disk to be migrated according to the priority of the first sequence and determining the target disk according to the priority of the second sequence, wherein the priority of the disk which is sequenced at the front in the first sequence and the second sequence is the highest.

6. The apparatus for disk storage load balancing according to claim 5, further comprising:

7. The apparatus for load balancing of disk storage according to any one of claims 5 to 6, wherein the first computing module specifically includes:

a fifth calculation submodule for passing through the formula

8. The apparatus for load balancing of disk storage according to claim 5, wherein the migration module specifically includes:

a seventh calculation submodule for passing the formula

Calculating the maximum transferable basic data of the disk to be transferred;

to bases meeting the migration criteriaA base unit data union set;