CN109521958B

CN109521958B - Delay processing method and device for data distribution

Info

Publication number: CN109521958B
Application number: CN201811232307.9A
Authority: CN
Inventors: 甄天桥
Original assignee: Zhengzhou Yunhai Information Technology Co Ltd
Current assignee: Zhengzhou Yunhai Information Technology Co Ltd
Priority date: 2018-10-22
Filing date: 2018-10-22
Publication date: 2022-02-18
Anticipated expiration: 2038-10-22
Also published as: CN109521958A

Abstract

The application discloses a delay processing method and a delay processing device for data distribution, wherein the method comprises the following steps: when detecting that a first trigger event triggering a cluster to perform data distribution occurs, starting a delay timer to time, detecting whether a second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration, restarting the delay timer to time if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, and redistributing data in the cluster until the timing duration of the delay timer reaches the preset duration. Therefore, by combining the trigger events for triggering all the clusters to perform data distribution in a short time, the clusters only need to perform data distribution once, so that the computing resources of the clusters which need to be consumed can be reduced.

Description

Delay processing method and device for data distribution

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a delay processing method and apparatus for data distribution.

Background

In each cluster of the distributed storage system, if a hard disk is pulled out, the state of data stored on the hard disk is usually set to be a degraded state, that is, an abnormal state waiting for processing the data, and when it is determined that the length of time for pulling out the hard disk exceeds a certain threshold, the hard disk is considered to have permanently exited from the cluster where the hard disk is located, the data stored on the hard disk is restored to other hard disks through a data recovery process, and the data in the cluster is redistributed; and when the hard disk which is considered to be permanently quit the cluster is added into the cluster again or other new hard disks are added into the cluster, the data in the cluster can be balanced to the newly added hard disks through a data balancing process, and the redistribution of the cluster data is completed. That is, when there is a permanent exit of a hard disk from a cluster or a new hard disk joining the cluster, the data on the cluster is typically redistributed.

In practical application, there will usually be an event that the hard disks in multiple clusters are pulled out and/or multiple hard disks are added to the clusters within a short time, and the pulling out and/or insertion of the hard disks are not completed at the same time, that is, there is a certain time interval between the pulling out and/or insertion of different hard disks, so that data on the clusters will be distributed multiple times within a short time, consuming more computing resources in the clusters, and more data in the clusters will be migrated multiple times meaningless among different hard disks, which results in a waste of cluster computing resources to a certain extent.

Disclosure of Invention

The embodiment of the application provides a delay processing method and device for data distribution, so as to avoid the waste of cluster computing resources caused by multiple times of data migration of data in a cluster.

In a first aspect, an embodiment of the present application provides a method for processing delay of data distribution, where the method includes:

when detecting that a first trigger event triggering a cluster to perform data distribution exists, starting a delay timer to time;

before the timing duration of the delay timer reaches a preset duration, detecting whether a second trigger event triggering the cluster to perform data distribution exists or not;

if it is determined that a second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration, restarting the delay timer to perform timing;

and when the timing duration of the delay timer reaches the preset duration, redistributing the data in the cluster.

In some possible embodiments, the detecting that there is a first trigger event for data distribution includes:

detecting that there is a first external memory to exit or join the cluster;

the detecting whether there is a second trigger event triggering the cluster to perform data distribution includes:

detecting whether a second external memory exits or joins the cluster.

In some possible embodiments, the external memory is embodied as a hard disk.

In some possible embodiments, the method further comprises:

recording a first version identification of an external memory state recording file in the cluster when the delay timer is started for timing;

when the time length of the delay timer reaches the preset time length, acquiring a second version identifier of an external memory state recording file in the cluster;

the redistributing the data in the cluster when the timing duration of the delay timer reaches the preset duration includes:

and when the timing duration of the delay timer reaches the preset duration and the first version identification is inconsistent with the second version identification, redistributing the data in the cluster.

In some possible embodiments, the method further comprises:

and when the first version identification is consistent with the second version identification, restarting the delay timer for timing.

In some possible embodiments, the second version identification is larger than the first version identification, the method further comprising:

and after the data in the cluster are redistributed, clearing the version number of the external memory state recording file.

In some possible embodiments, the redistributing the data includes:

determining a third external memory added to the cluster within a target time period, wherein the target time period is a time period from a first time when the first trigger event is detected to a second time when the timing duration of the delay timer reaches the preset duration;

allocating a portion of the data in the cluster to the third external memory;

determining a fourth external memory that exits the cluster within the target time period;

restoring data stored on the fourth external memory to a fifth external memory in the cluster.

In a second aspect, an embodiment of the present application further provides a delay processing apparatus for data distribution, where the apparatus includes:

the starting unit is used for starting the delay timer to time when detecting that a first trigger event triggering the cluster to perform data distribution exists;

the detection unit is used for detecting whether a second trigger event triggering the cluster to perform data distribution exists or not before the timing duration of the delay timer reaches a preset duration;

the first restarting unit is used for restarting the delay timer to time if determining that a second triggering event for triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration;

and the data distribution unit is used for redistributing the data in the cluster when the timing duration of the delay timer reaches the preset duration.

In some possible embodiments, the detecting that there is a first trigger event for data distribution, specifically detecting that there is a first external storage exiting or joining the cluster;

the detecting unit is specifically configured to detect whether a second external memory exits or joins the cluster.

In some possible embodiments, the external memory is embodied as a hard disk.

In some possible embodiments, the apparatus further comprises:

the recording unit is used for recording a first version identifier which represents the state of an external memory in the cluster when the delay timer is started for timing;

the obtaining unit is used for obtaining a second version identifier representing the state of an external memory in the cluster when the time length of the delay timer reaches the preset time length;

the data distribution unit is specifically configured to redistribute the data in the cluster when the timing duration of the delay timer reaches the preset duration and the first version identifier is inconsistent with the second version identifier.

In some possible embodiments, the apparatus further comprises:

and the second restarting unit is used for restarting the delay timer for timing when the first version identifier is consistent with the second version identifier.

In some possible embodiments, the second version identification is larger than the first version identification, and the apparatus further comprises:

and the zero clearing unit is used for clearing the version number of the external memory state recording file after the data in the cluster is redistributed.

In some possible embodiments, the data distribution unit includes:

a first determining subunit, configured to determine a third external memory to be added to the cluster within a target time period, where the target time period is a time period from a first time when the first trigger event is detected to a second time when it is determined that the timing duration of the delay timer reaches the preset duration;

an allocation subunit, configured to allocate a part of the data in the cluster to the third external storage;

a second determining subunit, configured to determine a fourth external memory exiting the cluster within the target time period;

a restoring subunit, configured to restore the data stored in the fourth external storage to a fifth external storage in the cluster.

In the implementation manner of the embodiment of the present application, trigger events for data distribution of all trigger clusters in a short time are combined, so that the clusters perform data distribution only once, thereby reducing consumption of computing resources of the clusters, and reducing waste of the computing resources of the clusters due to unnecessary migration of part of data in the clusters for multiple times. Specifically, when it is detected that a first trigger event triggering a cluster to perform data distribution occurs, a delay timer may be started to time, and before the timing duration of the delay timer reaches a preset duration, it is detected whether a second trigger event triggering the cluster to perform data distribution exists, and if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, the delay timer is restarted to time until the data in the cluster is redistributed when the timing duration of the delay timer reaches the preset duration. It can be seen that, by merging the trigger events that trigger the data distribution of all the clusters in a short time, the clusters do not need to execute multiple data distributions based on multiple trigger events in a short time, but only execute one data distribution, so that the data in the clusters only need to be migrated once between different hard disks, thereby avoiding the waste of cluster computing resources caused by multiple data migrations of the data in the clusters, and the computing resources consumed by the clusters only executing one data distribution are less than the computing resources consumed by executing multiple data distributions, thereby improving the performance of the clusters to a certain extent.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 is a schematic diagram of an application scenario in an embodiment of the present application;

fig. 2 is a schematic flow chart of a delay processing method for data distribution according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a delay processing apparatus for data distribution in an embodiment of the present application.

Detailed Description

For each cluster of the distributed storage system, if there is a permanent exit cluster of hard disks in the cluster or a new hard disk joining the cluster, the data on the cluster usually needs to be redistributed. If events of pulling out a plurality of hard disks from a cluster and/or inserting a plurality of hard disks into the cluster occur within a short time, it can be understood that the events of pulling out the plurality of hard disks and inserting the plurality of hard disks into the cluster do not occur at the same time, that is, there is a certain time interval between the pulling out and/or inserting of different hard disks, which makes data on the cluster perform data distribution many times within a short time, which makes part of data in the cluster perform meaningless migration between different hard disks in the data distribution performed by the cluster many times, which wastes certain computing resources of the cluster, and also consumes more computing resources of the cluster due to data distribution performed many times within a short time, which affects the performance of the cluster.

In order to solve the foregoing technical problem, an embodiment of the present application provides a delay processing method for data distribution, in which trigger events for triggering all clusters to perform data distribution in a short time are combined, so that the clusters perform data distribution only once, thereby reducing consumption of computing resources of the clusters, and reducing waste of the computing resources of the clusters due to unnecessary migration of part of data in the clusters. Specifically, when it is detected that a first trigger event triggering a cluster to perform data distribution occurs, a delay timer may be started to time, and before the timing duration of the delay timer reaches a preset duration, it is detected whether a second trigger event triggering the cluster to perform data distribution exists, and if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, the delay timer is restarted to time until the data in the cluster is redistributed when the timing duration of the delay timer reaches the preset duration.

It can be seen that, by merging the trigger events that trigger the data distribution of all the clusters in a short time, the clusters do not need to execute multiple data distributions based on multiple trigger events in a short time, but only execute one data distribution, so that the data in the clusters only need to be migrated once between different hard disks, thereby avoiding the waste of cluster computing resources caused by multiple data migrations of the data in the clusters, and the computing resources consumed by the clusters only executing one data distribution are less than the computing resources consumed by executing multiple data distributions, thereby improving the performance of the clusters to a certain extent.

For example, the embodiments of the present application may be applied to an exemplary application scenario as shown in fig. 1. In this scenario, a cluster includes a master node and a plurality of slave nodes. The master node may continuously detect whether a first hard disk is added to or withdrawn from the master node or the slave node in the cluster, if so, start the delay timer to time, and in the process of timing by the delay timer, if the master node further detects that a second hard disk is added to or withdrawn from the master node or the slave node in the cluster, restart the delay timer, so that the delay timer restarts to time. When the timing duration of the delay timer reaches the preset duration, the master node may control redistribution of the data in the cluster. In this way, although a plurality of hard disks are added to or withdrawn from the cluster in a short time, the master node only needs to control the distribution of data in the cluster once, so that the computing resources consumed by the cluster for data distribution in a short time can be reduced.

It is to be understood that the above scenario is only one example of a scenario provided in the embodiment of the present application, and the embodiment of the present application is not limited to this scenario.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, various non-limiting embodiments accompanying the present application examples are described below with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a delay processing method for data distribution in an embodiment of the present application, where the method specifically includes:

s201: and when detecting that a first trigger event triggering the cluster to perform data distribution exists, starting a delay timer to time.

In practical application, if an external memory such as a hard disk and a floppy disk exists in a cluster and exits the cluster, or a user adds a new external memory to the cluster, the cluster is triggered to perform data distribution. If the master node in the cluster detects that there is a trigger event that triggers the cluster to perform data distribution, such as adding or exiting the external memory, the master node may start a delay timer to perform timing.

S202: and before the timing duration of the delay timer reaches a preset duration, detecting whether a second trigger event triggering the cluster to perform data distribution exists.

It should be noted that in this embodiment, in order to merge all the trigger events occurring in the cluster within a period of time, when detecting that the first trigger event occurs, the master node may not immediately redistribute the data in the cluster, but start the delay timer to start timing. And when the timing duration of the delay timer reaches the preset duration, the master node can continuously detect whether a second trigger event for triggering the cluster to perform data distribution still exists in the cluster.

Similar to the first trigger event, the second trigger event may also be an event that the master node detects that there is an external storage to join or leave the cluster, and the like, which triggers the cluster to perform data distribution.

In most scenarios of practical applications, the external storage that joins or exits the cluster may be a hard disk.

S203: and if determining that a second trigger event for triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, restarting the delay timer to perform timing.

S204: and when the timing duration of the delay timer reaches the preset duration, redistributing the data in the cluster.

It should be noted that, if the master node detects that a second trigger event occurs in the cluster before the time length of the delay timer reaches the preset time length, the master node does not immediately redistribute the data in the cluster, but restarts the delay timer, so that the delay timer restarts to start timing again. Thus, although the first trigger event and the second trigger event occur in a short time, the two trigger events do not immediately trigger the cluster to perform data distribution, but perform data re-execution when the waiting delay timer reaches a preset time. This means that the master node combines the first trigger event and the second trigger event, so that the cluster that originally needs to execute data distribution twice only needs to execute data distribution once, and thus, the number of times that the cluster needs to execute data distribution in a short time can be reduced, and thus, the computing resources that the cluster needs to consume for executing data distribution in a short time can be reduced, and the waste of computing resources caused by unnecessary migration of part of data in the cluster among a plurality of hard disks is avoided.

It is understood that the present embodiment is only exemplified by the case that two triggering events exist in the cluster in a short time. In practical applications, after detecting the second trigger event and restarting the delay timer to start timing, if the master node further detects a third trigger event that triggers the cluster to perform data distribution before the time duration for restarting the timing by the delay timer reaches a preset time duration, the master node may restart the delay timer again, so that the delay timer restarts timing based on occurrence of the third trigger event. And repeating the process until the master node does not detect a triggering event triggering the cluster to distribute the data again in the process that the timing duration of the delay timer reaches the preset duration, so as to redistribute the data in the cluster.

As a specific implementation example of redistributing data in a cluster, a master node may determine a third external storage device that is added to the cluster within a target time period, and certainly, the number of the third external storage devices may be one or more, where the target time period is a time period from a first time when the master node detects that a first trigger event occurs to a second time when the master node determines that a timing duration of a delay timer reaches a preset duration; then, the master node may allocate part of the data in the cluster to a newly added third external storage according to a preset data distribution policy, so that the data in the cluster are uniformly distributed on all the hard disks in the cluster; then, the master node may determine that a fourth external storage of the cluster exits within the target time period, and similarly, the fourth external storage may be one or more; finally, the master node may start a corresponding data recovery process to recover the data on the fourth external storage exiting the cluster to the fifth external storage in the cluster, that is, after the fourth external storage exits the cluster, the data on the fourth external storage is not lost, so that the data in the cluster may be evenly distributed to each external storage in the cluster, and the redistribution of the data is completed.

In practical application, when an external storage exists in a cluster and joins or leaves the cluster, the record file of the state of the external storage in the cluster is also changed correspondingly. For example, when an external memory is added to the cluster, the record file of the original external memory state is updated to a record file containing the state of the newly added external memory. Then, in one example, a version identification may be added to the record file of the external memory state, such as a version number of the record file. After the trigger event occurs, the delay timer is started, and simultaneously, the record file updated based on the trigger event is generated, so that the first version identification of the updated record file can be recorded, when the timing duration of the delay timer reaches the preset duration, the second version identification of the record file in the current latest external storage state in the cluster is obtained, and if the state change of the hard disk is finished with persistent storage, the second version identification of the record file in the external storage state is not the same as the first version identification. Therefore, in order to avoid the influence of the incomplete persistent storage of the state change of the external storage on the redistribution of the data in the cluster, whether the persistent storage of the state change of the hard disk is completed or not can be determined by comparing the version identifiers of the record files of the state of the external storage, and if the state change of the external storage on the main node is not completed and the persistent storage is caused by reasons such as busy service of the main node, the redistribution of the data in the cluster can be refused even if the timing duration of the delay timer reaches the preset duration. For example, in some embodiments, it may be that the master node determines that the timing duration of the delay timer reaches a preset duration, and when the first version identifier of the record file is inconsistent with the second version identifier, the master node allows the redistribution of the data in the cluster.

Further, when the master node determines that the first version identifier of the log file is consistent with the second version identifier, the master node may restart the delay timer to wait for the state change of the external memory to implement persistent storage because the state change of the external memory does not complete persistent storage.

In practical applications, the version identifier of the record file can be represented by a number. For example, the first version identification may be similar to "version 1" and the second version identification may be similar to "version 2". Based on this, the number corresponding to the version identifier of the current latest record file is larger than the number corresponding to the previously recorded version identifier, and when determining whether to allow redistribution of the data in the cluster, it may be determined whether the second version identifier of the record file is larger than the first version identifier, and if so, redistribution of the data in the cluster is allowed. Further, in order to avoid the influence of the delay processing of the data distribution on the delay processing of the next data distribution, the version identifier of the record file may be cleared after the delay processing is completed.

In some possible application scenarios, a master node in a cluster may be switched, i.e., the master node in the cluster may be switched from a current node to another node in the cluster. If the master node needs to complete the switching of the master node in the cluster during the process of performing the delay processing on the data distribution in the cluster, in order to ensure that the delay processing on the data distribution can be continued, the master node after switching in the cluster may be instructed to continue to perform the delay processing by using the version identifier of the record file. Specifically, the switched master node may check whether the version identifier of the record file is zero, and if so, it indicates that the delay processing is not required for the data distribution in the cluster at present; if not, the delay processing is carried out on the data distribution in the cluster, and then whether an event waiting for the delay processing exists or not can be searched, if the event does not exist, an event waiting for the delay processing can be created, a delay timer is started to start timing, the process of the delay processing is executed, if the event exists, the process of the delay processing on the data distribution in the cluster can be continued based on the existing event waiting for the delay processing, and additional processing is not needed.

In this embodiment, trigger events for data distribution of all trigger clusters in a short time are combined, so that the clusters perform data distribution only once, thereby reducing consumption of computing resources of the clusters, and reducing waste of the computing resources of the clusters due to unnecessary migration of part of data in the clusters. Specifically, when it is detected that a first trigger event triggering a cluster to perform data distribution occurs, a delay timer may be started to time, and before the timing duration of the delay timer reaches a preset duration, it is detected whether a second trigger event triggering the cluster to perform data distribution exists, and if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, the delay timer is restarted to time until the data in the cluster is redistributed when the timing duration of the delay timer reaches the preset duration. It can be seen that, by merging the trigger events that trigger the data distribution of all the clusters in a short time, the clusters do not need to execute multiple data distributions based on multiple trigger events in a short time, but only execute one data distribution, so that the data in the clusters only need to be migrated once between different hard disks, thereby avoiding the waste of cluster computing resources caused by multiple data migrations of the data in the clusters, and the computing resources consumed by the clusters only executing one data distribution are less than the computing resources consumed by executing multiple data distributions, thereby improving the performance of the clusters to a certain extent.

In addition, the embodiment of the application also provides a delay processing device for data distribution. Referring to fig. 3, fig. 3 is a schematic structural diagram illustrating a delay processing apparatus for data distribution according to an embodiment of the present application, where the apparatus 300 includes:

a starting unit 301, configured to start a delay timer to time when detecting that a first trigger event triggering a cluster to perform data distribution exists;

a detecting unit 302, configured to detect whether a second trigger event triggering the cluster to perform data distribution exists before a timing duration of the delay timer reaches a preset duration;

a first restarting unit 303, configured to restart the delay timer for timing if it is determined that a second trigger event that triggers the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration;

a data distribution unit 304, configured to redistribute the data in the cluster when the timing duration of the delay timer reaches the preset duration.

the detecting unit 302 is specifically configured to detect whether there is a second external memory exiting from or joining the cluster.

In some possible embodiments, the external memory is embodied as a hard disk.

In some possible embodiments, the apparatus 300 further comprises:

In some possible embodiments, the data distribution unit 304 includes:

In this embodiment, it can be seen that, by merging trigger events that trigger data distribution of all trigger clusters in a short time, the clusters do not need to execute data distribution multiple times based on multiple trigger events in a short time, but only execute data distribution once, and thus, data in the clusters only need to be migrated once between different hard disks, so that the waste of cluster computing resources due to multiple times of data migration of the data in the clusters can be avoided, and the computing resources consumed by the clusters for executing data distribution only once are less than the computing resources consumed by executing data distribution multiple times, so that the performance of the clusters can be improved to a certain extent.

In the names of "first external memory", "first trigger event", "first version identifier", "first restart unit", and the like, the "first" mentioned in the embodiments of the present application is used only for name identifiers, and does not represent the first in sequence. The same applies to "second", "third", "fourth", "fifth", etc.

As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that all or part of the steps in the above embodiment methods can be implemented by software plus a general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a read-only memory (ROM)/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network communication device such as a router) to execute the method according to the embodiments or some parts of the embodiments of the present application.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only an exemplary embodiment of the present application, and is not intended to limit the scope of the present application.

Claims

1. A method for processing delay of data distribution, the method comprising:

when the timing duration of the delay timer reaches the preset duration, redistributing the data in the cluster;

the method further comprises the following steps:

when the timing duration of the delay timer reaches the preset duration, acquiring a second version identifier of an external memory state recording file in the cluster;

2. The method of claim 1, wherein the detecting that there is a first trigger event for data distribution comprises:

detecting that there is a first external memory to exit or join the cluster;

detecting whether a second external memory exits or joins the cluster.

3. Method according to claim 2, characterized in that the external memory is embodied as a hard disk.

4. The method of claim 1, further comprising:

5. The method of claim 1 or 4, wherein the second version identification is larger than the first version identification, the method further comprising:

6. The method of claim 1, wherein the redistributing data in the cluster comprises:

allocating a portion of the data in the cluster to the third external memory;

7. A data-distributed latency processing apparatus, the apparatus comprising:

the data distribution unit is used for redistributing the data in the cluster when the timing duration of the delay timer reaches the preset duration;

the device further comprises:

the obtaining unit is used for obtaining a second version identifier representing the state of an external memory in the cluster when the timing duration of the delay timer reaches the preset duration;

8. The apparatus of claim 7, further comprising: