CN109521958B - Delay processing method and device for data distribution - Google Patents

Delay processing method and device for data distribution Download PDF

Info

Publication number
CN109521958B
CN109521958B CN201811232307.9A CN201811232307A CN109521958B CN 109521958 B CN109521958 B CN 109521958B CN 201811232307 A CN201811232307 A CN 201811232307A CN 109521958 B CN109521958 B CN 109521958B
Authority
CN
China
Prior art keywords
cluster
delay timer
data distribution
data
timing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811232307.9A
Other languages
Chinese (zh)
Other versions
CN109521958A (en
Inventor
甄天桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811232307.9A priority Critical patent/CN109521958B/en
Publication of CN109521958A publication Critical patent/CN109521958A/en
Application granted granted Critical
Publication of CN109521958B publication Critical patent/CN109521958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a delay processing method and a delay processing device for data distribution, wherein the method comprises the following steps: when detecting that a first trigger event triggering a cluster to perform data distribution occurs, starting a delay timer to time, detecting whether a second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration, restarting the delay timer to time if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, and redistributing data in the cluster until the timing duration of the delay timer reaches the preset duration. Therefore, by combining the trigger events for triggering all the clusters to perform data distribution in a short time, the clusters only need to perform data distribution once, so that the computing resources of the clusters which need to be consumed can be reduced.

Description

Delay processing method and device for data distribution
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a delay processing method and apparatus for data distribution.
Background
In each cluster of the distributed storage system, if a hard disk is pulled out, the state of data stored on the hard disk is usually set to be a degraded state, that is, an abnormal state waiting for processing the data, and when it is determined that the length of time for pulling out the hard disk exceeds a certain threshold, the hard disk is considered to have permanently exited from the cluster where the hard disk is located, the data stored on the hard disk is restored to other hard disks through a data recovery process, and the data in the cluster is redistributed; and when the hard disk which is considered to be permanently quit the cluster is added into the cluster again or other new hard disks are added into the cluster, the data in the cluster can be balanced to the newly added hard disks through a data balancing process, and the redistribution of the cluster data is completed. That is, when there is a permanent exit of a hard disk from a cluster or a new hard disk joining the cluster, the data on the cluster is typically redistributed.
In practical application, there will usually be an event that the hard disks in multiple clusters are pulled out and/or multiple hard disks are added to the clusters within a short time, and the pulling out and/or insertion of the hard disks are not completed at the same time, that is, there is a certain time interval between the pulling out and/or insertion of different hard disks, so that data on the clusters will be distributed multiple times within a short time, consuming more computing resources in the clusters, and more data in the clusters will be migrated multiple times meaningless among different hard disks, which results in a waste of cluster computing resources to a certain extent.
Disclosure of Invention
The embodiment of the application provides a delay processing method and device for data distribution, so as to avoid the waste of cluster computing resources caused by multiple times of data migration of data in a cluster.
In a first aspect, an embodiment of the present application provides a method for processing delay of data distribution, where the method includes:
when detecting that a first trigger event triggering a cluster to perform data distribution exists, starting a delay timer to time;
before the timing duration of the delay timer reaches a preset duration, detecting whether a second trigger event triggering the cluster to perform data distribution exists or not;
if it is determined that a second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration, restarting the delay timer to perform timing;
and when the timing duration of the delay timer reaches the preset duration, redistributing the data in the cluster.
In some possible embodiments, the detecting that there is a first trigger event for data distribution includes:
detecting that there is a first external memory to exit or join the cluster;
the detecting whether there is a second trigger event triggering the cluster to perform data distribution includes:
detecting whether a second external memory exits or joins the cluster.
In some possible embodiments, the external memory is embodied as a hard disk.
In some possible embodiments, the method further comprises:
recording a first version identification of an external memory state recording file in the cluster when the delay timer is started for timing;
when the time length of the delay timer reaches the preset time length, acquiring a second version identifier of an external memory state recording file in the cluster;
the redistributing the data in the cluster when the timing duration of the delay timer reaches the preset duration includes:
and when the timing duration of the delay timer reaches the preset duration and the first version identification is inconsistent with the second version identification, redistributing the data in the cluster.
In some possible embodiments, the method further comprises:
and when the first version identification is consistent with the second version identification, restarting the delay timer for timing.
In some possible embodiments, the second version identification is larger than the first version identification, the method further comprising:
and after the data in the cluster are redistributed, clearing the version number of the external memory state recording file.
In some possible embodiments, the redistributing the data includes:
determining a third external memory added to the cluster within a target time period, wherein the target time period is a time period from a first time when the first trigger event is detected to a second time when the timing duration of the delay timer reaches the preset duration;
allocating a portion of the data in the cluster to the third external memory;
determining a fourth external memory that exits the cluster within the target time period;
restoring data stored on the fourth external memory to a fifth external memory in the cluster.
In a second aspect, an embodiment of the present application further provides a delay processing apparatus for data distribution, where the apparatus includes:
the starting unit is used for starting the delay timer to time when detecting that a first trigger event triggering the cluster to perform data distribution exists;
the detection unit is used for detecting whether a second trigger event triggering the cluster to perform data distribution exists or not before the timing duration of the delay timer reaches a preset duration;
the first restarting unit is used for restarting the delay timer to time if determining that a second triggering event for triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration;
and the data distribution unit is used for redistributing the data in the cluster when the timing duration of the delay timer reaches the preset duration.
In some possible embodiments, the detecting that there is a first trigger event for data distribution, specifically detecting that there is a first external storage exiting or joining the cluster;
the detecting unit is specifically configured to detect whether a second external memory exits or joins the cluster.
In some possible embodiments, the external memory is embodied as a hard disk.
In some possible embodiments, the apparatus further comprises:
the recording unit is used for recording a first version identifier which represents the state of an external memory in the cluster when the delay timer is started for timing;
the obtaining unit is used for obtaining a second version identifier representing the state of an external memory in the cluster when the time length of the delay timer reaches the preset time length;
the data distribution unit is specifically configured to redistribute the data in the cluster when the timing duration of the delay timer reaches the preset duration and the first version identifier is inconsistent with the second version identifier.
In some possible embodiments, the apparatus further comprises:
and the second restarting unit is used for restarting the delay timer for timing when the first version identifier is consistent with the second version identifier.
In some possible embodiments, the second version identification is larger than the first version identification, and the apparatus further comprises:
and the zero clearing unit is used for clearing the version number of the external memory state recording file after the data in the cluster is redistributed.
In some possible embodiments, the data distribution unit includes:
a first determining subunit, configured to determine a third external memory to be added to the cluster within a target time period, where the target time period is a time period from a first time when the first trigger event is detected to a second time when it is determined that the timing duration of the delay timer reaches the preset duration;
an allocation subunit, configured to allocate a part of the data in the cluster to the third external storage;
a second determining subunit, configured to determine a fourth external memory exiting the cluster within the target time period;
a restoring subunit, configured to restore the data stored in the fourth external storage to a fifth external storage in the cluster.
In the implementation manner of the embodiment of the present application, trigger events for data distribution of all trigger clusters in a short time are combined, so that the clusters perform data distribution only once, thereby reducing consumption of computing resources of the clusters, and reducing waste of the computing resources of the clusters due to unnecessary migration of part of data in the clusters for multiple times. Specifically, when it is detected that a first trigger event triggering a cluster to perform data distribution occurs, a delay timer may be started to time, and before the timing duration of the delay timer reaches a preset duration, it is detected whether a second trigger event triggering the cluster to perform data distribution exists, and if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, the delay timer is restarted to time until the data in the cluster is redistributed when the timing duration of the delay timer reaches the preset duration. It can be seen that, by merging the trigger events that trigger the data distribution of all the clusters in a short time, the clusters do not need to execute multiple data distributions based on multiple trigger events in a short time, but only execute one data distribution, so that the data in the clusters only need to be migrated once between different hard disks, thereby avoiding the waste of cluster computing resources caused by multiple data migrations of the data in the clusters, and the computing resources consumed by the clusters only executing one data distribution are less than the computing resources consumed by executing multiple data distributions, thereby improving the performance of the clusters to a certain extent.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a schematic diagram of an application scenario in an embodiment of the present application;
fig. 2 is a schematic flow chart of a delay processing method for data distribution according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a delay processing apparatus for data distribution in an embodiment of the present application.
Detailed Description
For each cluster of the distributed storage system, if there is a permanent exit cluster of hard disks in the cluster or a new hard disk joining the cluster, the data on the cluster usually needs to be redistributed. If events of pulling out a plurality of hard disks from a cluster and/or inserting a plurality of hard disks into the cluster occur within a short time, it can be understood that the events of pulling out the plurality of hard disks and inserting the plurality of hard disks into the cluster do not occur at the same time, that is, there is a certain time interval between the pulling out and/or inserting of different hard disks, which makes data on the cluster perform data distribution many times within a short time, which makes part of data in the cluster perform meaningless migration between different hard disks in the data distribution performed by the cluster many times, which wastes certain computing resources of the cluster, and also consumes more computing resources of the cluster due to data distribution performed many times within a short time, which affects the performance of the cluster.
In order to solve the foregoing technical problem, an embodiment of the present application provides a delay processing method for data distribution, in which trigger events for triggering all clusters to perform data distribution in a short time are combined, so that the clusters perform data distribution only once, thereby reducing consumption of computing resources of the clusters, and reducing waste of the computing resources of the clusters due to unnecessary migration of part of data in the clusters. Specifically, when it is detected that a first trigger event triggering a cluster to perform data distribution occurs, a delay timer may be started to time, and before the timing duration of the delay timer reaches a preset duration, it is detected whether a second trigger event triggering the cluster to perform data distribution exists, and if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, the delay timer is restarted to time until the data in the cluster is redistributed when the timing duration of the delay timer reaches the preset duration.
It can be seen that, by merging the trigger events that trigger the data distribution of all the clusters in a short time, the clusters do not need to execute multiple data distributions based on multiple trigger events in a short time, but only execute one data distribution, so that the data in the clusters only need to be migrated once between different hard disks, thereby avoiding the waste of cluster computing resources caused by multiple data migrations of the data in the clusters, and the computing resources consumed by the clusters only executing one data distribution are less than the computing resources consumed by executing multiple data distributions, thereby improving the performance of the clusters to a certain extent.
For example, the embodiments of the present application may be applied to an exemplary application scenario as shown in fig. 1. In this scenario, a cluster includes a master node and a plurality of slave nodes. The master node may continuously detect whether a first hard disk is added to or withdrawn from the master node or the slave node in the cluster, if so, start the delay timer to time, and in the process of timing by the delay timer, if the master node further detects that a second hard disk is added to or withdrawn from the master node or the slave node in the cluster, restart the delay timer, so that the delay timer restarts to time. When the timing duration of the delay timer reaches the preset duration, the master node may control redistribution of the data in the cluster. In this way, although a plurality of hard disks are added to or withdrawn from the cluster in a short time, the master node only needs to control the distribution of data in the cluster once, so that the computing resources consumed by the cluster for data distribution in a short time can be reduced.
It is to be understood that the above scenario is only one example of a scenario provided in the embodiment of the present application, and the embodiment of the present application is not limited to this scenario.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, various non-limiting embodiments accompanying the present application examples are described below with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a delay processing method for data distribution in an embodiment of the present application, where the method specifically includes:
s201: and when detecting that a first trigger event triggering the cluster to perform data distribution exists, starting a delay timer to time.
In practical application, if an external memory such as a hard disk and a floppy disk exists in a cluster and exits the cluster, or a user adds a new external memory to the cluster, the cluster is triggered to perform data distribution. If the master node in the cluster detects that there is a trigger event that triggers the cluster to perform data distribution, such as adding or exiting the external memory, the master node may start a delay timer to perform timing.
S202: and before the timing duration of the delay timer reaches a preset duration, detecting whether a second trigger event triggering the cluster to perform data distribution exists.
It should be noted that in this embodiment, in order to merge all the trigger events occurring in the cluster within a period of time, when detecting that the first trigger event occurs, the master node may not immediately redistribute the data in the cluster, but start the delay timer to start timing. And when the timing duration of the delay timer reaches the preset duration, the master node can continuously detect whether a second trigger event for triggering the cluster to perform data distribution still exists in the cluster.
Similar to the first trigger event, the second trigger event may also be an event that the master node detects that there is an external storage to join or leave the cluster, and the like, which triggers the cluster to perform data distribution.
In most scenarios of practical applications, the external storage that joins or exits the cluster may be a hard disk.
S203: and if determining that a second trigger event for triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, restarting the delay timer to perform timing.
S204: and when the timing duration of the delay timer reaches the preset duration, redistributing the data in the cluster.
It should be noted that, if the master node detects that a second trigger event occurs in the cluster before the time length of the delay timer reaches the preset time length, the master node does not immediately redistribute the data in the cluster, but restarts the delay timer, so that the delay timer restarts to start timing again. Thus, although the first trigger event and the second trigger event occur in a short time, the two trigger events do not immediately trigger the cluster to perform data distribution, but perform data re-execution when the waiting delay timer reaches a preset time. This means that the master node combines the first trigger event and the second trigger event, so that the cluster that originally needs to execute data distribution twice only needs to execute data distribution once, and thus, the number of times that the cluster needs to execute data distribution in a short time can be reduced, and thus, the computing resources that the cluster needs to consume for executing data distribution in a short time can be reduced, and the waste of computing resources caused by unnecessary migration of part of data in the cluster among a plurality of hard disks is avoided.
It is understood that the present embodiment is only exemplified by the case that two triggering events exist in the cluster in a short time. In practical applications, after detecting the second trigger event and restarting the delay timer to start timing, if the master node further detects a third trigger event that triggers the cluster to perform data distribution before the time duration for restarting the timing by the delay timer reaches a preset time duration, the master node may restart the delay timer again, so that the delay timer restarts timing based on occurrence of the third trigger event. And repeating the process until the master node does not detect a triggering event triggering the cluster to distribute the data again in the process that the timing duration of the delay timer reaches the preset duration, so as to redistribute the data in the cluster.
As a specific implementation example of redistributing data in a cluster, a master node may determine a third external storage device that is added to the cluster within a target time period, and certainly, the number of the third external storage devices may be one or more, where the target time period is a time period from a first time when the master node detects that a first trigger event occurs to a second time when the master node determines that a timing duration of a delay timer reaches a preset duration; then, the master node may allocate part of the data in the cluster to a newly added third external storage according to a preset data distribution policy, so that the data in the cluster are uniformly distributed on all the hard disks in the cluster; then, the master node may determine that a fourth external storage of the cluster exits within the target time period, and similarly, the fourth external storage may be one or more; finally, the master node may start a corresponding data recovery process to recover the data on the fourth external storage exiting the cluster to the fifth external storage in the cluster, that is, after the fourth external storage exits the cluster, the data on the fourth external storage is not lost, so that the data in the cluster may be evenly distributed to each external storage in the cluster, and the redistribution of the data is completed.
In practical application, when an external storage exists in a cluster and joins or leaves the cluster, the record file of the state of the external storage in the cluster is also changed correspondingly. For example, when an external memory is added to the cluster, the record file of the original external memory state is updated to a record file containing the state of the newly added external memory. Then, in one example, a version identification may be added to the record file of the external memory state, such as a version number of the record file. After the trigger event occurs, the delay timer is started, and simultaneously, the record file updated based on the trigger event is generated, so that the first version identification of the updated record file can be recorded, when the timing duration of the delay timer reaches the preset duration, the second version identification of the record file in the current latest external storage state in the cluster is obtained, and if the state change of the hard disk is finished with persistent storage, the second version identification of the record file in the external storage state is not the same as the first version identification. Therefore, in order to avoid the influence of the incomplete persistent storage of the state change of the external storage on the redistribution of the data in the cluster, whether the persistent storage of the state change of the hard disk is completed or not can be determined by comparing the version identifiers of the record files of the state of the external storage, and if the state change of the external storage on the main node is not completed and the persistent storage is caused by reasons such as busy service of the main node, the redistribution of the data in the cluster can be refused even if the timing duration of the delay timer reaches the preset duration. For example, in some embodiments, it may be that the master node determines that the timing duration of the delay timer reaches a preset duration, and when the first version identifier of the record file is inconsistent with the second version identifier, the master node allows the redistribution of the data in the cluster.
Further, when the master node determines that the first version identifier of the log file is consistent with the second version identifier, the master node may restart the delay timer to wait for the state change of the external memory to implement persistent storage because the state change of the external memory does not complete persistent storage.
In practical applications, the version identifier of the record file can be represented by a number. For example, the first version identification may be similar to "version 1" and the second version identification may be similar to "version 2". Based on this, the number corresponding to the version identifier of the current latest record file is larger than the number corresponding to the previously recorded version identifier, and when determining whether to allow redistribution of the data in the cluster, it may be determined whether the second version identifier of the record file is larger than the first version identifier, and if so, redistribution of the data in the cluster is allowed. Further, in order to avoid the influence of the delay processing of the data distribution on the delay processing of the next data distribution, the version identifier of the record file may be cleared after the delay processing is completed.
In some possible application scenarios, a master node in a cluster may be switched, i.e., the master node in the cluster may be switched from a current node to another node in the cluster. If the master node needs to complete the switching of the master node in the cluster during the process of performing the delay processing on the data distribution in the cluster, in order to ensure that the delay processing on the data distribution can be continued, the master node after switching in the cluster may be instructed to continue to perform the delay processing by using the version identifier of the record file. Specifically, the switched master node may check whether the version identifier of the record file is zero, and if so, it indicates that the delay processing is not required for the data distribution in the cluster at present; if not, the delay processing is carried out on the data distribution in the cluster, and then whether an event waiting for the delay processing exists or not can be searched, if the event does not exist, an event waiting for the delay processing can be created, a delay timer is started to start timing, the process of the delay processing is executed, if the event exists, the process of the delay processing on the data distribution in the cluster can be continued based on the existing event waiting for the delay processing, and additional processing is not needed.
In this embodiment, trigger events for data distribution of all trigger clusters in a short time are combined, so that the clusters perform data distribution only once, thereby reducing consumption of computing resources of the clusters, and reducing waste of the computing resources of the clusters due to unnecessary migration of part of data in the clusters. Specifically, when it is detected that a first trigger event triggering a cluster to perform data distribution occurs, a delay timer may be started to time, and before the timing duration of the delay timer reaches a preset duration, it is detected whether a second trigger event triggering the cluster to perform data distribution exists, and if it is determined that the second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches the preset duration, the delay timer is restarted to time until the data in the cluster is redistributed when the timing duration of the delay timer reaches the preset duration. It can be seen that, by merging the trigger events that trigger the data distribution of all the clusters in a short time, the clusters do not need to execute multiple data distributions based on multiple trigger events in a short time, but only execute one data distribution, so that the data in the clusters only need to be migrated once between different hard disks, thereby avoiding the waste of cluster computing resources caused by multiple data migrations of the data in the clusters, and the computing resources consumed by the clusters only executing one data distribution are less than the computing resources consumed by executing multiple data distributions, thereby improving the performance of the clusters to a certain extent.
In addition, the embodiment of the application also provides a delay processing device for data distribution. Referring to fig. 3, fig. 3 is a schematic structural diagram illustrating a delay processing apparatus for data distribution according to an embodiment of the present application, where the apparatus 300 includes:
a starting unit 301, configured to start a delay timer to time when detecting that a first trigger event triggering a cluster to perform data distribution exists;
a detecting unit 302, configured to detect whether a second trigger event triggering the cluster to perform data distribution exists before a timing duration of the delay timer reaches a preset duration;
a first restarting unit 303, configured to restart the delay timer for timing if it is determined that a second trigger event that triggers the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration;
a data distribution unit 304, configured to redistribute the data in the cluster when the timing duration of the delay timer reaches the preset duration.
In some possible embodiments, the detecting that there is a first trigger event for data distribution, specifically detecting that there is a first external storage exiting or joining the cluster;
the detecting unit 302 is specifically configured to detect whether there is a second external memory exiting from or joining the cluster.
In some possible embodiments, the external memory is embodied as a hard disk.
In some possible embodiments, the apparatus 300 further comprises:
the recording unit is used for recording a first version identifier which represents the state of an external memory in the cluster when the delay timer is started for timing;
the obtaining unit is used for obtaining a second version identifier representing the state of an external memory in the cluster when the time length of the delay timer reaches the preset time length;
the data distribution unit is specifically configured to redistribute the data in the cluster when the timing duration of the delay timer reaches the preset duration and the first version identifier is inconsistent with the second version identifier.
In some possible embodiments, the apparatus 300 further comprises:
and the second restarting unit is used for restarting the delay timer for timing when the first version identifier is consistent with the second version identifier.
In some possible embodiments, the second version identification is larger than the first version identification, and the apparatus further comprises:
and the zero clearing unit is used for clearing the version number of the external memory state recording file after the data in the cluster is redistributed.
In some possible embodiments, the data distribution unit 304 includes:
a first determining subunit, configured to determine a third external memory to be added to the cluster within a target time period, where the target time period is a time period from a first time when the first trigger event is detected to a second time when it is determined that the timing duration of the delay timer reaches the preset duration;
an allocation subunit, configured to allocate a part of the data in the cluster to the third external storage;
a second determining subunit, configured to determine a fourth external memory exiting the cluster within the target time period;
a restoring subunit, configured to restore the data stored in the fourth external storage to a fifth external storage in the cluster.
In this embodiment, it can be seen that, by merging trigger events that trigger data distribution of all trigger clusters in a short time, the clusters do not need to execute data distribution multiple times based on multiple trigger events in a short time, but only execute data distribution once, and thus, data in the clusters only need to be migrated once between different hard disks, so that the waste of cluster computing resources due to multiple times of data migration of the data in the clusters can be avoided, and the computing resources consumed by the clusters for executing data distribution only once are less than the computing resources consumed by executing data distribution multiple times, so that the performance of the clusters can be improved to a certain extent.
In the names of "first external memory", "first trigger event", "first version identifier", "first restart unit", and the like, the "first" mentioned in the embodiments of the present application is used only for name identifiers, and does not represent the first in sequence. The same applies to "second", "third", "fourth", "fifth", etc.
As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that all or part of the steps in the above embodiment methods can be implemented by software plus a general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a read-only memory (ROM)/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network communication device such as a router) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only an exemplary embodiment of the present application, and is not intended to limit the scope of the present application.

Claims (8)

1. A method for processing delay of data distribution, the method comprising:
when detecting that a first trigger event triggering a cluster to perform data distribution exists, starting a delay timer to time;
before the timing duration of the delay timer reaches a preset duration, detecting whether a second trigger event triggering the cluster to perform data distribution exists or not;
if it is determined that a second trigger event triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration, restarting the delay timer to perform timing;
when the timing duration of the delay timer reaches the preset duration, redistributing the data in the cluster;
the method further comprises the following steps:
recording a first version identification of an external memory state recording file in the cluster when the delay timer is started for timing;
when the timing duration of the delay timer reaches the preset duration, acquiring a second version identifier of an external memory state recording file in the cluster;
the redistributing the data in the cluster when the timing duration of the delay timer reaches the preset duration includes:
and when the timing duration of the delay timer reaches the preset duration and the first version identification is inconsistent with the second version identification, redistributing the data in the cluster.
2. The method of claim 1, wherein the detecting that there is a first trigger event for data distribution comprises:
detecting that there is a first external memory to exit or join the cluster;
the detecting whether there is a second trigger event triggering the cluster to perform data distribution includes:
detecting whether a second external memory exits or joins the cluster.
3. Method according to claim 2, characterized in that the external memory is embodied as a hard disk.
4. The method of claim 1, further comprising:
and when the first version identification is consistent with the second version identification, restarting the delay timer for timing.
5. The method of claim 1 or 4, wherein the second version identification is larger than the first version identification, the method further comprising:
and after the data in the cluster are redistributed, clearing the version number of the external memory state recording file.
6. The method of claim 1, wherein the redistributing data in the cluster comprises:
determining a third external memory added to the cluster within a target time period, wherein the target time period is a time period from a first time when the first trigger event is detected to a second time when the timing duration of the delay timer reaches the preset duration;
allocating a portion of the data in the cluster to the third external memory;
determining a fourth external memory that exits the cluster within the target time period;
restoring data stored on the fourth external memory to a fifth external memory in the cluster.
7. A data-distributed latency processing apparatus, the apparatus comprising:
the starting unit is used for starting the delay timer to time when detecting that a first trigger event triggering the cluster to perform data distribution exists;
the detection unit is used for detecting whether a second trigger event triggering the cluster to perform data distribution exists or not before the timing duration of the delay timer reaches a preset duration;
the first restarting unit is used for restarting the delay timer to time if determining that a second triggering event for triggering the cluster to perform data distribution exists before the timing duration of the delay timer reaches a preset duration;
the data distribution unit is used for redistributing the data in the cluster when the timing duration of the delay timer reaches the preset duration;
the device further comprises:
the recording unit is used for recording a first version identifier which represents the state of an external memory in the cluster when the delay timer is started for timing;
the obtaining unit is used for obtaining a second version identifier representing the state of an external memory in the cluster when the timing duration of the delay timer reaches the preset duration;
the data distribution unit is specifically configured to redistribute the data in the cluster when the timing duration of the delay timer reaches the preset duration and the first version identifier is inconsistent with the second version identifier.
8. The apparatus of claim 7, further comprising:
and the second restarting unit is used for restarting the delay timer for timing when the first version identifier is consistent with the second version identifier.
CN201811232307.9A 2018-10-22 2018-10-22 Delay processing method and device for data distribution Active CN109521958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811232307.9A CN109521958B (en) 2018-10-22 2018-10-22 Delay processing method and device for data distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811232307.9A CN109521958B (en) 2018-10-22 2018-10-22 Delay processing method and device for data distribution

Publications (2)

Publication Number Publication Date
CN109521958A CN109521958A (en) 2019-03-26
CN109521958B true CN109521958B (en) 2022-02-18

Family

ID=65772996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811232307.9A Active CN109521958B (en) 2018-10-22 2018-10-22 Delay processing method and device for data distribution

Country Status (1)

Country Link
CN (1) CN109521958B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035728A (en) * 2014-03-31 2014-09-10 深圳英飞拓科技股份有限公司 Hard disk hot plug handling method, device and node
CN104461389A (en) * 2014-12-03 2015-03-25 上海新储集成电路有限公司 Automatically learning method for data migration in mixing memory
CN107395721A (en) * 2017-07-20 2017-11-24 郑州云海信息技术有限公司 A kind of method and system of metadata cluster dilatation
CN107422977A (en) * 2017-07-31 2017-12-01 北京小米移动软件有限公司 Trigger action processing method, device and computer-readable recording medium
CN107562382A (en) * 2017-08-30 2018-01-09 郑州云海信息技术有限公司 A kind of disk automatic dynamic expansion method and system based on timed task

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10387415B2 (en) * 2016-06-28 2019-08-20 International Business Machines Corporation Data arrangement management in a distributed data cluster environment of a shared pool of configurable computing resources

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035728A (en) * 2014-03-31 2014-09-10 深圳英飞拓科技股份有限公司 Hard disk hot plug handling method, device and node
CN104461389A (en) * 2014-12-03 2015-03-25 上海新储集成电路有限公司 Automatically learning method for data migration in mixing memory
CN107395721A (en) * 2017-07-20 2017-11-24 郑州云海信息技术有限公司 A kind of method and system of metadata cluster dilatation
CN107422977A (en) * 2017-07-31 2017-12-01 北京小米移动软件有限公司 Trigger action processing method, device and computer-readable recording medium
CN107562382A (en) * 2017-08-30 2018-01-09 郑州云海信息技术有限公司 A kind of disk automatic dynamic expansion method and system based on timed task

Also Published As

Publication number Publication date
CN109521958A (en) 2019-03-26

Similar Documents

Publication Publication Date Title
US9960963B2 (en) Dynamic client fail-over during a rolling patch installation based on temporal server conditions
US20160036924A1 (en) Providing Higher Workload Resiliency in Clustered Systems Based on Health Heuristics
CN107656705B (en) Computer storage medium and data migration method, device and system
CN109445927B (en) Task management method and device for storage cluster
CN110825495A (en) Container cloud platform recovery method, device, equipment and readable storage medium
CN109173270B (en) Game service system and implementation method
CN105786539B (en) File downloading method and device
CN110858168B (en) Cluster node fault processing method and device and cluster node
CN111541762A (en) Data processing method, management server, device and storage medium
CN111342986B (en) Distributed node management method and device, distributed system and storage medium
CN114064217A (en) Node virtual machine migration method and device based on OpenStack
CN109521958B (en) Delay processing method and device for data distribution
WO2017080362A1 (en) Data managing method and device
CN109189487B (en) Restarting method, system and related components of Ceph distributed storage system
CN111221468B (en) Storage block data deleting method and device, electronic equipment and cloud storage system
US10789129B1 (en) Rolling restoration of enterprise business services following service disruption
CN111158956A (en) Data backup method and related device for cluster system
CN108121514B (en) Meta information updating method and device, computing equipment and computer storage medium
CN110908821B (en) Method, device, equipment and storage medium for task failure management
CN114116317A (en) Data processing method, device, equipment and medium
CN113542398A (en) Control method, device, medium and equipment of distributed cluster system
CN113553217A (en) Data recovery method and device, storage medium and computer equipment
JP2018538632A (en) Method and device for processing data after node restart
CN111405313A (en) Method and system for storing streaming media data
JP2017037539A (en) Server control program, server control method, and server control device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant