WO2017167070A1

WO2017167070A1 - Method for copying clustered data, and method and device for determining priority

Info

Publication number: WO2017167070A1
Application number: PCT/CN2017/077500
Authority: WO
Inventors: 史英杰; 何乐; 黄俨; 张�杰; 张辰
Original assignee: 阿里巴巴集团控股有限公司; 史英杰; 何乐; 黄俨; 张�杰; 张辰
Priority date: 2016-03-30
Filing date: 2017-03-21
Publication date: 2017-10-05
Also published as: CN107291724A; TW201737108A

Abstract

A method and device for copying clustered data. The method for copying clustered data comprises: determining at least one copying task requiring inter-cluster data copying (201); calculating a priority for each of the at least one copying task (202); and implementing the copying task according to the priority of the copying task (203).The method and device provided in the invention can achieve reasonable scheduling of data to be copied under a limited inter-domain bandwidth, thereby achieving rapid copying of the data.

Description

Cluster data replication method, priority determination method and device

Technical field

The present application relates to communication technologies, and in particular, to a cluster data replication method, a priority determination method, and an apparatus.

Background technique

Clustering greatly increases the storage limits and processing limits of a single unit. However, with the continuous development of the Internet, especially the mobile Internet, the data generated by many companies has reached the PB or even EB level, and the amount of new data added every day is also growing rapidly. When the data exceeds the storage of a single cluster, or a single cluster cannot meet the needs of data processing, it needs to be split and stored in multiple clusters according to business units.

There is a dependency between the business units. The data of the service A is stored in the first cluster. If the service B in the first cluster needs to access the data of the service A, the data can be directly read from the first cluster. If the service C located in the second cluster needs to access the data of the service A, the data needs to be read out of the cluster. If multiple services in the second cluster need to access the data of service A, it is necessary to read the same data across the cluster multiple times, which wastes bandwidth resources between the clusters. In particular, as the number of services increases, the number of services that read the same data across different clusters will increase, resulting in more waste of bandwidth resources.

To save bandwidth resources, the common method in the industry is to keep the data of service A in another cluster, so that the services in other clusters can directly read data in the cluster when the data of service A is needed. This requires data to be replicated between clusters. The prior art generally adopts an offline replication mode, that is, stops database operations of each cluster, and copies data all from one cluster to another at a time, which requires a large amount of bandwidth resources. However, cross-domain bandwidth is limited and valuable, and as the number of services grows, so does the need to replicate data. It will gradually increase, so under the condition of limited cross-domain bandwidth, the offline replication method obviously cannot meet the demand, so a new data replication method is urgently needed.

Summary of the invention

The present invention provides a cluster data replication method and device for reasonably scheduling data to be replicated under the condition of limited cross-domain bandwidth, thereby realizing rapid data replication.

To achieve the above objective, the embodiment of the present application adopts the following technical solutions:

In a first aspect, a cluster data replication method is provided, including:

Identify at least one replication task that requires data to be replicated across the cluster;

Calculating a priority of each of the at least one replication task;

Each of the copy tasks is executed according to the priority of each of the copy tasks.

In a second aspect, a cluster data replication apparatus is provided, including:

Determining a module for determining at least one replication task that requires data to be replicated across the cluster;

a calculation module, configured to calculate a priority of each of the at least one replication task;

And an execution module, configured to execute each of the replication tasks according to a priority of each of the replication tasks.

In a third aspect, a method for determining a priority is provided, including:

Acquiring at least one of a triggering manner of a replication task that needs to replicate data across the cluster, a generation time of the data that the replication task needs to be replicated, and an importance of the source service corresponding to the replication task, the source corresponding to the replication task Service refers to the service that generates the data that the copy task needs to copy;

The priority of the copy task is calculated according to the at least one factor.

In a fourth aspect, a priority determining apparatus is provided, including:

An information obtaining module, configured to acquire a triggering manner of a replication task that needs to replicate data across the cluster, a generation time of the data to be copied by the replication task, and an importance degree of the source service corresponding to the replication task, where The source service corresponding to the replication task refers to a service that generates data that needs to be copied by the replication task;

a priority calculation module, configured to calculate a priority of the replication task according to the at least one factor.

In the present application, after determining a replication task that needs to replicate data across the cluster, the priority of each replication task is calculated, and then each replication task is executed according to the priority of each replication task. It can be seen that, under the condition that the cross-domain bandwidth is limited, the replication task is scheduled according to the priority of the replication task, and the replication task with higher priority is preferentially scheduled, and the replication task can be reasonably scheduled, which is beneficial to realizing data. Quick copy.

The above description is only an overview of the technical solutions of the present application, and the technical means of the present application can be more clearly understood, and the above and other objects, features and advantages of the present application can be more clearly understood. The following is a specific embodiment of the present application.

DRAWINGS

Various other advantages and benefits will become apparent to those skilled in the art from a The drawings are only for the purpose of illustrating the preferred embodiments and are not intended to be limiting. Throughout the drawings, the same reference numerals are used to refer to the same parts. In the drawing:

FIG. 1 is a schematic structural diagram of a cluster system according to an embodiment of the present application;

2 is a schematic flowchart of a method for replicating a cluster data according to another embodiment of the present application;

FIG. 3 is a schematic structural diagram of a replication system according to another embodiment of the present disclosure;

FIG. 4 is a schematic flowchart of packet processing of a replication task according to another embodiment of the present disclosure;

FIG. 5 is a schematic structural diagram of a cluster data replication apparatus according to another embodiment of the present disclosure;

FIG. 6 is a schematic flowchart of a priority determining method according to another embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of a priority determining apparatus according to another embodiment of the present disclosure.

detailed description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the embodiments of the present invention have been shown in the drawings, the embodiments Rather, these embodiments are provided so that this disclosure will be more fully understood and the scope of the disclosure will be fully disclosed.

FIG. 1 is a schematic structural diagram of a cluster system according to Embodiment 1 of the present application. As shown in Figure 1, the cluster system only symbolically indicates three clusters, namely cluster A, cluster B, and cluster C. In fact, the cluster system can contain any number of clusters.

The process of cross-cluster data replication is illustrated in conjunction with the cluster system shown in Figure 1.

It is assumed that there is a service a in the cluster A, a service b exists in the cluster B, a service c exists in the cluster C, the service a generates data, and the service b and the service c need the data generated by the service a, so the service a needs to be generated. Data is copied from cluster A to cluster B and cluster C. Of course, if the cluster A also includes the service d, and the service b and the service c also need the data generated by the service d, the data generated by the service d needs to be copied from the cluster A to the cluster B and the cluster C. Of course, if the service a needs the data generated by the service b and the service c, the data generated by the services b and c also needs to be copied from the clusters B and C to the cluster A, respectively.

Due to the limited bandwidth resources between clusters and the increasing number of services, the amount of data to be replicated will gradually increase. Therefore, if offline replication is adopted, all data needs to be copied from one cluster to another at a time. A large number of bandwidth resources, in the case of limited bandwidth resources between clusters, the method is no longer applicable, and part of the data replication is required. This involves scheduling problems between data to be replicated, if the data to be copied can be reasonably scheduled. , then it will help to improve the efficiency of replication.

In view of the above problems, the present application provides a solution, the main idea of which is to calculate the priority of the replication task after determining the replication task that needs to replicate data across the cluster, and then perform the replication task according to the priority of the replication task. This application is based on the limited cross-domain bandwidth. The replication task is scheduled according to the priority of the replication task, and the replication task with the higher priority is preferentially scheduled, and the replication task can be scheduled reasonably, which is beneficial for realizing rapid data replication.

The technical solution of the present application will be described in detail below with reference to the accompanying drawings.

FIG. 2 is a schematic flowchart of a method for replicating a cluster data according to another embodiment of the present disclosure. As shown in Figure 2, the method includes:

201. Determine at least one replication task that needs to replicate data across the cluster.

202. Calculate a priority of each replication task in at least one replication task.

203. Perform each replication task according to the priority of each replication task.

The embodiment provides a cluster data replication method, which can be executed by a cluster data replication device (hereinafter referred to as a replication device) to properly schedule replication tasks, implement data replication across the cluster, and improve data replication efficiency.

In a specific implementation, the replication device may be deployed in a cluster system, for example, a cluster deployed in the cluster system, for example, a cluster for generating data (referred to as a source cluster) or a cluster that needs to replicate data (referred to as a destination cluster). Within, or it may be a device that is independent of the cluster system but is capable of communicating with the cluster system.

Specifically, the replication device first needs to determine at least one replication task that needs to replicate data across the cluster. A replication task is a task that requires data to be replicated from one cluster to another.

The copying device may determine the copying task by using, but not limited to, the following methods:

The first method is to poll the data version management server periodically, and when it is found that the version of the data changes, it is determined that there is a need to perform a cross-cluster replication replication task for the data whose version is changed. The first method can also be called a scan trigger method.

In the first mode, the version information of each data in the entire cluster system needs to be managed by the data version management server. For example, the copying device may start a thread to periodically poll the data version management server, and when it is found that the version of the data changes, it is determined that there is a need to perform a cross-cluster copying task for the data whose version has changed.

The second method is: the obtaining, by the control server, the replication task that is sent according to the received replication task notification message, and the replication task notification message is that the first service actively reports to the control server when the new version of the data is generated. The second way can also be called the event trigger mode.

In the second way, you need to deploy a control server. Based on this, a system architecture for implementing the method of the present embodiment is shown in FIG. 3. The system includes a control server 31 and at least one managed server 32. The copying device is implemented in the controlled server 32, but is not limited thereto.

In the deployment implementation, the system architecture can run on a cluster in the cluster system, for example, it can be a source cluster or a destination cluster, or can also run outside the cluster system.

Specifically, the control server 31 is responsible for the distribution management of the replication task in the cluster, monitoring the state of the controlled server 32 and the replication device running thereon, and further, the control server 31 is also responsible for communicating with the outside world, for example, receiving a replication task notification message, querying System status, etc. The control server 31 and the controlled server 32 communicate by means of a heartbeat message, and the controlled server 32 reports the running status to the control server 31, including the CPU, the memory usage, and the operating state of the copying device. The control server 31 sends the information of the copying device that needs to be started or stopped to the managed server 32. For example, the operation and maintenance personnel send a message to the control server 31 to start or stop a copy device through the management interface. When the control server 31 receives a message to start or stop a copy device, a controlled server 32 is randomly selected, and a message for starting or stopping a copy device is added to the heartbeat message of the controlled server 32, and is controlled. After the server 32 receives the heartbeat message, it will start or stop the corresponding copying device to execute or end the execution of the copying task.

The control server 31 may adopt a message subscription mode, that is, the service that generates data in the cluster (referred to as the first service), when generating the new version data, actively sends a replication task notification message to the control server to notify the control server 31 that there is a replication task. . Generally, the replication task notification message carries information of data that needs to be replicated across the cluster, information of the source cluster where the data is located, information of the destination cluster that needs to copy the data, and the like. The control server 31 issues a copy task to the copying device according to the copy task notification message.

The third mode is: receiving a replication task sent by the control server when receiving the replication task notification message, where the replication task notification message is a service requiring data (referred to as a second service), determining that the data is in the cluster where the second service is located. The version is reported to the control server when the version is inconsistent with the version in the source cluster where the data resides. The third way can also be called an on-demand trigger.

In the third way, you need to deploy a control server. This method can also be implemented by using the system architecture shown in FIG. See the above embodiment for a description of the system shown in FIG.

Specifically, when the user of the second service wants to submit the data generated by the service in another cluster (for example, cluster A) in the cluster where the second service is located (for example, cluster C), the second service determines the required service. Whether the version of the data in the cluster where the second service is located (for example, cluster C) is consistent with the version in the source cluster (for example, cluster A) where the data is located. For example, the second service can query the data version management server to obtain the version in the source cluster. Comparing the version in the cluster in which the second service is located with the version in the source cluster that is queried. If the result of the determination is inconsistent, the second service sends a copy task notification message to the control server 31, and the control server 31 performs the copy task. The notification message sends a copy task to the copying device.

The above three methods can be used alone or in combination.

In any of the ways described above, the replication device can determine the replication tasks that need to replicate data across the cluster. As the number of services continues to increase, the number of replication tasks that need to replicate data across clusters is increasing. However, the bandwidth between clusters is limited. This requires proper scheduling of replication tasks to complete replication tasks more efficiently and quickly. To solve this problem, the copying device calculates the priority of each copy task, and performs scheduling execution on each copy task according to the priority of each copy task. This ensures that high-priority replication tasks are processed first, and that replication tasks can be scheduled reasonably, which facilitates fast data replication.

In an optional implementation manner, the foregoing step 202, that is, calculating an implementation priority of each replication task, includes:

The priority of each replication task is calculated according to at least one of the triggering manner of each replication task, the generation time of the data to be copied by each replication task, and the importance of the source service corresponding to each replication task.

The triggering manner of the foregoing replication task mainly refers to the foregoing three methods for determining a replication task, such as a scan trigger mode, an event trigger mode, and a demand trigger mode. Replication tasks with different trigger modes have different priorities. For example, the priority of the replication task triggered by the demand trigger mode is the highest, the priority of the replication task triggered by the event trigger mode is the second, and the priority of the replication task triggered by the scan trigger mode is the lowest, but is not limited thereto.

The source service corresponding to the above replication task refers to the service that generates data that needs to be copied by the replication task. For example, if the service a in the cluster A generates data, and the data needs to be copied from the cluster A to the clusters B and C, the data to be copied by the replication task is the data generated by the service a, and the service a is the corresponding to the replication task. Source business. In general, the importance of different services is different. The higher the importance of the data generated by the service, the higher the importance of the data. Therefore, the priority of the replication task for copying the data generated by the more important business is higher. The higher.

The generation time of the data to be copied by the above replication task is mainly used to indicate the freshness of the data that the replication task needs to copy. Generally speaking, the later the data is generated, the newer the data is, the copying is used to copy the data with higher freshness. The priority of the task is higher.

In a specific embodiment, one of the above three factors may be used to determine the priority of each replication task. For example, the priority of each replication task is determined only according to the trigger mode of each replication task. For another example, the priority of each replication task is determined only according to the importance of the source service corresponding to each replication task. For another example, the priority of each copy task is determined based only on the generation time of the data to be copied for each copy task.

In another embodiment, the priority of each replication task may be determined in combination with any two of the above three factors. Specifically, a priority value can be calculated according to each factor, and then the weighted average of the two priority values is obtained to obtain the final priority of the replication task. Among them, different weights can be configured in advance for different factors.

For example, the importance of the source service corresponding to each replication task and the triggering mode of each replication task may be combined, and it is determined that the priority of the replication task is P1 according to the importance of the source service corresponding to the replication task, according to the replication task. The triggering mode determines that the priority of the replication task is P2. The weights determined in advance for the importance and trigger mode of the source service are w1 and w2, respectively. The final priority of the replication task is w1*P1+w2*P2.

Of course, the triggering manner of the replication task and the generation time of the data to be copied by the replication task may be combined, and the importance of the source service corresponding to the replication task and the generation time of the data to be copied by the replication task may be combined, and the specific calculation manner is the same as above. Let me repeat.

In another embodiment, the priority of each replication task may be calculated by combining the triggering manner of each replication task, the generation time of the data to be copied by each replication task, and the importance of the source service corresponding to each replication task.

Specifically, the replication device may determine, according to the triggering manner of each replication task, the first priority value of each replication task, and determine the second priority value of each replication task according to the importance of the source service corresponding to each replication task. Determining, according to the generation time of the data to be copied by each replication task, determining the third priority value of each replication task; according to the first priority value, the second priority value, and the third priority value of each replication task, Generate the priority of each replication task.

Optionally, the copying device may pre-set the weights for the importance of the source service corresponding to the replication task, the triggering manner of the replication task, and the generation time of the data to be copied by the replication task, for example, w1, w2, and w3, respectively. Based on this, the copying apparatus may perform weighted averaging on the first priority value, the second priority value, and the third priority value of each copy task according to the weight of the corresponding factor to generate a priority of each copy task. For example, the priority of the replication task = w1 * the first priority value + w2 * the second priority value + w3 * the third priority value.

Optionally, the copying device may also stitch the first priority value, the second priority value, and the third priority value of each copy task in order from high to low. The priority of each replication task.

For example, the priority of the triggering mode is set to [0, 2], where the priority of the triggering mode is 0, and the priority of the event triggering mode is 1, and the scanning trigger mode corresponds to The priority value is 2. Based on this, if the triggering mode of the replication task is the demand triggering mode, the first priority of the replication task is determined to be 0. If the triggering mode of the replication task is the event triggering mode, the first task of the replication task may be determined. The priority of the replication task is 1. If the triggering mode of the replication task is the scan trigger mode, you can determine that the first priority of the replication task is 2. For the convenience of subsequent description, the first priority value is recorded as P _t .

For example, the priority value of the priority of the source service is defined as [0, 9], which can be set according to the importance of the source service. Generally, the importance of the source service is higher. The lower the priority value. The second priority of the replication task is any value from 0-9. For the convenience of subsequent description, the first priority value is recorded as P _p .

For another example, the priority value corresponding to the generation time of the data to be copied is defined as [0, 9]. Optionally, the copying device may determine, according to formula (1), a third priority value of each copy task.

P _d =9*t/T (1)

In the above formula (1), t represents the generation time of the data to be copied by each copy task; T represents the life cycle of the data to be copied by each copy task, and 0 < t <T; P _d represents the third of each copy task Priority value. It is worth noting that the life cycle of data that needs to be replicated by different replication tasks may be the same or different. The length of the life cycle can generally be determined by factors such as business needs and importance.

The above priority value may be based on the composition, three digital copying apparatus, in which one hundred, ten, bit followed by P _t, the priority _{P p,} P _d, the number three is the replication tasks, The value range is [000, 299]. The smaller the three digits, the higher the priority of the corresponding replication task.

After the priorities of the replication tasks are calculated, the replication tasks are comparable. Therefore, the replication device can apply for bandwidth resources for each replication task according to the priority of each replication task, and then based on the applied bandwidth resources. Perform each copy task. Among them, high-priority replication tasks will apply for bandwidth resources preferentially, so they will be executed preferentially, which is beneficial to improve the replication efficiency of cross-domain data.

Specifically, the replication device can maintain a multi-priority queue, for example, 300 queues, and then add the replication task to the corresponding priority queue according to the priority of the replication task. The copying device polls each priority queue in order of priority from high to low, and requests bandwidth resources for the polled copy task. In conjunction with the system architecture shown in Figure 3, the control server 31 is also responsible for the allocation of bandwidth for the entire replication system. The copying device may send a bandwidth request request to the control server, and the control server 31 sends the number of bandwidths allocated to the copy task and the priority of the copy task for requesting the bandwidth resource to the copying device through the heartbeat message. The copying device receives the heartbeat message sent by the control server 31, and obtains the number of bandwidths requested.

It is worth noting that when there are multiple replication tasks requesting bandwidth resources at the same time, the control server 31 allocates according to the priority level. For the high-priority replication task, not only the quota within the rated bandwidth can be obtained, but when the bandwidth resource is allocated, the control server 31 also satisfies the application of the high-priority replication task by overselling.

Further, the copying device can continuously poll the priority queue, pack a plurality of copy tasks into one copy job, and apply for bandwidth resources to the control server 31 in units of copy jobs. Among them, performing multiple replication tasks in one replication job can effectively improve efficiency; however, if there are too many replication tasks included in one replication job, the replication job execution time will be too long to meet the replication task with high real-time requirements. demand. Based on this, you need to limit the size of each copy job. For example, you can limit the total number of files that a copy job contains, or the total file size, or you can limit the total number of files and the total file size that a copy job contains. For ease of description, the condition for limiting the size of a copy job is called a job submission limit. Job submission limit The system may include at least one of the following: an upper limit of the total number of files, and an upper limit of the total size of the file. The value of the upper limit of the total number of files and the upper limit of the total size of the file may be set according to the application requirements. The specific value is not limited in this embodiment.

The process of applying for the bandwidth resource by the copying device according to the foregoing job submission restriction includes: packaging each copy task according to a priority of each copy task and a preset job submission limit to form at least one copy job; and then, according to each The priority of the replication task included in the replication job is determined, and the priority of each replication job is determined; and the bandwidth resource is requested for the replication task included in each replication job according to the priority of each replication job.

Further, the copying device may package the copy task in the following manner:

The copying device sequentially obtains the copy task as the current copy task according to the order of priority, and determines whether the current copy task reaches the job submission limit, for example, whether the total number of files included in the current copy task reaches the total number of files. Value, and determine whether the total size of the file included in the current replication task reaches the upper limit of the total file size. If there is a yes in the judgment result, it is determined that the current replication task reaches the job submission limit, which indicates that the current replication task is too large. Or too many files, do not need to be packaged with other replication tasks, you can directly use the current replication task as a replication job; if the judgment result is no, this indicates that the current replication task meets the packaging requirements, and then continue to obtain other non-reaching job submission restrictions. The replication task, until the sum of multiple replication tasks that do not meet the job submission limit reaches the job submission limit, packages multiple replication tasks that do not meet the job submission limit to generate a replication job.

Combined with the multi-priority queue, the process of packaging the replication task by the replication device is as shown in FIG. 4, and includes the following steps:

401, determine whether the current priority queue to read the replication task, and if so, to read the replication task, step 402 is performed; if not, that is, the replication task is not read, step 408 is performed;

402, determining whether the read copy task reaches the job submission limit, if the determination result is yes, step 403 is performed; if the determination result is no, step 404 is performed;

You can perform at least one of the following judgments:

Determine whether the total number of files included in the copied task reaches the upper limit of the total number of files;

Determine whether the total size of the file included in the copied task reaches the upper limit of the total file size;

If the judgment result of the at least one judgment operation is no, it is determined that the read copy task does not reach the job submission limit; if the judgment result of the at least one judgment operation is yes, it is determined that the read copy task reaches the job submission limit;

403, the read copy task is placed in the current priority queue, and the read pointer is incremented by 1, returning to step 401;

404, the read replication task is added to the current job queue, and step 405 is performed;

405, determining whether the sum of the replication tasks in the current job queue reaches the job submission limit, if the determination result is yes, step 406; if the determination result is no, step 407;

Specifically perform at least one of the following judgment operations:

You can perform at least one of the following judgments:

Determine whether the total number of files included in all replication tasks in the current job queue reaches the upper limit of the total number of files;

Determine whether all the replication tasks in the current job queue include the total file size to reach the upper limit of the total file size;

If the judgment result of the at least one judgment operation is no, it is determined that the sum of the replication tasks in the current job queue does not reach the job submission limit; if the judgment result of the at least one judgment operation is yes, it is determined that the sum of the replication tasks in the current job queue reaches Job submission limit;

406. Package the replication task in the current job queue as a replication task, determine a priority of the replication job, request a bandwidth resource for the replication job, and submit the replication job to start replication, and perform step 411.

407, the read pointer is incremented by 1, returning to step 401;

408. Determine whether the current job queue is empty. If the determination result is no, go to step 409. If the judgment result is yes, go to step 411.

409, determining whether the waiting time exceeds the preset number of upper limit, if the determination is no, proceed to step 410; if the determination is yes, then step 411;

410. Wait for the number of times to wait for +1, and wait for 300 ms, and return to step 407.

411. End this operation.

It can be seen from the above that the present embodiment can improve the execution efficiency and meet the timeliness requirement of the replication task by packaging a plurality of replication tasks into one replication job.

FIG. 5 is a schematic structural diagram of a cluster data replication apparatus according to another embodiment of the present disclosure. As shown in FIG. 5, the apparatus includes a determination module 51, a calculation module 52, and an execution module 53.

The determining module 51 is configured to determine at least one copy task that needs to replicate data across the cluster.

The calculating module 52 is configured to calculate a priority of each copy task in the at least one copy task.

The executing module 53 is configured to execute each copy task according to the priority of each copy task.

In an optional implementation, the determining module 51 is specifically configured to perform at least one of the following operations:

Periodically polling the data version management server to determine whether there is a need to perform a cross-cluster replication replication task for data with changed versions when a version of the data is found to change;

Receiving a replication task sent by the control server when receiving the replication task notification message;

The replication task notification message is reported to the control server when the first service generates the new version data, or is the version of the second service that needs the data in determining the version of the data in the cluster where the second service is located and the source cluster in which the data is located. Reported to the control server when it is inconsistent.

In an optional implementation, the calculation module 52 is specifically configured to:

Calculating the priority of each replication task according to at least one of a triggering manner of each replication task, a generation time of data to be copied by each replication task, and an importance of source services corresponding to each replication task;

The source service corresponding to the replication task refers to the service that generates data that needs to be copied by the replication task.

Optionally, the calculating module 52 is further specifically configured to:

Determining, according to the triggering manner of each replication task, the first priority value of each replication task;

Determining a second priority value of each replication task according to the importance of the source service corresponding to each replication task;

Determining the third priority value of each replication task according to the generation time of the data to be copied by each replication task;

The priority of each replication task is generated according to the first priority value, the second priority value, and the third priority value of each replication task.

Further, when the calculation module 52 determines the third priority value of each replication task according to the generation time of the data to be copied according to each replication task, the calculation module 52 is specifically configured to: determine the third priority of each replication task according to formula (1). Level value. The description of the formula (1) is specifically referred to the foregoing embodiment, and details are not described herein again.

Further, when the calculation module 52 generates the priority of each replication task according to the first priority value, the second priority value, and the third priority value of each replication task, the calculation module 52 is specifically configured to:

The first priority value, the second priority value, and the third priority value of each replication task are spliced together in order from the high to the low to generate the priority of each replication task.

In an optional implementation, the executing module 53 is specifically configured to:

Apply bandwidth resources to each replication task according to the priority of each replication task.

Each copy task is executed based on the requested bandwidth resource.

Further, when the execution module 53 requests bandwidth resources for each replication task according to the priority of each replication task, the execution module 53 is specifically configured to:

Each copy task is packaged according to a priority of each copy task and a preset job submission limit to form at least one copy job;

Determining the priority of each copy job based on the priority of the copy tasks included in each copy job in at least one copy job;

Request bandwidth resources for the replication tasks included in each replication job based on the priority of each replication job.

Further, the execution module 53 performs a packaging process on each of the copy tasks according to the priority of each copy task and the preset job submission limit to form at least one copy job, specifically for:

The replication task is sequentially acquired as the current replication task according to the order of priority from high to low;

If the current replication task does not meet the job submission limit, continue to obtain other replication tasks that do not meet the job submission limit. Until the sum of multiple replication tasks that do not meet the job submission limit reaches the job submission limit, the job submission limit will not be reached. Packing tasks are packaged to generate a copy job;

If the current replication task reaches the job submission limit, the current replication task is directly treated as a replication job.

Optionally, the job submission restriction includes at least one of the following:

The total number of files is the upper limit;

The upper limit of the total file size.

The cluster data replication apparatus provided in this embodiment calculates the priority of each replication task after determining the replication task that needs to replicate data across the cluster, and then performs each replication task according to the priority of each replication task. It can be seen that, under the condition that the cross-domain bandwidth is limited, the replication task is scheduled according to the priority of the replication task, and the replication task with higher priority is preferentially scheduled, and the replication task can be reasonably scheduled, which is beneficial to realizing data. Quick copy.

In addition to the above technical solutions, the present application also provides a priority determination method for cross-cluster replication tasks to determine the priority of cross-cluster replication tasks. The process of the priority determining method is as shown in FIG. 6, and includes:

601. Acquire at least one of a triggering manner of a replication task that needs to replicate data across the cluster, a generation time of the data to be copied by the replication task, and an importance of the source service corresponding to the replication task, where the source service corresponding to the replication task is generated. The replication task requires replication of the data for the business.

602. Calculate a priority of the replication task according to at least one of the foregoing factors.

In another embodiment, any two of the above three factors may be used in combination Determine the priority of each replication task. Specifically, a priority value can be calculated according to each factor, and then the weighted average of the two priority values is obtained to obtain the final priority of the replication task. Among them, different weights can be configured in advance for different factors.

In another embodiment, the priority of the replication task may be calculated by combining the triggering manner of each replication task, the generation time of the data to be copied by each replication task, and the importance of the source service corresponding to each replication task.

Specifically, the copying device may determine the value of the first priority of each of the replication tasks according to the triggering manner of the replication task, and determine the second priority value of the replication task according to the importance of the source service corresponding to the replication task; The time at which the task needs to be copied, determines the third priority of the replication task, and generates the priority of the replication task according to the first priority value, the second priority value, and the third priority value of the replication task. level.

Optionally, the copying device may pre-set the weights for the importance of the source service corresponding to the replication task, the triggering manner of the replication task, and the generation time of the data to be copied by the replication task, for example, w1, w2, and w3, respectively. Based on this, the copying apparatus may perform weighted averaging on the first priority value, the second priority value, and the third priority value of the copy task according to the weight of the corresponding factor to generate a priority of the copy task. For example, the priority of the replication task = w1 * the first priority value + w2 * the second priority value + w3 * the third priority value.

Optionally, the copying device may also stitch the first priority value, the second priority value, and the third priority value of each replication task in an order from high to low to generate a replication task. priority.

For example, the priority of the triggering mode is set to [0, 2], where the priority of the triggering mode is 0, and the priority of the event triggering mode is 1, and the scanning trigger mode corresponds to The priority value is 2. Based on this, if the triggering mode of the replication task is the demand triggering mode, the first priority of the replication task is determined to be 0. If the triggering mode of the replication task is the event triggering mode, the first task of the replication task may be determined. The priority of the replication task is 1. If the triggering mode of the replication task is the scan trigger mode, you can determine that the first priority of the replication task is 2. For the convenience of subsequent description, the first priority value is recorded as Pt.

For example, the priority value of the priority of the source service is defined as [0, 9], which can be set according to the importance of the source service. Generally, the importance of the source service is higher. The lower the priority value. The second priority of the replication task is any value from 0-9. For the convenience of subsequent description, the first priority value is recorded as Pp.

For another example, the priority value corresponding to the generation time of the data to be copied is defined as [0, 9]. Optionally, the copying device may determine, according to formula (1), a third priority value of each copy task. For the formula (1), refer to the foregoing description, and details are not described herein again.

Based on the above, the copying device may form the above-mentioned priority values into three digits, wherein the hundred digits, the ten digits, and the single digits are Pt, Pp, Pd, and the three digits are the priority of the copying task, and the value thereof is The range is [000, 299], and the smaller the three digits, the higher the priority of the corresponding copy task.

In this embodiment, the priority of the replication task across the cluster can be calculated, and the conditions for the subsequent replication task processing based on the priority of the replication task (for example, the replication task scheduling process) are provided.

FIG. 7 is a schematic structural diagram of a priority determining apparatus according to another embodiment of the present disclosure. As shown in FIG. 7, the apparatus includes an information acquisition module 71 and a priority calculation module 72.

The information obtaining module 71 is configured to acquire at least one of a triggering manner of a replication task that needs to replicate data across the cluster, a generation time of the data to be copied by the replication task, and an importance of the source service corresponding to the replication task, where the replication task corresponds The source service refers to the service that generates the data that the replication task needs to replicate.

The priority calculation module 72 is configured to calculate a priority of the replication task according to at least one factor acquired by the information acquisition module 71.

In an embodiment, the information obtaining module 71 may obtain one of the above three factors, and the priority calculating module 72 determines the priority of the copying task according to the factor acquired by the information acquiring module 71. For example, the priority calculation module 72 determines the priority of the replication task based only on the triggering manner of the replication task. For another example, the priority calculation module 72 determines the priority of the replication task based only on the importance of the source service corresponding to the replication task. Another example, priority calculation The module 72 determines the priority of the replication task based only on the generation time of the data that the replication task needs to replicate.

In another embodiment, the information obtaining module 71 may obtain any two of the above three factors, and the priority calculating module 72 determines the priority of each copy task according to two factors acquired by the information acquiring module 71. The priority calculation module 72 can calculate a priority value according to each factor, and then perform weighted averaging on the two priority values to obtain the final priority of the replication task. Among them, different weights can be configured in advance for different factors.

For example, the priority calculation module 72 may combine the importance of the source service corresponding to the replication task and the triggering mode of the replication task, and determine that the priority of the replication task is P1 according to the importance of the source service corresponding to the replication task, according to The triggering mode of the replication task determines that the priority of the replication task is P2. The weights determined for the importance and trigger mode of the source service are w1 and w2 respectively. The final priority of the replication task is w1*P1+w2. *P2.

In another embodiment, the information acquiring module 71 can obtain the foregoing three factors, and the priority calculating module 72 can simultaneously combine the triggering manner of the replication task, the generation time of the data that the replication task needs to be copied, and the source service corresponding to the replication task. Importance, calculate the priority of the replication task.

Specifically, the priority calculation module 72 may determine the first priority value of the replication task according to the triggering manner of the replication task, and determine the second priority value of the replication task according to the importance of the source service corresponding to the replication task; The generation time of the data to be copied by the replication task, determining the third priority value of the replication task; generating the replication task according to the first priority value, the second priority value, and the third priority value of the replication task priority.

Optionally, the importance of the source service corresponding to the replication task and the touch of the replication task may be pre-replicated The sending time and the generation time of the data to be copied by the copying task are three factors, such as w1, w2, and w3. Based on this, the priority calculation module 72 may perform weighted averaging on the first priority value, the second priority value, and the third priority value of the replication task according to the weight of the corresponding factor to generate a priority of the replication task. . For example, the priority of the replication task = w1 * the first priority value + w2 * the second priority value + w3 * the third priority value.

Optionally, the priority calculation module 72 may also splicing the first priority value, the second priority value, and the third priority value of the replication task in order from high to low to generate a replication. The priority of the task.

Based on the above, the priority calculation module 72 may form the priority value into a three-digit number, wherein the hundred, ten, and one digits are Pt, Pp, and Pd, and the three digits are priorities of the replication task. Its value range is [000, 299], the smaller the three digits, the corresponding copy The higher the priority of the service.

The priority determining apparatus provided in this embodiment can calculate the priority of the replication task across the cluster, and provide conditions for the subsequent replication task processing process based on the priority of the replication task (for example, the replication task scheduling process).

One of ordinary skill in the art will appreciate that all or part of the steps to implement the various method embodiments described above may be accomplished by hardware associated with the program instructions. The aforementioned program can be stored in a computer readable storage medium. The program, when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Finally, it should be noted that the above embodiments are only for explaining the technical solutions of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present application. range.

Claims

A cluster data replication method, comprising:

Identify at least one replication task that requires data to be replicated across the cluster;

Calculating a priority of each of the at least one replication task;

Each of the copy tasks is executed according to the priority of each of the copy tasks.
The method of claim 1, wherein the determining at least one replication task that requires data replication across the cluster comprises at least one of the following operations:

Periodically polling the data version management server, and determining that there is a need to perform a cross-cluster replication replication task on the data whose version changes when the version of the data is found to change;

Obtaining a replication task sent by the control server according to the received replication task notification message;

The copy task notification message is reported to the control server when the first service generates the new version data, or the second service that needs the data is determined to be the version of the data in the cluster where the second service is located. Reported to the control server when the version in the source cluster where the data is inconsistent is reported.
The method according to claim 1, wherein the calculating the priority of each of the at least one copy task comprises:

Calculating the priority of each of the replication tasks according to at least one of a triggering manner of each of the replication tasks, a generation time of the data to be copied by the replication tasks, and an importance of the source services corresponding to the replication tasks. level;

The source service corresponding to the replication task refers to a service that generates data that needs to be copied by the replication task.
The method according to claim 3, wherein the triggering manner according to the replication tasks, the generation time of data to be copied by the replication tasks, and the importance of source services corresponding to the replication tasks Degree, calculating the priority of each copy task, including:

Determining, according to the triggering manner of each replication task, a first priority of each replication task Value

Determining, according to the importance of the source service corresponding to each of the replication tasks, a value of the second priority of each of the replication tasks;

Determining a third priority value of each of the replication tasks according to a generation time of the data to be copied by each of the replication tasks;

The priority of each of the replication tasks is generated according to the first priority value, the second priority value, and the third priority value of each of the replication tasks.
The method according to claim 4, wherein the determining the third priority value of each of the replication tasks according to the generation time of the data to be copied by the replication tasks comprises:

Determining, according to the formula P d =9*t/T, the third priority value of each copy task;

Where t represents the time at which the data to be copied by each of the replication tasks is generated;

T represents the life cycle of the data that each copy task needs to copy, and 0<t<T;

P d represents the third priority value of each copy task.
The method according to claim 4, wherein the generating, according to the first priority value, the second priority value, and the third priority value of each of the replication tasks, generating the replication tasks Priority, including:

The first priority value, the second priority value, and the third priority value of the respective replication tasks are spliced together in order from the high to the low to generate the priorities of the replication tasks.
The method according to any one of claims 1-6, wherein the performing the each of the replication tasks according to the priority of the replication tasks comprises:

Applying bandwidth resources to the replication tasks according to the priorities of the replication tasks;

Each of the replication tasks is performed based on the requested bandwidth resource.
The method of claim 7 wherein said copying Priority of the task, applying for bandwidth resources for each of the replication tasks, including:

And each of the copy tasks is packaged according to a priority of each of the copy tasks and a preset job submission limit to form at least one copy job;

Determining a priority of each copy job according to a priority of a copy task included in each copy job in the at least one copy job;

The bandwidth resource is requested for the copy task included in each copy job according to the priority of each copy job.
The method according to claim 8, wherein said copying tasks are packaged according to a priority of said each copy task and a preset job submission limit to form at least one copy job, including :

The replication task is sequentially acquired as the current replication task according to the order of priority from high to low;

If the current copy task does not reach the job submission limit, continue to obtain other copy tasks that do not reach the job submission limit, until the sum of the plurality of copy tasks that do not reach the job submission limit reaches the job submission limit. Packaging the plurality of copy tasks that do not meet the job submission limit to generate a copy job;

If the current copy task reaches the job submission limit, the current copy task is directly used as a copy job.
The method of claim 8 wherein said job submission limit comprises at least one of:

The total number of files is the upper limit;

The upper limit of the total file size.
A cluster data replication device, comprising:

Determining a module for determining at least one replication task that requires data to be replicated across the cluster;

a calculation module, configured to calculate a priority of each of the at least one replication task;

And an execution module, configured to execute each of the replication tasks according to a priority of each of the replication tasks.
The apparatus according to claim 11, wherein the determining module is specifically configured to perform at least one of the following operations:

Periodically polling the data version management server, and determining that there is a need to perform a cross-cluster replication replication task on the data whose version changes when the version of the data is found to change;

Obtaining a replication task sent by the control server according to the received replication task notification message;

The copy task notification message is reported to the control server when the first service generates the new version data, or the second service that needs the data is determined to be the version of the data in the cluster where the second service is located. Reported to the control server when the version in the source cluster where the data is inconsistent is reported.
The device according to claim 11, wherein the calculation module is specifically configured to:

Calculating the priority of each of the replication tasks according to at least one of a triggering manner of each of the replication tasks, a generation time of the data to be copied by the replication tasks, and an importance of the source services corresponding to the replication tasks. level;

The source service corresponding to the replication task refers to a service that generates data that needs to be copied by the replication task.
The device according to claim 13, wherein the calculation module is specifically configured to:

Determining, according to the triggering manner of each of the replication tasks, a first priority value of each of the replication tasks;

Determining, according to the importance of the source service corresponding to each of the replication tasks, a value of the second priority of each of the replication tasks;

Determining a third priority value of each of the replication tasks according to a generation time of the data to be copied by each of the replication tasks;

According to the first priority value, the second priority value, and the third priority of each replication task. The first level takes a value, and the priority of each copy task is generated.
The device according to claim 14, wherein the calculation module is specifically configured to:

Determining, according to the formula P d =9*t/T, the third priority value of each copy task;

Where t represents the time at which the data to be copied by each of the replication tasks is generated;

T represents the life cycle of the data that each copy task needs to copy, and 0<t<T;

P d represents the third priority value of each copy task.
The device according to claim 14, wherein the calculation module is specifically configured to:

The first priority value, the second priority value, and the third priority value of the respective replication tasks are spliced together in order from the high to the low to generate the priorities of the replication tasks.
The device according to any one of claims 11 to 16, wherein the execution module is specifically configured to:

Applying bandwidth resources to the replication tasks according to the priorities of the replication tasks;

Each of the replication tasks is performed based on the requested bandwidth resource.
The apparatus according to claim 17, wherein the execution module is specifically configured to:

And each of the copy tasks is packaged according to a priority of each of the copy tasks and a preset job submission limit to form at least one copy job;

Determining a priority of each copy job according to a priority of a copy task included in each copy job in the at least one copy job;

The bandwidth resource is requested for the copy task included in each copy job according to the priority of each copy job.
The apparatus according to claim 18, wherein said execution module is specific Used for:

The replication task is sequentially acquired as the current replication task according to the order of priority from high to low;

If the current copy task does not reach the job submission limit, continue to obtain other copy tasks that do not reach the job submission limit, until the sum of the plurality of copy tasks that do not reach the job submission limit reaches the job submission limit. Packaging the plurality of copy tasks that do not meet the job submission limit to generate a copy job;

If the current copy task reaches the job submission limit, the current copy task is directly used as a copy job.
The apparatus of claim 18, wherein the job submission limit comprises at least one of the following:

The total number of files is the upper limit;

The upper limit of the total file size.
A priority determining method, comprising:

Acquiring at least one of a triggering manner of a replication task that needs to replicate data across the cluster, a generation time of the data that the replication task needs to be replicated, and an importance of the source service corresponding to the replication task, the source corresponding to the replication task Service refers to the service that generates the data that the copy task needs to copy;

The priority of the copy task is calculated according to the at least one factor.
A priority determining apparatus, comprising:

An information obtaining module, configured to acquire a triggering manner of a replication task that needs to replicate data across the cluster, a generation time of the data to be copied by the replication task, and an importance degree of the source service corresponding to the replication task, where The source service corresponding to the replication task refers to a service that generates data that needs to be copied by the replication task;

a priority calculation module, configured to calculate a priority of the replication task according to the at least one factor.