WO2019223283A1

WO2019223283A1 - Combinatorial optimization scheduling method for predicting task execution time

Info

Publication number: WO2019223283A1
Application number: PCT/CN2018/118871
Authority: WO
Inventors: 郭乃网; 田英杰; 苏运; 陈睿; 宋岩; 沈泉江; 庞天宇; 方炯; 杨洪山
Original assignee: 国网上海市电力公司; 华东电力试验研究院有限公司; 星环信息科技（上海）有限公司
Priority date: 2018-05-24
Filing date: 2018-12-03
Publication date: 2019-11-28
Also published as: CN108769182A; JP2020524313A

Abstract

Disclosed is a combinatorial optimization scheduling method for predicting task execution time, comprising: combinatorially optimizing a predictive execution task scheduling model on the basis of combination re-execution scheduling technology (CREST); on the basis of the predictive execution task scheduling model, collecting fine-grained resource information of inter-node bandwidth and node processing capacity; obtaining an expected completion time of a rescheduling task according to the fine-grained resource information of inter-node bandwidth and node processing capacity; designing a scheme for re-executing a slow task combination, and obtaining an expected completion time of slow task combination re-execution; designing a slow task optimal combination re-execution scheme, and obtaining a target equation for combination re-execution optimization according to the expected completion time of the re-scheduling task and the expected completion time of the slow task combination re-execution; and setting weights, and obtaining an effectively shortened predictive execution task running time using the target equation for combination re-execution optimization.

Description

Combination optimal scheduling method for predicting execution time of execution tasks

This application claims priority from a Chinese patent application filed with the Chinese Patent Office on May 24, 2018 with application number 201810510271.X, the entire contents of which are incorporated herein by reference.

Technical field

The present application relates to an offline job scheduling method, for example, to a combined optimized scheduling method for predicting the execution time of a task.

Background technique

Power supply and distribution big data applications need to realize massive data real-time processing, rely on related parallel processing technologies, and also emphasize the flexibility, reliability, manageability and economy of related computing and storage capabilities. Among them, when scheduling offline jobs such as MapReduce, the running time of a MapReduce job is determined by the sum of the running time of the longest running Map task and the longest running Reduce task. Therefore, it is necessary to minimize the running time of the job. It is necessary to minimize the maximum running time of the Map task and the running time of the Reduce task. Therefore, how to minimize the running time of MapReduce jobs becomes a Min-Max optimization problem. The related technology can basically minimize the running time of the job, but in the optimal case, it cannot effectively increase the performance gain, and it is not well applicable to the heterogeneity and dynamics of resources in a distributed computing environment.

Summary of the Invention

In order to overcome the shortcomings of the foregoing related technologies, the present application provides a combined optimal scheduling method for predicting the execution time of a task.

Because the number of Reduce tasks is relatively small, the data set to be transmitted by the task is small, and its input data needs to be obtained from all Map tasks. There is no data locality problem. Therefore, the use of CREST technology can reduce the running time of slow tasks in all Map tasks, so the problem can be further transformed into how to minimize the execution time t _spec of predicted execution tasks of Map slow tasks.

In a network that supports offline scheduling, the bandwidth between task nodes is small, and the processing capabilities of task nodes are heterogeneous. Data locality needs to be considered when scheduling prediction execution tasks. Aiming at the heterogeneity and dynamics of resources in a distributed computing environment, a series of methods and systems for resource monitoring have been generated. At the same time, methods such as the execution time model of tasks in MapReduce jobs and the sampling operation of sample tasks to predict the processing capability of nodes on tasks have also achieved good application results. Therefore, fine-grained resource information such as bandwidth between nodes and the ability to process tasks in a distributed computing environment can be used to optimize predictive execution task scheduling.

Based on the foregoing, the present application relates to a combined optimization scheduling method for predicting the execution time of an execution task. The method includes the following steps S1 to S6.

In step S1: a combination optimization-based prediction execution technology (Combination, Re-Execution, Speculative Technology, CREST) combination optimization prediction execution task scheduling model;

CREST includes: (i) the CREST combined optimized predictive execution task scheduling model and (ii) the CREST combined re-execution scheduling algorithm and its implementation. Because there is a direct network connection between each computing node, that is, communication between any two points does not need to be forwarded by a third node, therefore, a complete graph is used to represent the network topology structure diagram and task migration diagram of the nodes. The CREST combined optimized predictive execution task scheduling model is a directed complete graph. Each edge represents a possible migration path, that is, the task running at the starting point of the edge is ended, and it is migrated to the end point of the edge and re-executed.

There are two mechanisms for predictive execution task scheduling: LATE and CREST.

1) Schedule mechanism for predicting the longest completion time of a task (Longest, Approximate, Time, End): Run the predictive execution task directly on idle resources.

2) CREST: A combined re-execution mechanism. The combined re-execution scheme can be regarded as a loop-free path from a slow task node to an idle node. For each edge on the path, the task at its starting point is migrated to its end point for re-execution, and it is postponed in order to keep the slow task on the slow task node to continue execution. Assume that there is no exchange job between nodes in the combined re-execution mechanism, that is, there is no loop in the path, because the data of the job that has begun to run has been transmitted to the node local, and the load of the Map task is approximately the same. The requirements of data locality have been taken into consideration, so the interchange operation between nodes does not bring performance gains.

In step S2: fine-grained resource information of bandwidth between nodes and processing capabilities of the nodes is introduced.

For a directed edge complete graph (u, v), it is assumed that T _u represents Map tasks running on nodes _u, d (T u) represents the processed data T _u, | d (T _u) | D represents ( T _u ), PR _v represents the progress rate at which a node completes this type of Map task, and bw (u, v) represents the bandwidth of the network connection between nodes u and v.

In step S3: obtaining the estimated completion time of the rescheduled task.

The Map task executed on a given node u is rescheduled for execution on node v, then its estimated completion time (ExpectedTimetoFinish, ETF) is represented by t ′ _etf (u, v), which is defined as:

Among them, t _c represents the current time. The data transmission time t _{data_movement} can be obtained by the formula:

It should be pointed out that d (T _u ) can be transmitted from nodes other than u to v. In the algorithm implementation, a copy optimization selection strategy is used to speed up the transmission.

In step S4: obtaining a re-execution slow task combination scheme.

Given s represents a slow task running node, f represents an idle node, PATH (s, f) represents a path from s to f, and PATH (s, f) cascades all tasks containing the starting node in the edge. Migration to the termination node for execution is defined as a combination re-execution scheme of slow tasks along PATH (s, f).

Slow task combination re-execution estimated completion time is: given the slow task combination re-execution of a given edge, its estimated completion time is defined as the maximum estimated completion time of all migration jobs, and its expression is:

In step S5: obtaining an optimal combination re-execution scheme for slow tasks.

The optimal slow task combination execution plan is defined as the combination re-execution plan with the smallest predicted completion time, which is expressed in CRES, that is, the estimated optimal completion time of the slow task optimal combination re-execution is:

t _spec (CRES) = min (t _spec (PATH (s, f))) forallpathconnects, f ...

Then the estimated completion time of the slow task prediction execution obtained along the optimal combination re-execution scheme is:

t _cres = min (t _spec (PATH (s, f))) forallpathconnects, f

The objective equation of the combination and re-execution optimization can be obtained by integrating step S3 as:

In step S6: the weight is adjusted to obtain a shortened running time of the predicted execution task.

Set the weight of the directed edge (u, v) in the task scheduling model graph to t ′ _etf (u, v), and the weight is greater than zero, then the optimal combination of slow task re-execution scheme is reflected in the task scheduling model graph. Is an optimal path (CRES), and the weight of the path is not the arithmetic sum of the weights of the contained edges, but the maximum value of the weights of the contained edges. After optimization, you can effectively shorten the running time of the forecast execution task.

Compared with related technologies, this application has the following advantages:

(1) This application uses the CREST technology to reduce the average execution time of the predicted execution task by more than 50%. In the best case, this performance gain can reach 70%. At the same time, when the CREST technology is used, there is a probability of more than 50%. The performance gain of more than 40%, with the increase of the number of copies factor, the performance improvement using CREST technology has further increased.

(2) This application introduces fine-grained resource information of inter-node bandwidth and node processing capabilities to design a combination optimization mechanism that can meet the data locality requirements of predictive execution tasks and is well applicable to the heterogeneity of resources in distributed computing environments. And dynamic.

Overview of the drawings

FIG. 1 is a flowchart of a combined optimal scheduling method for predicting execution time of an execution task according to an embodiment of the present application.

Detailed ways

The following describes the present application in detail with reference to the drawings and specific embodiments.

Examples

As shown in FIG. 1, the present application relates to a combined optimization scheduling method for predicting the execution time of a task, including the following steps 110 to 160.

In step 110, based on the CREST technology, a combined optimized predictive execution task scheduling model is combined.

In step 120, fine-grained resource information of bandwidth between nodes and processing capabilities of the nodes is collected.

The fine-grained resource information of node processing capabilities mainly includes Map tasks running on the nodes, the data processed by the Map tasks, the size of the data processed by the Map tasks, the progress rate of the nodes to complete Map tasks of this type, and The bandwidth of the network connection between nodes.

In step 130, the estimated completion time of the rescheduled task is obtained according to the fine-grained resource information of the inter-node bandwidth and the processing capability of the node.

In step 140, a re-execution slow task combination scheme is designed, and an estimated completion time of the slow task combination re-execution is obtained.

In step 150, an optimal combined re-execution plan for slow tasks is designed, and an objective equation of combined re-execution optimization is obtained according to the estimated completion time of the re-scheduled task and the estimated completion time of the combined re-execution of the slow tasks.

In step 160, a weight is set, and the target execution equation of the combined re-execution optimization is used to obtain an effectively shortened prediction execution task running time.

In a network that supports offline scheduling, the bandwidth between task nodes is small, and the processing capabilities of the task nodes are heterogeneous, and data locality needs to be considered when scheduling prediction execution tasks. Aiming at the heterogeneity and dynamics of resources in a distributed computing environment, a series of methods and systems for resource monitoring have been generated. At the same time, methods such as the execution time model of tasks in MapReduce jobs and the sampling operation of sample tasks to predict the processing capacity of nodes on the tasks have also achieved good application results. Therefore, fine-grained resource information such as bandwidth between nodes and the ability to process tasks in a distributed computing environment can be used to optimize predictive execution task scheduling.

CREST is an optimization technology that uses fine-grained resource information to meet the data locality requirements of predictive execution tasks through a combination of optimization mechanisms, eliminates the time overhead of data transmission, and significantly reduces the execution time of the entire job Map phase. CREST technology includes two parts: CREST combined optimized predictive execution task scheduling model and CREST combined re-execution scheduling algorithm and its implementation. Because there is a direct network connection between each computing node, that is, communication between any two points does not need to be forwarded by a third node, we can use a complete graph to represent the network topology structure diagram and task migration diagram of the nodes. The CREST combined optimized predictive execution task scheduling model is a directed complete graph. Each edge represents a possible migration path, that is, the task running at the starting point of the edge is ended, and it is migrated to the end point of the edge and re-executed. Generally, there are two mechanisms for predicting the execution of tasks: LATE and CREST.

1) LATE: run the predictive execution task directly on the idle resource.

2) CREST: A combined re-execution mechanism. The combined re-execution scheme can be regarded as an acyclic path from a slow task node to an idle node in the above figure (directed acyclic graph). For each edge on the path, the task at its starting point will be migrated to its end point and re-executed, which will be postponed in order, but the slow task on the slow task node will continue to execute. We assume that there is no exchange job between nodes in the combined re-execution mechanism, that is, there is no loop in the path, which is reasonable in reality, because the data of the job that has started running has been transmitted to the node local, and the Map The load of the tasks is roughly the same. The data locality requirements have been taken into account in the initial assignment of the jobs, so swapping jobs between nodes does not bring performance gains.

Compared with LATE, CREST introduces fine-grained resource information between node bandwidth and node processing capabilities, and designs a combined optimization mechanism to meet the data locality requirements of predictive execution tasks. Side of the above figure (u, v), the set T _u represents Map tasks running on nodes _u, d (T u) represents the processed data T _u, | d (T _U) | represents d (T _u ), PR _v represents the progress rate (ProgressRate) of a node completing this type of Map task, and bw (u, v) represents the bandwidth of the network connection between nodes u and v.

Estimated completion time of rescheduled tasks: A Map task executed on a given node u is rescheduled for execution on node v, and its estimated completion time t ′ _etf (u, v) is defined as follows:

The data transmission time t _{data_movement} can be obtained by the formula:

Where t _c represents the current time. It should be noted that d (T _u ) can also be transmitted from nodes other than u to v. We use the copy optimization selection strategy in the algorithm implementation to speed up the transmission.

Slow task combination re-execution scheme: given s represents a slow task running node, f represents an idle node, PATH (s, f) represents a path from s to f, and PATH (s, f) includes all along the cascade The tasks of the starting node in the edge are migrated to the terminating node for execution, which is defined as a combination re-execution scheme of slow tasks along PATH (s, f). Slow task combination re-execution estimated completion time: Given the slow task combination re-execution of a given edge, its estimated completion time is defined as the maximum estimated completion time of all migration jobs as follows:

Among them, PATH (s, f) is a path from s to f, and t ′ _etf (u, v) is the estimated completion time of the rescheduled task; the meaning of the formula is: the estimated completion time of the slow task combination re-execution is defined as The maximum expected completion time of re-scheduled tasks for all directed edges in the path from s to f, that is, the maximum expected completion time of all migration jobs.

Slow task optimal combination re-execution scheme: given s represents a slow task running node, f represents an idle node. PATH (s, f) is a path from s to f. The optimal slow task combination execution plan is defined as the combination re-execution plan with the smallest predicted completion time, which is expressed in CRES, as follows:

t _spec (CRES) = min (t _spec (PATH (s, f))) forallpathconnects, f

Let t _spec (CRES) be the estimated completion time of the slow task prediction execution obtained along the optimal combination re-execution scheme.

t _cres = min (t _spec (PATH (s, f))) forallpathconnects, f

Then the formula can be used to obtain the objective equation of the re-execution optimization:

Among them, forallpathconnects represents all path connection combinations.

A large number of experiments have shown that the use of CREST technology can shorten the average execution task execution time by more than 50%, and in the best case, this performance gain can reach 70%. At the same time, when using CREST technology, there is a probability of more than 50% to obtain a performance gain of more than 40%. As the number of copies increases, the performance improvement of using CREST technology will further increase.

Claims

A combined optimal scheduling method for predicting the execution time of an execution task includes:

Combined optimization-based predictive execution technology CREST, combined optimized predictive execution task scheduling model;

Based on the predictive execution task scheduling model, collecting fine-grained resource information of bandwidth between nodes and processing capabilities of the nodes;

Obtaining an estimated completion time of a rescheduling task according to the fine-grained resource information of the inter-node bandwidth and the processing capability of the node;

Design a re-execution slow task combination scheme and obtain the estimated completion time of the slow task combination re-execution;

Designing an optimal combined re-execution plan for slow tasks, and obtaining the target equation for combined re-execution optimization based on the estimated completion time of the re-scheduled task and the estimated completion time of the combined re-execution of the slow task;

Set weights, and use the combined target execution optimization to obtain an effectively shortened prediction execution task running time.
The method according to claim 1, wherein the CREST comprises: (i) a CREST combined optimized prediction execution task scheduling model and (ii) a CREST combined re-execution scheduling algorithm and implementation thereof, wherein the CREST combined optimized prediction execution The task scheduling model is a directed complete graph.
The method according to claim 2, wherein the CREST technology-based optimized predictive execution task scheduling model comprises:

A complete graph is used to represent the network topology structure of the nodes and a task migration diagram. Each edge of the directed complete graph of the CREST combined optimization prediction task scheduling model represents a possible migration path. The starting point of each edge is The task ends, migrates to the end point of each edge and re-executes.
The method according to claim 3, wherein the fine-grained resource information of the inter-node bandwidth and node processing capability comprises: a Map task running on the node, data processed by the Map task, and data processed by the Map task. The size of the data, the rate at which nodes complete this type of Map task, and the bandwidth of the network connection between nodes.
The method according to claim 4, wherein the obtaining an estimated completion time of a rescheduled task according to the fine-grained resource information of the inter-node bandwidth and the processing capability of the node comprises: The expression obtains the estimated completion time of the rescheduled task, and the expression of the estimated completion time t ′ etf (u, v) of the rescheduled task is:

Where t c is the current time, t data_movement is the data transmission time, (u, v) is the edge of the directed complete graph, u, v are two nodes, and PR v is the progress of the node in completing this type of Map task. rate.
The method according to claim 5, wherein the designing a re-execution slow task combination scheme and obtaining an estimated completion time of the slow task combination re-execution comprises:

Suppose s is a slow task running node, f is an idle node, PATH (s, f) is a path from s to f, and PATH (s, f) migrates all tasks that include the starting node in the edge along a cascade. Execute to the termination node, defined as a combination of slow task re-execution schemes along PATH (s, f);

The estimated completion time of the slow task combination re-execution is: The slow task combination re-execution of a given edge is defined as the maximum estimated completion time of all migration jobs, and its expression is:
The method according to claim 6, wherein said designing an optimal combined re-execution plan for a slow task obtains a combined re-execution optimization based on the estimated completion time of said re-scheduled task and said estimated completion time of said slow task combination re-execution The objective equations include:

The optimal slow task combination execution plan is defined as the combination re-execution plan with the smallest predicted completion time, which is expressed in CRES, that is, the estimated optimal completion time of the slow task optimal combination re-execution is:

t spec (CRES) = min (t spec (PATH (s, f))) forallpathconnects, f ...

Then the estimated completion time of the slow task prediction execution obtained along the optimal combination re-execution scheme is:

t cres = min (t spec (PATH (s, f))) forallpathconnects, f

The estimated completion time of the comprehensive rescheduling task, and the objective equation for obtaining the combined re-execution optimization is:
The method according to claim 7, wherein the setting weight is used to obtain an effectively shortened prediction execution task running time by using the combined weighted execution optimization objective equation:

The weight of the directed edge (u, v) in the predictive execution task scheduling model graph is set to t ′ etf (u, v), and the effectively shortened execution time of the predictive execution task is obtained according to the target equation of the combined reexecution optimization.
The method according to claim 8, wherein the weight t ' etf (u, v) is greater than zero.
The method according to claim 5, wherein a copy optimization selection strategy is used to accelerate data transmission, and an expression of the data transmission time is:

Wherein, d (T u) represents the node transmits than u to v; T u is a Map task running on the node u, d (T u) data T u processed, | d (T u) | is d (T u ); bw (u, v) is the bandwidth of the network connection between nodes u and v.