CN114116173A

CN114116173A - Method, device and system for dynamically adjusting task allocation

Info

Publication number: CN114116173A
Application number: CN202111455195.5A
Authority: CN
Inventors: 姚思雨; 刘涛; 贺思远
Original assignee: Beijing Jingdong Zhenshi Information Technology Co Ltd
Current assignee: Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date: 2021-12-01
Filing date: 2021-12-01
Publication date: 2022-03-01

Abstract

The invention discloses a method, a device and a system for dynamically adjusting task allocation, and relates to the technical field of computers. One embodiment of the method comprises: for each host included in the cluster, acquiring the performance parameters of the host at regular time; judging whether the host is a host to be scheduled or not according to a preset rule and a performance parameter; and under the condition that the host is the host to be scheduled, determining the reassigned tasks from the tasks being executed by the host, finishing the reassigned tasks and adding the reassigned tasks to the task queue to perform task assignment again. The implementation method can realize the optimal distribution of the cluster resources according to the actual task requirements, so that the utilization rates of the resources such as the CPU, the memory, the disk and the like of the whole cluster host are in a balanced level, the streaming data tasks in the cluster are distributed more uniformly and can run stably, and the utilization rate of the cluster CPU resources is improved.

Description

Method, device and system for dynamically adjusting task allocation

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, and a system for dynamically adjusting task allocation.

Background

Since Apache Hadoop came into the world, distributed clustering based on multiple scattered Worker hosts became the mainstream architecture for big data processing. In such an architecture, generally, one Master node is responsible for scheduling the computing tasks, and a plurality of Worker hosts are responsible for executing the actual tasks.

At present, in such a cluster, a mainstream task scheduling scheme is implemented based on a fixed slot (slot) as a basic unit of task processing resources. Taking a streaming data distributed processing engine Apache Flink as an example, when the Task management module is started, a fixed slot number is set according to the performance of a Worker node, each slot can start a Task, and each slot in a cluster holds a part of memory resources with fixed size. After receiving the Task to be deployed from the Task manager, each slot establishes connection with the upstream of the slot, receives data and processes the data.

Most of task scheduling schemes in the existing clusters are slot-based static scheduling schemes, the number of slots in the clusters and held resources are determined to be unchanged when the clusters are started; and the tasks in the cluster can be scheduled only once when the tasks are initialized, and after the tasks are distributed to a certain slot position on a Worker node, the tasks cannot be distributed for the second time unless the tasks are failed to be executed. However, when an actual cluster runs, the demands of different tasks on computing resources such as CPUs, memories and the like are not consistent, and in order to meet the running demands of most tasks under a static scheduling scheme, the resources of slots need to be set according to the maximum resource demand of all tasks in the whole task life cycle, which causes great waste of computing resources.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, and a system for dynamically adjusting task allocation, which can dynamically schedule a task being executed in a task execution stage, so as to implement optimal allocation of resources of a cluster according to actual requirements of the task, so that the utilization rates of resources such as a CPU, a memory, and a disk of a host of the whole cluster are all at a balanced level, so that the streaming data tasks in the cluster are distributed more evenly and can run stably, and the utilization rate of the resources of the CPU of the cluster is improved.

To achieve the above object, according to an aspect of an embodiment of the present invention, a method for dynamically adjusting task allocation is provided.

A method of dynamically adjusting task allocation, comprising: for each host included in the cluster, acquiring the performance parameters of the host at regular time; judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters; and under the condition that the host is the host to be scheduled, determining a reassigned task from the tasks executed by the host, finishing the reassigned task and adding the reassigned task to a task queue to perform task assignment again.

Optionally, the task allocation comprises: and acquiring the tasks to be distributed from the task queue, selecting a host for the tasks to be distributed according to the task information of the tasks to be distributed, and distributing the tasks to be distributed to the host.

Optionally, selecting a host for the task to be allocated according to the task information of the task to be allocated includes: judging whether the task to be allocated is a CPU reallocation task or not according to the task information; if yes, sorting the hosts included in the cluster according to the size of the available CPU resources; otherwise, sorting the hosts included in the cluster according to the proportion of the available processes; and selecting the host from the sorted hosts.

Optionally, the selecting the host from the sorted hosts includes: and traversing the ordered hosts in sequence, and selecting the hosts according to the available process number, the available disk space, the CPU load and the residual memory of each host.

Optionally, selecting the host according to the available process number, the available disk space, the CPU load, and the remaining memory of each host includes: for each host, judging whether the available process number of the host is larger than the process number required by the task to be distributed; if yes, judging whether the task to be distributed is a disk consumption type task or not; if so, judging whether the CPU load of the host is smaller than a second threshold value or not when the available disk space of the host is larger than the first threshold value; otherwise, directly judging whether the CPU load of the host is smaller than a second threshold value; if so, taking the host as the selected host when the residual memory of the host is larger than the memory required by the task to be distributed; otherwise, adding the host into a host list to be scheduled; and if no host is selected after the traversal is completed, taking the host with the lowest CPU load in the host list to be scheduled as the selected host.

Optionally, the performance parameters include CPU load and available disk space; judging whether the host is a host to be scheduled according to a preset rule and the performance parameter comprises the following steps: judging whether the host is a host to be scheduled or not according to the CPU load of the host and the available disk space, wherein if the CPU load of the host is greater than a third threshold and the duration time reaches a first time threshold, the host is taken as the host to be scheduled; and if the available disk space of the host is smaller than a fourth threshold and the duration time reaches a second time threshold, taking the host as a host to be scheduled.

Optionally, the state type of the host to be scheduled includes a high disk load and a high CPU load; determining to reassign tasks from among the tasks being performed by the host comprises: judging whether the state type of the host is a high disk load or not; if so, determining the task with the largest disk occupation amount which is executed by the host as a redistribution task; otherwise, determining to redistribute the tasks according to the CPU utilization rate of the tasks being executed by the host.

Optionally, determining to reallocate tasks according to the CPU utilization of the task being executed by the host includes: acquiring a task with the maximum CPU utilization rate, and judging whether the CPU utilization rate exceeds a fifth threshold value; if yes, determining each task which is executed by the host except the task with the maximum CPU utilization rate as a reassigned task, and adding a CPU reassignment task identifier for the task; otherwise, traversing each task being executed by the host in sequence, and executing the following operations on each task until the CPU utilization rate of the host is less than the fifth threshold: and if the task is not scheduled in the latest period, determining the task as a reassigned task, and adding a CPU reassigned task identifier for the task.

According to another aspect of the embodiments of the present invention, an apparatus for dynamically adjusting task allocation is provided.

An apparatus for dynamically adjusting task allocation, comprising: the system comprises a host parameter acquisition module, a performance parameter acquisition module and a performance parameter acquisition module, wherein the host parameter acquisition module is used for acquiring the performance parameters of each host in a cluster at regular time; the host state judging module is used for judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters; and the task allocation adjusting module is used for determining the re-allocation tasks from the tasks executed by the host under the condition that the host is the host to be scheduled, finishing the re-allocation tasks and adding the re-allocation tasks into the task queue to re-allocate the tasks.

Optionally, the task allocation adjusting module is further configured to, when performing task allocation: and acquiring the tasks to be distributed from the task queue, selecting a host for the tasks to be distributed according to the task information of the tasks to be distributed, and distributing the tasks to be distributed to the host.

Optionally, the task allocation adjusting module is further configured to: judging whether the task to be allocated is a CPU reallocation task or not according to the task information; if yes, sorting the hosts included in the cluster according to the size of the available CPU resources; otherwise, sorting the hosts included in the cluster according to the proportion of the available processes; and selecting the host from the sorted hosts.

Optionally, when the task allocation adjustment module selects a host from the sorted hosts, the task allocation adjustment module is further configured to: and traversing the ordered hosts in sequence, and selecting the hosts according to the available process number, the available disk space, the CPU load and the residual memory of each host.

Optionally, when the task allocation adjustment module selects a host from the sorted hosts, the task allocation adjustment module is further configured to: for each host, judging whether the available process number of the host is larger than the process number required by the task to be distributed; if yes, judging whether the task to be distributed is a disk consumption type task or not; if so, judging whether the CPU load of the host is smaller than a second threshold value or not when the available disk space of the host is larger than the first threshold value; otherwise, directly judging whether the CPU load of the host is smaller than a second threshold value; if so, taking the host as the selected host when the residual memory of the host is larger than the memory required by the task to be distributed; otherwise, adding the host into a host list to be scheduled; and if no host is selected after the traversal is completed, taking the host with the lowest CPU load in the host list to be scheduled as the selected host.

Optionally, the performance parameters include CPU load and available disk space; the host status determination module is further configured to: judging whether the host is a host to be scheduled or not according to the CPU load of the host and the available disk space, wherein if the CPU load of the host is greater than a third threshold and the duration time reaches a first time threshold, the host is taken as the host to be scheduled; and if the available disk space of the host is smaller than a fourth threshold and the duration time reaches a second time threshold, taking the host as a host to be scheduled.

Optionally, the state type of the host to be scheduled includes a high disk load and a high CPU load; the task allocation adjustment module is further configured to: judging whether the state type of the host is a high disk load or not; if so, determining the task with the largest disk occupation amount which is executed by the host as a redistribution task; otherwise, determining to redistribute the tasks according to the CPU utilization rate of the tasks being executed by the host.

Optionally, the task allocation adjusting module is further configured to: acquiring a task with the maximum CPU utilization rate, and judging whether the CPU utilization rate exceeds a fifth threshold value; if yes, determining each task which is executed by the host except the task with the maximum CPU utilization rate as a reassigned task, and adding a CPU reassignment task identifier for the task; otherwise, traversing each task being executed by the host in sequence, and executing the following operations on each task until the CPU utilization rate of the host is less than the fifth threshold: and if the task is not scheduled in the latest period, determining the task as a reassigned task, and adding a CPU reassigned task identifier for the task.

According to yet another aspect of an embodiment of the present invention, a system for dynamically adjusting task allocation is provided.

A system for dynamically adjusting task allocation, comprising: the task scheduling node is used for acquiring the performance parameters of each host in the cluster at regular time; judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters; under the condition that the host is a host to be scheduled, determining a reassigned task from tasks being executed by the host, finishing the reassigned task and adding the reassigned task to a task queue to perform task assignment again; and the host is used for executing the distributed tasks.

Optionally, the host is further configured to: acquiring current resource use data of the node at regular time and reporting the data to the task scheduling node; the task scheduling node is further configured to: determining a type of the host from the resource usage data.

Optionally, the host is further configured to: calculating the resource occupation data of the executing task at regular time and reporting the data to the task scheduling node; the task scheduling node is further configured to: and determining the type of each task according to the resource occupation data.

According to another aspect of the embodiments of the present invention, an electronic device for dynamically adjusting task allocation is provided.

An electronic device that dynamically adjusts task allocation, comprising: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for dynamically adjusting task allocation provided by the embodiment of the invention.

According to yet another aspect of embodiments of the present invention, a computer-readable medium is provided.

A computer readable medium, on which a computer program is stored, which when executed by a processor implements the method for dynamically adjusting task allocation provided by embodiments of the present invention.

One embodiment of the above invention has the following advantages or benefits: the method comprises the steps that performance parameters of hosts are obtained regularly for each host included in a cluster; judging whether the host is a host to be scheduled or not according to a preset rule and a performance parameter; under the condition that the host is a host to be scheduled, determining a reassigned task from tasks being executed by the host, ending the reassigned task and adding the reassigned task to a task queue to re-distribute the tasks, adopting a task scheduling algorithm which can still monitor the running condition of the host in the cluster in real time and dynamically adjust and distribute the tasks in the task execution process, realizing the optimal distribution of the resources of the cluster according to the actual requirements of the tasks, ensuring that the utilization rates of resources such as a CPU, a memory, a disk and the like of the whole cluster host are in a balanced level, ensuring that the distribution of streaming data tasks in the cluster is more balanced and the cluster can stably run, and simultaneously improving the utilization rate of the resources of the cluster CPU.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of the main steps of a method for dynamically adjusting task allocation according to an embodiment of the present invention;

FIG. 2 is a system architecture diagram of a task distribution system of an embodiment of the present invention;

FIG. 3 is a flow chart illustrating the implementation of the task assignment phase according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating an implementation of searching for a host to be scheduled in a task execution phase according to an embodiment of the present invention;

FIG. 5 is a flow chart illustrating an implementation of task reassignment determination during a task execution phase according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of the main modules of an apparatus for dynamically adjusting task allocation according to an embodiment of the present invention;

FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the prior art, most task scheduling schemes in a cluster are based on a slot static scheduling scheme, and the static scheduling scheme is mainly embodied as follows: the number of slots in the cluster, and the held resources are determined not to change when the cluster is started; tasks in the cluster can be scheduled only once when the tasks are initialized, and after the tasks are allocated to a certain slot position on a Worker node, the tasks cannot be allocated for the second time unless the tasks are failed to be executed.

However, when an actual cluster runs, the demands of different tasks on computing resources such as a CPU, a disk, and a memory of a core processor are not consistent, and in order to meet the running demands of most tasks under a static scheduling scheme, the resources of the slot position need to be set according to the maximum resource demand of all tasks in the whole task life cycle, which results in great waste of computing resources. For example, there are 10 tasks in the cluster, each task needs to be executed for 10 minutes, where 9 tasks all need 1G of memory in the whole life cycle, another task a needs 4G of memory in 1 minute of the 10-minute execution life cycle, and other tasks only need 1G of memory, all slots of the cluster must be set to 4G of memory in order to meet the requirements of the task a, and then 10 × 4- (1 × 4+ 9) ═ 27G of memory is wasted by the final cluster.

In order to solve the technical problems in the prior art, the invention provides a method, a device and a system for dynamically adjusting task allocation, which can dynamically schedule tasks, monitor the running condition of a host in a cluster in real time after the allocation is finished, and dynamically adjust and allocate the tasks, so that the host resources of the whole cluster reach a balanced state, and the tasks cannot be starved due to the fact that the tasks are not allocated all the time.

Fig. 1 is a schematic diagram of main steps of a method for dynamically adjusting task allocation according to an embodiment of the present invention. As shown in fig. 1, the method for dynamically adjusting task allocation according to the embodiment of the present invention mainly includes the following steps S101 to S103.

Step S101: for each host included in the cluster, acquiring the performance parameters of the host at regular time;

step S102: judging whether the host is a host to be scheduled or not according to a preset rule and a performance parameter;

step S103: and under the condition that the host is the host to be scheduled, determining the reassigned tasks from the tasks being executed by the host, finishing the reassigned tasks and adding the reassigned tasks to the task queue to perform task assignment again.

According to the above steps S101 to S103, for the problems existing in the conventional cluster task scheduling, by using a task scheduling algorithm that can still monitor the operation condition of the host in the cluster in real time during the task execution process and dynamically adjust and allocate the tasks, the optimal allocation of the cluster resources according to the actual task requirements is realized, so that the utilization rates of the resources such as the CPU, the memory, the disk and the like of the whole cluster host are all at a balanced level.

In order to realize the method for dynamically adjusting the task allocation, the invention provides a system for dynamically adjusting the task allocation, which comprises a task scheduling node and a host. The task scheduling node is used for acquiring the performance parameters of each host in a cluster at regular time; judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters; under the condition that the host is a host to be scheduled, determining a reassigned task from tasks being executed by the host, finishing the reassigned task and adding the reassigned task to a task queue to perform task assignment again; and the host is used for executing the distributed tasks.

According to an embodiment of the present invention, in the system for dynamically adjusting task allocation, during a specific application process, the host may further be configured to: acquiring current resource use data of the node at regular time and reporting the data to the task scheduling node; and, the task scheduling node may be further configured to: determining a type of the host from the resource usage data. In another aspect, the host may be further configured to: calculating the resource occupation data of the executing task at regular time and reporting the data to the task scheduling node; and, the task scheduling node may be further configured to: and determining the type of each task according to the resource occupation data.

The system for dynamically adjusting task allocation according to the embodiments of the present invention is described below with reference to the accompanying drawings. FIG. 2 is a system architecture diagram of a task distribution system of an embodiment of the present invention. In this embodiment, the task scheduling node is embodied as a Master node, and the host is embodied as a Worker node. As shown in fig. 2, the task allocation system according to the embodiment of the present invention mainly includes a Master node and a plurality of Worker nodes. Wherein each Worker node can be operated by a separate host. The Master node mainly comprises a cluster resource management module, a cluster task scheduling module and a cluster task management module, and is mainly responsible for distributing and scheduling cluster tasks. The Worker node mainly comprises a node resource acquisition module, a task resource calculation module and a task execution module, and is responsible for executing tasks and calculating and reporting resource conditions. According to the technical scheme of the invention, the main flow of dynamic task scheduling is as follows:

1. after the cluster is started, the node resource acquisition module of the Worker node acquires resource use data of the current CPU, memory, disk and the like of the Worker node in a cluster at regular time and reports the data to the cluster resource management module of the Master node. The cluster resource management module of the Master node stores the latest resource use data of all Worker nodes of the cluster, and can divide the Worker nodes into a CPU type (with a large number of CPU cores), an internal memory type (with a large internal memory), a disk type (with a large disk) and a universal type according to the available resource use data. The general Worker node is, for example, a host corresponding to a resource with the most configuration in the hosts included in the cluster; the CPU type, the memory type and the disk type are divided relative to the general type, for example, the CPU type is that the number of CPU cores of the host is more than that of the general host, and the memory type, the disk type and the like are the same division principle;

2. the task resource calculation module of the Worker node is responsible for calculating resource occupation data of a task running on the node at regular time (for example, 10 seconds, which can be flexibly set according to application requirements) and reporting the resource occupation data to the cluster task management module of the Master node. The cluster task management module of the Master node is responsible for maintaining the latest resource occupation data of the task running at the Worker node, and the task is divided into a CPU consumption type, a memory consumption type, a disk consumption type and a balance type according to the resource occupation data. When the task type is divided, the division can be performed according to a preset threshold, for example, a CPU consumption type task can be defined as a task execution host CPU utilization rate exceeding 10%, and the like;

3. after the Master node acquires the tasks from the task queue, a cluster task scheduling module of the Master node selects an optimal Worker node for the tasks by using a first-stage (task allocation stage) task scheduling algorithm and allocates the tasks to the Worker node for execution;

4. the cluster task scheduling module of the Master node uses a task scheduling algorithm of a second stage (task execution stage) at regular time (for example, 3 minutes, which can be flexibly set according to needs), finds a Worker node set with deficient resources, and redistributes tasks of the Worker nodes, so that the use condition of the whole cluster resources reaches a relatively balanced state.

The task allocation system of the embodiment of the invention is used for dynamically allocating tasks, not only a two-stage task scheduling algorithm of a task allocation stage and a task execution stage is adopted, but also the optimized allocation of cluster resources according to the actual requirements of the tasks is realized, and the utilization rates of resources such as a CPU (central processing unit), an internal memory, a disk and the like of a whole cluster host are in a balanced level; meanwhile, the most suitable operation node can be selected according to different task types (such as CPU consumption type, memory consumption type, disk consumption type and the like), so that the effects of efficiently operating the tasks, simultaneously enabling the utilization rate of cluster resources to be higher and enabling the cluster operation state to be more stable are achieved.

According to one embodiment of the invention, task allocation is required in the following manner whether it is a task that has not been allocated before or a task that has been allocated but is determined to be a reassigned task. When task allocation is performed, specifically, a task to be allocated can be obtained from a task queue, a host is selected for the task to be allocated according to task information of the task to be allocated, and the task to be allocated is allocated to the host. When a host is selected for a task to be allocated according to task information of the task to be allocated, the method specifically includes the following steps:

judging whether the task to be allocated is a CPU reallocation task or not according to the task information;

if yes, sorting the hosts included in the cluster according to the size of the available CPU resources; otherwise, sorting the hosts included in the cluster according to the proportion of the available processes;

and selecting the host from the sorted hosts.

In another embodiment of the present invention, when selecting the hosts from the sorted hosts, the hosts may be selected according to the number of available processes, the available disk space, the CPU load, and the remaining memory of each host by traversing the sorted hosts in sequence.

In another embodiment of the present invention, selecting the host according to the number of available processes, the available disk space, the CPU load, and the remaining memory of each host may specifically include the following steps:

for each host, judging whether the available process number of the host is larger than the process number required by the task to be distributed;

if yes, judging whether the task to be distributed is a disk consumption type task or not;

if so, judging whether the CPU load of the host is smaller than a second threshold value or not when the available disk space of the host is larger than the first threshold value; otherwise, directly judging whether the CPU load of the host is smaller than a second threshold value;

if so, taking the host as the selected host when the residual memory of the host is larger than the memory required by the task to be distributed; otherwise, adding the host into a host list to be scheduled;

and if no host is selected after the traversal is completed, taking the host with the lowest CPU load in the host list to be scheduled as the selected host.

In the above embodiment, both the first threshold and the second threshold may be flexibly set according to application requirements, which has no influence on implementation of the technical solution of the present invention.

The following describes an implementation flow of the task allocation phase according to an embodiment of the present invention with reference to fig. 3. Fig. 3 is a schematic flow chart of the implementation of the task allocation phase according to the embodiment of the present invention. As shown in fig. 3, after a task to be allocated is acquired from the task queue, the task allocation is performed according to the following process:

1. acquiring resource use data corresponding to hosts of the cluster according to the cluster where the task is located, wherein each host is a Worker node;

2. and sequencing the hosts according to the task information. When the task information is different, the corresponding host sorting modes are different. Specifically, the method comprises the following steps:

(1) if the task information contains a CPU reallocation task identifier (namely, the task is a CPU reallocation task), the task is a reallocation task triggered by a cluster scheduling mechanism for dynamically adjusting task allocation and is a reallocation task determined due to overhigh CPU load of a host, and at the moment, the host needs to be sequenced according to the size of available CPU resources;

(2) if the task information does not contain the CPU reallocation task identifier (namely, the task is not the CPU reallocation task), the task is a task which is submitted for the first time, or a task which is automatically retried after being executed failure due to other reasons, or a disk consumption type task with large disk occupation (the task information contains the information), and at this time, the hosts need to be sorted according to the available process number ratio. The available slot position ratio is the available slot position ratio, each slot position can be understood as a process, each slot position can execute a task, but one task may need a plurality of slot positions to execute, that is: a task requires one or more processes to execute;

3. traversing the sorted hosts;

4. taking out a host computer, executing the step 5, if no host computer exists, indicating that all the host computers have traversed and no host computer meeting the requirements is found, and executing the step 12;

5. judging whether the available process number of the host is larger than the process number required by the task, and if so, executing the step 6; otherwise, executing step 4;

6. judging whether the task is a disk consumption type task, if so, executing a step 7, otherwise, executing a step 8;

7. the host is checked whether its available disk space is greater than a first threshold, and normally, for a host, when its available disk space is less than a certain threshold (e.g., 20%), no task should be allocated, otherwise the host's operation will be affected. If yes, executing step 8, otherwise executing step 4;

8. checking whether the CPU load of the host is less than a second threshold value, wherein the second threshold value can be expressed as: number of nuclei f₁，f₁Is a definable threshold, and usually, for a host, tasks should not be allocated when its CPU load reaches a certain threshold (e.g., 0.6 core count), otherwise the operation of the host will be affected. If yes, executing step 9, otherwise executing step 11;

9. checking whether the residual memory of the host is larger than the memory required by the task, if so, executing the step 10, otherwise, executing the step 4;

10. distributing the tasks and ending;

11. adding the host into a host list to be scheduled, and then executing the step 4;

12. the host lists to be scheduled are sorted, and the host with the lowest load is selected from the hosts to be scheduled for task allocation, so that the phenomenon that tasks are starved due to the fact that the tasks are not allocated all the time due to overlarge cluster pressure can be prevented.

According to the embodiment shown in fig. 3, a more appropriate running host can be selected according to different types of tasks, so that the tasks run efficiently, and meanwhile, the cluster resource utilization rate is higher and the cluster running state is more stable. Meanwhile, the memory of the cluster host cannot be divided in advance, so that the memory allocated to the task can be dynamically adjusted according to the minimum memory and the maximum memory requirements configured by the process by checking the residual memory of the host, and the utilization rate of the cluster memory resources is improved.

According to the technical scheme of the invention, after the task allocation is completed, in the task execution process, the host set to be scheduled still needs to be searched from the cluster at regular time, the reallocation task is determined from the tasks being executed by each host to be scheduled, the reallocation task is finished, and the reallocation task is added into the task queue to perform the task allocation again.

In one embodiment of the present invention, for each host included in a cluster, performance parameters of the host are periodically obtained, where the performance parameters include, for example, CPU load and available disk space of the host. And then, judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters. When determining whether the host is a host to be scheduled, the method may specifically be performed according to the following steps:

judging whether the host is a host to be scheduled or not according to the CPU load of the host and the available disk space, wherein if the CPU load of the host is greater than a third threshold and the duration time reaches a first time threshold, the host is taken as the host to be scheduled; and if the available disk space of the host is smaller than a fourth threshold and the duration time reaches a second time threshold, taking the host as a host to be scheduled.

According to one embodiment of the invention, the state types of the host to be scheduled comprise high disk load and high CPU load. When determining to reallocate tasks from among the tasks being executed by the host, the method may specifically include:

judging whether the state type of the host is a high disk load or not; the state type of the host refers to a specific corresponding overload state type when the current host is in an overload state;

if so, determining the task with the largest disk occupation amount which is executed by the host as a redistribution task;

otherwise, determining to redistribute the tasks according to the CPU utilization rate of the tasks being executed by the host.

According to another embodiment of the present invention, the step of determining to reallocate tasks according to the CPU utilization of the task being executed by the host specifically may include:

acquiring a task with the maximum CPU utilization rate, and judging whether the CPU utilization rate exceeds a fifth threshold value;

if yes, determining each task which is executed by the host except the task with the maximum CPU utilization rate as a reassigned task, and adding a CPU reassignment task identifier for the task;

otherwise, traversing each task being executed by the host in sequence, and executing the following operations on each task until the CPU utilization rate of the host is less than the fifth threshold: and if the task is not scheduled in the latest period, determining the task as a reassigned task, and adding a CPU reassigned task identifier for the task.

The following describes the implementation flow of the task execution phase according to the embodiment of the present invention with reference to fig. 4 and 5. FIG. 4 is a flowchart illustrating an implementation of searching for a host to be scheduled in a task execution phase according to an embodiment of the present invention; fig. 5 is a flowchart illustrating an implementation of determining reassignment tasks during a task execution phase according to an embodiment of the present invention. And the cluster resource management module of the Master node performs load check at regular time and determines the redistribution task.

As shown in fig. 4, the main process for the cluster resource management module of the Master node to periodically search the host set to be scheduled from the cluster is as follows:

1. traversing hosts included in the cluster;

2. taking out a host computer, executing the step 3, and executing the step 7 if no host computer to be traversed exists;

3. checking whether the CPU load of the host is greater than a third threshold (the third threshold may be the same as or different from the second threshold) and the duration reaches the first time threshold (for example, 3 minutes), wherein the third threshold may be expressed as the number of cores f₂，f₂If yes, executing step 5, otherwise executing step 4;

4. checking whether the available disk space of the host is smaller than a fourth threshold (the fourth threshold may be the same as or different from the first threshold), and the duration reaches a second time threshold (for example: 3 minutes), if so, executing step 6, otherwise, executing step 2;

5. adding a host into a host list to be scheduled, and setting the state type as a CPU high load; at this time, because the host is already in a state of overload of the CPU load, the state type corresponding to the host is recorded as the CPU high load;

6. adding a host into a host list to be scheduled, and setting the state type as a high load of a disk; at this time, because the host is already in a state of overload of the disk load, the state type corresponding to the host is recorded as the disk high load;

7. the examination is ended.

As shown in fig. 5, after acquiring the host to be scheduled, the cluster task scheduling module of the Master node may analyze the task being executed by the host to be scheduled to determine the task that needs to be reallocated, and the main process is as follows:

1. traversing a host to be scheduled;

2. taking out a host to be scheduled, executing the step 3, and executing the step 14 if no host to be traversed to be scheduled exists;

3. judging whether the state type of the host to be scheduled is a high disk load, if so, executing a step 4, otherwise, executing a step 5;

4. finding the task with the largest disk occupation amount of the host to be scheduled, determining the task as a redistribution task, ending the task (the disk space can only be recovered through an ending process so as to release the disk space), putting the task into a task queue, and then executing the step 2;

5. sequencing the tasks executed by the host to be scheduled according to the CPU utilization rate;

6. and acquiring the task with the highest CPU utilization rate, and judging whether the CPU utilization rate > a fifth threshold value is met, wherein the fifth threshold value is 50% for example. It should be noted that if the CPU utilization rate of the task is greater than 50%, the load of any host will be high no matter whether the task is put on the host for execution, so that the task will not be killed, and other tasks will be processed. If yes, executing step 7, otherwise executing step 10;

7. traversing all the remaining tasks except the task with the maximum CPU utilization rate, and if no task to be traversed exists, executing the step 2;

8. it is determined whether the task has been scheduled for a recent period of time, such as approximately 10 minutes, where scheduled indicates that the task was originally executed on one of the hosts a and has been reassigned to be executed on the host B. If the task is scheduled in the latest period, the task is not scheduled any more, so that the problem that some tasks are frequently scheduled to influence the execution efficiency and waste system resources is avoided. If yes, executing step 7, otherwise executing step 9;

9. determining the task as a redistribution task, adding a CPU redistribution task identifier for the task, ending the task, putting the task into a task queue, and then executing the step 7;

10. traversing all the tasks executed by the host to be scheduled in sequence, and executing the step 2 if no task to be traversed exists;

11. judging whether the taken task is scheduled in the latest time period, if so, executing the step 10, otherwise, executing the step 12;

12. determining the task as a reassigned task, adding a CPU reassigned task identifier for the task, ending the task, putting the task into a task queue, and then executing the step 13;

13. judging whether the current CPU utilization rate of the host to be scheduled is smaller than a fifth threshold, if so, executing a step 14, otherwise, executing a step 10;

14. and finishing the scheduling.

According to the embodiments shown in fig. 4 and fig. 5, in the task execution stage, the running data of the cluster host can be dynamically acquired and the tasks can be dynamically scheduled, so that the streaming data tasks in the cluster are distributed more evenly and can run stably, and meanwhile, the utilization rate of the CPU resources of the cluster is improved.

In all the embodiments of the present invention, the threshold value can be flexibly set according to the application requirement, which does not affect the implementation of the technical solution of the present invention.

According to another aspect of the present invention, an apparatus for dynamically adjusting task allocation is provided. FIG. 6 is a schematic diagram of the main modules of an apparatus for dynamically adjusting task allocation according to an embodiment of the present invention. As shown in fig. 6, the apparatus 600 for dynamically adjusting task allocation according to the embodiment of the present invention mainly includes a host parameter obtaining module 601, a host status determining module 602, and a task allocation adjusting module 603.

A host parameter obtaining module 601, configured to obtain, at regular time, a performance parameter of each host included in the cluster;

a host status determining module 602, configured to determine whether the host is a host to be scheduled according to a preset rule and the performance parameter;

the task allocation adjusting module 603 is configured to, when the host is a host to be scheduled, determine a reallocation task from tasks being executed by the host, end the reallocation task, and add the reallocation task to a task queue to re-allocate the task.

According to an embodiment of the present invention, when performing task allocation, the task allocation adjusting module 603 may further be configured to: and acquiring the tasks to be distributed from the task queue, selecting a host for the tasks to be distributed according to the task information of the tasks to be distributed, and distributing the tasks to be distributed to the host.

According to an embodiment of the present invention, the task allocation adjustment module 603 may further be configured to:

and selecting the host from the sorted hosts.

According to another embodiment of the present invention, the task allocation adjustment module 603, when selecting a host from the sorted hosts, is further configured to:

and traversing the ordered hosts in sequence, and selecting the hosts according to the available process number, the available disk space, the CPU load and the residual memory of each host.

According to yet another embodiment of the invention, the performance parameters include CPU load and available disk space; the host status determination module 602 may also be configured to:

According to another embodiment of the invention, the status types of the host to be scheduled include disk high load and CPU high load; the task assignment adjustment module 603 may also be configured to:

judging whether the state type of the host is a high disk load or not;

According to yet another embodiment of the present invention, the task allocation adjustment module 603 may further be configured to:

According to the technical scheme of the embodiment of the invention, the performance parameters of the host are regularly acquired for each host included in the cluster; judging whether the host is a host to be scheduled or not according to a preset rule and a performance parameter; under the condition that the host is a host to be scheduled, determining a reassigned task from tasks being executed by the host, ending the reassigned task and adding the reassigned task to a task queue to re-distribute the tasks, adopting a task scheduling algorithm which can still monitor the running condition of the host in the cluster in real time and dynamically adjust and distribute the tasks in the task execution process, realizing the optimal distribution of the resources of the cluster according to the actual requirements of the tasks, ensuring that the utilization rates of resources such as a CPU, a memory, a disk and the like of the whole cluster host are in a balanced level, ensuring that the distribution of streaming data tasks in the cluster is more balanced and the cluster can stably run, and simultaneously improving the utilization rate of the resources of the cluster CPU.

Fig. 7 illustrates an exemplary system architecture 700 of a device for dynamically adjusting task assignments or a method for dynamically adjusting task assignments to which embodiments of the invention may be applied.

As shown in fig. 7, the system architecture 700 may include

terminal devices

701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the

terminal devices

701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. The

terminal devices

701, 702, 703 may have installed thereon various communication client applications, such as data processing applications, search applications, streaming data processing tools, task scheduling tools, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 705 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

terminal devices

701, 702, 703. The background management server can perform data such as the received task allocation request on each host included in the cluster, and regularly acquire performance parameters of the hosts; judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters; and under the condition that the host is the host to be scheduled, determining a reassigned task from the tasks executed by the host, ending the reassigned task, adding the reassigned task to a task queue to perform processing such as task assignment again, and feeding back a processing result (such as a task assignment result-just an example) to the terminal equipment.

It should be noted that the method for dynamically adjusting task allocation provided by the embodiment of the present invention is generally executed by the server 705, and accordingly, the device for dynamically adjusting task allocation is generally disposed in the server 705.

It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use with a terminal device or server implementing an embodiment of the present invention. The terminal device or the server shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 801.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware. The described units or modules may also be provided in a processor, and may be described as: a processor comprises a host parameter acquisition module, a host state judgment module and a task allocation adjustment module. The names of these units or modules do not in some cases constitute a limitation on the units or modules themselves, and for example, the host parameter obtaining module may also be described as a "module for periodically obtaining the performance parameters of each host included in the cluster".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: for each host included in the cluster, acquiring the performance parameters of the host at regular time; judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters; and under the condition that the host is the host to be scheduled, determining a reassigned task from the tasks executed by the host, finishing the reassigned task and adding the reassigned task to a task queue to perform task assignment again.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for dynamically adjusting task allocation, comprising:

for each host included in the cluster, acquiring the performance parameters of the host at regular time;

judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters;

and under the condition that the host is the host to be scheduled, determining a reassigned task from the tasks executed by the host, finishing the reassigned task and adding the reassigned task to a task queue to perform task assignment again.

2. The method of claim 1, wherein performing task assignment comprises:

and acquiring the tasks to be distributed from the task queue, selecting a host for the tasks to be distributed according to the task information of the tasks to be distributed, and distributing the tasks to be distributed to the host.

3. The method of claim 2, wherein selecting a host for the task to be allocated according to the task information of the task to be allocated comprises:

and selecting the host from the sorted hosts.

4. The method of claim 3, wherein selecting the host from the sorted hosts comprises:

5. The method of claim 4, wherein selecting the hosts based on the number of processes available, the disk space available, the CPU load, and the remaining memory for each host comprises:

6. The method of claim 1, wherein the performance parameters include CPU load and available disk space;

judging whether the host is a host to be scheduled according to a preset rule and the performance parameter comprises the following steps:

7. The method according to claim 1 or 6, wherein the status types of the hosts to be scheduled comprise disk high load and CPU high load;

determining to reassign tasks from among the tasks being performed by the host comprises:

judging whether the state type of the host is a high disk load or not;

8. The method of claim 7, wherein determining reassignment tasks based on CPU usage of tasks being executed by the host comprises:

9. An apparatus for dynamically adjusting task allocation, comprising:

the system comprises a host parameter acquisition module, a performance parameter acquisition module and a performance parameter acquisition module, wherein the host parameter acquisition module is used for acquiring the performance parameters of each host in a cluster at regular time;

the host state judging module is used for judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters;

and the task allocation adjusting module is used for determining the re-allocation tasks from the tasks executed by the host under the condition that the host is the host to be scheduled, finishing the re-allocation tasks and adding the re-allocation tasks into the task queue to re-allocate the tasks.

10. A system for dynamically adjusting task allocation, comprising:

the task scheduling node is used for acquiring the performance parameters of each host in the cluster at regular time; judging whether the host is a host to be scheduled or not according to a preset rule and the performance parameters; under the condition that the host is a host to be scheduled, determining a reassigned task from tasks being executed by the host, finishing the reassigned task and adding the reassigned task to a task queue to perform task assignment again;

and the host is used for executing the distributed tasks.

11. The system of claim 10,

the host is further configured to: acquiring current resource use data of the node at regular time and reporting the data to the task scheduling node;

the task scheduling node is further configured to: determining a type of the host from the resource usage data.

12. The system of claim 10,

the host is further configured to: calculating the resource occupation data of the executing task at regular time and reporting the data to the task scheduling node;

the task scheduling node is further configured to: and determining the type of each task according to the resource occupation data.

13. An electronic device that dynamically adjusts task allocation, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

14. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-8.