CN108427602B - Distributed computing task cooperative scheduling method and device - Google Patents

Distributed computing task cooperative scheduling method and device Download PDF

Info

Publication number
CN108427602B
CN108427602B CN201710078384.2A CN201710078384A CN108427602B CN 108427602 B CN108427602 B CN 108427602B CN 201710078384 A CN201710078384 A CN 201710078384A CN 108427602 B CN108427602 B CN 108427602B
Authority
CN
China
Prior art keywords
task
tasks
resource
completion time
population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710078384.2A
Other languages
Chinese (zh)
Other versions
CN108427602A (en
Inventor
朱力鹏
胡斌
饶玮
黄太贵
李端超
王松
靳丹
马志程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Gansu Electric Power Co Ltd
State Grid Anhui Electric Power Co Ltd
Global Energy Interconnection Research Institute
Original Assignee
State Grid Corp of China SGCC
State Grid Gansu Electric Power Co Ltd
State Grid Anhui Electric Power Co Ltd
Global Energy Interconnection Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Gansu Electric Power Co Ltd, State Grid Anhui Electric Power Co Ltd, Global Energy Interconnection Research Institute filed Critical State Grid Corp of China SGCC
Priority to CN201710078384.2A priority Critical patent/CN108427602B/en
Publication of CN108427602A publication Critical patent/CN108427602A/en
Application granted granted Critical
Publication of CN108427602B publication Critical patent/CN108427602B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a distributed computing task cooperative scheduling method and a distributed computing task cooperative scheduling device, wherein the method comprises the following steps: determining expected completion time of each task on each resource, and establishing an expected completion time matrix; determining the number of tasks to be processed of each resource by using a gene expression programming algorithm; determining the evaluation value of each task according to the urgency and the importance of each task, and sequencing the evaluation values of each task from large to small to obtain a task sequence; allocating tasks in the task sequence to each resource in sequence by utilizing a Min-Min algorithm according to the expected completion time matrix and the number of tasks to be processed of each resource; the technical scheme provided by the invention researches an effective task scheduling method, and fully utilizes effective resources to complete submitted tasks in the shortest time.

Description

Distributed computing task cooperative scheduling method and device
Technical Field
The invention relates to the field of distributed computing software, in particular to a distributed computing task cooperative scheduling method and device.
Background
Cooperative scheduling is an important technique for resource allocation in a distributed computing environment, and is used to allocate multiple tasks submitted by users to multiple resources for simultaneous processing, so as to meet specific performance requirements. The effective processing of user tasks is the centralized embodiment of technology fusion type development and application intelligent concept under new situation, and in a distributed system, the task scheduling, the balance of resource computing capacity and the efficiency of computing nodes are main indexes for measuring the quality of an algorithm. How to share resources and collaboratively solve among a plurality of dynamically-changing virtual mechanisms is a big problem in task collaborative scheduling at present, the performance of tasks in a distributed computing environment can be improved through collaborative scheduling, and the collaborative scheduling method is widely applied to the fields of virtual reality, virtual instruments, large-scale scientific computing and the like.
At present, the research of task scheduling strategies at home and abroad mainly comprises two types: application-level task scheduling and job-level task scheduling. Application-level task scheduling evolves from a task graph (DAG) -based scheduling problem in a traditional resource environment, and application performance is improved by abstracting compute-intensive applications into coarse-grained constraint task graphs and mapping the coarse-grained constraint task graphs to network computing resources by adopting an economic model and a mathematical planning strategy. Due to the defects of high delay, low bandwidth and the like of the current network environment, the research work in the aspect is only limited to parameter scanning and loosely-coupled iterative application. The heuristic task scheduling based on fuzzy clustering, the task scheduling at the job level based on the cooperative scheduling of the particle swarm algorithm and the like are provided for the cooperative scheduling problem of tasks in China, and the problem of performance optimization of research objects during cooperative operation is an extension of high-performance multiple resource scheduling research in a network computing environment. The task collaborative scheduling of a plurality of resources under the network environment is an NP-hard problem, and an optimal scheduling method in polynomial time is difficult to obtain. The main factors influencing the task execution time include the diversity of tasks, the difference of each resource, and the like.
Disclosure of Invention
The invention provides a distributed computing task cooperative scheduling method and a distributed computing task cooperative scheduling device, and aims to research an effective task scheduling method and fully utilize effective resources to complete submitted tasks in the shortest time.
The purpose of the invention is realized by adopting the following technical scheme:
the improvement of a method for collaborative scheduling of distributed computing tasks, comprising:
determining expected completion time of each task on each resource, and establishing an expected completion time matrix;
determining the number of tasks to be processed of each resource by using a gene expression programming algorithm;
determining the evaluation value of each task according to the urgency and the importance of each task, and sequencing the evaluation values of each task from large to small to obtain a task sequence;
and sequentially distributing the tasks in the task sequence to each resource by utilizing a Min-Min algorithm according to the expected completion time matrix and the number of the tasks to be processed of each resource.
Preferably, the determining the expected completion time of each task on each resource and establishing an expected completion time matrix includes:
recording the number of tasks as n and the number of resources as m, and constructing an m multiplied by n expected completion time matrix E according to the following formulam×n
Figure BDA0001225138970000021
In the above formula, eijThe expected completion time on the jth resource for the ith task.
Preferably, the determining the number of tasks required to be processed by each resource by using the gene expression programming algorithm includes:
a. initializing a population, wherein the population consists of m resources and the number of tasks to be processed by each resource, the head length of a chromosome in the population is p, the tail length d is p (l-1) +1, and l is the maximum operand;
b. selecting the optimal individual in the population according to the fitness function value of the population individual, reserving the optimal individual, and carrying out gene crossing, gene mutation and recombination on the current sub-population individual to obtain a new population, wherein the fitness function of the population individual is determined according to the following formula:
Figure BDA0001225138970000022
in the above formula, M is the range value of the number of the resource selection tasks, C(i,j)Returning a value, T, for the fitness of task i to resource jjSelecting a target value for the number of tasks, f, for a resource jiScheduling the task i to the fitness value on the resource j;
c. if the genetic algebra T satisfies T & gt T, outputting the new population, decoding the new population, obtaining the number of tasks required to be processed by each resource when the target function value is minimum, and if the genetic algebra T does not satisfy T & gt T, making T as T +1 and returning to the step b, wherein the target function is as follows:
Figure BDA0001225138970000023
in the above formula, hjNumber of tasks to be processed for resource j, eijAnd n is the total number of tasks and m is the total number of resources.
Preferably, the determining the evaluation value of each task according to the urgency and the importance of each task, and sorting the evaluation values of each task from large to small to obtain the task sequence includes:
the evaluation value G of the task i at the current time t is determined according to the following formulai(t):
Gi(t)=p1Ui(t)+p2Ii
In the above formula, Ui(t) the urgency of task I at the current time t, IiImportance of task i, p1As an urgency weight, p2As importance weight, p1+p2=1。
Further, the urgency U of the task i at the current time t is determined as followsi(t):
Ui(t)=ti1/(ti2+ti3-t)
In the above formula, ti1Estimating a completion time, t, for a taski2Is the allowed completion time for the task; t is ti3Is the time-out time that the task is allowed.
Further, the importance I of task I is determined as followsi
Ii=m1Hi+m2Ni
In the above formula, HiIs the relational importance of task i, NiM is the time importance of task i1Is a relationship importance weight, m2As a temporal importance weight, m1+m2=1;
Wherein the relationship importance H of task i is determined according to the following formulai
Figure BDA0001225138970000031
In the above formula, MikIs the dependency of the relationship between task i and task k, if MikIf 0, task i is independent of task k, and if M is equal to Mik1, the execution process of the task i and the execution process of the task k are interdependent, and n is the total number of the tasks;
determining the temporal importance N of task i as followsi
Figure BDA0001225138970000032
In the above formula, ti1Is the estimated completion time, t, of task ik1Is the estimated completion time for task k.
Preferably, the sequentially allocating the tasks in the task sequence to the resources by using a Min-Min algorithm according to the expected completion time matrix and the number of the tasks to be processed by the resources includes:
a. deleting the distributed tasks from the task sequence, and deleting the resources, the number of which is required by the tasks distributed in the resource set to meet the number of the tasks needing to be processed, from the resource set;
b. and c, selecting the top-ranked task in the task sequence, distributing the task to the resource with the minimum expected completion time for completing the task in the resource set, and returning to the step a until the task sequence is empty.
In an apparatus for collaborative scheduling of distributed computing tasks, the improvement comprising:
the first determining module is used for determining the expected completion time of each task on each resource and establishing an expected completion time matrix;
the second determining module is used for determining the number of tasks to be processed of each resource by using a gene expression programming algorithm;
the evaluation module is used for determining the evaluation value of each task according to the urgency and the importance of each task and sequencing the evaluation values of each task from large to small to obtain a task sequence;
and the distribution module is used for sequentially distributing the tasks in the task sequence to each resource by utilizing a Min-Min algorithm according to the expected completion time matrix and the number of the tasks needing to be processed by each resource.
Preferably, the first determining module includes:
recording the number of tasks as n and the number of resources as m, and constructing an m multiplied by n expected completion time matrix E according to the following formulam×n
Figure BDA0001225138970000041
In the above formula, eijThe expected completion time on the jth resource for the ith task.
Preferably, the second determining module includes:
a. initializing a population, wherein the population consists of m resources and the number of tasks to be processed by each resource, the head length of a chromosome in the population is p, the tail length d is p (l-1) +1, and l is the maximum operand;
b. selecting the optimal individual in the population according to the fitness function value of the population individual, reserving the optimal individual, and carrying out gene crossing, gene mutation and recombination on the current sub-population individual to obtain a new population, wherein the fitness function of the population individual is determined according to the following formula:
Figure BDA0001225138970000042
in the above formula, M is the range value of the number of the resource selection tasks, C(i,j)Returning a value, T, for the fitness of task i to resource jjSelecting a target value for the number of tasks, f, for a resource jiScheduling the task i to the fitness value on the resource j;
c. if the genetic algebra T satisfies T & gt T, outputting the new population, decoding the new population, obtaining the number of tasks required to be processed by each resource when the target function value is minimum, and if the genetic algebra T does not satisfy T & gt T, making T as T +1 and returning to the step b, wherein the target function is as follows:
Figure BDA0001225138970000051
in the above formula, hjNumber of tasks to be processed for resource j, eijAnd n is the total number of tasks and m is the total number of resources.
Preferably, the evaluation module includes:
the evaluation value G of the task i at the current time t is determined according to the following formulai(t):
Gi(t)=p1Ui(t)+p2Ii
In the above formula, Ui(t) the urgency of task I at the current time t, IiImportance of task i, p1As an urgency weight, p2As importance weight, p1+p2=1。
Preferably, the distribution module includes:
a. deleting the distributed tasks from the task sequence, and deleting the resources, the number of which is required by the tasks distributed in the resource set to meet the number of the tasks needing to be processed, from the resource set;
b. and c, selecting the top-ranked task in the task sequence, distributing the task to the resource with the minimum expected completion time for completing the task in the resource set, and returning to the step a until the task sequence is empty.
The invention has the beneficial effects that:
the technical scheme provided by the invention aims at processing the total time of the tasks, allocates the tasks with high urgency and importance to the grid resource with the shortest processing time according to the current available resources, sorts the tasks submitted by the user according to the importance and urgency of the tasks, preferentially processes the tasks with high urgency and importance, preferentially processes the tasks to high-quality resources, meets the user requirements as far as possible in limited cyclic selection, has good expandability and flexibility, can fully utilize the current available resources, can improve the service quality, and solves the problem of collaborative optimization of the distributed computing tasks facing to the big data in the resource environment.
Drawings
FIG. 1 is a flow chart of a method for collaborative scheduling of distributed computing tasks in accordance with the present invention;
fig. 2 is a schematic structural diagram of a cooperative scheduling apparatus for distributed computing tasks according to the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The cooperative scheduling method of the distributed computing task provided by the invention needs to consider two problems:
(1) how to sort the tasks submitted by the users according to the importance and the urgency of the tasks, so that the tasks with high urgency and the importance are distributed to the optimal resources to be processed, data transmission and task execution are carried out synchronously, and the waiting time of the task execution on the data is reduced.
(2) Due to the difference of available resources, how to find a method for arranging tasks to resources in proportion according to resource attributes and reasonably arranging the number of tasks to resources according to the computing power, the storage space and the like of each resource is provided.
The cooperative scheduling method for distributed computing tasks, provided by the invention, uses the evaluation function values of the tasks to sequence the tasks, preferentially processes the tasks with high evaluation function values of the tasks, and uses a gene expression programming algorithm to determine the number of the processing tasks of each resource, as shown in fig. 1, and comprises the following steps:
101. determining expected completion time of each task on each resource, and establishing an expected completion time matrix;
102. determining the number of tasks to be processed of each resource by using a gene expression programming algorithm;
103. determining the evaluation value of each task according to the urgency and the importance of each task, and sequencing the evaluation values of each task from large to small to obtain a task sequence;
104. and sequentially distributing the tasks in the task sequence to each resource by utilizing a Min-Min algorithm according to the expected completion time matrix and the number of the tasks to be processed of each resource.
Specifically, the step 101 includes:
the number of the tasks is recorded as n,with m resources, an m × n expected completion time matrix E is constructed as followsm×n
Figure BDA0001225138970000061
In the above formula, eijThe expected completion time on the jth resource for the ith task.
After the expected completion time matrix is established, the number of processing tasks for each resource is determined, and therefore, in step 102, the objective function is programmed using a genetic expression programming algorithm
Figure BDA0001225138970000062
Approximating the optimal value so as to obtain the number of each resource processing task, which specifically comprises the following steps:
a. initializing a population, wherein the population consists of m resources and the number of tasks to be processed by each resource, the head length of a chromosome in the population is p, the tail length d is p (l-1) +1, and l is the maximum operand;
wherein, p is 12 in the application;
b. selecting the optimal individual in the population according to the fitness function value of the population individual, reserving the optimal individual, and carrying out gene crossing, gene mutation and recombination on the current sub-population individual to obtain a new population, wherein the fitness function of the population individual is determined according to the following formula:
Figure BDA0001225138970000071
in the above formula, M is the range value of the number of the resource selection tasks, C(i,j)Returning a value, T, for the fitness of task i to resource jjSelecting a target value for the number of tasks, f, for a resource jiScheduling the task i to the fitness value on the resource j;
the method and the device adopt a roulette algorithm to select the next generation of individuals, the individuals are selected according to the goodness or the weakness according to the fitness value, and the probability that the individuals with higher fitness values are selected is higher.
c. If the genetic algebra T satisfies T & gt T, outputting the new population, decoding the new population, obtaining the number of tasks required to be processed by each resource when the target function value is minimum, and if the genetic algebra T does not satisfy T & gt T, making T as T +1 and returning to the step b, wherein the target function is as follows:
Figure BDA0001225138970000072
in the above formula, hjNumber of tasks to be processed for resource j, eijAnd n is the total number of tasks and m is the total number of resources.
After determining the number of tasks to be processed by each resource, the tasks need to be sorted according to the evaluation value of each task, so step 103 includes:
the evaluation value G of the task i at the current time t is determined according to the following formulai(t):
Gi(t)=p1Ui(t)+p2Ii
In the above formula, Ui(t) the urgency of task I at the current time t, IiImportance of task i, p1As an urgency weight, p2As importance weight, p1+p2=1。
However, since the longer the estimated completion time is, the less the time remaining from the deadline is, and the higher the urgency level of the task is, the urgency level U of the task i at the current time t is determined by the following equationi(t):
Ui(t)=ti1/(ti2+ti3-t)
In the above formula, ti1Estimating a completion time, t, for a taski2Is the allowed completion time for the task; t is ti3Is the time-out time that the task is allowed.
The importance of the tasks is composed of 2 dimensions of relation and time, and the importance of the relation reflects the influence of the execution time of a single task on the whole system. If 1 task is related to other tasks, i.e. task TjIs performed in dependence on TiWhen, consider task TiThe importance of the relationship is higher; if a task requires a longer execution time, the task is considered to have a higher time importance for the whole system. Therefore, the importance I of task I is determined as followsi
Ii=m1Hi+m2Ni
In the above formula, HiIs the relational importance of task i, NiM is the time importance of task i1Is a relationship importance weight, m2As a temporal importance weight, m1+m2=1;
Wherein the relationship importance H of task i is determined according to the following formulai
Figure BDA0001225138970000081
In the above formula, MikIs the dependency of the relationship between task i and task k, if MikIf 0, task i is independent of task k, and if M is equal to Mik1, the execution process of the task i and the execution process of the task k are interdependent, and n is the total number of the tasks;
determining the temporal importance N of task i as followsi
Figure BDA0001225138970000082
In the above formula, ti1Is the estimated completion time, t, of task ik1Is the estimated completion time for task k.
Finally, in step 104, sequentially allocating the tasks in the task sequence to the resources by using a Min-Min algorithm according to the expected completion time matrix and the number of the tasks to be processed by the resources, including:
a. deleting the distributed tasks from the task sequence, and deleting the resources, the number of which is required by the tasks distributed in the resource set to meet the number of the tasks needing to be processed, from the resource set;
b. and c, selecting the top-ranked task in the task sequence, distributing the task to the resource with the minimum expected completion time for completing the task in the resource set, and returning to the step a until the task sequence is empty.
The present invention also provides a cooperative scheduling apparatus for distributed computing tasks, as shown in fig. 2, the apparatus includes:
the first determining module is used for determining the expected completion time of each task on each resource and establishing an expected completion time matrix;
the second determining module is used for determining the number of tasks to be processed of each resource by using a gene expression programming algorithm;
the evaluation module is used for determining the evaluation value of each task according to the urgency and the importance of each task and sequencing the evaluation values of each task from large to small to obtain a task sequence;
and the distribution module is used for sequentially distributing the tasks in the task sequence to each resource by utilizing a Min-Min algorithm according to the expected completion time matrix and the number of the tasks needing to be processed by each resource.
The first determining module includes:
recording the number of tasks as n and the number of resources as m, and constructing an m multiplied by n expected completion time matrix E according to the following formulam×n
Figure BDA0001225138970000091
In the above formula, eijThe expected completion time on the jth resource for the ith task.
The second determining module includes:
a. initializing a population, wherein the population consists of m resources and the number of tasks to be processed by each resource, the head length of a chromosome in the population is p, the tail length d is p (l-1) +1, and l is the maximum operand;
b. selecting the optimal individual in the population according to the fitness function value of the population individual, reserving the optimal individual, and carrying out gene crossing, gene mutation and recombination on the current sub-population individual to obtain a new population, wherein the fitness function of the population individual is determined according to the following formula:
Figure BDA0001225138970000092
in the above formula, M is the range value of the number of the resource selection tasks, C(i,j)Returning a value, T, for the fitness of task i to resource jjSelecting a target value for the number of tasks, f, for a resource jiScheduling the task i to the fitness value on the resource j;
c. if the genetic algebra T satisfies T & gt T, outputting the new population, decoding the new population, obtaining the number of tasks required to be processed by each resource when the target function value is minimum, and if the genetic algebra T does not satisfy T & gt T, making T as T +1 and returning to the step b, wherein the target function is as follows:
Figure BDA0001225138970000093
in the above formula, hjNumber of tasks to be processed for resource j, eijAnd n is the total number of tasks and m is the total number of resources.
The evaluation module comprises:
the evaluation value G of the task i at the current time t is determined according to the following formulai(t):
Gi(t)=p1Ui(t)+p2Ii
In the above formula, Ui(t) the urgency of task I at the current time t, IiImportance of task i, p1As an urgency weight, p2As importance weight, p1+p2=1。
The distribution module includes:
a. deleting the distributed tasks from the task sequence, and deleting the resources, the number of which is required by the tasks distributed in the resource set to meet the number of the tasks needing to be processed, from the resource set;
b. and c, selecting the top-ranked task in the task sequence, distributing the task to the resource with the minimum expected completion time for completing the task in the resource set, and returning to the step a until the task sequence is empty.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (7)

1. A method for collaborative scheduling of distributed computing tasks, the method comprising:
determining expected completion time of each task on each resource, and establishing an expected completion time matrix;
determining the number of tasks to be processed of each resource by using a gene expression programming algorithm;
determining the evaluation value of each task according to the urgency and the importance of each task, and sequencing the evaluation values of each task from large to small to obtain a task sequence;
allocating tasks in the task sequence to each resource in sequence by utilizing a Min-Min algorithm according to the expected completion time matrix and the number of tasks to be processed of each resource;
the method for determining the evaluation value of each task according to the urgency and the importance of each task and obtaining the task sequence by sequencing the evaluation values of each task from large to small comprises the following steps:
the evaluation value G of the task i at the current time t is determined according to the following formulai(t):
Gi(t)=p1Ui(t)+p2Ii
In the above formula, Ui(t) the urgency of task I at the current time t, IiImportance of task i, p1As an urgency weight, p2As importance weight, p1+p2=1;
The urgency U of the task i at the current time t is determined according to the following formulai(t):
Ui(t)=ti1/(ti2+ti3-t)
In the above formula, ti1Estimating a completion time, t, for a taski2Is the allowed completion time for the task; t is ti3A timeout time allowed for the task;
the importance I of task I is determined as followsi
Ii=m1Hi+m2Ni
In the above formula, HiIs the relational importance of task i, NiM is the time importance of task i1Is a relationship importance weight, m2As a temporal importance weight, m1+m2=1;
Wherein the relationship importance H of task i is determined according to the following formulai
Figure FDA0003189742510000011
In the above formula, MikIs the dependency of the relationship between task i and task k, if MikIf 0, task i is independent of task k, and if M is equal to Mik1, the execution process of the task i and the execution process of the task k are interdependent, and n is the total number of the tasks;
determining the temporal importance N of task i as followsi
Figure FDA0003189742510000021
In the above formula, ti1Is the estimated completion time, t, of task ik1An estimated completion time for task k;
the method for determining the number of tasks needing to be processed of each resource by using the gene expression programming algorithm comprises the following steps:
a. initializing a population, wherein the population consists of m resources and the number of tasks to be processed by each resource, the head length of a chromosome in the population is p, the tail length d is p (l-1) +1, and l is the maximum operand;
b. selecting the optimal individual in the population according to the fitness function value of the population individual, reserving the optimal individual, and carrying out gene crossing, gene mutation and recombination on the current sub-population individual to obtain a new population, wherein the fitness function of the population individual is determined according to the following formula:
Figure FDA0003189742510000022
in the above formula, M is the range value of the number of the resource selection tasks, C(i,j)Returning a value, T, for the fitness of task i to resource jjSelecting a target value for the number of tasks, f, for a resource jiScheduling the task i to the fitness value on the resource j;
c. if the genetic algebra T satisfies T > T, outputting the new population, decoding the new population, acquiring the number of tasks required to be processed by each resource when the objective function value is minimum, and if the genetic algebra T does not satisfy T > T, making T equal to T +1 and returning to the step b, wherein the objective function is as follows:
Figure FDA0003189742510000023
in the above formula, hjNumber of tasks to be processed for resource j, eijAnd n is the total number of tasks and m is the total number of resources.
2. The method of claim 1, wherein determining expected completion times for tasks on resources and building an expected completion time matrix comprises:
recording the number of tasks as n and the number of resources as m, and constructing an n multiplied by m expected completion time matrix E according to the following formulan×m
Figure FDA0003189742510000024
In the above formula, eijThe expected completion time on the jth resource for the ith task.
3. The method of claim 1, wherein said sequentially assigning tasks in the task sequence to each resource using a Min-Min algorithm based on the expected completion time matrix and the number of tasks required to be processed by each resource comprises:
a. deleting the distributed tasks from the task sequence, and deleting the resources, the number of which is required by the tasks distributed in the resource set to meet the number of the tasks needing to be processed, from the resource set;
b. and c, selecting the top-ranked task in the task sequence, distributing the task to the resource with the minimum expected completion time for completing the task in the resource set, and returning to the step a until the task sequence is empty.
4. An apparatus for implementing a method of co-scheduling of distributed computing tasks according to any of claims 1-3, the apparatus comprising:
the first determining module is used for determining the expected completion time of each task on each resource and establishing an expected completion time matrix;
the second determining module is used for determining the number of tasks to be processed of each resource by using a gene expression programming algorithm;
the evaluation module is used for determining the evaluation value of each task according to the urgency and the importance of each task and sequencing the evaluation values of each task from large to small to obtain a task sequence;
the allocation module is used for sequentially allocating the tasks in the task sequence to the resources by utilizing a Min-Min algorithm according to the expected completion time matrix and the number of the tasks to be processed of the resources;
the second determining module includes:
a. initializing a population, wherein the population consists of m resources and the number of tasks to be processed by each resource, the head length of a chromosome in the population is p, the tail length d is p (l-1) +1, and l is the maximum operand;
b. selecting the optimal individual in the population according to the fitness function value of the population individual, reserving the optimal individual, and carrying out gene crossing, gene mutation and recombination on the current sub-population individual to obtain a new population, wherein the fitness function of the population individual is determined according to the following formula:
Figure FDA0003189742510000031
in the above formula, M is the range value of the number of the resource selection tasks, C(i,j)Returning a value, T, for the fitness of task i to resource jjSelecting a target value for the number of tasks, f, for a resource jiScheduling the task i to the fitness value on the resource j;
c. if the genetic algebra T satisfies T > T, outputting the new population, decoding the new population, acquiring the number of tasks required to be processed by each resource when the objective function value is minimum, and if the genetic algebra T does not satisfy T > T, making T equal to T +1 and returning to the step b, wherein the objective function is as follows:
Figure FDA0003189742510000032
in the above formula, hjNumber of tasks to be processed for resource j, eijAnd n is the total number of tasks and m is the total number of resources.
5. The apparatus of claim 4, wherein the first determining module comprises:
recording the number of tasks as n and the number of resources as m, and constructing an n multiplied by m expected completion time matrix E according to the following formulan×m
Figure FDA0003189742510000041
In the above formula, eijThe expected completion time on the jth resource for the ith task.
6. The apparatus of claim 4, wherein the evaluation module comprises:
the evaluation value G of the task i at the current time t is determined according to the following formulai(t):
Gi(t)=p1Ui(t)+p2Ii
In the above formula, Ui(t) the urgency of task I at the current time t, IiImportance of task i, p1As an urgency weight, p2As importance weight, p1+p2=1。
7. The apparatus of claim 4, wherein the assignment module comprises:
a. deleting the distributed tasks from the task sequence, and deleting the resources, the number of which is required by the tasks distributed in the resource set to meet the number of the tasks needing to be processed, from the resource set;
b. and c, selecting the top-ranked task in the task sequence, distributing the task to the resource with the minimum expected completion time for completing the task in the resource set, and returning to the step a until the task sequence is empty.
CN201710078384.2A 2017-02-14 2017-02-14 Distributed computing task cooperative scheduling method and device Active CN108427602B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710078384.2A CN108427602B (en) 2017-02-14 2017-02-14 Distributed computing task cooperative scheduling method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710078384.2A CN108427602B (en) 2017-02-14 2017-02-14 Distributed computing task cooperative scheduling method and device

Publications (2)

Publication Number Publication Date
CN108427602A CN108427602A (en) 2018-08-21
CN108427602B true CN108427602B (en) 2021-10-29

Family

ID=63154946

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710078384.2A Active CN108427602B (en) 2017-02-14 2017-02-14 Distributed computing task cooperative scheduling method and device

Country Status (1)

Country Link
CN (1) CN108427602B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109451010B (en) * 2018-10-31 2021-08-17 邵榆涵 Information interaction method between computers in local area network
CN109684076B (en) * 2018-11-28 2020-07-10 华中科技大学 Multitasking method suitable for cloud computing system
CN109872049B (en) * 2019-01-23 2021-06-25 北京航空航天大学 Resource allocation optimization method and device
CN110659137B (en) * 2019-09-24 2022-02-08 支付宝(杭州)信息技术有限公司 Processing resource allocation method and system for offline tasks
CN111353696A (en) * 2020-02-26 2020-06-30 中国工商银行股份有限公司 Resource pool scheduling method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271405A (en) * 2008-05-13 2008-09-24 武汉理工大学 Bidirectional grade gridding resource scheduling method based on QoS restriction
CN101692208A (en) * 2009-10-15 2010-04-07 北京交通大学 Task scheduling method and task scheduling system for processing real-time traffic information
CN102063339A (en) * 2010-12-21 2011-05-18 北京高森明晨信息科技有限公司 Resource load balancing method and equipment based on cloud computing system
CN103019857A (en) * 2012-11-23 2013-04-03 浙江工业大学 Multi-task priority scheduling method for attendance machine of internet of things
CN103902375A (en) * 2014-04-11 2014-07-02 北京工业大学 Cloud task scheduling method based on improved genetic algorithm
CN104932938A (en) * 2015-06-16 2015-09-23 中电科软件信息服务有限公司 Cloud resource scheduling method based on genetic algorithm
CN105117286A (en) * 2015-09-22 2015-12-02 北京大学 Task scheduling and pipelining executing method in MapReduce

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012094030A (en) * 2010-10-28 2012-05-17 Hitachi Ltd Computer system and processing control method
CN102799945B (en) * 2012-06-21 2016-02-10 浙江工商大学 The minimized integrated resource quantity configuration of service-oriented process executory cost and task distribution optimization method
US20140142998A1 (en) * 2012-11-19 2014-05-22 Fmr Llc Method and System for Optimized Task Assignment
CN104376407A (en) * 2014-11-06 2015-02-25 河南智业科技发展有限公司 Task allocation application system and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271405A (en) * 2008-05-13 2008-09-24 武汉理工大学 Bidirectional grade gridding resource scheduling method based on QoS restriction
CN101692208A (en) * 2009-10-15 2010-04-07 北京交通大学 Task scheduling method and task scheduling system for processing real-time traffic information
CN102063339A (en) * 2010-12-21 2011-05-18 北京高森明晨信息科技有限公司 Resource load balancing method and equipment based on cloud computing system
CN103019857A (en) * 2012-11-23 2013-04-03 浙江工业大学 Multi-task priority scheduling method for attendance machine of internet of things
CN103902375A (en) * 2014-04-11 2014-07-02 北京工业大学 Cloud task scheduling method based on improved genetic algorithm
CN104932938A (en) * 2015-06-16 2015-09-23 中电科软件信息服务有限公司 Cloud resource scheduling method based on genetic algorithm
CN105117286A (en) * 2015-09-22 2015-12-02 北京大学 Task scheduling and pipelining executing method in MapReduce

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
云计算环境中优化混合遗传算法的资源调度研究;王静;《软件导刊》;20161130(第11期);第53-55页 *

Also Published As

Publication number Publication date
CN108427602A (en) 2018-08-21

Similar Documents

Publication Publication Date Title
CN108427602B (en) Distributed computing task cooperative scheduling method and device
Hamad et al. Genetic-based task scheduling algorithm in cloud computing environment
Ge et al. GA-based task scheduler for the cloud computing systems
CN107273209B (en) Hadoop task scheduling method based on minimum spanning tree clustering improved genetic algorithm
Szabo et al. Evolving multi-objective strategies for task allocation of scientific workflows on public clouds
CN111768006A (en) Artificial intelligence model training method, device, equipment and storage medium
CN103729246B (en) Method and device for dispatching tasks
CN113411369A (en) Cloud service resource collaborative optimization scheduling method, system, medium and equipment
Malik et al. Comparison of task scheduling algorithms in cloud environment
Zhang et al. Novel efficient particle swarm optimization algorithms for solving QoS‐demanded bag‐of‐tasks scheduling problems with profit maximization on hybrid clouds
Li et al. An effective scheduling strategy based on hypergraph partition in geographically distributed datacenters
Delavar et al. A synthetic heuristic algorithm for independent task scheduling in cloud systems
Kaleeswaran et al. Dynamic scheduling of data using genetic algorithm in cloud computing
Jangiti et al. Scalable and direct vector bin-packing heuristic based on residual resource ratios for virtual machine placement in cloud data centers
Yin et al. An improved genetic algorithm for task scheduling in cloud computing
CN110362388A (en) A kind of resource regulating method and device
CN111813500B (en) Multi-target cloud workflow scheduling method and device
TW202134870A (en) Task scheduling method and apparatus
Kaur et al. Latency and network aware placement for cloud-native 5G/6G services
US20210390405A1 (en) Microservice-based training systems in heterogeneous graphic processor unit (gpu) cluster and operating method thereof
Shu-Jun et al. Optimization and research of hadoop platform based on fifo scheduler
CN108108242B (en) Storage layer intelligent distribution control method based on big data
Kumar et al. Delay-based workflow scheduling for cost optimization in heterogeneous cloud system
Singh et al. Cuckoo search based workflow scheduling on heterogeneous cloud resources
Ramezani et al. Task Scheduling in cloud environments: a survey of population‐based evolutionary algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 102209 18 Riverside Avenue, Changping District science and Technology City, Beijing

Applicant after: GLOBAL ENERGY INTERCONNECTION RESEARCH INSTITUTE Co.,Ltd.

Applicant after: STATE GRID ANHUI ELECTRIC POWER Co.,Ltd.

Applicant after: STATE GRID GANSU ELECTRIC POWER Co.

Applicant after: STATE GRID CORPORATION OF CHINA

Address before: 102209 Beijing Changping District future science and Technology North District Smart Grid Research Institute

Applicant before: GLOBAL ENERGY INTERCONNECTION RESEARCH INSTITUTE Co.,Ltd.

Applicant before: STATE GRID ANHUI ELECTRIC POWER Co.

Applicant before: STATE GRID GANSU ELECTRIC POWER Co.

Applicant before: State Grid Corporation of China

GR01 Patent grant
GR01 Patent grant