CN103970613B - Multi-copy task fault tolerance scheduling method of heterogeneous distributed system - Google Patents
Multi-copy task fault tolerance scheduling method of heterogeneous distributed system Download PDFInfo
- Publication number
- CN103970613B CN103970613B CN201410216137.0A CN201410216137A CN103970613B CN 103970613 B CN103970613 B CN 103970613B CN 201410216137 A CN201410216137 A CN 201410216137A CN 103970613 B CN103970613 B CN 103970613B
- Authority
- CN
- China
- Prior art keywords
- task
- node
- copy
- reliability
- scheduling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The invention belongs to the field of computers, and particularly relates to a multi-copy task fault tolerance scheduling method of a heterogeneous distributed system. The method includes the steps that according to the load of each task and the executing speed of each node in the system, the average executing time of all the tasks on all the processor nodes and the average communication time of all communication messages on all chains are calculated; through a bottom end priority method, the bottom end priority of any task in a task set is calculated; the tasks allowed to be scheduled are added into a scheduling queue in a priority non-increasing mode according to the priority of the tasks; the task highest in priority is selected from all the tasks allowed to be scheduled in the scheduling queue. According to the method, the execution starting time of current task scheduling copies can be further shortened, and therefore the task scheduling Makespan can be further reduced.
Description
Technical field
The invention belongs to computer realm, and in particular to a kind of many copy task fault-tolerance dispatching parties of heterogeneous distributing system
Method.
Background technology
With the appearance of express network, using resource connection that is distributed, inexpensive and being particularly likely that isomery as meter
Calculate environment be it is feasible, therefore distributed system (such as cloud computing, grid computing, distributed mobile computing) Computer isomery
Property will progressively strengthen, this provide it is a kind of be referred to as heterogeneous distributed calculating (Heterogeneous Distributed
Computing, HDC) system calculating platform.HDC systems have become the calculating of high-performance calculation and the information processing of prevalence
Equipment, and progressively used by critical system.HDC systems often have throughput of system and availability higher, can
Efficiently access extensive distributed network information.HDC systems are more complicated than isomorphism system and central control system, extra
Complexity is likely to result in more thrashings.In HDC systems, the feelings that safety-critical application program will not only occur in failure
Can be fault-tolerant under condition, also to meet time constraint condition.
In large-scale heterogeneous distributed computing system effective task scheduling algorithm meet user or system requirements and
Realize that high-performance aspect plays the part of pivotal player.Task scheduling be intended to by duty mapping to processor and set task start hold
The row time, execution sequence is set to meet dependence between task, while maximizing dispatch reliability and minimizing scheduling Makespan.
Current task scheduling problem is directed to independent task mostly, ignores the data correlation and priority constraint relationship between task.Most simultaneously
Smallization dispatches Makespan and application program failure probability is often conflicting, it is therefore necessary to which design considers scheduling simultaneously
The dispatching algorithm of Makespan and reliability.
Fault-tolerant scheduling method has carried out numerous studies.Fault-tolerant scheduling method can use passive replication (Primary/
Backup, PB) mechanism and Active Replication mechanism improves reliability.Two versions of each task are distributed to difference by PB mechanism
Processor, when key plate present processor fails, task will be performed on subedition processor.PB mechanism can only tolerate once event
The generation of barrier.PB mechanism task execution time when breaking down is more long, it is most likely that can not meet real-time task time requirement.
Active Replication mechanism is based on spatial redundancy, multiple copies of task is dispatched into different processor, by the parallel of multiple copies
Perform fault-tolerant to realize.Task scheduling based on many copy modes has two kinds:Strict scheduling and general-purpose scheduler.Strictly dispatch and refer to
Task is only carried out completing and relying on message arrival current scheduling task in all copies of its all of direct predecessor task
During institute's mapping node, could start to perform.The fault-tolerant scheduling method of task based access control copy all uses this scheduling mode mostly.It is logical
As long as thering is a copy to perform completion and the message Successful transmissions of the copy with each direct predecessor task that scheduling refers to task
To current scheduling duty mapping node, current scheduling task can just start to perform.Obviously strict scheduling is the special case of general-purpose scheduler.
The Starting Executing Time of the lower current scheduling copy of strict scheduling is necessary for all predecessor task copies and its respective call duration time
Maximum in sum.And under general-purpose scheduler current scheduling copy Starting Executing Time can for part predecessor task copy and its
Maximum in respective call duration time sum.Intertask communication message simultaneously need not all send, as long as required part is appointed
Message is relied between business copy to send, general-purpose scheduler mode can further reduce when starting to perform of current scheduling task copy
Between, therefore the general Makespan for calling is very possible than strictly dispatching small.Strict scheduling and the reliability of general-purpose scheduler mode
Calculation is different, and the reliability of task copy need to consider all pairs of all predecessor tasks of the task in strict scheduling
This, and the reliability of task copy can only consider to complete execution time and message in all of predecessor task copy in general-purpose scheduler
All copies of the call duration time sum less than current scheduling task copy Starting Executing Time.
Most of Active Replication fault-tolerant scheduling mechanism are blindly processed the certain number of times of Task Duplication to tolerate specified quantity
Device fails.A.Girault in 2003 etc. is in meeting《Dependable Systems and Networks》On the article delivered
“An algorithm for automatically obtaining distributed and fault-tolerant
Static schedules " and Anne Benoit in 2008 etc. are in meeting《Parallel and Distributed
Processing》On article " the Fault tolerant scheduling of precedence task graphs that deliver
On heterogeneous platforms " propose FTBAR algorithms and FTSA algorithms respectively.The two algorithms are respectively by task
Minimum scheduling pressure and the preceding processors of ε+1 of minimum deadline is dispatched to tolerate ε processor failure.The two algorithms
Dispatch reliability analysis is not all provided.Zhao Laiping in 2013 etc. are in periodical《Parallel Computing》On deliver
Article " Reliable workflow scheduling with less resource redundancy " propose and be based on
Active Replication mechanism realizes minimizing the fault-Tolerant Scheduling Algorithm of resource overhead, and the algorithm picks reliability highest node is performed
Task copy, the copy deadline is not considered.Reliability node high is scheduled to it is therefore possible to go out current task, but
Its execution time is long, and the method is unfavorable for system load balancing.And the method is based on strict scheduling mode
Dispatching method, its scheduling Makespan it is more long compared with general-purpose scheduler mode.Antonios Litke in 2007 etc. are in periodical
《Future Generation Computer Systems》On article " the Efficient task replication that deliver
And management for adaptive fault tolerance in Mobile Grid environments " are proposed
Fault-tolerant scheduling mechanism under mobile network's computing environment, but it is directed to Independent Task Scheduling, and the mechanism do not account for
Scheduling Makespan.Alain Girault in 2009 etc. are in periodical《Journal of Parallel and Distributed
Computing》On article " the Reliability versus performance for critical that deliver
Applications " is proposed can be while optimize the Two Phase Method of reliability and scheduling length.But what the algorithm was used
It is strict scheduling mode, Makespan is long for its scheduling.The algorithm does not account for link failure, and its task copy maps
Node is randomly selected.Laiping Zhao in 2011 etc. are in meeting《Advanced Information Networking
and Applications》On article " the A Resource Minimizing Scheduling Algorithm with that deliver
Ensuring the Deadline and Reliability in Heterogeneous Systems " use Active Replication machine
System carries out task fault-tolerance scheduling research.This article algorithm considers node and link failure simultaneously, but does not use general-purpose scheduler
Mode.
Random search algorithm can combine the information obtained in existing Search Results and produce new knot with some random characters
Really.Genetic algorithm (GA) is a kind of method of utilization natural selection and evolution the thought optimizing in higher dimensional space, with simple, fast
Speed, the features such as robustness is good, and tend to provide preferably solution.It is using the non-traversal random search machine for having tutorial message
System, can rapidly converge to global near-optimum solution.The operation expense of GA is usually higher, but for long-play task
This is acceptable, and can improve calculating speed by parallel GA technologies.SungHo Chin in 2009 etc. are in meeting
《Ubiquitous Information Technologies&Applications》On the article " Genetic that delivers
Algorithm based Scheduling Method for Efficiency and Reliability in Mobile
Grid " carries out the task copy scheduling in isomery mobile grid environment based on GA to improve mission reliability, but what it was directed to
It is independent task.Atakan Dogan in 2005 etc. are in periodical《The Computer Journal》On the article delivered
“Biobjective scheduling algorithms for execution time-reliability trade-off
The double object genetic algorithm (BGA) that in heterogeneous computing systems " are proposed can be while Optimized Operation
Makespan and reliability, but BGA is possible to the trivial solution relied between producing the task of running counter to during evolution.2011
Xiaofeng Wang etc. are in periodical《Future Generation Computer Systems》On the article delivered
“Optimizing the makespan and reliability for workflow applications with
Reputation and a look-ahead genetic algorithm " using GA while task dependence is met,
Do not carried using task copy mechanism come Optimized Operation Makespan and reliability simultaneously, but the algorithm using two-stage policy
High reliability, therefore the lifting of its reliability is limited.
The copy scheduling problem of reliability can be optimized in heterogeneous computing system for np complete problem, i.e., in the absence of multinomial
Time algorithm can maximize reliability.Therefore this method carries out task scheduling using alternative:As long as dispatching method is full
Sufficient mission reliability requirement, and scheduler task reliability need not be maximized.Either transient fault or permanent fault,
Assessment general-purpose scheduler reliability has been demonstrated it is all #P ' complete problems, and the problem is at least and the equal difficulty of np complete problem.Cause
Even if this obtain task set dispatching scheme, still can not in polynomial time calculating task collection reliability.Therefore this method meter
Calculate the reliability requirement of each task, meet between mission reliability requirement and task rely on restriction relation on the premise of, enter one
Step Optimized Operation Makespan.The reliability calculation method of task scheduling is not under strict scheduling mode and general-purpose scheduler mode
With.Mission reliability and its Starting Executing Time are tight associations, because task Starting Executing Time determines task energy
The predecessor task message number for enough receiving.Current scheduling problem mostly do not account for the task copy on node start perform when
Between position selection, therefore the deadline optimization on there is certain defect.Therefore this method is using based on general-purpose scheduler mode
Many copy fault-tolerant scheduling mechanism, on the premise of reliability requirement is met, using the further Optimized Operation of genetic algorithm
Makespan。
The content of the invention
It is of the invention to be to provide a kind of many copy task fault-tolerance scheduling of heterogeneous distributing system based on Active Replication mechanism
Method.
The object of the present invention is achieved like this:
(1)According to the execution speed of each node in the load of each task and system, each is appointed in calculating application program
Business vjIt is scheduled to each node p in systemkExecution time ET (vj,pk);The application program G=of constraint is relied on for existing<
V,E>, set V={ v1,v2,...vN, task quantity N=| V |, E are the oriented communication weight line set between task in V;System
Model is non-directed graph GS=<P,L>, P={ p1,p2,...,pMIt is M heterogeneous nodes set, M=| P |, L are the individual communication chains of | L |
Gather on road;Task-set reliability requirement R;
(2)Each task is calculated in the average performance times and every communication information of all processor nodes in all chains
The average communication data on road;
(3)Any task v is concentrated come calculating task using bottom priority approachjBottom priority bl (vj):
Succ (v in formulaj) it is task vjDirect follow-up work set,It is task vjThe institute in node set P
There are the average performance times of node,It is message ej,iIn systems during the average transmission of all links of link set L
Between;
(4)Priority according to task will allow scheduler task according to the nonincremental mode of its priority added to scheduling team
Row;
(5)Highest priority task is selected from all permission scheduler tasks of scheduling queue, highest priority is calculated and is appointed
Business vjReliability requirement rx, x is position of the task in priority query:
1≤x≤n in formula, and meet the prioritization of task;R is task-set reliability requirement;r′iIt is priority team
Row middle position is set to the actual institute's achieved reliability, r ' of task of i0=1;If the task is for highest priority task
Entry tasks, reliability requirement
(6)If reliability requirement is invalid, i.e. task vjReliability requirement rx>=1, then refusal scheduler task, and return
Return;The otherwise many copy general-purpose scheduler methods of calling task calculate the copy scheduling node and Starting Executing Time of the task;
(7)Scheduler task is deleted from scheduling queue, while new permission scheduler task is added according to priority
It is added in scheduling queue;Next highest priority task is scheduled in continuing selection scheduling queue, repeat step(5)-(7)
Until all tasks all dispatch completion.
The many copy general-purpose scheduler methods of task are:
(6.1) corresponding information is initialized:By task viCopy amount is assigned to 0, and mapping node is assigned to sky, by idle node
Set is assigned to node set P;
(6.2) if task vjIt is entry tasks, chooses deadline earliest node in idle node queue and perform and appoint
Business copy, calculating task vjReliability
proc(vj) it is task vjMapping node set, λ pnIt is processor node pnPermanent fault probability, w (vj)
Expression task vjLoad, w (pn) represent node pnThe amount of calculation that be can perform in unit interval;If task can not be met
Reliability, then continue to choose in idle queues deadline earliest node to perform task copy, then calculating task
Reliability, until meeting mission reliability requirement;If until idle node queue is sky, mission reliability still can not meet
It is required that, mission reliability loss is made up by Calculation of Reliability formula when follow-up work copy is dispatched;
(6.3) if task vjThere is predecessor task, call many copy general-purpose scheduler methods of the task based on genetic algorithm
Carry out copy scheduling.
The many copy general-purpose scheduler methods of task based on genetic algorithm are:
(6.3.1)Initialization crossover probability pc, mutation probability pm, population quantity GN, Evolution of Population number of times EN;
(6.3.2)Generation initial population:
Calculate the predecessor task v of current scheduling taskiBe mapped in node pkTask copyMessage reach node pn
Time
FT (v in formulai,pk) it is task viIn node pkCompletion perform the time, rdy (lk,n) it is link lk,nBe ready to
Call duration time is last message communicating deadline of link, w (ei,j) it is task viWith task vjBetween communication information ei,j's
Size, w (lk,n) it is node pkWith node pnBetween link lk,nThe data volume that can be transmitted in unit interval, if mapping node phase
Together, i.e. pk=pn, then time rdy (lk,n) it is 0, communication overhead is 0,
Each node effectively need to be started to perform by task encoding scheme in minimum effectively Starting Executing Time position and maximum
All position encoded between time location is gene in individuality, task vjIn processor pnMinimum effectively perform time location EST
(vj,pn) calculate;
Pred (v in formulai) it is task viDirect predecessor task set;rep(vi) it is task viCopy set;rdy
(pn) it is current scheduling situation lower node pnThe completion of last mapping tasks performs time PFT (pn)
Proc (v in formulai) task viThe processor sets for being mapped;
Task vjIn processor pnMaximum effectively Starting Executing Time position LST (vj,pn)
Processor node is chosen from node idle queues, an effective Starting Executing Time position is chosen in processor node
Put, map the copy of current scheduling task, the reliability of calculating task copy, if the reliability of the task is unsatisfactory for requiring,
Continue to choose processor node from node idle queues and in the effective Starting Executing Time position of node selection task, Zhi Daoren
The reliable sexual satisfaction requirement of business, using task copy mapping scheme as the individual in population, repeatedly generates individuality, Zhi Daoda
To population scale, if task copy amount is M, the reliability of task is also not reaching to reliability requirement, will the task
Copy mapping scheme is used as the individual in population, because follow-up work can in right amount compensate task reliability when dispatching is damaged
Lose,
In formulaIt is task vjIt is mapped in node pnCopyReliability,It is node pnUpper current scheduling is appointed
Business copyThe task copy for performing beforePrepnIt is node pnThe task copy set of execution;ST(vj,pn) it is task vj
In node pnStarting Executing Time;etp,qIt is task vpWith vqBetween communication information beginning call duration time;ON(lk,n) it is in chain
Road lk,nThe all of communication for occurring;etp,q≤etl,j(vp,vq∈ V) it is link lk,nUpper communication information ep,qBeginning call duration time
Less than or equal to message el,jBeginning call duration time;λlk,nIt is node pkWith node pnBetween link lk,nFailure probability;If appointed
Business copyWithMapping node it is identical, then its link communication time is 0, and the reliability of the communication information is 1;
The corresponding encoding gene value in effective Starting Executing Time position of mapping tasks is 1, the no mapping tasks of correspondence
Position is 0, and in duty mapping, be up to one value of position is 1 in the corresponding gene of each node, and the value of other positions is
0;
Coding also includes effective mapping position number of each node in individual UVR exposure, and the position is represented by array s, such as
Fruit task vjDistribute to node pnIn k-th effective Starting Executing Time position, then individual gjIn l-th gene gj,l=1,|si| it is s in array siRepresentative node piEffective mapping position number, | s0|=0, coding individuality
Length isArray element siIn individual gjCorresponding gene sets are
(6.3.3)According to crossover probability pcAll individualities in population carry out crossover operation:
If random number is less than crossover probability pc, it is right in two individualities in selection array s for two individualities selected
The same node point for answering encoding gene value to differ, the gene corresponding to all nodes that will be chosen in two individualities is swapped,
The new individual that will be generated is added to population;
(6.3.4)According to mutation probability pmAll individualities in population carry out mutation operation:
It is newly-generated individual added to population;
(6.3.5)Deadline valuation functions FTimWith reliability assessment function FRelCalculate each individual g in populationiIt is suitable
Response, by all individualities according to FTimAnd FRelThe descending arrangement of functional value obtains two sequence individual queues
(6.3.6)The individuality in two queues is selected based on RR mechanism as the individuality in new population, until reaching population
Scale requirements;
(6.3.7)If being unsatisfactory for stop condition, repeat step(6.3.3)-(6.3.6), regulation evolution number of times it
Interior reliability or Makespan are not improved, and stop solving.
The beneficial effects of the present invention are:
The general-purpose scheduler mode of many copy task fault-tolerance dispatching methods of heterogeneous distributing system of the present invention allows current scheduling
The Starting Executing Time of copy is to lead between the maximum in part predecessor task copy and its respective call duration time sum, task
Letter message simultaneously need not all send, as long as being sent message is relied between required partial task copy, the method can enter one
Step reduces the Starting Executing Time of current scheduling task copy, therefore the method can further reduce the scheduling of task
Makespan.Multiple intersection and change of the method on the premise of ensuring to meet the reliability requirement of task-set using genetic algorithm
Different evolutional operation further optimizes the scheduling Makespan of task, node failure is considered in Calculation of Reliability and link loses
Effect;And the method is not in idle task copy pair in the evolutionary process of genetic algorithm.
Brief description of the drawings
The many copy task fault-tolerance dispatching method flow charts of Fig. 1 heterogeneous distributing systems;
Fig. 2 scheduler task DAG structure charts;
Fig. 3 systems interior joint and link configuration parameters;
Fig. 4 tasks v3An individual for initialization of population generation during mapping;
Fig. 5 tasks v3Second individuality of initialization of population generation during mapping;
Fig. 6 tasks v33rd individuality of initialization of population generation during mapping;
Fig. 7 tasks v34th individuality of initialization of population generation during mapping;
Fig. 8 tasks v3Individuality g during mapping3The 5th individuality generated after making a variation for the first time;
Fig. 9 tasks v4An individual for initialization of population generation during mapping;
Figure 10 tasks v4Second individuality of initialization of population generation during mapping;
Figure 11 tasks v43rd individuality of initialization of population generation during mapping;
Figure 12 tasks v44th individuality of initialization of population generation during mapping;
Figure 13 tasks v4The 5th individuality generated after intersecting during mapping;
Figure 14 tasks v4The 6th individuality generated after being made a variation during mapping;
The scheduling scheme that Figure 15 is ultimately generated.
Specific embodiment
The present invention is described in more detail below in conjunction with the accompanying drawings:
The brought wasting of resources and other reliability dispatching methods ignorance scheduling Makespan, task is replicated for blindness
Between rely on link failure probability and strict scheduling mode scheduling Makespan defects more long, it is an object of the invention to provide one
Plant many copy task fault-tolerance dispatching methods of heterogeneous distributing system based on Active Replication mechanism.The method is based on general-purpose scheduler side
Formula on the premise of mission reliability requirement is met, is evolved using many copy fault tolerant mechanisms by the intersection and variation of genetic algorithm
Operation further optimizes the scheduling Makespan of task-set.
Many copy task fault-tolerance dispatching methods of heterogeneous distributing system of the present invention are comprised the following steps that:
The application program G=of constraint is relied on for existing<V,E>, set of tasks V={ v1,v2,...vN, task quantity N
=| V |, E are the oriented communication weight line set between task in V;System model is non-directed graph GS=<P,L>, P={ p1,
p2,...,pMIt is M heterogeneous nodes set, M=| P |, L are the individual communication link set of | L |;Task-set reliability requirement R:
1. the load of each task and the execution speed of each node in system are first according to, each in application program is calculated
Task vjIt is scheduled to each node p in systemkExecution time ET (vj,pk)。
2. average performance times and every communication information of each task in all processor nodes are calculated in all chains
The average communication data on road.
3. any task v is concentrated come calculating task using bottom priority approach according to formula (1)jBottom priority bl
(vj)。
Succ (v in formulaj) it is task vjDirect follow-up work set,It is task vjThe institute in node set P
There are the average performance times of node,It is message ej,iIn systems during the average transmission of all links of link set L
Between.
4. the priority according to task will allow scheduler task according to the nonincremental mode of its priority added to scheduling team
Row, it is allowed to which scheduler task is that predecessor task is scheduled to be completed or in the absence of the task of predecessor task.
5. highest priority task is selected from all permission scheduler tasks of scheduling queue, calculates excellent according to formula (2)
First level super objective vjReliability requirement rx(x is position of the task in priority query).
1≤x≤n in formula, and meet the prioritization of task;R is task-set reliability requirement;r′iIt is priority team
Row middle position is set to the actual institute's achieved reliability, r ' of task of i0=1.If the task be highest priority task (i.e.
Entry tasks), then its reliability requirement
If 6. reliability requirement is invalid, i.e. task vjReliability requirement rx>=1, then refusal scheduler task, and return
Return.The otherwise many copy general-purpose scheduler methods of calling task calculate the copy scheduling node and Starting Executing Time of the task.
The many copy general-purpose scheduler method implementation process of task are:
For scheduler task vj, system interior joint set P, mission reliability requirement rx:
(1) corresponding information is initialized first.By task viCopy amount is assigned to 0, and mapping node is assigned to sky.Free time is saved
Point set is assigned to node set P.
(2) if task vjIt is entry tasks, then choose deadline earliest node in idle node queue first
Execution task copy (if the deadline of two nodes identical so randomly select any node).Calculated according to formula (3)
Task vjReliabilityAs long as all copies reliability sexual satisfaction corresponding requirements of the task, it is not necessary to consider task
Between rely on message.
In formula, proc (vj) it is task vjMapping node set, λ pnIt is processor node pnPermanent fault probability,
w(vj) represent task vjLoad, w (pn) represent node pnThe amount of calculation that be can perform in unit interval.
If mission reliability can not be met, then continue to choose in idle queues deadline earliest node to perform
Task copy, then calculates the reliability of the task according to formula (3), until meeting the mission reliability requirement.If until
Idle node queue is sky, and the mission reliability still can not meet requirement, can be by can when follow-up work copy is dispatched
Mission reliability loss is made up by property computing formula, to ensure to meet set of tasks reliability requirement.
(3) if task vjThere is predecessor task, then call many copy general-purpose scheduler sides of task based on genetic algorithm
Method carries out copy scheduling.
Many copy general-purpose scheduler methods of task based on genetic algorithm are comprised the following steps that:
1) initialization crossover probability p firstc, mutation probability pm, population quantity GN, Evolution of Population number of times EN.
2) then generation initial population.
Dependence between in order to ensure task, can only include effective Starting Executing Time position in coding.Effectively start to hold
Line position is put the task of being necessary to ensure that and can receive the message of its predecessor task institute mapping node transmission.For current scheduling task
Any predecessor task viBe mapped in node pkTask copyMessage reach node pnTimeAccording to formula
(4) calculate.
FT (v in formulai,pk) it is task viIn node pkCompletion perform the time.It is link lk,nBe ready to communication
Time is last message communicating deadline of link.w(ei,j) it is task viWith task vjBetween communication information ei,jIt is big
It is small.w(lk,n) it is node pkWith node pnBetween link lk,nThe data volume that can be transmitted in unit interval.If mapping node is identical,
That is pk=pn, then time rdy (lk,n) it is 0, communication overhead is 0, now
Each node effectively need to be started to perform by task encoding scheme in minimum effectively Starting Executing Time position and maximum
All position encoded between time location is gene in individuality.Task vjIn processor pnMinimum effectively Starting Executing Time position
Put EST (vj,pn) calculated according to formula (5).
Pred (v in formulai) it is task viDirect predecessor task set;rep(vi) it is task viCopy set;rdy
(pn) it is current scheduling situation lower node pnThe completion of last mapping tasks performs time PFT (pn), its computational methods such as formula
(6) shown in.
Proc (v in formulai) task viThe processor sets for being mapped.
Task vjIn processor pnMaximum effectively Starting Executing Time position LST (vj,pn) calculated according to formula (7).
Certain processor node is randomly selected from node idle queues, device node is managed in this place and is randomly selected one effectively
Starting Executing Time position, maps the copy of current scheduling task.The reliability of the task copy is calculated according to formula (8).Such as
Really the reliability of the task is unsatisfactory for requiring, then continue to choose processor node from node idle queues and in node choosing
The effective Starting Executing Time position of task is taken, until the reliable sexual satisfaction requirement of the task, the task copy mapping scheme is made
It is the individual in population.Individuality is repeatedly generated, until reaching population scale.If task copy amount is M, task
Reliability is also not reaching to reliability requirement, will the task copy mapping scheme as the individual in population because after
The reliability loss of the task can be in right amount compensated during continuous task scheduling.
In formulaIt is task vjIt is mapped in node pnCopyReliability,It is node pnUpper current scheduling
Task copyThe task copy for performing beforePrepnIt is node pnThe task copy set of execution;ST(vj,pn) it is task
vjIn node pnStarting Executing Time;etp,qIt is task vpWith vqBetween communication information beginning call duration time;ON(lk,n) be
Link lk,nThe all of communication for occurring;etp,q≤etl,j(vp,vq∈ V) it is link lk,nUpper communication information ep,qStart communication when
Between be less than or equal to message el,jBeginning call duration time.λlk,nIt is node pkWith node pnBetween link lk,nFailure probability.If
Task copyWithMapping node it is identical, then its link communication time is 0, and the reliability of the communication information is 1.
The corresponding encoding gene value in effective Starting Executing Time position of mapping tasks is 1, corresponds to not map therewith and appoints
The position of business is 0.In duty mapping, in order to prevent that task is repeatedly mapped to identical node, the corresponding base of each node
It is 1 that can only at most have a value for position because in, and the value of other positions is 0, that is, map to the same task copy of same node point
Can only at most there is one.
Encoding scheme will also include effective mapping position number of each node in individual UVR exposure, and the position is by array s
Represent.If task vjDistribute to node pnIn k-th effective Starting Executing Time position, then individual gjIn l-th gene
gj,l=1,|si| it is s in array siRepresentative node piEffective mapping position number, | s0|=0.Coding
Individual length isArray element siIn individual gjCorresponding gene sets are
3) according to crossover probability pcAll individualities in population carry out crossover operation.If random number is less than crossover probability
pc, for two individualities selected, in random selection array s in two individualities correspondence encoding gene value is differed certain or
Certain several same node point, the gene corresponding to all nodes that will be chosen in two individualities is swapped.It is finally new by what is generated
Individuality is added to population.
4) according to mutation probability pmAll individualities in population carry out mutation operation.If random number is less than mutation probability
pm, certain individuality and randomly selected certain node location in array s for selecting carry out mutation operation.Mutation operation includes
Change task copy is in two kinds of the Starting Executing Time position of mapping node and task copy mapping node.
If the individual reliability is low compared with the reliability requirement of current scheduling task, and the corresponding nodes of array s are
Through mapping tasks copy, then postpone increasing the mapping tasks Starting Executing Time of the variation node chosen in individuality backward
Mission reliability.
If the individual reliability is low compared with the reliability requirement of current scheduling task, and the corresponding nodes of array s do not have
There is mapping tasks copy, then the Starting Executing Time position of the variation node without mapping tasks copy that will be chosen in individuality
Corresponding genic value is set to 1, adds new mappings task copy to improve reliability.
If the individual reliability is high compared with the reliability requirement of current scheduling task, and exists than current scheduling task
The early effective Starting Executing Time position of copy Starting Executing Time, then the mapping tasks of the variation node that will be chosen in individuality
Starting Executing Time is elapsed forward, and this can reduce mission reliability, but as long as meeting corresponding reliability conditions.
If the individual reliability is high compared with the reliability requirement of current scheduling task, and current scheduling task copy
Starting Executing Time is earliest effective Starting Executing Time of mapping node, then node Starting Executing Time position is corresponding
Genic value is set to 0, and the copy of the variation node of the mapping tasks copy that will be chosen in individuality is cancelled, and can so reduce can
By property, but as long as ensuring to meet mission reliability requirement.
Finally will be newly-generated individual added to population.
5) according to formula (9) deadline valuation functions FTimWith formula (10) reliability assessment function FRelIn calculating population
Each individual giFitness.By all individualities according to FTimAnd FRelThe descending arrangement of functional value obtains two individual teams of sequence
Row.
6) individuality in two queues is selected based on RR mechanism as the individuality in new population, will until reaching population scale
Ask.
If 7) be unsatisfactory for stop condition, repeat step 3) -6).It is final after certain Evolution of Population number of times or algorithm is received
When holding back (reliability or Makespan are not significantly improved within the evolution number of times of regulation), stop solving.
7. scheduler task is deleted from scheduling queue, while new permission scheduler task is added according to its priority
It is added in scheduling queue.Continue selection scheduling queue in next highest priority task be scheduled, repetitive process 5-7 until
All tasks all dispatch completion.
Fig. 1 shows many copy task fault-tolerance dispatching method flow charts of heterogeneous distributing system, with reference to flow chart and example
Describe the implementation process of the method in detail.
Example is by task-set V={ v in Fig. 21,v2,v3,v4Configuration parameter is dispatched to for Fig. 3 interior joint set P={ p1,
p2,p3,p4,p5Heterogeneous distributing system when dispatch situation, reliability requirement R be 0.999.
1. the load of each task and the execution speed of each node in system are first according to, each task are calculated and is scheduled
The execution time of each node into system.It is computed:Task v1Be respectively in the execution time of five nodes 18,9,9,
18,6 }, task v2It is respectively { 20,10,10,20,6.7 } in the execution time of five nodes, task v3In five execution of node
Time is respectively { 22,11,11,22,7.3 }, task v4It is respectively { 24,12,12,24,8 } in the execution time of five nodes.
2. average performance times of each task in all processor nodes are calculated:Task v1Held in five the average of node
The row time is 12, task v2It is 13.3, task v in five average performance times of node3In five average performance times of node
It is 14.7, task v4It is 16 in five average performance times of node.Every communication information is calculated in the average logical of all links
The letter time:Message e1,2It is 8.5, message e in the average communication data of all links1,3It is in the average communication data of all links
10.6, message e2,4It is 6.4, message e in the average communication data of all links3,4It is in the average communication data of all links
12.8。
3. each task priority is calculated using bottom priority approach according to formula (1).Task v1Bl (v1) be
54.1, task v2Bl (v2) it is 35.7, task v3Bl (v3) it is 43.5, task v4Bl (v4) it is 16.The priority of task
It is ordered as { v1,v3,v2,v4}。
4. the priority according to task will allow scheduler task v1Added to scheduling queue.
5. highest priority task v is selected from all permission scheduler tasks of scheduling queue1, calculated according to formula (2)
Task v1Reliability requirement
6. many copy general-purpose scheduler methods of calling task carry out calculating task v1Copy scheduling node and Starting Executing Time.
(1) first by task v1Copy amount is assigned to 0, and mapping node is assigned to sky.Idle node set is assigned to set of node
Close P.
(2) deadline earliest node p in idle node queue is chosen1Execution task copy.Calculated according to formula (3)
Task v1ReliabilityIt is 0.998202.Mission reliability requirement can not be met, then complete in continuation selection idle queues
Into the node p that the time is earliest4To perform task copy, it is 0.999981 that the reliability of the task is then calculated according to formula (3),
Meet the mission reliability requirement.That is r '1It is 0.999981.
7. by scheduler task v1Deleted from scheduling queue, while by new permission scheduler task v2And v3According to it
Priority is added in scheduling queue.Continue to select highest priority task v3It is scheduled.
8. highest priority task v is selected from all permission scheduler tasks of scheduling queue3, calculated according to formula (2)
Task v3Reliability requirement
9. many copy general-purpose scheduler methods of calling task carry out calculating task v3Copy scheduling node and Starting Executing Time.
(1) first by task v3Copy amount is assigned to 0, and mapping node is assigned to sky.Idle node set is assigned to set of node
Close P.
(2) calling many copy general-purpose scheduler algorithms of the task based on genetic algorithm carries out copy scheduling, comprises the following steps that:
1) initialization crossover probability p firstc=0.5, mutation probability pm=0.25, population quantity GN=4, Evolution of Population time
Number EN=3.
2) then generation initial population.
Firstly generate individual g1It is by task v3Map to node p1.Now EST (v3,p1) it is 18, LST (v3,
p1) it is 38.CopyStarting Executing Time position be(for task vjPredecessor task copyTask copyIn section
Point pnCorresponding Starting Executing Time position is designated as), i.e., 18.Now task v3ReliabilityIt is 0.996008.After
It is continuous to choose node p2Mapping tasks v3Copy.Now EST (v3,p2) it is 23, LST (v3,p2) it is 28.CopyStart perform
Time location isI.e. 23.Now task v3ReliabilityIt is 0.999967.Task v3Individual g1Encoding scheme is as schemed
Shown in 4.
Then second individuality g is generated2It is by task v3Map to node p1.Now EST (v3,p1) it is 18, LST (v3,
p1) it is 38.CopyStarting Executing Time position beI.e. 18.Now task v3ReliabilityIt is 0.996008.
Continue to choose node p4Mapping tasks v3Copy.Now EST (v3,p4) it is 18, LST (v3,p4) it is 38.CopyStart hold
Row time location isI.e. 18.Now task v3ReliabilityIt is 0.999905.Task v3Individual g2Encoding scheme is such as
Shown in Fig. 5.
Then the 3rd individuality g is generated3It is by task v3Map to node p2.Now EST (v3,p2) it is 23, LST (v3,
p2) it is 28.CopyStarting Executing Time position beI.e. 23.Now task v3ReliabilityIt is 0.991635.
Continue to choose node p3Mapping tasks v3Copy.Now EST (v3,p3) it is 23, LST (v3,p3) it is 28.CopyStart hold
Row time location isI.e. 23.Now task v3ReliabilityIt is 0.999946.Task v3Individual g3Encoding scheme is such as
Shown in Fig. 6.
Ultimately produce the 4th individuality g4It is by task v3Map to node p2.Now EST (v3,p2) it is 23, LST (v3,
p2) it is 28.CopyStarting Executing Time position beI.e. 23.Now task v3ReliabilityIt is 0.991635.
Continue to choose node p4Mapping tasks v3Copy.Now EST (v3,p4) it is 18, LST (v3,p4) it is 38.CopyStart hold
Row time location isI.e. 18.Now task v3ReliabilityIt is 0.999802.Task v3Individual g4Encoding scheme
As shown in Figure 7.
3) according to crossover probability pcAll individualities in=0.5 pair of population carry out first time crossover operation.Assuming that now every
The random number of secondary crossover operation is less than crossover probability pc, therefore crossover operation is not carried out.
4) according to mutation probability pmAll individualities in=0.25 pair of population carry out mutation operation.Assuming that in the 3rd individuality
g3Become different time random number and be more than mutation probability pm, then the 3rd individuality is made a variation.Randomly select the 3rd individual nodes p2's
Mapping position enters row variation.The individual reliability is compared with task v3Reliability requirement it is high, and task copyStart hold
The row time is earliest effective Starting Executing Time of mapping node, therefore by node p2Middle Starting Executing Time positionCorrespondence
Genic value be set to 0, by the node p of being made a variation in individuality2CopyCancel, generation individuality g5.Task v3Generated after variation individual
g5Encoding scheme is as shown in Figure 8.Will be newly-generated individual added to population.
5) according to formula (9) deadline valuation functions FTimWith formula (10) reliability assessment function FRelIn calculating population
Each individual fitness.According to FTimAnd FRelFunctional value is descending to arrange to obtain two sequence individual queues.Deadline comments
Estimate queue:g3, g5, g1, g2, g4.Reliability assessment queue:g1, g3, g2, g4, g5。
6) individuality in two queues is selected based on RR mechanism as the individuality in new population, will until reaching population scale
Ask.Choose queue in individuality be:g3, g1, g5, g2.Evolve for the first time and complete.
7) remaining evolutionary process as procedure described above 3) -6) is carried out, in the mutation operation evolved for second, it is assumed that individual
Body g5Meet variation condition, variation mode is in node p5Addition task v3Mapping copyIt is raw after finally being evolved at three times
Into final task v3Copy mapping scheme:WithNow task copyStarting Executing Time be 23, task copyStarting Executing Time be 21.3, task v3Completion perform the time be 34, reliability is 0.999970.That is r '2For
0.999970。
10. by scheduler task v3From scheduling queue delete, continue selection scheduling queue in next priority most
Task v high2It is scheduled.According to formula (2) calculating task v2Reliability requirement
The many copy general-purpose scheduler methods of 11. calling tasks carry out calculating task v2Copy scheduling node and start perform when
Between.Calculation procedure and task v3Calculation procedure it is similar, herein with regard to not repeated.After finally being evolved at three times, generation is final
Task copy mapping scheme:WithNow task copyStarting Executing Time be 18, task copyStart hold
The row time is 22, task v2Completion perform the time be 38, reliability is 0.999973.That is r '3It is 0.999973.
12. by scheduler task v2Deleted from scheduling queue, while by new permission scheduler task v4It is added to scheduling
In queue.Continue to select highest priority task v4It is scheduled.
13. select highest priority task v from all permission scheduler tasks of scheduling queue4, calculated according to formula (2)
Task v4Reliability requirement r4。r4=R/ (r '1*r′2*r′3)=0.99907593.
The many copy general-purpose scheduler methods of 14. calling tasks carry out calculating task v4Copy scheduling node and start perform when
Between.
(1) first by task v4Copy amount is assigned to 0, and mapping node is assigned to sky.Idle node set is assigned to set of node
Close P.
(2) calling many copy general-purpose scheduler algorithms of the task based on genetic algorithm carries out copy scheduling, comprises the following steps that:
1) initialization crossover probability p firstc=0.5, mutation probability pm=0.25, population quantity GN=4, Evolution of Population time
Number EN=3.
2) then generation initial population.
Firstly generate individual g1It is by task v4Map to node p1.Now EST (v4,p1) it is 40, LST (v4,
p1) it is 40.6.CopyStarting Executing Time position beI.e. 40.Now task v4ReliabilityFor
0.992627.Continue to choose node p2Mapping tasks v4Copy.Now EST (v4,p2) it is 38, LST (v4,p2) it is 52.6.It is secondary
ThisStarting Executing Time position beI.e. 38.Now task v4ReliabilityIt is 0.999927.Task v4Individuality
g1Encoding scheme is as shown in Figure 9.
Then second individuality g is generated2It is by task v4Map to node p1.Now EST (v4,p1) it is 40, LST (v4,
p1) it is 40.6.CopyStarting Executing Time position beI.e. 40.Now task v4ReliabilityIt is 0.992627.
Continue to choose node p3Mapping tasks v4Copy.Now EST (v4,p3) it is 34, LST (v4,p3) it is 52.6.CopyBeginning
Performing time location isI.e. 34.Now task v4ReliabilityIt is 0.999911.Task v4Individual g2Coding staff
Case is as shown in Figure 10.
Then the 3rd individuality g is generated3It is by task v4Map to node p2.Now EST (v4,p2) it is 38, LST (v4,
p2) it is 52.6.CopyStarting Executing Time position beI.e. 38.Now task v4ReliabilityFor
0.990050.Continue to choose node p3Mapping tasks v4Copy.Now EST (v4,p3) it is 34, LST (v4,p3) it is 52.6.It is secondary
ThisStarting Executing Time position beI.e. 41.Now task v4ReliabilityIt is 0.999886.Task v4Individuality
g3Encoding scheme is as shown in figure 11.
Ultimately produce the 4th individuality g4It is by task v4Map to node p3.Now EST (v4,p3) it is 34, LST (v4,
p3) it is 52.6.CopyStarting Executing Time position beI.e. 34.Now task v4ReliabilityFor
0.987973.Continue to choose node p4Mapping tasks v3Copy.Now EST (v4,p4) it is 35, LST (v4,p4) it is 50.CopyStarting Executing Time position beI.e. 35.Now task v4ReliabilityIt is 0.999603.Task v4Individuality
g4Encoding scheme is as shown in figure 12.
3) according to crossover probability pcAll individualities in=0.5 pair of population carry out first time crossover operation.Assuming that now only
In individual g1With individual g4Random number during crossover operation is more than crossover probability pc, therefore carry out crossover operation.During crossover operation with
Two positions that machine chooses array s include array element s2And s3Between mapping position.Therefore by two individual interior joint p2And p3
Corresponding encoding gene is swapped, and produces new individual g5.Individual g5It is by task v4Map to node p2And p4.Now
EST(v4,p2) it is 38, LST (v4,p2) it is 52.6.CopyStarting Executing Time position beI.e. 38.Now EST (v4,
p4) it is 35, LST (v4,p4) it is 50.CopyStarting Executing Time position beI.e. 35.Now task v4ReliabilityIt is 0.999671.By newly-generated individual g5Added to population.Task v4The individual g of crossover operation generation5Encoding scheme is such as
Shown in Figure 13.
4) according to mutation probability pmAll individualities in=0.25 pair of population carry out mutation operation.Assuming that in the 3rd individuality
g3Become different time random number and be more than mutation probability pm, then the 3rd individuality is made a variation.Randomly select the 3rd individual nodes p3's
Mapping position enters row variation.The individual reliability is compared with task v4Reliability requirement it is high, and task copyStart hold
Still suffered from effective Starting Executing Time before the row time, therefore by node p3Starting Executing Time position is migrated to positionCopyStarting Executing Time position be34.Generation individuality g6, its reliability is 0.999880, by newly-generated individual g6Addition
To population.Task v4The individual g of mutation operation generation6Encoding scheme is as shown in figure 14.
5) according to formula (9) deadline valuation functions FTimWith formula (10) reliability assessment function FRelIn calculating population
Each individual fitness.According to FTimAnd FRelFunctional value is descending to arrange to obtain two sequence individual queues.Deadline comments
Estimate queue:g6, g3, g4, g5, g1, g2.Reliability assessment queue:g1, g2, g3, g6, g5, g4。
6) individuality in two queues is selected based on RR mechanism as the individuality in new population, will until reaching population scale
Ask.Choose queue in individuality be:g6, g1, g3, g2.Evolve for the first time and complete.
7) as procedure described above 3) -6) remaining evolutionary process is carried out.After finally being evolved at three times, final task is generated
v4Copy mapping scheme:WithNow task copyStarting Executing Time be 38, task copyStart perform
Time is 34, task v4Completion perform the time be 50, reliability is 0.999880.
15. scheduling schemes for ultimately generating are as shown in figure 15.Now the scheduling Makespan of task-set is 50, and reliability is
0.99980401。
Claims (1)
1. many copy task fault-tolerance dispatching methods of a kind of heterogeneous distributing system, it is characterised in that:
(1) according to the execution speed of each node in the load of each task and system, each task v in application program is calculatedjQuilt
It is dispatched to each node p in systemkExecution time ET (vj,pk);The application program G=of constraint is relied on for existing<V,E>,
Set V={ v1,v2,...vN, task quantity N=| V |, E are the oriented communication weight line set between task in V;System model
It is non-directed graph GS=<P,L>, P={ p1,p2,...,pMIt is M heterogeneous nodes set, M=| P |, L are the individual communication link collection of | L |
Close;Task-set reliability requirement R;
(2) each task is calculated in the average performance times and every communication information of all processor nodes in all links
Average communication data;
(3) any task v is concentrated come calculating task using bottom priority approachjBottom priority bl (vj):
Succ (v in formulaj) it is task vjDirect follow-up work set,It is task vjAll nodes in node set P
Average performance times,It is message ej,iThe average transmission time of all links of link set L in systems;
(4) priority according to task will allow scheduler task to be added to scheduling queue according to the nonincremental mode of its priority;
(5) highest priority task is selected from all permission scheduler tasks of scheduling queue, highest priority task v is calculatedj's
Reliability requirement rx, x is position of the task in priority query:
1≤x≤n in formula, and meet the prioritization of task;R is task-set reliability requirement;r′iIt is priority query's middle position
It is set to the actual institute's achieved reliability, r ' of task of i0=1;If the task is entry tasks for highest priority task,
Reliability requirement
(6) if reliability requirement is invalid, i.e. task vjReliability requirement rx>=1, then refusal scheduler task, and return;It is no
Then many copy general-purpose scheduler methods of calling task calculate the copy scheduling node and Starting Executing Time of the task;
(7) scheduler task is deleted from scheduling queue, while new permission scheduler task is added to according to priority
In scheduling queue;Continue selection scheduling queue in next highest priority task be scheduled, repeat step (5)-(7) until
All tasks all dispatch completion;
The many copy general-purpose scheduler methods of described task are:
(6.1) corresponding information is initialized:By task viCopy amount is assigned to 0, and mapping node is assigned to sky, by idle node set
It is assigned to node set P;
(6.2) if task vjIt is entry tasks, deadline earliest node performs task pair in choosing idle node queue
This, calculating task vjReliability P [Evj]:
P[Evj]=1- ∏pn∈proc(vj)(1-exp{-λpn*w(vj)/w(pn)})
proc(vj) it is task vjMapping node set, λ pnIt is processor node pnPermanent fault probability, w (vj) represent
Task vjLoad, w (pn) represent node pnThe amount of calculation that be can perform in unit interval;If task reliability can not be met
Property, then continue to choose in idle queues deadline earliest node to perform task copy, then the reliability of calculating task
Property, until meeting mission reliability requirement;If until idle node queue is sky, mission reliability still can not meet will
Ask, make up mission reliability loss by Calculation of Reliability formula when follow-up work copy is dispatched;
(6.3) if task vjThere is predecessor task, call many copy general-purpose scheduler methods of the task based on genetic algorithm to carry out pair
This scheduling;
The many copy general-purpose scheduler methods of the described task based on genetic algorithm are:
(6.3.1) initialization crossover probability pc, mutation probability pm, population quantity GN, Evolution of Population number of times EN;
(6.3.2) generates initial population:
Calculate the predecessor task v of current scheduling taskiBe mapped in node pkTask copyMessage reach node pnWhen
Between
FT (v in formulai,pk) it is task viIn node pkCompletion perform the time, rdy (lk,n) it is link lk,nBe ready to communication when
Between i.e. link last message communicating deadline, w (ei,j) it is task viWith task vjBetween communication information ei,jSize, w
(lk,n) it is node pkWith node pnBetween link lk,nThe data volume that can be transmitted in unit interval, if mapping node is identical, i.e. pk
=pn, then time rdy (lk,n) it is 0, communication overhead is 0,
Task encoding scheme need to be by each node in minimum effectively Starting Executing Time position and maximum effectively Starting Executing Time
All position encoded between position is gene in individuality, task vjIn processor pnMinimum effectively perform time location EST (vj,
pn) calculate;
Pred (v in formulai) it is task viDirect predecessor task set;rep(vi) it is task viCopy set;rdy(pn) be
Current scheduling situation lower node pnThe completion of last mapping tasks performs time PFT (pn)
PFT(pk)=maxvi∈V,pk∈proc(vi){FT(vi,pk)}
Proc (v in formulai) task viThe processor sets for being mapped;
Task vjIn processor pnMaximum effectively Starting Executing Time position LST (vj,pn)
Processor node is chosen from node idle queues, an effective Starting Executing Time position is chosen in processor node,
The copy of current scheduling task is mapped, the reliability of calculating task copy, if the reliability of the task is unsatisfactory for requiring, continues
Processor node is chosen from node idle queues and in the effective Starting Executing Time position of node selection task, until task
Reliable sexual satisfaction requirement, using task copy mapping scheme as the individual in population, repeatedly generates individuality, until reaching kind
Group's scale, if task copy amount is M, the reliability of task is also not reaching to reliability requirement, will the task copy
Mapping scheme is used as the individual in population, because follow-up work can in right amount compensate the reliability loss of the task when dispatching,
In formulaIt is task vjIt is mapped in node pnCopyReliability,It is node pnUpper current scheduler task pair
ThisThe task copy for performing beforePrepnIt is node pnThe task copy set of execution;ST(vj,pn) it is task vjIn section
Point pnStarting Executing Time;etp,qIt is task vpWith vqBetween communication information beginning call duration time;ON(lk,n) it is in link lk,n
The all of communication for occurring;etp,q≤etl,j(vp,vq∈ V) it is link lk,nUpper communication information ep,qBeginning call duration time be less than
Or equal to message el,jBeginning call duration time;λlk,nIt is node pkWith node pnBetween link lk,nFailure probability;If task pair
ThisWithMapping node it is identical, then its link communication time is 0, and the reliability of the communication information is 1;
The corresponding encoding gene value in effective Starting Executing Time position of mapping tasks is 1, position of the correspondence without mapping tasks
It is 0, in duty mapping, be up to one value of position is 1 in the corresponding gene of each node, and the value of other positions is 0;
Coding also includes effective mapping position number of each node in individual UVR exposure, and the position is represented by array s, if appointed
Business vjDistribute to node pnIn k-th effective Starting Executing Time position, then individual gjIn l-th gene gj,l=1,|si| it is s in array siRepresentative node piEffective mapping position number, | s0|=0, coding individuality
Length isArray element siIn individual gjCorresponding gene sets are
(6.3.3) is according to crossover probability pcAll individualities in population carry out crossover operation:
If random number is less than crossover probability pc, for two individualities selected, correspondence is compiled in two individualities in selection array s
The same node point that code genic value is differed, the gene corresponding to all nodes that will be chosen in two individualities is swapped, by life
Into new individual be added to population;
(6.3.4) is according to mutation probability pmAll individualities in population carry out mutation operation:
It is newly-generated individual added to population;
(6.3.5) deadline valuation functions FTimWith reliability assessment function FRelCalculate each individual g in populationiFitness,
By all individualities according to FTimAnd FRelThe descending arrangement of functional value obtains two sequence individual queues
(6.3.6) is based on RR mechanism and selects the individuality in two queues as the individuality in new population, until reaching population scale
It is required that;
(6.3.7), if being unsatisfactory for stop condition, repeat step (6.3.3)-(6.3.6) can within the evolution number of times of regulation
Do not improved by property or Makespan, stop solving.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410216137.0A CN103970613B (en) | 2014-05-21 | 2014-05-21 | Multi-copy task fault tolerance scheduling method of heterogeneous distributed system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410216137.0A CN103970613B (en) | 2014-05-21 | 2014-05-21 | Multi-copy task fault tolerance scheduling method of heterogeneous distributed system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103970613A CN103970613A (en) | 2014-08-06 |
CN103970613B true CN103970613B (en) | 2017-05-24 |
Family
ID=51240145
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410216137.0A Active CN103970613B (en) | 2014-05-21 | 2014-05-21 | Multi-copy task fault tolerance scheduling method of heterogeneous distributed system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103970613B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108628708A (en) * | 2017-03-20 | 2018-10-09 | 中兴通讯股份有限公司 | Cloud computing fault-tolerance approach and device |
CN108108233B (en) * | 2017-11-29 | 2021-10-01 | 上海交通大学 | Cluster job scheduling method and system for task multi-copy execution |
CN109254841B (en) * | 2018-09-30 | 2021-11-26 | 湘潭大学 | Dual-objective optimization task scheduling method for distributed system |
CN109976890B (en) * | 2019-03-28 | 2023-05-30 | 东南大学 | Variable frequency method for minimizing heterogeneous private cloud computing resource energy consumption |
CN111090783B (en) * | 2019-12-18 | 2023-10-03 | 北京百度网讯科技有限公司 | Recommendation method, device and system, graph embedded wandering method and electronic equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102799474A (en) * | 2012-06-21 | 2012-11-28 | 浙江工商大学 | Cloud resource fault-tolerant scheduling method based on reliability drive |
-
2014
- 2014-05-21 CN CN201410216137.0A patent/CN103970613B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102799474A (en) * | 2012-06-21 | 2012-11-28 | 浙江工商大学 | Cloud resource fault-tolerant scheduling method based on reliability drive |
Non-Patent Citations (3)
Title |
---|
A Resource Minimizing Scheduling Algorithm with Ensuring the Deadline and Reliability in Heterogeneous Systems;Laiping Zhao et al.;《2011 IEEE International Conference on Advanced Information Networking and Application》;20111231;全文 * |
Genetic Algorithm based Scheduling Method for Efficiency and Reliability in Mobile Grid;SungHo Chin et al.;《Proceedings of the 4th International Conference on Ubiquitous Information Technologies & Applications, 2009》;20091231;全文 * |
Optimizing Makespan and Reliability for Workflow Applications with Reputation and Look-ahead Genetic Algorithm;Wang X et al.;《Future Generation Computer Systems》;20110315;第27卷(第8期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN103970613A (en) | 2014-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103970613B (en) | Multi-copy task fault tolerance scheduling method of heterogeneous distributed system | |
Chen et al. | Energy-efficient offloading for DNN-based smart IoT systems in cloud-edge environments | |
CN103870317A (en) | Task scheduling method and system in cloud computing | |
US11223674B2 (en) | Extended mobile grid | |
CN111325356A (en) | Neural network search distributed training system and training method based on evolutionary computation | |
CN103281374B (en) | A kind of method of data fast dispatch during cloud stores | |
CN106201701A (en) | A kind of workflow schedule algorithm of band task duplication | |
CN104283963B (en) | A kind of CDN load-balancing methods of Distributed Cooperative formula | |
Bukhsh et al. | A decentralized edge computing latency-aware task management method with high availability for IoT applications | |
Liu et al. | Task scheduling in cloud computing based on improved discrete particle swarm optimization | |
Emberson et al. | Extending a task allocation algorithm for graceful degradation of real-time distributed embedded systems | |
Zhou et al. | Learning to optimize dag scheduling in heterogeneous environment | |
Sheeba et al. | An efficient fault tolerance scheme based enhanced firefly optimization for virtual machine placement in cloud computing | |
Aliyu et al. | Management of cloud resources and social change in a multi-tier environment: a novel finite automata using ant colony optimization with spanning tree | |
CN109951551A (en) | A kind of container mirror image management system and method | |
CN102799474A (en) | Cloud resource fault-tolerant scheduling method based on reliability drive | |
CN110730241B (en) | Global scale oriented blockchain infrastructure | |
CN112883526B (en) | Workload distribution method under task delay and reliability constraint | |
Meddeber et al. | Tasks assignment for Grid computing | |
Semmoud et al. | A survey of load balancing in distributed systems | |
CN112698944A (en) | Distributed cloud computing system and method based on human brain simulation | |
CN113285823A (en) | Business function chain arranging method based on container | |
Stavrinides et al. | Resource allocation and scheduling of linear workflow applications with ageing priorities and transient failures | |
Kuang et al. | Level value density task scheduling algorithm for cyber physical systems on cloud | |
Samal et al. | Bio-inspired approach to fault-tolerant scheduling of real-time tasks on multiprocessor-a study |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |