CN109104304A

CN109104304A - A kind of distribution real time fail processing method

Info

Publication number: CN109104304A
Application number: CN201810819362.1A
Authority: CN
Inventors: 秦佳峰; 杨祎; 林颖; 李程启; 白德盟; 冯新岩; 周超; 刘洋; 贾然; 李龙龙; 郑文杰; 孙景文; 韩明明; 乔颖; 王娟娟; 王宏安; 罗雄飞; 郭超平
Original assignee: State Grid Corp of China SGCC; Institute of Software of CAS; Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; Institute of Software of CAS; Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd
Priority date: 2018-07-24
Filing date: 2018-07-24
Publication date: 2018-12-28
Anticipated expiration: 2038-07-24
Also published as: CN109104304B

Abstract

The present invention provides a kind of distributed real time fail processing method, which is characterized in that the method includes: S1: establishing task-set τ={ τ of real time fail processing_i| 1≤i≤n }, wherein n indicates to constitute the n task of task-set τ, each task τ_iRespectively corresponding one has the critical fault tree TR of mixing_i；S2: according to the execution state of failure, the dispatching method of failure task is determined；S3: the failure that system generates is matched with the safety operation figure of its troubleshooting according to the dispatching method of step S2, completes the elimination of failure by the task-set obtained using step S1.This method can complete real-time troubleshooting under distributed environment, and in view of the Restoration Mechanism in the case of the potential sprawling of failure.

Description

A kind of distribution real time fail processing method

Technical field

The invention belongs to real time reaction formula systems and real-time technique field, and in particular to a kind of distribution real time fail processing Method.

Background technique

Large-scale and complex distributed system is monitored and controlled in real time in order to realize, by powerful sensing section The key node of point deployment in systems, and it has been directly connected to internet, collected information is transmitted to accordingly in real time Server cluster in the instruction that is calculated, and needed to be implemented back to sensing node or control node, complete predetermined Security target.This kind of ultra-large complexity that centralization with distributed coordination operation are cooperatively formed by the multiple types network integration The real time reaction formula system of network, referred to as complicated real time reaction formula system.Smart grid is the one of complicated real time reaction formula system A Typical Representative.

Complicated real time reaction formula system is usually related to the security of the lives and property, society and Environmental security, concerns safely, There is high requirement of real-time, i.e., after the event for needing to pay close attention to occurs, system must be completed corresponding dynamic within the given time limit These events of opposing are responded, and the intelligent operation of a large amount of even magnanimity needs enterprising in different nodes, different equipment There are stringent regulation in row, the execution order of these operations and time；Once response has exceeded its time limit or has operation in mistake Equipment on, mistake at the time of execute, execute time overlength, execution order mistake, it will cause catastrophic consequences: Ren Yuanchong The harm of wound or the dead perhaps serious damage or environment of equipment.

The comment on network thousands of miles of complicated real time reaction formula system, equipment is multifarious, environment is ever-changing, in the whole network By can acquire information in real time, rapid data operation, complete relevant business operation in time in range, to the fortune of whole system Row is monitored；Once breaking down, the modes such as the reaction equation system that needs to concern by actual time safety is quickly checked, diagnosed It reduces loss, repair rapidly.Complicated intelligent business should be had timely completed as target using trouble saving when operating normally to grasp Make；Failure occur when find failure in time, according to current failure state, the fusion situation of multiple network and network information state, The various states such as distributed equipment state carry out emergency action and self-regeneration to the failure of appearance to eliminate event within the time limit Barrier, to guarantee the safety of system；Its key problem is to study the Real-Time Scheduling problem of its distributed fault processing task.

In complicated real time reaction formula system when faulty generation, if cannot locate in time due to limited system resources Reason, may causing other, there are business or the new failures of data correlation, constantly occur under distributed environment to break down The case where sprawling.For the chain reaction that this failure may occur, there is no consider for current complicated real time reaction formula system How the real-time of troubleshooting is guaranteed under its chain reaction, to affect the success rate and safety of fault recovery.

Summary of the invention

For complicated real time reaction formula system the deficiencies in the prior art, the present invention provides a kind of new distribution events in real time Hinder processing method, this method can complete real-time troubleshooting under distributed environment, and consider the potential sprawling situation of failure Under Restoration Mechanism.

The technical scheme is that realize in the following manner:

A kind of distribution real time fail processing method, the method include:

S1: task-set τ={ τ of real time fail processing is established_i| 1≤i≤n }, wherein n indicates to constitute the n of task-set τ A task, each task τ_iRespectively corresponding one has the critical fault tree TR of mixing_i；

S2: according to the execution state of failure, the dispatching method of failure task is determined；

S3: the task-set obtained using step S1, the failure for generating system according to the dispatching method of step S2 and its event The safety operation figure of barrier processing is matched, and the elimination of failure is completed.

Further, the specific implementation process of step S1 are as follows:

S11: creation task τ_iThe primary fault node τ of corresponding fault tree_i,1；

S12: according to previous fault data, failure τ is derived_i,1The consequent malfunction node of initiation forms τ_i,1Subsequent section Point, until all malfunctioning node τ_i,jAll establish；

S13: all malfunctioning node τ are established_i,jSet form task τ_i；

S14: task τ is utilized_iEstablish the task-set τ of real time fail processing.

Further, malfunctioning node τ_i,jWith the corresponding relationship between safety operation figure are as follows: Wherein, G_i,jRepresent processing τ_i,jCorresponding failure institute The safety operation figure that need to be executed, includes n_i,jA subtask for carrying out safety operationD_i,jIt is G_i,jThe opposite off period,It is subtaskThe execution time needed for completing safety operation.

Further, malfunctioning node τ_i,jSet τ_i(r_i,TR_i)={ τ_i,j|1≤j≤ n_i, wherein TR_iIndicate oriented Tree, r_iIt is TR_iThe ready time of initial malfunctioning node, τ_i,jIndicate TR_iEach node.

Further, the specific implementation process of step S2 are as follows:

S21: the execution state of analysis failure τ i default, according to fault tree TR_iKey where source node, confirmation is crucial Node；

S22: MCE2E task cluster is formed according to key node, wherein the ordinary node in each cluster is chosen to be saved according to key The comprehensive decision of key state and its pressing degree where the pressing degree and ordinary node of point；If not yet there is crucial section Point is then initially formed the Candidate Set of MCE2E task cluster according to the node of the key state of current highest；

S23: according to the round of key node, the dispatching method of each cluster interior joint is established.

Further, task τ_iThe execution state of its default of representative failure is its fault tree TR_iSource node where Key, i.e. τ_i=τ_i,1, TR_iSource node Its In, G_i,1Represent processing τ_i,1The safety operation figure that need to be executed, G_i,1Only one originating task and a whole task, including n_i,jIt is a into The subtask of row safety operation

Further, in step S23, the execution method of dispatching method are as follows: scheduling window of every wheel in the cluster key node The three phases that ordinary node is likely to occur in interior judgement cluster,

If being in key state reservation phase, all nodes are all executed in the case where currently mixing key state, at this point, tired The long-pending execution time is not up to the upper limit of the key state of the mixing；

If be in key state switch step, ordinary node be key node successful execution and concede processor money Source；

If being in the key state more new stage, due to the key state switching that second stage generates, update general In logical node, the subsequent node information in other clusters.

Further, in key state switch step, method is specifically executed are as follows:

According to the key state and its pressing degree where ordinary node, choose key state it is lower and idle when Between relatively abundant ordinary node carry out degradation execution；

If key state conversion occurs for the ordinary node for being downgraded execution, next common section is chosen from Candidate Set Point carries out degradation execution.

Further, the specific steps that ordinary node degradation executes are as follows:

1) the critical task subclass of highest is scheduled, is looked for subtask of each key node on partial order figure The local off period allocation plan schedulable to one kind,

2) according under current key character state execution time demand and the off period, in conjunction with local off period splitting scheme come Can analysis find sufficiently long idle processor length on multiple agent to complete to execute；

3) if task can be by successful dispatch, the task is by current key state access and executes；Otherwise, this Business activates the inter-related task of the next key state of grade, and goes to and 2) continue to execute.

The beneficial effects of the present invention are:

The present invention is directed to the demand for security of complicated real time reaction formula system, present in complicated real time reaction formula system Scheduling problem devises the real time fail processing method under distributed environment, with increase malfunctioning node safety operation processing at Power, and reduce the subsequent triggers rate of failure.The present invention judges that can existing system resource meet according to Schedulable conditions The off period constraint of reasoning task in system, according to the determination processing sequence that its scheduling strategy is reasoning task, be real-time reasoning Process distributes reasonable system resource, and judges that can newly arrived reasoning task have reasoning task in not influence system Under the premise of be safely completed, real-time reasoning process when system enters normal operation if being able to satisfy；Otherwise, with total repair time Most short, most short failure sprawling length is target, is scheduled to the self-healing multiple agent in system, solves effective fault restoration Solution makes system that can also avoid losing in the worst cases as far as possible.This method is suitable for complicated real time reaction formula system, both Guarantee the safe operation of multiple agent entirety, and the subsequent failure rate that fault treating procedure can be made to occur is few, failure degree of expansion It is small, to improve the high real-time and high reliability of complicated real time reaction formula system.

Detailed description of the invention

Fig. 1 is the mapping schematic diagram of fault tree and safety operation figure of the invention；

Fig. 2 is distributed real time fail processing task model figure of the invention；

Fig. 3 is the flow chart of the method for the present invention.

Specific embodiment

Below in conjunction with the attached drawing specific embodiment that the present invention will be described in detail, following disclosure provides specific embodiment For realizing the device of the invention and method, those skilled in the art is made to be more clearly understood that how to realize the present invention.In order to Simplify disclosure of the invention, hereinafter the component of specific examples and setting are described.In addition, the present invention can be in different examples Repeat reference numerals or letter in son.This repetition is for purposes of simplicity and clarity, itself not indicate discussed various Relationship between embodiment or setting.It should be noted that illustrated component is not drawn necessarily to scale in the accompanying drawings.The present invention saves The description to known assemblies and treatment technology and process has been omited to avoid the present invention is unnecessarily limiting.It will be appreciated that though this Invention describes its preferred embodiment, however these are elaborations to embodiment, rather than limits the present invention Range.

The principle of entire technical solution are as follows:

Fault tree set is the malfunction that the possibility rule of thumb summarized occurs or has occurred and that, each malfunction A time span is all corresponded to, the off period is defined herein as, if cannot complete in the off period, new failure will be caused.

Safety operation figure includes the task processing sequence of all safety of daily maintenance and troubleshooting, with the side of digraph Formula is stored.

As shown in Figure 1, any one fault tree or normal condition chain, one at least corresponded in a safety operation figure is complete Whole sequence of operation subtree；After corresponding intelligent agent completes all operationss sequence, then the troubleshooting；If cannot provide It is completed in time, failure cannot exclude, and generate new failure, and fault tree enters next link, need to complete more safety The sequence of operation.

As shown in Figures 2 and 3, the present processes mainly comprise the steps that a kind of distributed real time fail processing side Method, the method include:

S1: task-set τ={ τ of real time fail processing is established_i| 1≤i≤n }, wherein n indicates to constitute the n of task-set τ A task, each task τ_iRespectively corresponding one has the critical fault tree TR of mixing_i, the mixing of task is key to be referred to Fault tree TR_iUpper failure τ_iThe different degrees of τ of extension_i,j。

The specific implementation principle and process of step S1 are as follows:

S11: creation task τ_iThe primary fault node τ of corresponding fault tree_i,1。

S12: according to previous fault data, failure τ is derived_i,1The consequent malfunction node of initiation forms τ_i,1Subsequent section Point, until all malfunctioning node τ_i,jIt all establishes, forms fault tree TR_i。

TR_iDirected edge is from τ if it exists_i,jIt is directed toward τ_i,l, then τ_i,jIt is τ_i,lFather node, τ_i,lIt is τ_i,jChild node.τ_i,lOnly In τ_i,jCorresponding safety operation figure, which executes, exceeds τ_i,jOff period when be just triggered and ready, at this point, τ_i,jIt terminates and holds immediately Row.

The subtask for not having father node is source node, and the subtask of child node is not terminal note.Each node only has one A father node and multiple child nodes, TR_iOnly one source node and multiple terminal notes.

S13: all malfunctioning node τ are established_i,jSet form task τ_i.Node τ on any fault tree_i,jCorrespond to by Handling failure τ defined in directed acyclic graph_i,jThe safety operation figure G of required execution_i,j.Failure τ_i,jSuccess or not is eliminated to depend on In all safety operations (meet off period constraint) in its corresponding safety operation figure can be completed in time.

TR_iThere is n_iA node, each node τ_i,jIt is by directed acyclic graph G_i,jIt is defined, represent processing τ_i,jIt is corresponding The safety operation figure executed needed for failure, includes n_i,jA subtask for carrying out safety operationMalfunctioning node τ_i,jWith safety Corresponding relationship between operation diagram are as follows:Wherein, G_i,jRepresent processing τ_i,jThe safety operation figure executed needed for corresponding failure, includes n_i,jA son for carrying out safety operation is appointed BusinessD_i,jIt is G_i,jOpposite off period (namely TR_iTransfer occurs and generates key variation between upper difference node Minimum interval),It is subtaskThe execution time needed for completing safety operation.

G_i,jDirected edge indicate τ_i,jThe safety operation for carrying out troubleshooting executes stream, defines τ_i,jSafety operation son The temporal constraint of task.G_i,jIn if it exists directed edge from subtaskIt is directed toward subtaskThenIt isIt is direct before It drives,It isImmediate successor.G_i,jIn if it exists directed walk from subtaskUp to subtaskThenIt is's Forerunner,It isIt is subsequent.

There is no the subtask of forerunner to be known as originating task, not subsequent subtask is known as whole task.It must be until all It could start to execute after the completion of direct precursor,May there are multiple direct precursors and multiple immediate successors；G_i,jOnly one source Task and a whole task.

S14: task τ is utilized_iEstablish task-set τ={ τ of real time fail processing_i|1≤i ≤n}。

If τ_iThe corresponding task of upper first node was completed to execute within its off period, then the fault tree not back-propagation； Otherwise, the corresponding task of next node on the fault tree is triggered.τ_iIf going to the last one subtask, peace is entered The state to concern entirely, that is, becomeOtherwise, it isAllIt must complete to execute before its off period,Allow to miss the off period and enters next subtask.

Engineer often according to be effectively treated in the different functionalities of system, system the distribution of resource, system it is existing about Beam and with the proximity etc. in physical environment between sensor and driver, rank is designed in system based on its experience and preference Section is good by the mapping settings between the task and process resource in system.

Therefore, different faults tree τ is adhered to separately_iSubtask according to action type difference, be assigned to the distribution of corresponding classification It is executed on formula processor, the subtask that can be executed on certain processor set is denoted as Ψ (h).

S2: according to the execution state of failure, the dispatching method of failure task is determined.By right to active fault tree node institute All subtasks in safety operation figure answered are scheduled, and not only can guarantee that all MCE2E tasks are all schedulable, but also can make Failure degree of expansion minimum (the i.e. mean failure rate degree of expansion minimum MIN (AVG (et of task-set_i)) or maximum failure extension journey Spend minimum MIN (MAX (et_i)))。

The specific implementation process of step S2 are as follows:

Failure τ is analyzed first_iThe execution state of default, task τ_iThe execution state of its default of representative failure is its event Barrier tree TR_iSource node where key, i.e. τ_i=τ_i,1。

TR_iSource node G_i,1It represents Handle τ_i,1The safety operation figure that need to be executed, G_i,1Only one originating task and a whole task, including n_i,jA carry out safety operation Subtask

Then, MCE2E task cluster is formed according to key node, sub-clustering is carried out to active fault tree node, by being closed at one Key node and several ordinary nodes form, it may be assumed that

MCE2E task cluster is formed on intelligent body according to key node, the ordinary node in each cluster is chosen according to key The pressing degree of node and the key state where ordinary node and its comprehensive decision of pressing degree；If not yet there is crucial section Point is then initially formed the Candidate Set of MCE2E task cluster according to the node of the key state of current highest.

S23: according to the round of key node, the dispatching method of each cluster interior joint, the execution method of dispatching method are established Are as follows: scheduling window of every wheel in the cluster key nodeThe three phases that ordinary node is likely to occur in interior judgement cluster:

If being in key state reservation phase, all nodes are all executed in the case where currently mixing key state, at this point, tired The long-pending execution time is not up to the upper limit of the key state of the mixing.

If be in key state switch step, ordinary node be key node successful execution and concede processor money Source is potentially caused certain ordinary nodes and is converted to higher key state.

For key state switch step, implementation strategy are as follows: according to the key state where ordinary node and its tightly Compel degree, chooses the ordinary node that key state is lower and free time is relatively abundant and carry out degradation execution；If being downgraded Key state conversion occurs for the ordinary node of execution, then chooses next ordinary node from Candidate Set and carry out degradation execution.

It should be noted that in key state switch step, it should be ensured that ordinary node executes bring due to degrading Key conversion is as few as possible.

The key state more new stage is updated related general due to the key state switching that second stage generates Logical node has the subsequent node information in other clusters of temporal constraint.

It, can will be by if key node is completed after executing in the switching of key state and key state more new stage Disconnected ordinary node restores to execute, and generates to reduce unnecessary key state switching, to reduce by higher key Off period caused by state task misses spread length.

For in key node, the scheduling of key subtask is described in detail below with reference to concrete instance.

If P^k _i,1It is in G_i,1In from τ^k _i,1To all paths between whole task, P^k _i,1In longest path be known as critical path Diameter P_i,1 ^kcri, the length is C_i,1 ^kcri；P_i,1 ^kcriOn subtask be known as crucial subtask.

Since any task delay in critical path can all cause the delay of overall task response time, base can use In the critical path and correlation technique of depth-first search, analysis is scheduled to the task based on graph model, is found out to task Crucial execution sequence is dispatched, preferably analyzes the executive condition of the task on the whole.

The slack time of crucial subtask is minimum, executes crucial subtask as soon as possible preferably to obtain the best sound of its task Between seasonable.By giving subtask τ^k _i,1Distribute local off period d^k _i,1, it is respectively completed all subtasks on respective intelligent body It executes demand, while the off period d of any subtask^k _i,1No more than its affiliated task τ_i,1Off period d_i,1, thus institute There is task τ_iIt can be under its initial key character state by successful dispatch.

For this purpose, forerunner subtask will excessively cannot be reserved enough using the slack time on intelligent body to subsequent subtask Time complete execute.For this purpose, the optimization aim of local off period distribution method is so that τ on each intelligent body^k _i,1Minimum Path relaxation degree is maximum [4] [5], is that slack time as much as possible is saved in the follow-up work of these subtasks, to help to meet Constraint (the i.e. d of its affiliated task total off period^k _i,1)。

Meanwhile also to guarantee that all subtask set Ψs (h) of the different task on the intelligent body also can successfully be adjusted Degree.

Optimization aim are as follows: max:min { d_i,1-d^k _i,1-C_i,1 ^kcri|τ^k _i,1∈Ψ(h)}.With mixed integer linear programming or non- Linear programming model solves the optimization problem.

Constraint condition are as follows: r^k _{I, 1}+C^k _{I, 1}≤d^k _{I, 1}≤d_{I, 1}-C_{I, 1} ^kcri,

If rightThe solution of the problem can be found, All It runs succeeded under the initial key character state of source node；Otherwise, the τ of failure is dispatched_i,1It is terminated on its safety operation figure immediately It operates and enters more advancedly key, all τ_i,1The failure that represents of child node be all triggered, need to be all to what is newly triggered Failure is handled.

Work as τ_iPresent node set in have node τ_i,jIt is TR_iOn destination node when, system is by τ_iKey be defined as Highest is key, by τ_i,jIt is defined as key node, other nodes are that have different critical ordinary nodes.System at regular intervals prison It surveys the triggering situation of key node and checks that can the execution of key node meet the constraint of its off period；If being unable to satisfy, need It selects suitable ordinary node to be interrupted on a processor in key node and postpones to execute, allowed for the execution of key node Processor resource out；When key node meets off period constraint, the ordinary node being interrupted can be continued to execute；It will be above-mentioned The degradation that method is known as ordinary node executes.

The specific strategy that ordinary node degradation executes is as follows:

1) task subclass Υ critical to highest^criIt is scheduled: for son of each key node on partial order figure Task, by for Υ^criIn a distributed manner to a kind of schedulable local off period allocation plan is found, to ensure multiple agent Processor resource can successfully dispatch the subtask of all key nodes；It is reserved at the free time as much as possible to other tasks simultaneously Manage device resource.

2) to other tasks Υ^non-criIt is scheduled, task is defaulted as τ_{I, 1}Key state；According to current key Execution time demand and off period under character state, can analyze in conjunction with local off period splitting scheme look on multiple agent It completes to execute to sufficiently long idle processor length.

Might as well set in system has m processor, and Ψ (h) is enabled to be preassigned safety operation on some processor to be all The set (these subtasks can be seized mutually) of subtask, i.e. Ψ={ Ψ (h), 1≤h≤m }.Each intelligent body will according to appoint Business subtask is divided on partial order figure local off period, to Ψ (h),On subtask be scheduled. Steps are as follows:

For the new local off period of all subtasks distribution in Ψ, it is the smallest to select the local off period in Ψ (h)It goes It executes,

When the subtask of Ψ (h)It, will when completionCompletion information notify toWherein,?It is abandoned from Ψ (h)；Part is selected in Ψ (h) Off period the smallest subtaskIt executes,

Ψ (l) is obtainedCompletion information after, allowIt is ready, It calculates'sAnd the part cut-off new for all subtasks distribution in Ψ (l) Phase.

If rightDivide local off period failure, and τ_{I, j}It is Υ^non-criIn task, by τ_{I, j}The next grade caused Key state activation.

Aforesaid operations are repeated until all task executions finish.

If τ_i,1In its off period D_i,1All operations on safety operation figure are completed before, then, τ_iAlso it is carried out into Function, otherwise, τ_i,1Execution have exceeded D_i,1, then, τ_i,1The operation on its safety operation figure is terminated immediately and enters next stage Key τ_i,2(τ_i,2It is τ_i,1Child node), and carry out troubleshooting (i.e. since current time τ by the key state_i= τ_i,2, and τ_iOff period be updated to D_i,2, τ_i,2Referred to as present node).

If τ_i,1There are multiple child nodes, then, the failure that all child nodes represent all is triggered, and becomes and work as prosthomere Point, and possess its respective off period constraint.

And so on, by τ_iThe fault tree TR of representative_iThere may be multiple opposite off periods to constrain, and is by all respectively Defined in present node.

If fault tree TR_iUpper all present nodes all complete the operation on its safety operation figure before its off period, then Task τ_iIt is schedulable；At this point, any present node on all fault trees or be not destination node or satisfaction C_{i,SINK_NODE}≤D_{i,SINK_NODE}.Fault tree TR_iAny one upper destination node, if the operation on its corresponding safety operation figure Execution has exceeded its off period, then, τ_iThe scheduled failure of representative troubleshooting.

In addition, application range of the invention is not limited to the technique, mechanism, system of specific embodiment described in specification It makes, material composition, means, method and step.From the disclosure, will be easy as those skilled in the art Ground understands, for current technique that is existing or will developing later, mechanism, manufacture, material composition, means, method or Step, the knot that the function or acquisition that wherein they are executed is substantially the same with the corresponding embodiment that the present invention describes are substantially the same Fruit can apply them according to the present invention.Therefore, appended claims of the present invention are intended to these techniques, mechanism, system It makes, material composition, means, method or step are included in its protection scope.

Claims

1. a kind of distribution real time fail processing method, which is characterized in that the method includes:

S1: task-set τ={ τ of real time fail processing is established_i| 1≤i≤n }, wherein n indicates the n task of composition task-set τ, Each task τ_iRespectively corresponding one has the critical fault tree TR of mixing_i；

S3: the task-set obtained using step S1, at the failure and its failure generated system according to the dispatching method of step S2 The safety operation figure of reason is matched, and the elimination of failure is completed.

2. a kind of distributed real time fail processing method according to claim 1, which is characterized in that the specific reality of step S1 Existing process are as follows:

S12: according to previous fault data, failure τ is derived_i,1The consequent malfunction node of initiation forms τ_i,1Descendant node, directly To all malfunctioning node τ_i,jAll establish；

S13: all malfunctioning node τ are established_i,jSet form task τ_i；

3. a kind of distributed real time fail processing method according to claim 2, which is characterized in that malfunctioning node τ_i,jWith Corresponding relationship between safety operation figure are as follows: Its In, G_i,jRepresent processing τ_i,jThe safety operation figure executed needed for corresponding failure, includes n_i,jA son for carrying out safety operation TaskD_i,jIt is G_i,jThe opposite off period,It is subtaskThe execution time needed for completing safety operation.

4. a kind of distributed real time fail processing method according to claim 3, which is characterized in that malfunctioning node τ_i,j's Set τ_i(r_i,TR_i)={ τ_i,j|1≤j≤n_i, wherein TR_iIndicate directed tree, r_iIt is TR_iInitial malfunctioning node it is ready when Between, τ_i,jIndicate TR_iEach node.

5. a kind of distributed real time fail processing method according to claim 1, which is characterized in that the specific reality of step S2 Existing process are as follows:

The execution state of S21: analysis task τ i default, according to fault tree TR_iIt is key where source node, confirm key node；

S22: MCE2E task cluster is formed according to key node, wherein the ordinary node in each cluster is chosen according to key node The comprehensive decision of key state and its pressing degree where pressing degree and ordinary node；If not yet there is key node, The Candidate Set of MCE2E task cluster is initially formed according to the node of the key state of current highest；

6. a kind of distributed real time fail processing method according to claim 5, which is characterized in that task τ_iRepresentative The execution state of its default of failure is its fault tree TR_iSource node where key, i.e. τ_i=τ_i,1, TR_iSource nodeWherein, G_i,1Represent processing τ_i,1The peace that need to be executed Full operation figure, G_i,1Only one originating task and a whole task, including n_i,jA subtask for carrying out safety operation

7. a kind of distributed real time fail processing method according to claim 5, which is characterized in that in step S23, scheduling The execution method of method are as follows: every wheel judges ordinary node is likely to occur in cluster three in the scheduling window of the cluster key node Stage,

If being in key state reservation phase, all nodes are all executed in the case where currently mixing key state, at this point, accumulation The execution time is not up to the upper limit of the key state of the mixing；

If be in key state switch step, ordinary node be key node successful execution and concede processor resource；

If being in the key state more new stage, due to the key state switching that second stage generates, common section is updated Subsequent node information in point, in other clusters.

8. a kind of distributed real time fail processing method according to claim 7, which is characterized in that cut in key state It changes the stage, specifically executes method are as follows:

According to the key state and its pressing degree where ordinary node, choose that key state is lower and free time phase Degradation execution is carried out to abundant ordinary node；

If the conversion of key state occurs for the ordinary node for being downgraded execution, chosen from Candidate Set next ordinary node into Row, which degrades, to be executed.

9. a kind of distributed real time fail processing method according to claim 8, which is characterized in that ordinary node degradation is held Capable specific steps are as follows:

1) the critical task subclass of highest is scheduled, finds one to subtask of each key node on partial order figure The schedulable local off period allocation plan of kind,

2) according under current key character state execution time demand and the off period, analyzed in conjunction with local off period splitting scheme Sufficiently long idle processor length can be found on multiple agent to complete to execute；

3) if task can be by successful dispatch, the task is by current key state access and executes；Otherwise, which swashs The inter-related task of the next key state of grade living, and go to and 2) continue to execute.