CN104698839B - A kind of multiple agent fault detect based on information interaction and compensating control method - Google Patents

A kind of multiple agent fault detect based on information interaction and compensating control method Download PDF

Info

Publication number
CN104698839B
CN104698839B CN201410832047.4A CN201410832047A CN104698839B CN 104698839 B CN104698839 B CN 104698839B CN 201410832047 A CN201410832047 A CN 201410832047A CN 104698839 B CN104698839 B CN 104698839B
Authority
CN
China
Prior art keywords
node
fault
information interaction
information
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410832047.4A
Other languages
Chinese (zh)
Other versions
CN104698839A (en
Inventor
方浩
陈杰
李俨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201410832047.4A priority Critical patent/CN104698839B/en
Publication of CN104698839A publication Critical patent/CN104698839A/en
Application granted granted Critical
Publication of CN104698839B publication Critical patent/CN104698839B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention is directed to current distributed multi agent system easily to break down, and without simple and feasible this problem of real time fail processing scheme, propose a kind of distributed real-time fault detection based on information interaction and compensating control method.Step one, system and fault modeling: described modeling comprises nodes dynamics model, Information Interaction Model, typical fault model; Step 2, multiple agent real-time fault detection based on information interaction; Step 3, based on the information integration of Gossip algorithm and process; The compensation rate of step 4, Control-oriented amount calculates and applies; Step 5, the connectedness designed based on two hop-informations keep: from Information Interaction Model, Content of Communication between malfunctioning node is analyzed, by utilizing two hop-informations wherein, setting up virtual information transfer path, ensureing that fault handling scheme can not the normal work of influential system.

Description

A kind of multiple agent fault detect based on information interaction and compensating control method
Technical field
The present invention relates to a kind of multiple agent fault detect based on information interaction and compensating control method, belong to MAS control technical field.
Background technology
In recent years, along with the fast development of computer techno-stress technology, the scale of multi-agent system is also in growth at full speed.Traditional centerized fusion scheme, because being subject to the restriction of central node arithmetic speed and sensing range, has more and more been difficult to the demand meeting practical problems.And distributed control program, lower to the requirement of single intelligent body self because of it, and be with good expansibility and become the main flow of MAS control research gradually.But it should be noted that, due in distributed control program, there is not a central node to the behavior of all nodes of making overall planning, this makes system be easy to be subject to the attack of malfunctioning node and malicious node, may cause the paralysis of whole system time serious.Therefore, for distributed multi-agent system, design a set of failure detection schemes safely and efficiently, make system automatically can complete inspection and repair to malfunctioning node, be one urgent and have the work of broad prospect of application.
For the fault detect of distributed multi agent system, existing solution mainly contains following several:
Scheme 1: document (I.Shames, A.M.H.Teixeira, H.Sandberg, andK.H.Johansson.Distributedfaultdetectionforinterconnec tedsecondordersystem [J] .Automatica, Oct.2011, and document (S.SundaramandC.N.Hadjicostis.Distributedfunctioncalculat ionvialineariterationsinthepresenceofmaliciousagents toappear.), parti:Attackingthenetwork [C] .AmericanControlConference, june2008.) propose to adopt Unknown Input Observer (UIO), by long-time observation, accumulate the original state that enough data carry out estimating system, and then obtain the end-state of system, and judge whether the motion of present node meets expection requirement on this basis.Observer is adopted to carry out to fault the mainstream scheme that Real-Time Monitoring is current multi-agent system fault diagnosis.Fault-signal serves as Unknown worm in systems in which, drives observer to produce error and exports, by utilizing error signal to diagnosing malfunction and compensation.
Unknown Input Observer (UIO) is utilized to carry out the advantage that fault diagnosis has self, as explicit physical meaning, easy to understand; Do not rely on physical model, applied widely etc.In addition, document (Chung, W.H., Speyer, J.L., & Chen, R.H.Adecentralizedfaultdetectionfilter [J] .JournalofDynamicSystems, Measurement, andControl, 123 (2), 237 – 247,2001) point out, with other observer, as Beard-Jones fault Detection Filter (Beard-JonesFaultDetectionFilter) is compared, Unknown Input Observer (UIO) structure is relatively simple, and is easy to the optimum solution that optimizing application algorithm obtains being similar to.But then, the program also comes with some shortcomings: for having the node of N number of adjacent node, for detecting that Unknown Input Observer (UIO) number required for its all of its neighbor node failure is N+1.When system topology more complicated, or when interstitial content is numerous, the data required for the program and calculated amount will be very huge, and this proposes very high requirement by the hardware of node.The operation of the program simultaneously also can take a large amount of computational resources, produces adverse influence to other control task of system.
Scheme 2: document (M.Franceschelli, M.Egerstedt, andA.Giua.Motionprobesforfaultdetectionandrecoveryinnetw orkedcontrolsystems [C] .AmericanControlConference, pages4358 – 4363, june2008.) propose to adopt motion detector, the control system of excitation network is carried out by applying extra pumping signal, judge the running status of current system according to the response of system, detect malfunctioning node whereby.Different from the first scheme, the mode of active detecting that what the program was taked is.For this scheme, Problems existing mainly practical operation gets up more difficult, choosing of pumping signal, and the time etc. of signal applying all can be subject to the restriction of a lot of condition.
Scheme 3: document (Guo, M., Dimarogonas, D.V., & Johansson, K.H. (2012, June) .Distributedreal-timefaultdetectionandisolationforcooper ativemulti-agentsystems [C] .InAmericanControlConference (ACC), 2012 (pp.5270-5275) .IEEE.) propose to utilize internodal data message mutual, by receiving the controlled quentity controlled variable information of adjacent node, carry out simulation to its motion state to reappear, and itself and the actual motion state of adjacent node detected are compared, fault detect is carried out in this, as foundation.The maximum advantage of the program calculates easy, workable, but also have that some are obviously not enough, as restrictive condition is too harsh, the scope of application is narrower, and the misuse rate of system is excessively high.
The present invention inspires by such scheme 3, while fully using for reference its advantage of absorption, for the deficiency existing for himself, proposes a kind of distributed fault based on information interaction and detects and compensatory control scheme.The program improves the fault distinguishing mechanism of node, by adopting rumor mongering (Gossip) algorithm, effectively improves the problem that system misoperation rate is too high., in scheme, internodal information interaction content is reset meanwhile, make node can more effectively utilize received information.In addition, consider the complicacy of IT policy, the present invention proposes a kind of fault restoration scheme of Control-oriented amount, the restriction of system to nodal information interaction protocol is relaxed greatly, expands the range of application of the program.
Summary of the invention
The present invention is directed to current distributed multi agent system easily to break down, and without simple and feasible this problem of real time fail processing scheme, a kind of distributed real-time fault detection based on information interaction and compensating control method are proposed, by internodal information interaction, complete the detection to malfunctioning node, isolation and repair, thus the timely process realized the system failure, reduce the object of the loss that it brings.
A kind of distributed real-time fault detection based on information interaction of the present invention and compensating control method, comprise the steps:
Step one, system and fault modeling: described modeling comprises nodes dynamics model, Information Interaction Model, typical fault model; Wherein nodes dynamics model adopts simple integral device model, by the motion state of differential equation of first order description node; Information Interaction Model adopts non-directed graph to describe, namely all can two-way communication between node, and each autonomous intelligence body carries out information interaction whereby, completion system control task; Typical fault model comprises the fault type that intelligent body in reality often occurs;
Step 2, multiple agent real-time fault detection based on information interaction: from the expression formula of the nodes dynamics model described in step one, choose relevant state variable as the description to node running status; By setting threshold function, the running status of node is divided, difference normal node and malfunctioning node; Whether simultaneously individual node obtains the status information of its adjacent node by the Information Interaction Model described in step one, and detect it by detection algorithm and break down, and forms the testing result of single node;
Step 3, information integration and process based on Gossip algorithm: due to the packet loss that communicates, the existence of the problems such as time lag, in step 2, single node testing result is affected by environment comparatively large, and confidence level is not high; Therefore utilize Gossip algorithm, single node testing result is carried out information integration, obtain the comprehensive detection result that reliability is higher, and using this as the final basis for estimation to node running status, in order to distinguish normal node and malfunctioning node;
The compensation rate of step 4, Control-oriented amount calculates and applies: if malfunctioning node detected, then by corresponding operating, malfunctioning node is isolated, simultaneously from the impact of malfunctioning node on its adjacent node controlled quentity controlled variable, the numerical procedure that design is relevant, obtain the value of compensation rate, and add in former controlled quentity controlled variable, so as to offsetting the impact that malfunctioning node produces system;
Step 5, the connectedness designed based on two hop-informations keep: from Information Interaction Model, Content of Communication between malfunctioning node is analyzed, by utilizing two hop-informations wherein, setting up virtual information transfer path, ensureing that fault handling scheme can not the normal work of influential system.
Wherein said fault type comprises destructive fault, runaway fault and interference type fault.
Compared with existing scheme, advantage of the present invention and innovation mainly contain following some:
(1) higher for the large multipair System Hardware Requirement of existing scheme, need the problem (as shown in scheme 1,2) taking a large amount of computational resource, the present invention sets about from the basic controlling rule of multi-agent system, make full use of its existing result of calculation, under the condition taking few computational resource, namely achieve the Real-Time Monitoring to adjacent node, greatly reduce application cost of the present invention.Simultaneously, the present invention is to increase Content of Communication on a small quantity for cost, by utilizing gossip algorithm, effectively overcome the interference problem (as shown in scheme 3) of random signal to failure detection result, this innovation ensure that the reliability of failure detection result, also makes the present invention have the value of practical application.
(2) existing scheme to the isolation of malfunctioning node and repairing research few, major part is all adopt the simple mode directly stopping communication, and fault restoration scheme is also only applicable to Linear Control agreement (as shown in scheme 3), range of application is restricted.The present invention starts with from the control result of system, devise a kind of fault isolation based on controlled quentity controlled variable and backoff algorithm, this algorithm has taken into full account the saturation characteristic that system is the most common, can efficient solution by no means under Linear Control agreement system to the reparation problem of fault, greatly extend the scope of application of the present invention.
(3) after malfunctioning node is isolated, how to carry out the connective problem kept for system, existing scheme does not all do deep research to this.The present invention passes through by existing Content of Communication and the reliable Detection Information utilizing gossip algorithm to obtain, devise a kind of system topology based on two hop-informations and keep scheme, if the program can ensure that malfunctioning node does not depart from the communication range of normal node, the information of the adjacent node that can transmit by it sets up virtual information transmission channel, guarantee system time is communicated with, and its normal function can not be totally disrupted because of malfunctioning node isolation.
Accompanying drawing explanation
Fig. 1-multi-agent system topology diagram;
Fig. 2-failure detection schemes schematic diagram;
Fig. 3-based on the information processing scheme schematic diagram of rumor mongering (Gossip) algorithm;
Fig. 4-node desired output and actual output relation figure;
System topology figure after Fig. 5-utilize two hop-informations;
Fig. 6-fault 1 is with or without processing scheme Comparative result figure;
Fig. 7-fault 2 is with or without processing scheme Comparative result figure;
Fig. 8-fault 3 is with or without processing scheme Comparative result figure;
Fig. 9-based on the distributed real-time fault detection of information interaction and compensating control method process flow diagram.
Embodiment
Below in conjunction with accompanying drawing and example, the present invention will be further described:
First system and detection model is provided:
In the multi-agent system of reality, usually the movable information of node is controlled as target, meet control overflow in the hope of the motion state or position distribution realizing node.For adopting the node of simple integral device model, its kinetic model meets following form:
x · i ( t ) = u i ( t ) - - - ( 1 )
This formula shows that the controlled quentity controlled variable of node depends on the derivative of node state, is generally speaking the velocity information of node.The kinetic model that what formula (1) provided is at state lower node continuous time, but in the control system of reality, because node needs to sample to status information, and the sampling period can not be infinitely small, therefore, kinetic model node set up under discrete-time state is needed.From the knowledge of the related disciplines such as computer control system, under the discrete-time state obtained after carrying out sliding-model control to above-mentioned model, typical simple integral device model has following form:
z i((k+1)T)=z i(kT)+u i(kT)T,i=1,…,N(2)
Wherein T is the sampling time.For simplicity, z is remembered i k=z i(kT), u i k=u i(kT), and meet wherein z i kthe position coordinates of node in two-dimensional space, u i k∈ R 2the controlled quentity controlled variable of node i in each time step k.The actual physics meaning of this model representative is: using the position of system interior joint as control objectives, realized the adjustment of node location by the size controlling node speed in each time step k, finally make the distribution situation of node reach control overflow.No matter that multi-agent system all realizes Collaborative Control based on information interchange by between node by wired or wireless network carries out information transmission.Whole information interaction network can describe with figure G={V, E}, wherein V={1 ..., N} is the summit in figure, each node simultaneously also in representative system, it is the road in figure.We define: if self information can be transferred to j by node i, then claim i to be the adjacent node of j, i.e. (i, j) ∈ E.Note N i k={ i 1..., i pbe the adjacent node collection of time step k interior nodes i, | N i k| be its radix.In addition, we suppose that G is non-directed graph, that is ( i , j ) ∈ E ⇔ ( j , i ) ∈ E .
In addition, assuming that the control law of node has following structure:
u i k=P i(z i k,I i k)(3)
Wherein P i: R 2→ R 2for control protocol, determined by the control objectives of node; the state of time step k interior nodes i adjacent node, wherein N i k={ i 1..., i p, p=|N i k|.Structure shown in formula (3) is structure conventional in multiple agent Collaborative Control, and namely the controlled quentity controlled variable of node is determined jointly by the state of its current state and all of its neighbor node thereof.P={P 1... P ncontrol protocol for presetting, if meet then claim P to be homogeneous control protocol, otherwise be called control protocol nonhomogeneous.The present invention only considers that IT policy is homogeneous situation.
In note system, to be F, time step k interior nodes i to the testing result of node j be in the set of all malfunctioning nodes meet:
q i , j k = 0 , j ∉ F 1 , j ∈ F - - - ( 4 )
In each time step k, node can carry out fault detect to its all of its neighbor node, obtains testing result simultaneously in addition, define system is comprehensive to its testing result of this node all of its neighbor node to the testing result of node, and its form is as follows:
Q i k = Σ j = i 1 i p q j , i k / p - - - ( 5 )
Wherein N i k={ i 1..., i p, p=|N i k|.Mutual by data message, each node can obtain the evaluation result of system to himself, and concrete obtain manner will do detailed statement hereinafter.
Below internodal Information Interaction Model is analyzed:
In multi-agent system, each node is controlled self by perception surrounding environment.If the perception of node to environment is mutual based on internodal mutual information, then this model is just called as the model based on information interaction.In the system information interaction models that the present invention discusses, the information interaction content between node forms by with lower part:
Content 1: the controlled quentity controlled variable u that node i ∈ V will be tried to achieve by formula (3) in time step k i kand self current state z i kbe transferred to the adjacent node j ∈ N that they are all i k.
Content 2: node i ∈ V in time step k by the state of its adjacent node and the system to be obtained by formula (5) is to testing result { the j ∈ N of its adjacent node i k| Q j kbe transferred to its all adjacent node j ∈ N i k.
Content 3: node i ∈ V in time step k by its testing result to corresponding adjacent node and adjacent node is to the testing result of i be transferred to the adjacent node j ∈ N that they are all i k.
The explanation of value, in content 3, the adjacent node of transmission is to the testing result of i be not Q i kform, although both equivalents in meaning.Mainly consider that node i is the situation of malicious node, if directly transmit Q herein i k, these data may deliberately be revised by malicious node and make system cannot detect this malicious node.Adopt form, owing to comprising the Detection Information of node self to malicious node in data, can be used to carry out Information Proofreading, or proofread to confirm testing result by carrying out data with the adjacent node of malicious node.This belongs to the research category of information countermeasure, and the present invention does not discuss in detail this.
Provide typical fault type below:
When defining the system failure, due to the composition of different system and the method for operation different, it is also different to the definition mode of fault.Relatively independent and the system that structure is completely known is divided for function, can set about defining it from fault Producing reason.Such as, for an automobile, can specifically define the fault such as engine damage, brake failure from the aspect such as power system, brake system.The benefit done like this is with strong points, and can reduce the destruction to system to greatest extent, ensures that its function is unaffected, and the most of system in real world all adopts this fault definition mode.But for multi-agent system, because its method of operation complexity is various, topological structure is also different, be often difficult to definitely learn concrete failure cause, therefore also just cannot from source failure definition.Consider that multi-agent system has worked in coordination with control task by multiple intelligent body, single intelligent body can not produce conclusive impact to system, therefore, can consider not make a concrete analysis of fault Producing reason, but set about defining from the actual running results of node, as long as namely the operation result of a certain node is discontented with pedal system requirement, just assert that it breaks down, and it is rejected from system.This fault definition mode can produce certain destruction to system, but makes whole system out of service compared to for keeping in repair a certain node failure, and its loss is still relatively little.In addition, this definition mode can simplify the difficulty of systems axiol-ogy fault greatly, and can process in real time fault, and guarantee system still can complete predict task in case to greatest extent there being fault to deposit.Now combine actual multi-agent system, provide following several typical failure mode:
Fault 1: breaking-up type fault.Be embodied in node stop motion abnormally in operational process, although or have movement tendency, the state of reality does not change by expection.Producing this kind of failure cause may be that node is subject to external attack, and power system is damaged, or the depleted of energy of node, and run out of steam source.In addition, consider a kind of special circumstances in system operation, namely node cannot continue motion by the impact of surrounding environment or self program, such as node card is in certain irremovable landform, or control program existing defects, make node be absorbed in Local Extremum etc., although node itself is not damaged in this case, but it cannot proper motion, therefore it will be attributed to the row of breaking-up type fault.
Fault 2: type fault out of control.Be embodied in joint movements uncontrolled, speed remains unchanged, or unconventional changes, and makes control effects to meet system requirements.Producing reason may be that control system makes a mistake, and normally cannot generate control information, or the power system of node and control system out of touch, actuator cannot obtain correct controlled quentity controlled variable.In addition, the most easily produce this kind of fault when system is subject to malicious attack, can to it can be used as in detection system whether one of mark having malicious node, as detected, this fault occurs, should take the precautionary measures in time, prevents the further diffusion of fallacious message.
Fault 3: interference type fault.Be embodied in node and occur a large amount of random motion, actual motion state and theoretical running status deviation are excessive, endanger normal the operations generation of system.Cause the reason of this kind of fault a lot, also common in the multi-agent system of reality.Concrete reason may be that node is subject to strong random external disturbing effect, and as landform is too rugged, or the measure of precision of node executive component is not enough, and the stochastic error of generation is too large.For this type of fault, should hold the prudent attitude when processing, because the appearance of error is inevitable, if trace routine is too harsh, great deal of nodes may be made to be identified and break down, this brings unnecessary loss by system.For solving problems, can consider to adopt filtering algorithm etc. to compensate process to it.
In addition, for the model based on information interaction, the acquisition of node to the perception of surrounding environment and self controlled quentity controlled variable rely on completely and the data message of adjacent node mutual, therefore, data message is the bridge of node and interconnection alternately, plays vital effect to the normal operation of node.For above-mentioned three kinds of fault types, break down if node is controller or power system, but also remain with normal information interaction function, be then called I class fault; If nodal information interactive system is destroyed, normally cannot carries out data message alternately, then be called II class fault.
Provide the concrete implementing method of the failure detection schemes based on information interaction below:
From discussion above, the fault definition that the control effects that the present invention be directed to node carries out, namely once the actual motion state of node does not meet control overflow, namely thinks that it there occurs fault.Can very naturally expect a kind of failure detection schemes thus: the running status of detection system, if the theoretical running status of a certain node and actual motion state produce error r, and this error exceeds certain limit, namely concludes nodes break down.
Because the nodal analysis method considered in the present invention is simple integral device model, export the continuous dislocation z being reflected in node i k+1-z i k, or perhaps the speed of node exports u i kon, so we use u i kperformance index as computing system residual signals: r i k=u i r,k-u i a,k, wherein u i r,k∈ R 2by the Systems Theory motion state that control protocol P tries to achieve in time step k, u i a,k∈ R 2being by measuring the system actual motion state obtained in real time, meeting:
u i r,k=u i k=P i(z i k,I i k)(6)
u i a,k=h(z i k+1,z i k)(7)
If the state of node i is continuous print, then z i k+1and z i kcan be obtained by the sensor measurement that node is built-in, h then can use simple differential equation of first order form (z i k+1-z i k)/[(k+1) T-kT].
Now make as given a definition to malfunctioning node:
Definition 1: for the node i adopting simple integral device model, if it meets formula (8) described condition, be then called malfunctioning node.
||r i k||=||u i r,k-u i a,k||>χ(||u i r,k||,δ)(8)
Wherein, χ (|| u i r,k||, δ) be called threshold function, its value depends on the size of input signal || u i r,k|| with disturbance quantity δ.Generally can get χ (|| u i r,k||, δ)=γ 1+ γ 2|| u i r,k||, wherein constant γ 1depend on disturbance quantity δ, variations per hour γ 2|| u i r,k|| depend on the instantaneous input of node.
For the system that may comprise malfunctioning node, our control objectives is: system can complete original task, detects and isolated fault node simultaneously.Because malfunctioning node cannot participate in original task, therefore regulation: if the node do not broken down is completed original task, namely think that whole system completes re-set target.
As shown in Figure 2, the present invention proposes following failure detection schemes:
Suppose that now node j carries out fault detect to node i, in time step k, node i state z this moment can be obtained by information interaction content 1 and 2, node j mentioned above i kand its all of its neighbor nodal information because IT policy is homogeneous, therefore node j can utilize self control protocol and I i ktry to achieve the theoretical controlled quentity controlled variable u of node i i r,k.In next time step k+1, similarly, node j can obtain z i k+1and u i r, k+1, and utilize formula (7) to try to achieve u i a,k.Now, whether node j just can utilize formula (8) to carry out decision node i to break down in time step k.
Intuitively, this failure detection schemes is exactly the information by obtaining destination node adjacent node, the theory movement state of destination node is tried to achieve by homogeneous IT policy, and itself and the actual motion state detected are compared, if error exceedes certain amplitude, then predicate node breaks down.
The testing result of what failure detection schemes mentioned above obtained is individual node, it affects comparatively large by enchancement factor, the confidence level of result is not high.Such as, the phenomenon such as time delay and loss of data is often there is in the information interactive process of node, if node does not receive the information of a certain adjacent node in time, or the information received is imperfect, then probably this adjacent node erroneous judgement is malfunctioning node and it is taked to the operations such as isolation, these operations will be considered as fault by its adjacent node, cause this node itself to be detected as malfunctioning node.Go down like this, a large amount of normal node will be had in system and be isolated because of maloperation, and cause the serious wasting of resources, global object even may be caused to realize.Consider these situations, the present invention proposes to adopt rumor mongering (Gossip) algorithm, carries out information processing, improve the accuracy of testing result whereby to the testing result of each node.As shown in Figure 3, specific implementation is as follows for scheme schematic diagram:
For node i, first, in time step k, node i carries out fault diagnosis work alone, utilizes formula (8) and formula (4) to obtain its diagnostic result to all of its neighbor node meanwhile, all of its neighbor node of i is also carrying out same operation.Then, as shown in information interaction content 3, node i sends its adjacent node to respectively by the diagnostic result of adjacent node, receives the diagnostic result of adjacent node to i simultaneously finally, node i is by the comprehensive detection result of adjacent node to oneself send its all of its neighbor node to, receive the comprehensive detection result of its adjacent node simultaneously.Like this, utilize formula (5), node i just can calculate the testing result of system to its adjacent node by setting up parameter Q con, when time, can break down by decision node j.Generally speaking, parameter Q confor (0,1] constant on interval, the precision that its value will perform by node, the intensity of environmental interference, the impact of the factors such as internodal information interaction quality.Q convalue larger, the testing result reliability of system to fault is higher, but undetected probability is also larger, therefore, Q convalue suitably should choose according to the difference of real system.
Provide the concrete implementing method of fault isolation and recovery scenario below:
After fault detect completes, system often needs to carry out isolation work to malfunctioning node, to eliminate it to the impact remaining normal node.In addition, in distributed multi agent system, because the fault detect task of each node is carried out alone, the situation that the node that probably breaks down is not being detected in the same time by its adjacent node.And owing to needing application rumor mongering (Gossip) algorithm to carry out information processing to testing result between node, this also will bring time delay to a certain degree.Therefore, time of nodes break down and node are that time of malfunctioning node is normally inconsistent by system diagnostics.During this period of time, malfunctioning node still acts on system, makes final control result produce deviation.In order to eliminate this impact, this chapter will propose a kind of by applying external signal, system be carried out to the fault restoration algorithm of control-action compensation.
By discussing above and can finding, in system, each nodal test may be inconsistent to the time of malfunctioning node, if each node oneself is detecting that the moment of malfunctioning node just carries out isolation and the reparation of fault, then this isolation reparation operation is probably being diagnosed as fault by its adjacent node and is taking same operation to it.This situation will spread down step by step, finally causes the collapse of whole system.Therefore, be necessary to specify that a moment is unified to carry out the operation to fault to each node.We introduce a new parameter: fault detect and repairing efficiency, be denoted as T p=p*T.Wherein constant p* ∈ Z +, T is the sampling time.In each cycle T pin, node is at k ∈ [k *t p+ T, (k *+ 1) T p-T] carry out fault detect and information processing in the time period, at k=(k *+ 1) T p, k ∈ Z +carry out isolating to malfunctioning node in time period and repair.It should be noted that because fault isolation and reparation are non routine operations, be probably detected as fault by its adjacent node, therefore, at time period k=(k*+1) T p, k ∈ Z +in, temporarily should shield the fault detection capability of each node.
Provide the isolation scheme of malfunctioning node below:
Fault isolation refers to be removed the controlled quentity controlled variable of the node that breaks down from its adjacent node, and blocks the information interaction channel that malfunctioning node receives adjacent node information simultaneously, to reach the object eliminating malfunctioning node impact.Be easy to expect, when nodal test breaks down to its adjacent node, only this node need be concentrated from its adjacent node and remove, stop sending oneself state information to this node simultaneously, isolation work can be completed.Note the information not interrupting malfunctioning node being sent to self adjacent node here, because both there is no direct contact, send this information little to systematic influence.Stop self information to send mainly for the security consideration of information, because the reason producing fault is unknown, if this node is controlled by adverse party, continue to send data and likely utilized by malicious behaviors of nodes, thus the control of self is had an impact.But, termination message sends also can bring a problem, namely malfunctioning node cannot receive the information of adjacent node, its adjacent node will be judged to be malfunctioning node and same interrupting information sends, like this, when malfunctioning node stops sending information to all of its neighbor node, it will be completely sightless for system, its motion also will cannot be evaded completely to the harm that system produces, this can cause much occurring the disadvantageous situation of system, as collided between node, system topology generation catastrophic collapse etc.For avoiding above-mentioned situation, now node definition is operated as follows:
Operation 1: when node cannot receive the information of a certain adjacent node, by the adjacent node that received to the evaluation of self, utilizes formula (5) to obtain the diagnostic result of system to self.If this result exceeds certain amplitude Q con, can judge self to break down, now shield the fault detection capability of self, but still still send data message to adjacent node, be by the part no longer cited by content 3 in information.
By aforesaid operations, malfunctioning node can not make evaluation to residue normal node, but its motion is still visible, so that system is made a response to its destructive activity early for system.In addition, the information interaction content 1 and 2 of reservation will make malfunctioning node become the via node of an information transmission, and avoid fault isolation to operate system topology generation catastrophic collapse, the concrete discussion of this partial content will provide hereinafter.
Provide fault restoration scheme below:
The object of fault restoration is: if malfunctioning node fails after breaking down to be isolated in time, still create certain impact to system, then take fault restoration to eliminate this some effects.For fault restoration, one intuitively idea be exactly the controlled quentity controlled variable of malfunctioning node is separated, its negate is rejoined in former controlled quentity controlled variable, so as to offsetting the impact of malfunctioning node.But the control protocol of this protocols call node has linearly stackable form, the controlled quentity controlled variable of malfunctioning node can be isolated.But for the multi-agent system of complexity, usually there will be control protocol is the non-linear situation that can not superpose.Therefore, the present invention, by the Actual Control Effect of Strong from node, defines a kind of new compensation rate and calculates and fault restoration scheme, simultaneously to the feasibility of the program to issuing a certificate.
For any one node i in multi-agent system, if it does not break down, then desired output u iy is exported with the actual of node ibetween must meet relation as shown in Figure 4.Ignore the execution error that actuator self exists herein.
In figure, u imaxthe greatest hope of representation node i exports, y imaxthe output upper limit of representation node reality.For the node of reality, when desired output to exceed in actual output that node can reach in limited time, node can only at maximum output y imaxlower operation, this will cause part controlled quentity controlled variable to show in the output.Therefore, in order to avoid this saturation characteristic is diagnosed as fault by node, just need to carry out amplitude limit setting to controlled quentity controlled variable, namely like this, will there is following relation between exporting in desired output and reality:
y i=a·u i(9)
Wherein constant a ∈ R +for the output gain of system, in the present invention, suppose a=1.
It should be noted that, u herein ia mathematical description just to node desired output is not the real output of controller.For the actuator of reality, its non-linear complexity is various, and be not merely that saturation characteristic is simple like this, controller also needs to take corresponding control algolithm, as PID control, fuzzy control etc. ensure that node can normally perform output task.In addition, the node above mentioned actual output y irefer to the Utopian steady-state response result of node, its dynamic response characteristic is not within discussion scope of the present invention.
Calculating now for compensation rate is defined as follows operation:
Operation 2: when node i is at k=T imoment is when detecting that its adjacent node j breaks down, and utilizes formula (3) to calculate under the condition not having j to affect the controlled quentity controlled variable of self calculate simultaneously value and its negate is added up, until arrive next fault isolation with repair moment k=T ip.
From operation 2, for node i, the compensation rate applied is needed to be for eliminating the impact of malfunctioning node j:
u i comp , j = - Σ k = T i T ip ( u i k - u i \ j k ) - - - ( 10 )
Owing to there is the maximum amplitude of output in real system, therefore can not the compensation rate of being tried to achieve by formula (10) be added in former controlled quentity controlled variable simply, need consider whether adding of compensation rate can cause former controlled quentity controlled variable to exceed amplitude limit value and occur compensating insufficient situation.Be defined as follows operation for this reason:
Operation 3: at k=(k*+1) T p, k ∈ Z +time, if node i has confirmed that node j breaks down, then the compensation rate of being tried to achieve by formula (10) is added in former controlled quentity controlled variable, whether the controlled quentity controlled variable simultaneously detected now exceeds amplitude limit value, if, then the part exceeding amplitude limit value again assignment, to compensation rate, is treated next isolation and repaired the moment and proceed to compensate; If not, then compensation rate reset, repair completes.
From operation 3, after several fault detects and repairing efficiency, compensation rate will be added in the middle of controlled quentity controlled variable completely, now namely completes the repair to malfunctioning node.
The network connectivty provided below based on two hop-informations keeps scheme:
Known from analysis above, for generation II class fault, i.e. the node that destroyed of information interaction system, it will cannot be repaired under existing topological structure the impact of network connectivty.But, if node still remains with normal information interaction function, then can be regarded as the via node of an information transmission, set up the information transmission path of two jumpings, whereby repair may going to wreck property destroy network topology structure.Specific implementation is as follows:
As shown in Figure 1, suppose that node 3 breaks down, if only take isolation to repair operation to it, then will not have information interaction between node 1,2 and node 4,5,6,7, the connectedness of former figure has suffered destruction, and this system cannot work in coordination with control objectives.Consider in operation 1 known to the regulation of malfunctioning node, now the adjacent node of malfunctioning node still can receive the information of the information interaction content 1 and 2 that it transmits, and wherein will comprise the complete information of its adjacent node in content 2.It is contemplated that thus, using the via node of this malfunctioning node as information transmission, between two non-conterminous adjacent node, setting up virtual information transmission path, so as to keeping the connectedness of former figure.Be defined as follows operation:
Operation 4: j breaks down if node i detects its adjacent node, then after completing fault isolation operation, the information of its adjacent node in detection node j information interaction content 2 with if and then make z l k ∈ I i k , Q l k ∈ { m ∈ N i k | Q m k } .
After operating 4, Fig. 1 will become topological structure as shown in Figure 5, and its interior joint 3 breaks down.The two hop-information transmission channels that represented by dotted arrows is via node with node 3.
From operation 4, if not direct information interaction contact between two of malfunctioning node adjacent nodes, then through aforesaid operations, a virtual information transfer channel will be set up between two nodes, make two nodes become adjacent node on theory significance, can ensure that the connectedness of former figure is not destroyed thus.
Now verify fault detect proposed by the invention, isolation and the feasibility of recovery scenario for the consistency problem in MAS control.
First suppose for all roads a ijvalue all equal, then now control protocol P is homogeneous, and namely the controlled quentity controlled variable generating mode of all nodes is all identical.Like this, by utilizing information interaction content 1-3, node receives the information of its adjacent node, and utilizes the control protocol of self to diagnose adjacent node, then utilize rumor mongering (Gossip) algorithm to carry out information processing, get final product the detection of paired fault.Afterwards, operate 1-4 by utilization, can the isolation reparation of normal complete paired fault and the task of network connectivty maintenance.There is no special condition in whole process to limit the application of this fault handling scheme, therefore can prove, if the weighted value of road is equal in multi-agent network, then this fault handling scheme can be applied in consistency problem.
But likely there is this situation in actual multi-agent system: in different information interaction networks, the weighted value of different road is also different, and that is control protocol P is no longer homogeneous.Now, need to carry out some amendments to information interaction content 2, node i no longer transmits the status information I of its adjacent node i k, change into and transmit new information wherein { i 1..., i p}=N i k.Like this, internodal control protocol just no longer comprises Inhomogeneous Terms a ij, still it can be considered as homogeneous control protocol, therefore, known with reference to analysis above, former scheme stands good.
Provide software emulation result below:
As shown in figs 6-8, what this three width figure showed is utilize MATLAB to carry out the result of consistance control imitation to 8 multiple agents.Fig. 6 (left side), Fig. 7 (left side), Fig. 8 (left side) are the result after fault 1, fault 2, fault 3 are carried out to fault detect isolation and repaired respectively, and Fig. 6 of correspondence (right side), Fig. 7 (right side), Fig. 8 (right side) be respectively correspondence do not take fault handling operation time result.Can find from figure, if do not process fault, then remain node passing in time and will be taken away re-set target by malfunctioning node, thus cause whole system control objectives to realize.And after taking fault handling scheme, malfunctioning node will no longer have an impact to residue node, remaining normal node still can complete consistance by expection and control.
Above-describedly be only preferred embodiment of the present invention, the present invention is not only confined to above-described embodiment, and all local done within the spirit and principles in the present invention are changed, equivalent replacement, improvement etc. all should be included within protection scope of the present invention.

Claims (2)

1., based on distributed real-time fault detection and the compensating control method of information interaction, it is characterized in that, comprise the steps:
Step one, system and fault modeling: described modeling comprises nodes dynamics model, Information Interaction Model, typical fault model; Wherein nodes dynamics model adopts simple integral device model, by the motion state of differential equation of first order description node; Information Interaction Model adopts non-directed graph to describe, namely all can two-way communication between node, and each autonomous intelligence body carries out information interaction whereby, completion system control task; Typical fault model comprises the fault type that intelligent body in reality often occurs;
Step 2, multiple agent real-time fault detection based on information interaction: from the expression formula of the nodes dynamics model described in step one, choose relevant state variable as the description to node running status; By setting threshold function, the running status of node is divided, difference normal node and malfunctioning node; Whether simultaneously individual node obtains the status information of its adjacent node by the Information Interaction Model described in step one, and detect it by detection algorithm and break down, and forms the testing result of single node;
Step 3, information integration and process based on Gossip algorithm: due to the packet loss that communicates, the existence of the problems such as time lag, in step 2, single node testing result is affected by environment comparatively large, and confidence level is not high; Therefore utilize Gossip algorithm, single node testing result is carried out information integration, obtain the comprehensive detection result that reliability is higher, and using this as the final basis for estimation to node running status, in order to distinguish normal node and malfunctioning node;
The compensation rate of step 4, Control-oriented amount calculates and applies: if malfunctioning node detected, then by corresponding operating, malfunctioning node is isolated, simultaneously from the impact of malfunctioning node on its adjacent node controlled quentity controlled variable, the numerical procedure that design is relevant, obtain the value of compensation rate, and add in former controlled quentity controlled variable, so as to offsetting the impact that malfunctioning node produces system;
Step 5, the connectedness designed based on two hop-informations keep: from Information Interaction Model, Content of Communication between malfunctioning node is analyzed, by utilizing two hop-informations wherein, setting up virtual information transfer path, ensureing that fault handling scheme can not the normal work of influential system.
2. a kind of distributed real-time fault detection based on information interaction and compensating control method as claimed in claim 1, it is characterized in that, wherein said fault type comprises destructive fault, runaway fault and interference type fault.
CN201410832047.4A 2014-12-26 2014-12-26 A kind of multiple agent fault detect based on information interaction and compensating control method Active CN104698839B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410832047.4A CN104698839B (en) 2014-12-26 2014-12-26 A kind of multiple agent fault detect based on information interaction and compensating control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410832047.4A CN104698839B (en) 2014-12-26 2014-12-26 A kind of multiple agent fault detect based on information interaction and compensating control method

Publications (2)

Publication Number Publication Date
CN104698839A CN104698839A (en) 2015-06-10
CN104698839B true CN104698839B (en) 2016-04-27

Family

ID=53346083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410832047.4A Active CN104698839B (en) 2014-12-26 2014-12-26 A kind of multiple agent fault detect based on information interaction and compensating control method

Country Status (1)

Country Link
CN (1) CN104698839B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105573120B (en) * 2016-01-14 2019-04-30 长春工业大学 Non-linear more single pendulum network system control method for coordinating based on multiple agent
CN105763396B (en) * 2016-04-12 2018-02-06 北京理工大学 Distributed multi agent real-time fault detection method based on neighbours' correlation behavior
CN109254532B (en) * 2018-04-19 2020-07-03 北京理工大学 Communication time lag-oriented multi-agent distributed cooperative fault detection method
CN108681320A (en) * 2018-05-11 2018-10-19 北京理工大学 A kind of distributed multi agent real-time fault detection method based on regional cooperative
CN109634798B (en) * 2019-02-25 2020-12-15 北京理工大学 Design method of fault estimator of piloting-following multi-agent distributed system
CN111290277B (en) * 2020-02-26 2023-01-10 鹏城实验室 Distributed multi-agent cooperative fault detection method, storage medium and equipment
CN112486114A (en) * 2020-11-23 2021-03-12 哈尔滨理工大学 Prediction-based actuator saturation multi-agent global consistency method
CN112947359A (en) * 2021-01-26 2021-06-11 北京理工大学 Large communication delay compensation and sensor fault diagnosis method for cluster cooperative system
CN112925206B (en) * 2021-01-26 2022-02-01 南京航空航天大学 Distributed robust fault diagnosis method for nonlinear multi-inverted pendulum interconnection system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200759A (en) * 2011-05-28 2011-09-28 东华大学 Nonlinear kernelled adaptive prediction method

Also Published As

Publication number Publication date
CN104698839A (en) 2015-06-10

Similar Documents

Publication Publication Date Title
CN104698839B (en) A kind of multiple agent fault detect based on information interaction and compensating control method
Cai et al. Bayesian networks in fault diagnosis
Yin et al. Real-time monitoring and control of industrial cyberphysical systems: With integrated plant-wide monitoring and control framework
CN104536435B (en) A kind of line control system network inline diagnosis method
CN108681320A (en) A kind of distributed multi agent real-time fault detection method based on regional cooperative
Mohammad et al. Formal analysis of human-assisted smart city emergency services
CN104504248A (en) Failure diagnosis modeling method based on designing data analysis
Lin et al. A general framework for quantitative modeling of dependability in cyber-physical systems: A proposal for doctoral research
CN105678337B (en) Information fusion method in intelligent substation fault diagnosis
Kościelny et al. Actuator fault distinguishability study for the DAMADICS benchmark problem
Harirchi et al. Model invalidation for switched affine systems with applications to fault and anomaly detection
Gao et al. An industrial control system testbed based on emulation, physical devices and simulation
Latif et al. Modeling of sewerage system linking UML, automata and TLA+
Shi et al. Study cybersecurity of cyber physical system in the virtual environment: a survey and new direction
Patel et al. Real-time, simulation-based identification of cyber-security attacks of industrial plants
Kościelny et al. Towards a unified approach to detection of faults and cyber-attacks in industrial installations
CN102799176A (en) Fault diagnosis method for nonlinear time-delay discrete system based on fuzzy theory
Bhattacharyya et al. A discrete event systems approach to network fault management: detection and diagnosis of faults
CN114064911B (en) Expert knowledge base modeling method and system for intelligent diagnosis system of nuclear power plant
Ghrieb et al. Hardware implementation using xsg of new fault detection method applied to robot manipulator
Kong et al. Concurrent fault diagnosis method for electric-hydraulic system: Subsea blowout preventer system as a case study
Bi et al. Novel cyber fault prognosis and resilience control for cyber–physical systems
Blesa et al. Robust fault detection and isolation of wind turbines using interval observers
Zhang et al. A TFPG-Based Method of Fault Modeling and Diagnosis for IMA Systems
Rong et al. Improved Reduced‐Order Fault Detection Filter Design for Polytopic Uncertain Discrete‐Time Markovian Jump Systems with Time‐Varying Delays

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant