GB2510253A - Evaluating the operating dependability of a complex system - Google Patents

Evaluating the operating dependability of a complex system Download PDF

Info

Publication number
GB2510253A
GB2510253A GB1321941.5A GB201321941A GB2510253A GB 2510253 A GB2510253 A GB 2510253A GB 201321941 A GB201321941 A GB 201321941A GB 2510253 A GB2510253 A GB 2510253A
Authority
GB
United Kingdom
Prior art keywords
component
failures
failure
cuts
minimal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1321941.5A
Other versions
GB201321941D0 (en
Inventor
Andr Leblond
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Thales SA
Original Assignee
Thales SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thales SA filed Critical Thales SA
Publication of GB201321941D0 publication Critical patent/GB201321941D0/en
Publication of GB2510253A publication Critical patent/GB2510253A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/008Reliability or availability analysis
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B64AIRCRAFT; AVIATION; COSMONAUTICS
    • B64FGROUND OR AIRCRAFT-CARRIER-DECK INSTALLATIONS SPECIALLY ADAPTED FOR USE IN CONNECTION WITH AIRCRAFT; DESIGNING, MANUFACTURING, ASSEMBLING, CLEANING, MAINTAINING OR REPAIRING AIRCRAFT, NOT OTHERWISE PROVIDED FOR; HANDLING, TRANSPORTING, TESTING OR INSPECTING AIRCRAFT COMPONENTS, NOT OTHERWISE PROVIDED FOR
    • B64F5/00Designing, manufacturing, assembling, cleaning, maintaining or repairing aircraft, not otherwise provided for; Handling, transporting, testing or inspecting aircraft components, not otherwise provided for
    • B64F5/60Testing or inspecting aircraft components or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/33Design verification, e.g. functional simulation or model checking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Manufacturing & Machinery (AREA)
  • Transportation (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Debugging And Monitoring (AREA)

Abstract

Evaluating the operating dependability of a system formed of hardware components and/or software components such as a system for displaying flight information on an aircraft instrument panel. Dependability evaluation can consider reliability, availability and maintainability of the system. Decomposing the components of the system into interconnected functions and defining associated functional failures for each function. Defining a logical expression of a feared event of the system whose terms are inputs and/or outputs of the functions. Determining a set of minimal combination of failures of functions causing the feared event of the system called minimal functional cuts. Further reducing the set of minimal combination of failures of functions to a set of hardware component minimal cuts. Computing for each component failure intervening in a component cut a failure rate and an exposure time. The exposure time may be duration of a mission for visible failures or duration separating two maintenance tests for failures which are not detectable by visual inspection or during normal operation. Computing a probability of occurrence of the feared event as a function of the failure rates and exposure times.

Description

METHOD FOR EVALUATING THE OPERATING DEPENDABILITY
OF A COMPLEX SYSTEM
GENERALITIES
The present invention relates to a method for evaluating the operating dependability of a software and/or hardware complex system such as for example a system for displaying flight information on an aircraft instrument panel. A system for displaying flight information comprises notably a software part for processing data and a hardware part comprising a computer, carrying out the data processing, and a viewing screen for example.
The operating dependability of a system can be defined as being the property that allows users of the system to place justified confidence in the services that it delivers to them. A user of the system may be an individual such as an operator or a supervisor or else, another hardware or software system having interactions with the system considered. According to the applications to which the system is intended, the operating dependability may be characterized according to different but complementary properties such as: reliability, o 20 availability, safety-security, maintainability of the system.
Reliability corresponds notably to the continuity of the services that the system must provide to its users, in the absence of repairs. Reliability may also be defined as the ability of a system to accomplish a required function, under given conditions, for a given duration. All the failures of the system of an accidental nature are then taken into account without any discrimination in relation to their criticality. An exemplary measure of reliability is the failure rate, the inverse of the mean operating time up to the first failure.
The maintainability of a system conveys its ability to support repairs and upgrades. The maintenance of the system must then be accomplished under given conditions with prescribed procedures and means. An exemplary measure of maintainability is for example the mean time to repair or restore the system to a state of proper operation.
The availability of a system is the capacity of a system to deliver a service correctly, in terms of deadline and quality, when the user needs it.
Availability is a unit-less measure, it corresponds notably to a proportion of the time of proper operation over the total time of operation of the system.
Safety-security is aimed notably at self-protection from catastrophic failures, that is to say failures some consequences of which are unacceptable in relation to the users of the system or the environment, for example an accident possibly involving human lives. It is important, or indeed vital, to curb and to reduce these failures.
In order to take safety constraints into account, it is therefore necessary to identify cases of critical malfunctions of the system, potentially endangering the system and/or the users and/or the environment. An exemplary critical malfunction for a flight parameters display macro-function may be the displaying of a false attitude on a screen of the instrument panel, with no alarm signalling that such a failure has occurred. Such a malfunction may bring about a collision with the ground. The critical malfunctions identified are dubbed feared events.
The quantitative safety constraints may pertain to feared events, for example imposing a maximum objective of i09 for the probability of occurrence per flight hour of the feared event ER. Another qualitative constraint of operating dependability may be that a simple fault cannot by itself produce this feared event.
A means of processing such constraints consists in decomposing o each feared event ER down to its root causes by means of a failure tree. A 20 failure tree can be represented in the form of a tree-like structure whose vertex is the feared event ER, the leaves (base events) representing faults (root r causes in respect of hardware, such as for example a breakdown of an electronic component) or defects (root causes in respect of hardware design or software design, such as for example a coding error).
On the basis of the probabilities of occurrence of the base events, it is possible, by ascending through the failure tree, to compute a probability of occurrence of the feared event ER. Moreover, the computation of the minimal cuts of the failure tree can be performed. A minimal cut is a combination of base events (faults in respect of the hardware itself, defects in respect of the hardware design or software design) leading to the feared event, such that if an event is removed from this combination, the feared event ER no longer takes place.
The minimal cuts being by definition the critical combinations leading to the feared event ER, one then knows which hardware and software elements of the system are the ones whose root failure or failures (faults in respect of the hardware, design defects in respect of the hardware components or software components) contribute the most, isolated or in combination, to the realization of the feared event ER.
An analysis by failure tree such as described hereinabove can make it possible in an upstream phase, that is to say during the design of the system architecture, to favour one hardware or software architecture from among several architectures, in view of the probabilities of occurrence of feared events ER and of the existence of simple cuts, for one or the other of the proposed architectures.
PRIOR ART
Critical systems, in terms of operating dependability, are becoming ever more complex; hence the increasing difficulties for the associated tasks of design and analysis.
Numerous approaches in respect of operating dependability have appeared recently, among which is the approach of "failure logic modelling" of o the components or functions, which is very suited to the hierarchical and modular construction of the system.
Failure logic modelling techniques comprise notably the AltaRica modelling language, around which tools have been developed. This language has been developed by the Laboratoire Bordelais de Recherche en Informatique situated in France 351, cours de Ia Liberation 33405 Talence FRANCE.
They also comprise the FPTN (Failure Propagation and Transformation Notation) notation, and the HiP-HOPS (Hierarchically Performed Hazard and Operability Studies) scheme.
The FPTN notation is for example described in the publication by P. Fenelon, et at, "Towards Integrated Safety Analysis and Design, In ACM Applied Computing Review", 2(1): 21-32, 1994, ACM Press. The HiP-HOPS scheme is for example described in the publication by Y. Papadopoulos, et al., "Analysis and Synthesis of the Behavior of Complex Programmable Electronic Systems in Conditions of Failure, Reliability Engineering and System Safety" 71(3): 229-247, 2001, Elsevier Science.
The FPTN notation has remained a mainly academic concept, but the HiP-HOPS scheme and the AltaRica language have attracted appreciable interest on the part of industry, notably in the avionics and automobile sectors.
They have been applied to numerous industrial cases, and have given rise to trials.
These approaches come up against difficulties however. During the upstream design phases, the only malfunctional information ordinarily available in respect of the hardware components is coarse estimations of the total failure rate. Other sources of inaccuracy appear during the detailed design phase.
Analyses of operating dependability are often based at this juncture on the results of a process of hardware AMDEC type (Analysis of the modes of failure, of their effects and of their criticality), known in the literature by the name EMEA for: "failure modes and effects analysis", including the identification for each elementary electronic constituent of the modes of elementary failures (faults) with their associated rates, as well as their effects at the component and system levels.
Nonetheless, the FMEA process is very empirical and prone to errors. In a general manner, the effects induced at the component level by the hardware faults identified during an FMEA analysis do not correspond in either o a clear or a systematic manner with the item of information required at the 20 component level for the needs of the malfunctional logic modelling. Indeed, the construction of failure logic on the basis of FMEAs very often culminates in r arbitrary models which are rather unrepresentative of reality.
In the absence of precise knowledge of the malfunctional behaviour of the components, a possible choice for their modelling, disregarding the FMEA process, is the definition and the use, for each hardware or software component, of component global failures, notably "the complete loss of the component", and the generation of erroneous data or "generation of errors".
"Complete loss of the component" signifies that all the data generated as output from the component are absent or invalid, the failure rate adopted (for the hardware components) being the complete failure rate A of the component.
"Generation of erroneous data" by the component signifies that all the data generated as output are corrupted, the failure rate adopted being, in the absence of other details, for example estimated at 10% A for the hardware components.
Although the modelling of component failure logic is feasible with the help of these conventions, as well as the automatic generation of failure trees and the quantification of the feared event ER, the qualitative and quantitative results obtained are unreliable. Worse, these results may be rashly optimistic, in so far as component malfunctional behaviours contributing to the feared events ER are frequently ignored, this being due to the simplistic character of the component failures envisaged (complete loss, generation of errors on all the outputs).
SUMMARY OF THE INVENTION
The aim of the invention is to propose a new scheme for the modelling of component failure logic accompanied by an appropriate analysis process, which at least partially addresses the difficulties mentioned hereinabove.
For this purpose, the subject of the invention is a method for evaluating the operating dependability of a system formed of hardware components and/or software components, characterized in that it comprises at least the following steps: * Decomposing the components of the system into the form of interconnected o functions; defining, for each function, associated functional failures; 20 * Defining a feared event of the system in the form of a logical expression whose terms are inputs and/or outputs of the functions; r * Determining minimal functional cuts each defining a minimal combination of failures of functions causing the feared event of the system; * Synthesizing the minimal functional cuts determined as a further-reduced set of hardware component minimal cuts, each cut element being a component failure; * Computing for each component failure intervening in a component cut a failure rate and an exposure time; * Computing a probability of occurrence of the feared event as a function of the failure rates and exposure times.
The probability computation can be done for the feared event by multiplying for each component cut the probabilities associated with the failures of the cut concerned and then by adding together the products obtained for the set of component cuts.
Advantageously, the functional failures are classed according to at least one set of types, a single type per set not being able to be adopted for each functional failure and in that in a set, one of the types is graded as dominant, the others as recessive.
It is possible to attach to each component failure intervening in a component cut the same set or sets of types as those associated with the functional failures. The type adopted for each of these component failures and for each set will be the dominant type if at least one of the underlying functional failures is of the dominant type, the type adopted in the converse case being the recessive type.
It is possible to class the failures according to a first pair of alternative types: "erroneous" graded as dominant or "loss" graded as recessive, the erroneous type being defined when the failure causes the corruption of an item of information, the loss type being defined in the converse case, notably when the failure causes the loss of an item of information.
o It is possible to class the failures according to a second pair of alternative types: "visible" graded as dominant or "dormant" graded as recessive, the visible type being defined when the failure is detectable during nominal operation of the system, the dormant type being defined in the converse case, notably when the failure is detectable only through the maintenance tests or not detectable at all.
The failure rates and the exposure times of the various component failures are advantageously defined as a function of the types adopted in respect of these failures.
During the synthesis, it is possible to generate for each functional minimal cut a component cut whose elements are component failures, each of these component failures being obtained by synthesis of the functional failures of this cut that are included in the component concerned.
Advantageously, the set of component cuts is reduced by deleting all the non-minimal component cuts, a component cut being deleted if it is identical to, or if it strictly includes, another component cut with identical types for each of the component failures which are common to them.
Moreover, it is possible to perform another synthesis of the minimal functional cuts determined as a further-reduced set of minimal cuts of hardware and/or software component design defects.
In this other synthesis, it is possible to consider components of similar design as being one and the same component.
DESCRIPTION OF THE FIGURES
Other advantages of the invention will become apparent on reading the detailed description, which description is illustrated by the attached drawing wherein: Figure 1 represents in a schematic manner an avionics system whose outputs are displayed critical reticles and to which it is possible to apply the method of the invention; o Figure 2 represents the internal structure of the equipment of the system of Figure 1; Figure 3 represents a decomposition of a hardware component of the system into interconnected functions; Figures 4a and 5a represent in table form the content of functional minimal cuts in terms of functional failures, with associated types; Figure 4b makes it possible to illustrate an operation of translating the functional minimal cuts into non-minimal hardware component cuts; Figure 4c makes it possible to illustrate a complementary operation of compression of the component cuts into a set of hardware component minimal cuts; Figures 5b, 5c also make it possible to illustrate these same operations of translating, and then compressing, the functional minimal cuts into non-minimal cuts of design defects, and then into minimal cuts of design defects for the hardware components and/or software components; Figure 6 represents in table form an exemplary list of the hardware component minimal cuts for the system of Figure 1 serving as example, as well as their associated probabilities; Figures 7a and 7b represent in the form of tables two examples of minimal defects cuts.
DETAILED DESCRIPTION OF THE INVENTION
As mentioned hereinabove, the list of the hardware faults and of their effects, identified by virtue of the FMEA (when it is available), is by nature incomplete and empirical, and therefore not very usable for the modelling of component failure logic. It does indeed lead to models which lack consistency and which only very partially cover the set of component malfunctional behaviours leading to the feared events ER.
Moreover, the notions in respect of the components of "complete loss" and "generation of errors" which in principle make it possible to avoid the FMEA process are very restrictive notions, which lead to coarse results in terms of minimal cuts, sometimes with errors in the order of magnitude of the probability of the associated feared event ER.
The invention allows evaluations of operating dependability without o recourse to the EMEA process, or to the notions of "complete loss" or of "generation of errors". . . Decomposition into functions and functional failures The start of the methodology of the invention consists in breaking the complexity by decomposing the system into a set of interconnected functions hosted by components. Two types of components are envisaged: hardware components and software components hosted by hardware components.
Functional failures are associated with these functions, such as for example: "Loss of data on an output", "Data erroneous on an output". The malfunctional logic of the states of the outputs of each function (valid, erroneous, absent...) produced by the states of the inputs is thereafter defined in the form of boolean expressions. This logic takes into account the set of functional failures associated with the function.
It is possible to class the functional failures according to at least one set of types, it being possible to class functional failures only according to one type per set. Customarily, the sets are pairs and there are therefore two alternative types in a set. In a type pair, one is graded as dominant and the other as recessive. It is possible to define more than two types in a set, just one is then dominant. For example, it is then possible to choose to class the functional failures as a first "erroneous" v.s. "loss" type, i.e. E v.s. L. It is also possible to choose to class them as a second "visible" v.s. "dormant" type, i.e. V v.s. D. It is also possible to choose to class them according to both these types simultaneously. The four cases thus defined are then as follows: "erroneous visible" denoted EV, "loss visible" denoted LV, "erroneous dormant" denoted ED, "loss dormant" denoted LD.
A functional failure is of the erroneous type when it corrupts the value of an item of information, otherwise it is of the loss type (notably, but not exclusively, when the item of information is lost). Moreover, it is of the visible type when it is detectable via its effects by a user or by the tests implemented during nominal operation of the system, otherwise it is of the dormant type. This type of test is known in the literature by the name CBIT for: "Continuous Built In Test". A functional failure is of the dormant type when it is not detectable visually or by CBIT during nominal operation of the system, but solely by the o maintenance tests, or else when it is not detectable by any test. For example, "Loss of a data item" and "Corruption of a data item" are usually considered to be visible if the data item is presented as primary output during nominal operation of the system, whereas the "loss of an alarm" can be considered to be dormant, since the alarm is not triggered during nominal operation of the system.
Minimal functional cuts The feared event ER under analysis is itself defined as boolean combination of states of the inputs and/or of the outputs of certain functions of the system. The continuation of the methodology consists in determining the minimal functional cuts each defining a minimal combination of base events (functional failures) causing the feared event ER.
Examples of minimal functional cuts are given in the table of Figure 4a. The rows of the table of this figure are minimal functional cuts. The base events (functional failures), denoted FFk, form the columns of the table. Each minimal cut FMC is expressed by crosses placed in the table at the intersection of row i and the columns. The base events are contained in hardware components denoted HC. There are for example 4 base events FF11, FF12, FF13, and FF14, contained in the component HG1. Moreover, one of the four values of the previously defined type is associated with each base event.
Synthesis of the minimal functional cuts as component minimal cuts The continuation of the methodology consists in translating each functional minimal cut into a hardware component cut. Each element of such a cut (component failure) corresponds to a component and synthesizes the functional failure or failures included in this cut, and also included in the component (partial cut). The component cut is therefore the translation of the combination of partial cuts which constitutes the functional minimal cut, each partial cut being associated with a component.
It is possible to also choose to associate one of the four types of failure "erroneous visible" (EV), "loss visible" (LV), "erroneous dormant" (ED) and "loss dormant" (LD) with each component failure. The type of a hardware component failure will depend on the types of the functional failures that it synthesizes.
o A hardware component failure will be for example the "erroneous 20 visible" type if at least one of the functional failures included in this component failure is erroneous, and if moreover at least one of these functional failures is r visible. Note that the erroneous failure can be distinct from the visible failure.
A hardware component failure will for example be of the "loss visible" type if none of the functional failures included in this component failure is erroneous, and if moreover at least one of these functional failures is visible.
A hardware component failure will for example be of the "erroneous dormant" type if at least one of the functional failures included in this component failure is erroneous, and if moreover none of these functional failures is visible.
A hardware component failure will be of the "loss dormant" type if none of the functional failures included in this component failure is erroneous and if moreover none of these functional failures is visible.
More generally, for each component failure intervening in a component cut, the type adopted for each set will be the dominant type if at least one of the underlying functional failures is of the dominant type, the type adopted in the converse case being the recessive type.
Figure 4b illustrates for the previous example, the translation of the functional minimal cuts into component cuts. The component cut GMC1 translated from FMC1 comprises the loss dormant type for the component HG2 and erroneous visible type for the component HG4, in accordance with the rule stated hereinabove. The component cut CMC2 translated from FMC2 consists of the erroneous visible type for HG1. The cut FMC2 is particular in the sense that it is translated into a simple component cut.
The question then arises of generating, on the basis of the non-minimal set of hardware component cuts, the set of component minimal cuts corresponding to the feared event ER. A component minimal cut will also consist of hardware component failures of one of the four types EV, LV, ED, LD.
Advantageously, the set of component cuts is reduced by deleting all the non-minimal component cuts. Stated otherwise, all the redundant cuts are deleted, as are the cuts which completely contain other cuts. Two component cuts are considered to be redundant if they have the same failure types for the same components. This reduction operation is shown diagrammatically in Figure 4c. For example, the cut 0MG3 contains the cut 0MG1. 0MG3 is o therefore not minimal and can be deleted.
20 This example is very simple (three functional cuts). In practice there may be several hundred, or indeed thousand minimal functional cuts. The r translation into component cuts and the synthesis into minimal component cuts may for example lead to sets of 10 to 60 minimal component cuts.
Quantification of the component cuts and of the feared event The continuation of the methodology consists in associating probabilities of occurrence with the component failures. These probabilities of occurrence will be the products of the component failure rates and the exposure times associated with these same failures.
The failure rate associated with a component failure of the types LV or LD is for example that of the hardware component. The failure rate associated with a failure of the types EV or ED is for example a portion of the failure rate for the hardware component (for example 10%).
More generally, the failure rates and the exposure times of the various component failures are defined as a function of the types adopted in respect of these failures.
The probability of occurrence (weight) of a component failure is for example equal to the failure rate mentioned hereinabove multiplied by the duration of the mission for the failures of visible type [V or LV, or by the duration separating two maintenance tests making it possible to detect this component failure, for the dormant types ED and LD. It is possible to compute the probability of occurrence differently as a function of the envisaged types.
The hardware component minimal cuts are then easily quantifiable.
The weight of such a cut is equal to the product of the weights of the component failures of which it consists. By summing these weights for the set of component minimal failures, an estimation for the probability of occurrence of the feared event ER is ultimately obtained.
Synthesis of the minimal functional cuts as component defect cuts The minimal functional cuts are also synthesized into component o design defect minimal cuts including only defects (a single type denoted DF, for 20 the hardware components and for the software components alike), and taking account of the design similarities between components. r
Each functional minimal cut is translated into a component defects cut. Each element of such a cut (component defect) corresponds to a component (or to a set of components of similar design) and synthesizes the functional failure or failures of the functional cut, which also are included in this component or this set (partial cut). Figure 5b illustrates for the previous example, the translation of the functional minimal cuts into component defect cuts. The components HC1 and HC2 have been declared of similar design and grouped together under the name HC12. The components HC3 and HC4 have also been declared similar and grouped together under the name HC.
The continuation consists in generating, on the basis of the non-minimal set of component defect cuts, the set of component defect minimal cuts corresponding to the feared event ER. The algorithm carrying out this task eliminates the doubletons from among the defects cuts, and then reduces the set obtained by deleting all the non-minimal defects cuts. Figure 5c illustrates this compression of defects cuts into minimal cuts.
The method of synthesis described hereinabove, that is to say of translating/compressing the minimal functional cuts into component defect minimal cuts, can be implemented on two distinct sets of components. The first set consists of the modelled hardware components. The second set consists of the modelled software components, to which are added the hardware components that do not host any software components. The synthesis method applied to these two sets will therefore produce two distinct suites of component defects minimal cuts.
Conclusion
The approach described hereinabove makes it possible to obtain estimations of the probabilities of occurrence of the feared events ER in the context of the evaluation of the dependability for a given projection of the functions considered onto the components. The optimization of this projection corresponds to the system upstream design phases. It is performed easily through comparative estimations on various projections, since the functional minimal cuts remain unchanged when the projection varies, the underlying array o of the interconnected functions likewise remaining invariant. The algorithm hereinabove is therefore repeated on the same set of input functional cuts, the adjustment parameter being the distribution for a given projection of the functional failures in the components.
The same scheme can be applied during the system detailed design phase, providing estimations of the probabilities of occurrence of the feared events ER based on component failure rates such as described hereinabove, rather than on individual failure rates arising from an FMEA, or rates for the "total loss" or "generation of errors" component failures.
SYSTEM MODELLING WITH THE AID OF THE ALTARICA LANGUAGE
The Altarica language has been developed for modelling the functional and malfunctional behaviours of systems. The construction of the Altarica Dataflow models (reduced language, general Altarica subset) is for example supported by the OCAS tool developed by Dassault Aviation whose headquarters are situated at 9 Rpt Champs Elysees 75008 Paris (France) and by the SD9 tool developed by Dassault Systemes whose headquarters are situated at rue Marcel Dassault 78140 Velizy-Villacoublay (France). These tools provide the graphical editing and simulation functionalities and comprise an automatic generator of failure trees, as well as an extractor of minimal cuts.
A complete Altarica Dataflow model constructed under the SD9 tool or under OCAS for a complex system, consists of a suite of interconnected nodes comprising sub-nodes, likewise interconnected. A "leaf' node is defined as a node not possessing any sub-node. A leaf node will for example be able to represent a function, or a software or hardware component containing a set of functions described "flat" in textual Altarica. The nodes other than the leaf nodes will represent for example components of a higher hierarchical level.
The feared events ER are input under SD9 or OCAS as boolean expressions of certain flow variables present in the AltaRica model. A failure tree generation can then be launched regarding a feared event ER, generating a tree whose vertex is the feared event ER itself, and whose leaves are AltaRica base events. Stated otherwise, the feared event ER is expressed as a boolean expression of the AltaRica events considered to be boolean themselves. Such a tree can be compiled with the aid of specialized tools o (Arbor, Simtree), so as to generate a set of minimal cuts and optionally to compute a probability of occurrence for the feared event ER.
IMPLEMENTATION OF THE INVENTION ON AN EXAMPLE
By way of example, the method of the invention is implemented on an avionics system comprising two main flight screens of the flight deck. These screens are known in the literature by the name "Primary Flight Displays", the avionics system itself being denoted "PFD system". Of course the invention can be implemented outside of the aeronautical context and for any other complex system.
The display of critical parameters such as speed, attitude, angular deviations from the landing approach axis, etc. of an aeroplane equipped with the PFD system is a critical task for the safety of the aeroplane. A feared event classed as catastrophic at the PFD system level will be for example "unsignalled erroneous display of a critical reticle on both of the two PFD screens of the system". A feared event classed as dangerous will be for example "unsignalled erroneous display of a critical reticle on one of the PFD screens".
Description of the system
Figure 1 represents in a simplified manner the highest hierarchical level of the SD9 model for the PFD system. It is composed of several interconnected items of equipment, each item of equipment being modelled by an Altarica node. More precisely, the system comprises two main screens F_PFD and P_PFD (higher-level nodes) and two sensors F_GNSS and P_GNSS (hardware components modelled by leaf nodes) each generating a critical parameter in terms of safety (for example an angular deviation), respectively F_Par and P_Par. The two parameters are processed by processors of each screen and displayed in the form of two graphical reticles, respectively F_Ret and P_Ret. In the processors, comparisons are performed between the two critical input parameters. In the case of inconsistency, alarm flags respectively F_Disc and P_Disc are displayed on the screens, for example in the form of specific reticles or of messages.
Figure 2 represents the immediately lower hierarchical level of the model, namely the internal architecture of a PFD item of equipment of Figure 1, consisting of interconnected hardware components. Each of these components is modelled by an Altarica leaf node. A programmable gate array ABAC serves for routing between the input-output card lOB, the processor Proc, the memory components (not represented), and the graphical processor GRPR. The two critical parameters P_Par and F_Par (Figure 1), correspond to the signals Prim_Par and Sec_Par (Figure 2). These signals pass through the input-output card lOB, and then the programmable gate array ABAC, before being presented as input to the processor Proc. They are thereafter processed by the processor Proc to generate a graphical command Cmd_Par, returned to the graphical processor GRPR through the programmable gate array ABAC. A signature module Sign receives predefined images (pixels) generated by the graphical processor GRPR, effects a signature thereof, which is thereafter retransmitted to the processor Proc by way of the programmable gate array ABAC. The graphical processor GRPR produces pixels and drives the actual screen, which is for example a back-lit LCD screen. Here the back-lighting hardware component bears the tag Disp. A monitoring hardware component MDLM monitors the display and notably its refreshing. It transmits its finding directly to the processor Proc. Moreover] monitoring is performed in real time by the processor Proc itself on its own graphical control computation. In the case of erroneous processing, the value "active" is allotted to the flag lnv_Flag generated by the processor Proc. The lnv_Flag flag is processed by the programmable gate array ABAC which generates, in the case where the value of lnv_Flag is "active", a Reset signal at output, i.e. a reset to zero of the PFD. The parameters Prim_Par and Sec_Par are also compared at input by the processor Proc, producing an output flag Disc_Flag of value "active" in the case of inconsistency. The Disc_Flag flag is dispatched to the graphical processor GRPR through the programmable gate array ABAC.
A more precise definition for the feared events mentioned hereinabove takes account of the monitoring means and associated sanctions.
The feared event classed as dangerous will then be for example "erroneous display of a critical reticle on one of the two PFD screens without active alarm o flag and no taking of sanction of resetting to zero".
Decomposition into functions and functional failures r The components of the system are firstly decomposed into functions.
Figure 3 represents for example the decomposition of the hardware component Proc into interconnected functions, also included in software components FDSA, FDSB, PA. The software component FDSA contains the functions "Computation of the divergence flag", "Inverse computation of the sensor parameter on the basis of the graphical command", "Computation of the inverse flag". The software component FDSB contains the function "Computation of the image parameter on the basis of the sensor parameter", and the software component PA the function "Computation of the graphical command".
The parameter Prim_Par is processed by FDSB to generate an image parameter 1mg_Par. It is thereafter processed by PA to generate the graphical command Cmd_Par, itself intended for the hardware component GRPR. The component FDSA makes it possible to generate on the one hand the Disc_Flag alarm flag by comparing the parameters Prim_Par and Sec_Par at input, on the other hand the lnv_Flag alarm flag by comparing Prim_Par and a signal mv_Par obtained on the basis of Cmd_Par by computation inverse to the processings performed by PA and FDSB. Disc_Flag signals a possible divergence between the signals Prim_Par and Sec_Par arising from the two sensors. lnv_Flag signals a possible error in the computation of the graphical command Cmd_Par.
Functional failures are defined for each of the functions included in the components, notably those listed hereinabove. For the function "Computation of the graphical command", the two failures "Loss of the graphical command" and "Corruption of the graphical command" are considered for example. For the function "Computation of the divergence flag", the two failures "Loss of the divergence flag" and "Forcing of the divergence flag to inactive" are considered.
The malfunctional logic linking the states of the outputs of each function to their failures and to the states of the inputs which produce them, are thereafter expressed in the form of boolean expressions.
One of the types EV, LV, ED, LD is thereafter allotted to each of the o previously defined functional failures. For example, "Loss of the graphical 20 command" is classed as "loss visible" LV. "Corruption of the graphical command" is classed as "erroneous visible" EV. These two failures are visible r since they are detectable by the pilot during nominal operation of the system (they impact the display). On the other hand, "Forcing of the Disc_Flag flag to inactive" is classed as "loss dormant" LD, since it has no impact during nominal operation of the system.
Altarica language modelling of the system Functional failures are modelled in Altarica in the form of events.
"Loss of the graphical command" is for example modelled by the event Cmd_Par_LO. Likewise, "Corruption of the graphical command" is modelled by the event Cmd_Par_EO. "Loss of the divergence flag" is modelled by Disc_Flag_LO, and "Forcing of the divergence flag to inactive" by Disc_Flag_STF.
With the events _LO, _EO, _STF, etc. are respectively associated Altarica state variables _Loss, _Err, _StuckF. The variables Cmd_Loss, Cmd_Err, Disc_Flag_StuckF are for example associated with the events Cmd_Par_LO, Cmd_Par_EO and Disc_Flag_STF. The activation of the events _LO and _EO causes the value of the state variables _Loss and _Err to toggle from false to true. The activation of Cmd_Par_LO causes the value of Cmd_Par_Loss to toggle for example from false to true. Likewise, the activation of Cmd_Par_EO causes the value of Cmd_Par_Err to toggle for example from false to true. The activation of Disc_Flag_LO or of Disc_Flag_STE causes the value of Disc_Flag_Loss or of Disc_Flag_StuckF respectively to toggle from false to true.
Moreover, the information Sec_Par, Prim_Par, lnv_Par, 1mg_Par, Cmd_Par, Fix_Par, Micr_lmg, Ret, 1mg_Sign of Figures 2 and 3 is modelled in Altarica with the aid of two booleans _APR and _ARL. The boolean Cmd_ParAPR indicates for example whether the item of information Crnd_Par is present (true) or absent (false). Moreover, the boolean Cmd_ParARL indicates for example whether the item of information Cmd_Par, considered when it is present, has an intact (true) or corrupted (false) value.
The flags Disc_Flag, lnv_Flag, Disc of Figures 2 and 3 are also modelled with the aid of two booleans _APR and _AST. The boolean Disc_Flag'PR indicates for example whether the flag Disc_Flag is present (true) o or absent (false). The boolean Disc_EIagAST indicates for example whether the flag Disc_Flag, considered when it is present, is active (true) or inactive (false).
The boolean expressions expressing the malfunctional logic of the functions are modelled in Altarica by assertions. By way of example, the Altarica code corresponding to the function "Computation of the graphical command" is as follows: trans -Cmd_Par_Loss Cmd_Par_LO -> Crnd_Par_Loss true; -Cmd_Par_Err Cmd_Par_EO -> Cmd_Par_Err true; assert Cmd_Par"PR = (Img_Par"PR and -Cmd_Par_Loss); Cmd_Par"RL = (Img_Par"RL and -Cmd_Par_Err); The section trans models the activation of the two events. The section assert models the malfunctional logic. The booleans lmg_Par"FR and lmg_Par'RL are logical expressions of the input booleans Prim_by"PR and Prim_byARL, as well as values of the two state variables Cmd_Par_Loss and Gmd_Par_Err.
Minimal functional cuts The continuation of the methodology consists in specifying the feared event ER considered in the form of an Altarica assertion. For the PFD system, this assertion is for example: ER = ((F_RCtAPR and._F_RCtARL and --F_Disc and -F_Reset) or (P_Ret"PR and -P Ret"RL and -P_Disc and -P_Reset)); The functional minimal cuts are thereafter generated under the tool SD9 or OCAS for this feared event. For the example studied, there are several hundred of them, of order 2, 3 and 4.
Synthesis of the minimal functional cuts as component minimal cuts o The continuation of the scheme consists in automatically synthesizing the set of minimal functional cuts as minimal hardware component cuts. The synthesis takes account of the types "erroneous visible" (EV), "loss visible" (LV), "erroneous dormant" (ED) and "loss dormant" (LD) associated previously with each functional failure. Figure 6 shows the component cuts thus obtained. The various components are labelled column-wise in the table and the component minimal cuts are labelled row-wise in the table. The types of the component failures are given at the intersection of the rows and columns as result of the synthesis.
Trials have shown that with respect to the prior art where the modelling and the computation of the component cuts is carried out with the aid of the two component failures "complete loss" and "generation of errors on all the outputs", the component cuts determined by synthesis are more numerous.
The new scheme therefore improves the exhaustivity of the evaluation of operating dependability by casting light on a larger number of causes of feared events.
Quantification of the component cuts and of the feared event The continuation of the methodology consists in associating probabilities of occurrence with the component failures. These probabilities of occurrence are the products of the component failure rates and the duration separating two tests (CBIT or maintenance) making it possible to detect failures.
The failure rate associated with a component failure of the types LV or LD is for example 1/MTBF, that of the hardware component. The failure rate associated with a failure of the types EV or ED is for example a portion of the failure rate for the hardware component, for example 10% (1/MTBF).
The probabilities of occurrence (weight) of a component failure for the types EV and LV are equal to 10% (1/MTBF) and to 1/MTBF respectively, the exposure time being 1 hour (mission duration). The probabilities of occurrence of a component failure for the types ED and LD are equal to 10% (1/MTBF)*12 and to (1/MTBF)*12 respectively, the duration separating two maintenance tests making it possible to detect this component failure being 12 hours.
The weight of the minimal cuts is equal to the product of the weights of the component failures of which it consists. By summing these weights for the set of component minimal failures, an estimation for the probability of o occurrence of the feared event ER is ultimately obtained. The numbers obtained are shown in Figure 6. r
Synthesis of the minimal functional cuts as component defect cuts It remains to automatically synthesize the minimal functional cuts as component design defect minimal cuts, having regard to design similarities. In the example presented, the components of the F_PFD and P_PFD sub-systems, as well as the F_GNSS and P_GNSS sensors, are all of similar design.
The synthesis is implemented on two distinct sets of components.
The first set consists of the hardware components. The second set consists of the software components EDSA, FDSB and PA, to which the hardware components other than Proc are added. Figure 7 shows the two tables obtained.

Claims (11)

  1. CLAIMS1. Method for evaluating the operating dependability of a system formed of hardware components and/or software components, characterized in that it comprises at least the following steps: * Decomposing the components of the system into the form of interconnected functions; defining, for each function, associated functional failures; * Defining a feared event (ER) of the system in the form of a logical expression whose terms are inputs and/or outputs of the functions; * Determining minimal functional cuts (FMC) each defining a minimal combination of failures of functions (FF) causing the feared event (ER) of the system; * Synthesizing the minimal functional cuts (FMC) determined as a further-reduced set of hardware component minimal cuts, each cut element being a component failure; * Computing for each component failure intervening in a component cut a failure rate and an exposure time; (\J * Computing a probability of occurrence of the feared event (ER) as a 0 20 function of the failure rates and exposure times.
  2. 2. Method according to Claim 1, characterized in that the probability computation is done for the feared event by multiplying for each component cut, the probabilities associated with the failures of the cut concerned and then by adding together the products obtained for the set of component cuts.
  3. 3. Method according to one of the preceding claims, characterized in that the functional failures are classed according to at least one set of types, a single type per set not being able to be adopted for each functional failure, and in that in a set, one of the types is graded as dominant, the others as recessive.
  4. 4. Method according to Claim 3, characterized in that the same set or sets of types as those associated with the functional failures is or are attached to each component failure intervening in a component cut and in that for each of these component failures and for each set, the type adopted will be the dominant type if at least one of the underlying functional failures is of the dominant type, the type adopted in the converse case being the recessive type.
  5. 5. Method according to one of Claims 3 or 4, characterized in that the failures are classed according to a first pair of alternative types: "erroneous" (E) graded as dominant or "loss" (L) graded as recessive, the erroneous type (E) being defined when the failure causes the corruption of an item of information, the loss type (L) being defined in the converse case, notably when the failure causes the loss of an item of information.
  6. 6. Method according to one of Claims 3 to 5, characterized in that it is possible to class the failures according to a second pair of alternative types: "visible" (V) graded as dominant or "dormant" (D) graded as recessive, the visible type (V) being defined when the failure is detectable during nominal operation of the system, the dormant type (D) being defined in the converse case, notably when the failure is detectable only through the maintenance tests or not detectable at all.
  7. 7. Method according to one of Claims 3 to 6, characterized in that the failure rates and the exposure times of the various component failures are defined as a function of the types adopted in respect of these failures.
  8. 8. Method according to one of the preceding claims, characterized in that during the synthesis, a component cut whose elements are component failures is generated for each functional minimal cut, each of these component failures being obtained by synthesis of the functional failures of this cut that are included in the component concerned.
  9. 9. Method according to one of the preceding claims, characterized in that the set of component cuts is reduced by deleting all the non-minimal component cuts, a component cut being deleted if it is identical to, or if it strictly includes, another component cut with identical types for each of the component failures which are common to them.
  10. 10. Method according to one of the preceding claims, characterized in that another synthesis is performed of the minimal functional cuts (FMC) determined as a further-reduced set of minimal cuts of hardware and/or software component design defects.
  11. 11. Method according to Claim 10, characterized in that components of similar design are considered to be one and the same component. (4 aD r
GB1321941.5A 2012-12-12 2013-12-11 Evaluating the operating dependability of a complex system Withdrawn GB2510253A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
FR1203373A FR2999318A1 (en) 2012-12-12 2012-12-12 METHOD FOR EVALUATING THE OPERATING SAFETY OF A COMPLEX SYSTEM

Publications (2)

Publication Number Publication Date
GB201321941D0 GB201321941D0 (en) 2014-01-22
GB2510253A true GB2510253A (en) 2014-07-30

Family

ID=49639908

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1321941.5A Withdrawn GB2510253A (en) 2012-12-12 2013-12-11 Evaluating the operating dependability of a complex system

Country Status (2)

Country Link
FR (1) FR2999318A1 (en)
GB (1) GB2510253A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10360741B2 (en) 2016-06-02 2019-07-23 Airbus Operations (S.A.S) Predicting failures in an aircraft

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6794116B2 (en) * 2016-02-10 2020-12-02 三菱航空機株式会社 Combination event evaluation device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1842191A4 (en) * 2005-01-19 2012-05-09 Favoweb Ltd A system and method for bouncing failure analysis
FR2923925B1 (en) * 2007-11-16 2009-11-27 Thales Sa METHOD FOR EVALUATING THE OPERATING SAFETY OF A SYSTEM

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10360741B2 (en) 2016-06-02 2019-07-23 Airbus Operations (S.A.S) Predicting failures in an aircraft

Also Published As

Publication number Publication date
GB201321941D0 (en) 2014-01-22
FR2999318A1 (en) 2014-06-13

Similar Documents

Publication Publication Date Title
Littlewood et al. Software reliability and dependability: a roadmap
AU2014208308B2 (en) Safety analysis of a complex system using component-oriented fault trees
Graydon et al. Assurance based development of critical systems
Bozzano et al. Formal design and safety analysis of AIR6110 wheel brake system
US10061670B2 (en) Method and apparatus for automatically generating a component fault tree of a safety-critical system
US10539955B2 (en) Failure analysis validation and visualization
Papadopoulos et al. The potential for a generic approach to certification of safety critical systems in the transportation sector
US9087419B2 (en) Method and apparatus for remote e-Enabled aircraft solution management using an electronic flight bag (EFB)
US20040169591A1 (en) Certifying software for safety-critical systems
Feiler et al. Automated fault tree analysis from aadl models
Bouzekri et al. Engineering issues related to the development of a recommender system in a critical context: Application to interactive cockpits
Wilfredo Software fault tolerance: A tutorial
Zhao et al. Safety assessment of the reconfigurable integrated modular avionics based on STPA
US20180074484A1 (en) Method and apparatus for generating a fault tree for a failure mode of a complex system
GB2510253A (en) Evaluating the operating dependability of a complex system
US8271845B2 (en) Method for evaluating the operating safety of a system
Hugues et al. Model-based design and automated validation of ARINC653 architectures using the AADL
Mueller et al. Automated test artifact generation for a distributed avionics platform utilizing abstract state machines
Park et al. Model-based concurrent systems design for safety
Saraç Certification aspects of model based development for airborne software
Frazza et al. MBSA in Aeronautics: A Way to Support Safety Activities
Gürbüz et al. Safety perspective for supporting architectural design of safety-critical systems
Micouin et al. Property model methodology: a first assessment in the avionics domain
Schwierz et al. Assurance Benefits of ISO 26262 compliant Microcontrollers for safety-critical Avionics
Marques et al. Requirements Engineering in Aircraft Systems, Hardware, Software, and Database Development

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)