CN107844407A - A kind of reliability verification method of the anti-SEU based on PRISM - Google Patents
A kind of reliability verification method of the anti-SEU based on PRISM Download PDFInfo
- Publication number
- CN107844407A CN107844407A CN201711102436.1A CN201711102436A CN107844407A CN 107844407 A CN107844407 A CN 107844407A CN 201711102436 A CN201711102436 A CN 201711102436A CN 107844407 A CN107844407 A CN 107844407A
- Authority
- CN
- China
- Prior art keywords
- failure
- component
- prism
- security
- reliability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3447—Performance evaluation by modeling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of reliability verification method of the anti-SEU based on PRISM.The present invention is that formal Verification Techniques are applied to system earlier design phase by one kind, analyze the system reliability under different reinforcement techniques and parameter, availability and security, designer is helped to develop a kind of method of more reliable and more effective solution, consider memory refresh using PRISM probabilistic models verifier, the reinforcement techniques such as redundancy are modeled to system design, dispatch using CDFG and (be used for fault recovery), failure checking cover ratio (security is related) and feature database (being used for the single-particle inversion probability of happening that each component is provided), it is stateful that the model established can describe the institute that system can reach.
Description
Technical field
The invention belongs to computer system model to examine field, and the specifically embedded system reliability in aviation field is tested
Card method.
Background technology
In recent years, as the continuous development of Aero-Space related-art technology, increasing IC-components (are such as defended
Integrated circuit in star) need to work under radiation environment.Wallmark and Marcus predicts cosmic ray to micro- within 1962
The influence of electronic device.1978, Pickel and Blandford entered to the accumulator system anomaly of U.S.'s secret satellite
Row analysis, it is caused by single-particle inversion (singe-event upset, SEU) to confirm these anomalies.Such as single-particle is turned over
Turn effect and do not take certain safeguard procedures, once appearing in crucial electronic chip, its caused consequence is probably disaster
Property.In 1589 American satellite exceptions, having 621 extremely caused by Single event upset effecf, 39.1% is accounted for.And China
" wind and cloud No.1 (B) " satellite irradiated by high energy charged particles due to its main control computer, multiple single-particle inversion thing occurs
Part, finally attitude control system is failed, finish the life-span too early.Therefore, compared to other kinds of device, under radiation environment
The device of application needs higher radiation-resistant guaranteed reliability.
Current radiation-resistant chip device type is broadly divided into application specific integrated circuit device (ASIC), One Time Programmable device
Part (such as FPGA based on antifuse), repeatable programming device (such as FPGA based on SRAM, i.e. SRAM-based FPGA).With
Exemplified by AEROSPACE APPLICATION, first two device is compared to, repeatable programming device possesses the advantage not available for first two device.
First, cost advantage.It is single for ASIC device because the device yield of the single application in AEROSPACE APPLICATION is often very low
The cost of device is quite high, and must carry out redesign manufacture towards different applications.And SRAM-based FPGA mono-
Aspect can be produced in batches, reduce the manufacturing cost of individual devices, on the other hand not have to consider that technique manufacture is asked due to basic
Topic, greatly reduces design cost, shortens product development cycle.2nd, restructural advantage.Because the environment in AEROSPACE APPLICATION is answered
The factors such as polygamy, high cost, restructural characteristic possess high meaning and value.On the one hand, if system finds to set after transmitting
Broken down in meter problem or use, reconfigurable device only need to remotely re-download amended circuit, can be so that system
Recover normal, substantially increase the maintainability of system.On the other hand, can be to change as needed due to system restructural
The task and function of more new system, developed again so as to both avoid, aerospace system is possessed " high intelligence ".Based on above-mentioned
Advantage, SRAM-based FPGA have received common concern, and are applied in many space projects.Such as 2003
(xQvR400oxL is successfully applied to during mars exploration appoints the Virtex FPGA of year Xilinx company, European gold in 2005
Star detection mission is also the system based on Virtex FPGA (XQVR6OO) used.But compared to first two device, SRAM-
Based FPGA are more vulnerable to the influence of SEU in radiation effect, it is therefore necessary to which the reinforcing in terms of carrying out SEU to device makes to have and held
Wrong ability, to improve system reliability.
2011, the related researcher in manned space flight totality portion of the Chinese Academy of Space Technology proposed for SEU phenomenons
A kind of Scheme of Strengthening, this scheme propose many kinds of measures and carry out frequency converter to FPGA, are detection side is compared in retaking of a year or grade first
Method, next to that memory refresh (scrubbing) measure, is finally that BRAM is grand from error correction.These three measures are all Xilinx companies
It is recommended that method, be effective against SEU, improve the belief system of system.The Institute of Technology of India (Indian Institute
Of Technology) a kind of scheme of partially dynamical reconfiguration is proposed to resist single-particle inversion, to improve the reliable of system
Property.This scheme uses partial reconfiguration, FPGA is reconstructed based on modular design (RM), it will can easily send out in FPGA
The logical resource of raw single-particle inversion reconfigures.Not only effectively make use of FPGA resource, and improve system can
By property.It can be seen that all having done many work in the research of FPGA reliabilities both at home and abroad, system can be improved to a certain extent
Reliability.But even with these Scheme of Strengthening, the system that final production comes out may not just meet reliability standard.Cause
This, has to before system formally puts into operation into excessively strict reliability demonstration.
Formal verification referred in computer software and hardware system design process, using mathematical method come prove some or
Some formal Specifications either correctness of attribute or incorrect property.For traditional method (such as simulation), how mould
Various possible sexual behaviour have become a classical problem in plan system, but Formal Verification is better than in this regard
Simulation, because formalization method can be verified by the state space of limit system to given Formal Specification.
Model testing is exactly a kind of ripe, for verifying the automation formal verification technology of finite state system correctness.Given system
The formalized model of system and specification to be authenticated and attribute, model-checking algorithm just can automatically thoroughly searching system can go out
Existing institute is stateful, to verify whether these specifications and attribute are satisfied.If be unsatisfactory for, counter-example can be enumerated.Model inspection skill
The use range of art is very extensive, such as investigates security, performance or the independence of system.
The content of the invention
In view of insufficient present in above-mentioned existing anti-single particle effect reliability verification method, the present invention is that a kind of assess is navigated
The verification method of IC-components reliability, availability, security, formal Verification Techniques are applied in its aviation field
System earlier design phase, system reliability, availability and security under different reinforcement techniques and parameter are analyzed, helps to design
Staff development goes out more reliable and more effective solution, reduces global design cost.
The technical solution adopted in the present invention is:A kind of reliability verification method of the anti-SEU based on PRISM, using general
Rate model detector, the design work of the chip in auxiliary aviation field.First by Program Generating CDFG flow graphs to be verified, CDFG
Flow graph includes controlling stream graph (CFG) and DFD (DFG), and system all state and behavioural information can be depicted;Afterwards
CDFG flow graphs are divided into the time step of the clock cycle corresponding to Method at Register Transfer Level (RTL) rank, reschedule failure
Recovery technology;The component that CDFG is distributed to varying number again is modeled, finally using PRISM probabilistic model verifiers to being
System design is modeled checking, specifically includes following steps:
Step 1:Anti-single particle effect reliability demonstration platform is built, it is a probabilistic model inspection of increasing income to build PRISM
Device, and tested, initial relevant configuration.
Described PRISM is a probabilistic model detector of increasing income, and it also includes multiple model testing engines, wherein several
(binary decision figure and its extension being used, such as multiple terminals binary decision figure) is realized based on symbol.These engines can be to bag
Model containing up to " 10 " ^10 state carries out probabilistic verification (PRISM average treatments at most " 10 " ^7 to " 10 " ^8 state
Model).PRISM also has a variety of advanced technologies, such as abstract to simplify and symmetrically reduce.In addition, it also supports to pass through discrete event
Simulation engine carries out approximation/statistical model and examined.
Step 2:From control flow chart (the control and data flow of high-level language description (such as C/C++) extraction
Graph, CDFG).
Described CDFG models be by for arithmetic or logical operation structure composition, all behaviors of algorithm can be represented,
The high-level design that the instrument such as GAUT, SUIF can be used to be expressed from high-level language (such as C/C++) extracts CDFG in describing.Total space flight
Aviation field C/C++ is conventional main flow high-level language, and the program that numerous C/C++ language are write will be write in embedded chip,
Apply among rocket, aerospace craft.Meanwhile this method is equally applicable to other language, such as Java language, Fortran languages
The older language of speech etc.
Step 3:The CDFG flow graphs extracted are modeled using PRISM modeling languages, first will be comprised the concrete steps that
CDFG is extracted, then using PRISM modeling languages to the CDFG with different configurations (applicable components quantity) and radiation ring
Under border situation (reinforcement technique, failure occur with recovery etc.) be modeled.The fault rate of wherein component is from component
What feature database obtained, verify various reliability attributes automatically using PRISM afterwards, whether meet to require with inspection system.
Described probabilistic model checking technology is for analyzing system that those show random behavior and for definition
Probability attribute verified automatically.Probabilistic model checking technology is successfully applied to many fields, such as random point
The field such as cloth algorithm, communication and security protocol, biology.Markov model is corresponding to this kind of random process of Markov chain
Model, have the property that:Under conditions of known current state (present), its following differentiation (future) is independent of it
Conventional differentiation (past).In real world, it is all Markov process to have many processes, such as the cloth that particulate is made in liquid
Bright motion, the infected number of infectious disease, the number of waiting at station etc., all can be considered Markov process.On the process
Research, A.H. Andrei Kolmogorovs exist within 1931《The analytic method of probability theory》First by the side of the analyses such as the differential equation in one text
Method is used for this class process, has established the theoretical foundation of Markov process.Probabilistic model has four models most commonly seen in examining:
Markov model (Discrete-Time Markov Chains), the Markov model of continuous time of discrete time
(Continuous-Time Markov Chains), markov decision model (Markov Decision Processes) and
The automodel (Probabilistic Time Automata) of probability times.When to system modelling, model will be according to system
The characteristics of behavior, is selected.
Brief description of the drawings
Fig. 1 is radiation single particle effect figure;
Fig. 2 is Single event upset effecf schematic diagram;
Fig. 3 is Virtex Series FPGA basic structure schematic diagrams;
Fig. 4 is the logic error schematic diagram of one three input look-up table caused by SEU;
Fig. 5 is TMR method basic structure schematic diagrams;
Fig. 6 is false code and corresponding DFG figures;
Fig. 7 is that CDFG reschedules fault recovery technology sample;
Fig. 8 is that probabilistic model examines schematic diagram;
Fig. 9 is the sample of CTMCs reliabilty and availability analysis;
Figure 10 be consider failure whether safety Safety modeling.
Embodiment
To be easy to understand the technical means, the inventive features, the objects and the advantages of the present invention, with reference to
Embodiment, the present invention is expanded on further.
The present invention is a kind of checking for assessing IC-components reliability, availability, security in field of aerospace
Method, formal Verification Techniques are applied to system earlier design phase, the system analyzed under different reinforcement techniques and parameter can
By property, availability and security, help designer to develop more reliable and more effective solution, reduction global design into
This.Consider that the reinforcement techniques such as memory refresh, redundancy are modeled to system design using PRISM probabilistic models verifier, establish
It is stateful that good model can describe the institute that system can reach.In the present invention, dispatch using CDFG and (be used for fault recovery),
Failure checking cover ratio (security is related) and feature database (being used for the single-particle inversion probability of happening that each component is provided).Below will
Theoretical and specific modeling is examined to be specifically addressed to probabilistic model.
For different types of model, there are corresponding many model inspection technologies.Probabilistic model checking technology be for
Analyze the system that those show random behavior and verified automatically for the probability attribute of definition.Probabilistic model checking skill
Art is successfully applied to many fields, such as the field such as accidental distributed algorithm, communication and security protocol, biology.
At present, all probabilistic models are all Markov models, Markov model be it is this kind of to Markov chain with
Model corresponding to machine process.Markov chain, proposed by Russia's mathematician A.A. markovs in 1907.The process has such as
Lower characteristic:Under conditions of known current state (present), its following differentiation (future) differentiation (mistake conventional independent of it
Go).In real world, it is all Markov process to have many processes, such as particulate is made in liquid Brownian movement, infectious disease
Infected number, the number of waiting at station etc., all can be considered Markov process.On the research of the process, 1931
A.H. Andrei Kolmogorov exists《The analytic method of probability theory》The method of the analyses such as the differential equation is used for first in one text this kind of
Process, the theoretical foundation of Markov process is established.CTMCs is the random process for having Markov property, i.e., known present s
When state X (s) and all last time u, 0≤u≤s state X (u) under conditions of, future time t+s state X (t+s)
Condition distribution only rely on present state X (s) and independent with the past.It is continuous the time that the characteristics of CTMCs, which is, during state from
It is scattered, such as weather forecast, the random walk of particle, gambling lose problem etc., in the FPGA based on SRAM over time by
A continuous Markov process is can be regarded as in the process that SEU breaks down.CTMCs includes one group of state S and transfer speed
Rate matrix R:S×S→R≥0.Speed R (s, s ') is defined on the delay before changing between state s and s '.If R (s,
S ') ≠ 0, then in time t, the probability of the transformation between state s and s ' is defined as 1-e- R (s, s ') × t.If R (s, s ')=
0, then it will not change.PRISM (Probabilistic Symbolic Model Checker) is a kind of conventional probability
Model checking tools.PRISM is a free software of increasing income issued by Oxford University 1999.As PRISM plus
After having carried model file and property file, we can verify some specific attribute or all attributes.In addition,
The concept of experiment (experiment) is also defined in PRISM.It is so-called once to test, it is exactly by the stateful change of institute to model
Amount assigns initial value, travels through out the once execution of model.According to the change of Model Parameter, PRISM can draw out model behavior
Variation tendency.Therefore, experiment can intuitively analyze the influence factor of system action very much.PRISM language includes two classes:Model
Language and attribute language.Model language is a kind of specification normative language based on system mode for modeling.And attribute language includes
Sequential logic, e.g., PCTL, CSL etc..PRISM provides the support automatically analyzed to the extensive qualitative attribute of model.
Because the SEU speed λ of the FPGA based on SRAM is highly dependent on device technology technology, framework and track, so should
Parameter is different for each train.Use CREME96[7]In HEO (HEO) and Low Earth Orbit (LEO)
In, the single-particle inversion probability λ of Xilinx Virtex-5 every bitbit.The fault rate of component can use below equation meter
Calculate:
λcomponent=λbit×Number of critical bits (1)
In the present invention, λbit=7.31 × 10-12SEUs/bit/sec, Number of critical bits are crucial
The number of position.
Table 1 gives the parameter situation that SEU causes component faults.First row is component, and secondary series is important bits number, one
As for, the crucial bits number of component is less than important position number, and we employ the important position of worst case, i.e. component in testing
It is crucial position entirely.3rd row are unsuccessfully interval time (MTBF), and unit is day.Component faults rate λ and failure interval time just like
Lower relation:
The feature database of table 1
1) reliabilty and availability analysis modeling
CTMCs models are frequently utilized for the Reliability modeling of degradable system.Represent every in the CTMCs models of particular configuration
Individual state can be divided into different type according to the quantity of normal component.Such as FIR filter at least needs an adder and one
Multiplier completes once successfully operation.Therefore, any state for not meeting minimum resources availability is all marked as unsuccessfully shape
State.Finally, the state to fail one by one due to SEU labeled as all component in the state representation system all to fail.Build herein
Mode step section does not take into account that safe and unsafe failure.How security is included into model to be described in detail in next trifle.
The original state all component of configuration is all available, and system has maximum handling capacity.Side between state represents conversion rate.Mould
It is as follows to intend hypothesis.
Assuming that 1:Assuming that component is all individually to break down, and when causing the failure of component due to configuration bit flipping
Between and memory refresh interval follow exponential distribution.Under this assumption, speed is repairedWherein τ represents memory refresh interval.
Assuming that 2:Assuming that only data flow component failure.Since in many systems, compared with controlling stream component, number
The overwhelming majority of design is occupied according to stream component, the likelihood ratio controlling stream component that data flow component is influenceed by SEU is much bigger.
Assuming that 3:Assuming that can only once have a component because SEU breaks down, and we to be easy to inspection system every
The failure of individual component.This hypothesis is in order to ensure the complexity of markov model is manageable.
Assuming that 4:Assuming that cold standby component can only break down in activity because of by cosmic radiation.Cold standby component is used
When the component failure of redundancy, only same type is provided, it is only activity.
Assuming that 5:Assuming that the time is reconfigured and rescheduled (that is, when system reschedules when component failure
Between and carry out by memory refresh repairing the required time) it is very small compared with the time between failure and reparation.Again adjust
Time needed for degree is preferably at most several clock cycle, and the time needed for memory refresh only has several milliseconds
Assuming that 6:Assuming that the institute in CTMCs models stateful can be divided into three types by us:
1) normal condition is operated:System all component is normal, and the handling capacity of system is maximum.
2) degeneration degrading state:At least one component failure of system, system continue work using remaining component resources
Make, but the handling capacity of system diminishes.
3) status of fail:The remaining components of system has been not enough to complete successfully to operate, handling capacity 0.
By taking the device with 2 adders and 2 multipliers as an example, Markovian state's metastasis model of the device is for example attached
Shown in Fig. 8, it is assumed that the device at least needs 1 adder and 1 multiplier to complete once successfully to operate, in state
A, M and numeral above represent adder, subtracter and the quantity of normal work respectively.According to state transition diagram, we make
The stateful and behavior of this system is described with PRISM codes.We are divided the state of system with formula (formula)
Class, we define three formula, i.e. operational, degraded and failed identifies that system operatio is normal, drop of degenerating respectively
Level and failure.
With PRISM to having 2 adders and the system modelling process of 2 multipliers as it appears from the above, wherein, num_A and
Num_M represents available adder and the quantity of multiplier under configuration original state.λ A and λ M variables represent adder and multiplication
The dependent failure rate of device, and miu represents repair rate.Each repair (memory refresh) will return to system initial state.Then,
PRISM builds corresponding probabilistic model, is CTMC in this case.Reparation conversion and label [repair] in code is same
Step, with demonstration when the FPGA memory refresh after, phenomenon that all component is all repaired simultaneously.Formula operational, degrade
With failed to the normal operating in model, degenerate degradation and status of fail are classified.
2) safety analysis models
Assuming that any fault detection algorithm correctly can be detected and handled, institute is faulty, but actual conditions are really not so,
Because the always faulty fault detection mechanism that can escape implementation.Therefore, system will be unable to reschedule, and system will be with failure
Pattern continues to run with.It means that implement CDFG configuration in each component may by safety or it is unsafe in a manner of lose
Lose.This is just needed by the concept for considering and introducing safe failure and unsafe failure come improved model.It is defined as follows:
Define 1:For component because the failure that SEU occurs is properly detected, system, which reschedules, finds remaining component
Quantity has been not enough to complete successfully to operate, and system enters failure of security state.
Define 2:Unsafe status of fail refers to that system can not detect the failure silence row occurred during component faults
For.If all component failure can be detected safely, system eventually enters safe status of fail, but even if only
A uneasy total failure occurs, system can also immediately enter unsafe status of fail.The failure checking cover ratio of component can be with
Determined by conditional probability C:
C=P (fault detection | fault exitence) (3.2)
Figure 10 shows the Modeling with Security of the simple single component system of only two adders (including repairing conversion).It is right
In such case, it is assumed that system at least needs an adder just to successfully complete operation.Initially, system is in operation normal mode
Formula, two adders.When an adder fails, if detecting failure, system will reschedule, and simply continue to use one
Individual adder.If being not detected by failure, unsafe status of fail is transferred to.If another adder fails, system
It will be unable to continue their operations with, therefore it will safely fail.But if being not detected by this failure, system will be finally with uneasiness
Full mode fails.Security is included in a model and needs to change assumes that 6 is as follows:
Assuming that 6:Assuming that the institute in CTMC stateful can be divided into four types:
1) operation is normal:All component function is normal, the handling capacity highest of system.
2) degrade and degenerate:At least one component failure.
3) failure of security:Operation of the lazy weight of remaining barrier component without reason to run succeeded, therefore handling capacity is 0,
In order to reach failure of security state, cause thrashing it is faulty must be safe.
4) dangerous failure:At least one failure is not detected by fault detection algorithm.The failure silence row of component
Enter dangerous status of fail to immediately result in system.
Failure checking cover ratio C is added afterwards, is by above-mentioned code revision:
Above-mentioned is the system modelling with 2 adders and 2 multipliers with PRISM to addition coverage rate (C), is led to
Cross the implementation steps that above step details the analysis method of this patent proposition.The analysis method emphasis is that probabilistic model is examined,
FPGA based on SRAM under radiation environment is idealized, is abstracted into a continuous Markov model, system mode we according to
The amount field of normal component is divided into three classes, i.e., operation is normal, degenerate degradation and failure.Conversion between state is by SEU speed λ
Determined with reparation speed μ.Wherein λ is relevant with the component MBTF in feature database, and μ is relevant with the speed of memory refresh.
Claims (6)
1. a kind of reliability verification method of the anti-SEU based on PRISM, it is a kind of that formal Verification Techniques are early applied to system
Design phase phase, system reliability, availability and security under different reinforcement techniques and parameter are analyzed, help designer to open
Send a kind of method of more reliable and more effective solution, it is characterised in that be that a kind of assess collects in field of aerospace
Into circuit devcie reliability, availability, security verification method, using PRISM probabilistic models verifier consider memory refresh,
The reinforcement techniques such as redundancy are modeled to system design, and it is stateful that the model established can describe the institute that system can reach.
Dispatch using CDFG and (be used for fault recovery), failure checking cover ratio (security is related) and feature database (are used to provide each component
Single-particle inversion probability of happening).
2. the reliability verification method of the anti-SEU according to claim 1 based on PRISM, it is characterised in that comprising as follows
Step:Comprise the following steps, A:Anti-single particle effect reliability demonstration platform is built, it is a probability mould of increasing income to build PRISM
Type checking device, and tested, initial relevant configuration.B:From the control flow chart of high-level language description (such as C/C++) extraction
(control and data flow graph, CDFG).C:The CDFG flow graphs extracted are entered using PRISM modeling languages
Row modeling, comprises the concrete steps that and first extracts CDFG, then using PRISM modeling languages to different configuration (available sets
Number of packages amount) CDFG and radiation environment under situation (reinforcement technique, failure occur with recovery etc.) be modeled.Wherein component
Fault rate obtains from module diagnostic storehouse, verifies various reliability attributes automatically using PRISM afterwards, is to check
Whether system meets to require.
3. the reliability verification method of the anti-SEU according to claim 1 based on PRISM, it is characterised in that reliability and
6 kinds of hypothesis are employed in availability analysis modeling, it is assumed that component individually breaks down, and configures bit flipping and cause component
Fault time and memory refresh interval follow exponential distribution, then simplify challenge.In numerous systems, data flow group
Part accounts for the overwhelming majority of design, therefore assumes there was only data flow component failure, it is assumed that once only has a component that SEU events occur
Barrier, it is ensured that markovian complexity is relatively low.Assuming that cold standby component can only be sent out in activity because of by cosmic radiation
Raw failure.When cold standby component is used to provide the component failure of redundancy, only same type, it is only activity.Assuming that
Reconfigure and to reschedule the time very small compared with the time between failure and reparation.It is assuming that all in CTMCs models
State can be divided into three types by us:Operate normal condition, degeneration degrading state, status of fail.
4. the reliability verification method of the anti-SEU according to claim 1 based on PRISM, it is characterised in that security point
In analysis modeling, the situation of handling failure is unable to come improved model by the concept of the safe failure of introducing and unsafe failure,
By definitions component because the failure that SEU occurs is properly detected, system, which reschedules, finds the quantity of remaining component
Successfully operated through being not enough to complete, system enters failure of security state, and unsafe status of fail refers to that system can not be examined
Measure the failure silence behavior occurred during component faults.If all component failure can safely be detected that system is final
The status of fail of safety can be entered, but even if a uneasy total failure only occurs, system can also immediately enter unsafe mistake
Lose state.The failure checking cover ratio of component can be determined by conditional probability C.
5. the failure checking cover ratio of component according to claim 4, it is characterised in that the failure checking cover ratio of component
It can be determined by conditional probability C:Condition probability formula C is C=P (fault detection | fault exitence),
Show the Modeling with Security of the simple single component system of only two adders (including repairing conversion).In this case,
Assuming that system at least needs an adder just to successfully complete operation.Initially, system is in operation normal mode, two additions
Device.When an adder fails, if detecting failure, system will reschedule, and simply continue to use an adder.Such as
Fruit is not detected by failure, then is transferred to unsafe status of fail.If another adder fails, system will be unable to continue it
Operation, therefore it will safely fail.
6. the failure checking cover ratio of component according to claim 5, it is characterised in that repaiied security is included in model
Change the situation assumed in 6, it will be assumed that the institute in CTMC stateful can be divided into four types:(1) operation is normal:All component work(
Can be normal, the handling capacity highest of system.(2) degrade and degenerate:At least one component failure.(3) failure of security:It is remaining
Operation of the lazy weight of barrier component to run succeeded without reason, therefore handling capacity is 0, in order to reach failure of security state, is caused
Thrashing it is faulty must be safe.(4) dangerous failure:At least one failure is not by fault detection algorithm
Detect.The failure silence behavior of component immediately results in system and enters dangerous status of fail.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711102436.1A CN107844407A (en) | 2017-11-06 | 2017-11-06 | A kind of reliability verification method of the anti-SEU based on PRISM |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711102436.1A CN107844407A (en) | 2017-11-06 | 2017-11-06 | A kind of reliability verification method of the anti-SEU based on PRISM |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107844407A true CN107844407A (en) | 2018-03-27 |
Family
ID=61682547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711102436.1A Pending CN107844407A (en) | 2017-11-06 | 2017-11-06 | A kind of reliability verification method of the anti-SEU based on PRISM |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107844407A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112799890A (en) * | 2020-12-31 | 2021-05-14 | 南京航空航天大学 | Bus SEU-resistant reliability modeling and evaluating method |
CN114741133A (en) * | 2022-04-21 | 2022-07-12 | 中国航空无线电电子研究所 | Comprehensive modularized avionics system resource allocation and evaluation method based on model |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104007427A (en) * | 2014-05-20 | 2014-08-27 | 上海微小卫星工程中心 | Mean free error time assessment method and system based on irradiation test satellite-borne responder |
CN104035828A (en) * | 2014-05-19 | 2014-09-10 | 上海微小卫星工程中心 | FPGA space irradiation comprehensive protection method and device |
CN105185413A (en) * | 2015-09-24 | 2015-12-23 | 中国航天科技集团公司第九研究院第七七一研究所 | Automatic verification platform and method for on-chip memory management unit fault-tolerant structure |
CN106649173A (en) * | 2016-10-10 | 2017-05-10 | 上海航天控制技术研究所 | High-reliability in-orbit self-correction system and method for on-board computer on the basis of 1553B bus |
WO2017101238A1 (en) * | 2015-12-16 | 2017-06-22 | 南京南瑞继保电气有限公司 | Apparatus and method for ensuring reliability of protection trip of intelligent substation |
-
2017
- 2017-11-06 CN CN201711102436.1A patent/CN107844407A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104035828A (en) * | 2014-05-19 | 2014-09-10 | 上海微小卫星工程中心 | FPGA space irradiation comprehensive protection method and device |
CN104007427A (en) * | 2014-05-20 | 2014-08-27 | 上海微小卫星工程中心 | Mean free error time assessment method and system based on irradiation test satellite-borne responder |
CN105185413A (en) * | 2015-09-24 | 2015-12-23 | 中国航天科技集团公司第九研究院第七七一研究所 | Automatic verification platform and method for on-chip memory management unit fault-tolerant structure |
WO2017101238A1 (en) * | 2015-12-16 | 2017-06-22 | 南京南瑞继保电气有限公司 | Apparatus and method for ensuring reliability of protection trip of intelligent substation |
CN106649173A (en) * | 2016-10-10 | 2017-05-10 | 上海航天控制技术研究所 | High-reliability in-orbit self-correction system and method for on-board computer on the basis of 1553B bus |
Non-Patent Citations (3)
Title |
---|
LUCA CASSANO: "Analysis and test of the effects of single event upsets affecting the configuration memory of SRAM-based FPGAs", 《IEEE INTERNATIONAL TEST CONFERENCE》 * |
MARWAN AMMAR ET AL: "System-Level Analysis of the Vulnerability of Processors Exposed to Single-Event Upsets via Probabilistic Model Checking", 《IEEE TRANSACTIONS ON NUCLEAR SCIENCE》 * |
王辉等: "一种考虑防护措施的缓存可靠性评估方法", 《东南大学学报(自然科学版)》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112799890A (en) * | 2020-12-31 | 2021-05-14 | 南京航空航天大学 | Bus SEU-resistant reliability modeling and evaluating method |
CN114741133A (en) * | 2022-04-21 | 2022-07-12 | 中国航空无线电电子研究所 | Comprehensive modularized avionics system resource allocation and evaluation method based on model |
CN114741133B (en) * | 2022-04-21 | 2023-10-27 | 中国航空无线电电子研究所 | Comprehensive modularized avionics system resource allocation and assessment method based on model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Dubrova | Fault-tolerant design | |
Easterbrook et al. | Experiences using lightweight formal methods for requirements modeling | |
EP2876519A2 (en) | Safety analysis of a complex system using component-oriented fault trees | |
Nor et al. | Reliability engineering applications in electronic, software, nuclear and aerospace industries: A 20 year review (2000–2020) | |
Benard et al. | The Safe-SADT method for aiding designers to choose and improve dependable architectures for complex automated systems | |
CN107844407A (en) | A kind of reliability verification method of the anti-SEU based on PRISM | |
Zhao et al. | Safety assessment of the reconfigurable integrated modular avionics based on STPA | |
Li et al. | Integrating software into PRA | |
Li et al. | Safety analysis of software requirements: model and process | |
Pontes et al. | Contributions of model checking and CoFI methodology to the development of space embedded software | |
Guiotto et al. | SMART-FDIR: Use of Artificial Intelligence in the Implementation of a Satellite FDIR | |
Zalewski et al. | Safety of computer control systems: challenges and results in software development | |
Ruiz et al. | Towards a case-based reasoning approach for safety assurance reuse | |
Gomes et al. | Constructive model-based analysis for safety assessment | |
Li et al. | Integrating Software into PRA: A Software‐Related Failure Mode Taxonomy | |
Jung et al. | A practical application of NUREG/CR-6430 software safety hazard analysis to FPGA software | |
Hoque | Early dependability analysis of FPGA-based space applications using formal verification | |
McNelles et al. | Failure mode taxonomy for assessing the reliability of Field Programmable Gate Array based Instrumentation and Control systems | |
Oveisi et al. | Design Safe Software via UML-based SFTA in Cyber Physical Systems | |
Ravikumar et al. | A Survey on different software safety hazard analysis and techniques in safety critical systems | |
Nguyen | Trustworthy spacecraft design using formal methods | |
Tierno | Automatic Design Space Exploration of Fault-tolerant Embedded Systems Architectures | |
Chu et al. | A review of software-induced failure experience | |
Aliee | Reliability analysis and optimization of embedded systems using stochastic logic and importance measures | |
Feather et al. | Emerging technologies for V&V of ISHM software for space exploration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180327 |