CN105955834A - Method for optimizing system available time through considering system life and soft errors - Google Patents

Method for optimizing system available time through considering system life and soft errors Download PDF

Info

Publication number
CN105955834A
CN105955834A CN201610311676.1A CN201610311676A CN105955834A CN 105955834 A CN105955834 A CN 105955834A CN 201610311676 A CN201610311676 A CN 201610311676A CN 105955834 A CN105955834 A CN 105955834A
Authority
CN
China
Prior art keywords
mttf
task
time
frequency
delta
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610311676.1A
Other languages
Chinese (zh)
Inventor
魏同权
周俊龙
梁文彬
邵高原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201610311676.1A priority Critical patent/CN105955834A/en
Publication of CN105955834A publication Critical patent/CN105955834A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/008Reliability or availability analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present invention discloses a method for optimizing system available time through considering system life and soft errors. The method includes the following steps of modeling for a task set Tn and a processor; and in a case in which the task set Tn and a processor task execution frequency set Fn are given, calculating mean time between temporary failures MTTF<T> by using a provided analysis method and calculating mean time between permanent failures MTTF<P> by using a system-level life reliability modeling tool, wherein overall available time of a system depends on a smaller value of the MTTF<T> and the MTTF<P>. The relationship between the [gamma]MTTF<T> and the MTTF<P>, shown in the description, is classified into four types and then corresponding processing is carried out, so the overall available time of the system is maximized. For a case (ii), a partial backup and accelerated heuristic algorithm is adopted. Traditional reliability optimization usually focuses on only one of system life reliability and soft error reliability. Under the premise of fully considering a permanent failure, a temporary failure and a throughput constraint, the present invention provides the method, thereby solving the problem of maximization of the available time of the system.

Description

A kind of method optimizing system uptime weighing lifetime of system and soft error
Technical field
The present invention relates to the knowledge of real-time system, particularly relate to one and consider lifetime of system and soft error reliability two Plant different reliabilities, by the method weighing two kinds of reliability optimization system entirety pot lifes;Specifically a kind of balance Lifetime of system and the method optimizing system uptime of soft error.
Background technology
In 30 years of past, along with the raising of transistor integrated level on chip, the performance of processor becomes the most superior. But the raising of power density causes operating steeply rising and frequently changing of temperature, can accelerate due to electron transfer (EM), time The chip that the factors such as the dielectric breakdown (TDDB) of decision, stress migration (SM), thermal cycle (TC) cause is aging.Along with aging adds Speed, permanent fault is more easy to occur, causes the minimizing of lifetime of system.The reduction of operation voltage makes integrated circuit be more vulnerable to wink Time fault puzzlement, reduce soft error reliability.Expert has carried out in-depth study with regard to these problems both at home and abroad.Amrouch Et al. have studied influencing each other between the aging impact for lifetime of system of one-sided mechanism and each side mechanism.Chantem etc. People proposes task distribution and the dispatching algorithm of a kind of online reliability perception, and this method can slow down the aging speed of processor Degree.Huang et al. develops a kind of software/hardware Restoration Mechanism based on dispatching algorithm and tolerates permanent and instantaneity two side The fault in face.
The one of two kinds of reliabilities is simply paid close attention in existing great majority work, and the optimization for one of which would generally be weak Change alternative reliability.Give overall consideration to two kinds of reliabilities, utilize unified quantification of targets to weigh and seem critically important.For mostly For number system, user only focuses on and maximizes system uptime in raising MTTF (mean free error time), and does not exist The kind (instantaneous or permanent) of meaning fault.The reduction of operation voltage makes integrated circuit be more vulnerable to the puzzlement of transient fault, Reduce the reliability of soft error.Reduce repair, replace whole system cost and maintain service quality, improve lifetime of system and Soft error reliability becomes the most valuable.
Summary of the invention
The purpose of the present invention proposes for system entirety pot life optimization problem, when being optimized, and Quan Miankao Consider reliability of both lifetime of system and soft error reliability, according to the relation of two kinds of reliabilities, use different process sides Formula to realize the optimization of system entirety pot life, a kind of system uptime that optimizes weighing lifetime of system and soft error Method.For situation (ii) γ MTTFT<MTTFPRather than γ MTTFT<<MTTFP, have employed opening of a kind of incremental backup and acceleration Hairdo algorithm (PRS).The present invention, considering permanent fault, transient fault and throughput constraints on the premise of comprehensively, formulates and conciliates The certainly maximized problem of system uptime.
The object of the present invention is achieved like this:
A kind of method optimizing system uptime weighing lifetime of system and soft error, it is characterised in that the method includes Following steps:
Step one: task-set and processor model are set up;
Step 2: calculate γ MTTFTAnd MTTFP, wherein MTTFTFor averagely without the time of transient fault, MTTFPFor averagely Without the time of permanent fault, γ isMinima in Liang Zhe is i.e. defined as the pot life that system is overall;
Step 3: according to γ MTTFTAnd MTTFPRelation, carry out correspondence process.Situation (i) γ MTTFT<<MTTFP, Go to step four, situation (ii) γ MTTFT<MTTFPRather than γ MTTFT<<MTTFP, go to step five, situation (iii) γ MTTFT> MTTFPRather than γ MTTFT> > MTTFP, go to step six, situation (iv) γ MTTFT> > MTTFPAnd go to step seven;
Step 4: at (i) γ MTTFT<<MTTFPIn the case of, each ancestral task in task-set is backed up, for Ancestral task and backup tasks, processor is all with peak frequency fmaxExecution task, and go to step eight;
Step 5: at (ii) γ MTTFT<MTTFPRather than γ MTTFT< < MTTFPIn the case of, have employed a kind of part standby Part and the heuritic approach (PRS) accelerated select partial task to carry out backup operation or with peak frequency fmaxPerform task operating, And go to step eight;
Step 6: at (iii) γ MTTFT>MTTFPRather than γ MTTFT>>MTTFPIn the case of, use DVS method, by appointing The scaling fall low operating temperature of business frequency, it is achieved MTTFPRaising, go to step eight;
Step 7: at (iv) γ MTTFT>>MTTFPIn the case of, use a kind of method optimizing reliability of service life to adjust Whole, it is achieved MTTFPRaising, go to step eight;
Step 8: optimize and terminate.
Described step 1 specifically includes:
Step A1: the foundation of task-set model:
Wherein: n is task-setThe number of middle independent task;
Step A2: the foundation of task execution time model:
ti=ci/fi
Wherein: tiFor task τiIt is f in frequencyiTime the execution time, fi(fmin≤fi≤fmax) be processor operation frequency (frequency is with f for ratemaxBeing normalized for standard, frequency values scope is [0,1]), fminFor processor minimum operation frequency, fmaxFor processor maximum operation frequency, ciFor task τiAt peak frequency fmaxThe lower execution time;
Described step 2 specifically includes:
Step B1: the foundation of failure rate model:
&lambda; ( f ) = &lambda; 0 10 d ( 1 - f ) 1 - f min
Wherein: λ0For at processor maximum operation frequency fmaxTime failure rate, d (> 0) be hardware specific constant, f For processor frequencies, fminFor processor minimum operation frequency;
Step B2: the foundation of mission reliability model:
R i = e - &lambda; ( f i ) c i f i
Wherein: λ (fi) it is operation frequency fiTime failure rate, ciFor task τiAt peak frequency fmaxDuring lower execution Between, fiOperation frequency for processor;
Step B3: the foundation of the mission reliability model after employing backup method:
R i r e p = 1 - ( 1 - e - &lambda; ( f i ) c i f i ) 2
Wherein: λ (fi) it is operation frequency fiTime failure rate, ciFor task τiAt peak frequency fmaxDuring lower execution Between, fiOperation frequency for processor;
Step B4: task-setThe foundation of failure rate model:
Wherein: RiFor task τiIt is f performing frequencyiTime reliability;
Step B5: averagely without the time MTTF of transient faultTThe foundation of model:
Wherein:For the execution time of task lump, The first round executory expected time in task-set is there is for first-time fault;
Step B6: system reliability model is set up:
{ MTTF T MTTF T + MTTR T , MTTF P MTTF P + MTTR P }
Wherein: MTTFTFor averagely without the time of transient fault, MTTFPFor averagely without the time of permanent fault, MTTRTFor Average transient fault repair time, MTTRPFor average permanent fault repair time;
Step B7: optimization of system reliability:
max:{γMTTFT,MTTRP}
Wherein: MTTFTFor averagely without the time of transient fault, MTTFPFor averagely without the time of permanent fault, γ is
Described step 4 specifically includes:
Step C1: at (i) γ MTTFT<<MTTFPIn the case of, system uptime optimization:
To task-set τnIn each ancestral task carry out backup operation, for ancestral task and backup tasks, processor All with peak frequency fmaxExecution task
Wherein: fmaxFor processor peak frequency;
Described step 5 specifically includes:
Step D1: at (ii) γ MTTFT<MTTFPRather than γ MTTFT<<MTTFPIn the case of, system uptime optimization:
The algorithm (PRS) using incremental backup and acceleration selects partial task carry out backup operation or accelerate (with maximum frequency Rate fmaxPerform task operating);
Wherein: fmaxFor processor peak frequency;
Step D2: the backup of individual task or the foundation of acceleration mark:
Step D3: backup operation soft error reliability increment and the difference model accelerating operation soft error reliability increment are built Vertical:
&Delta;R i r - s = &Delta;R i r - &Delta;R i s = 2 R i - R i 2 - R i | f i = f m a x
Wherein: For task τ after employing backupiReliability, RiFor without task τ during backupi Reliability,It is f for operation frequencymaxTime task τiReliability;
Step D4:RS state determines algorithm:
1)
2)
3)
4)
5)
6)
7)
8)
9)return RSi
Wherein:For task τ after employing backupiReliability, RiFor without task τ during backupi Reliability,It is f for operation frequencymaxTime task τiReliability, f*For working as Time corresponding processor frequencies, work as RSiBackup operation is carried out when=1;Work as RSiOperation it is accelerated, to optimize soft error when=0 Reliability.
Step D5:PRS heuritic approach:
Wherein:For the task-set of input, mission frequency set Fn={ f1,f2,…fi,…fn, fminMinimum for processor Operation frequency, fmaxFor processor maximum operation frequency, γ is MTTFPAnd MTTRTRatio,For pending backup or add The task-set of speed operation,For the most not backing up the system current task collection the most not accelerating operation, first system preserves one IndividualBackup, update after decision when every step is finishedThen iteration comparesWith's Value, whenTime, first calculate each task τ to be operatediThe MTTF increasedT, determine that it is standby Part or acceleration operation, the MTTF of systemTMaximal increment is the most found, updates currentWhen Condition when being unsatisfactory for, algorithm returns maximum system uptime;
Described step 6 specifically includes:
Step E1: at (iii) γ MTTFT>MTTFPRather than γ MTTFT>>MTTFPIn the case of, system uptime optimization:
Use DVS method, by mission frequency fiScaling fall low operating temperature
Wherein: fi(fmin≤fi≤fmax) it is the operation frequency of processor;
Described step 7 specifically includes:
Step F1: at (iv) γ MTTFT> > MTTFPIn the case of, system uptime optimization:
A kind of method optimizing reliability of service life is utilized to be adjusted.
The present invention is by calculating the average time MTTF without transient faultTWork is modeled with utilizing system level reliability of service life Tool calculates the average time MTTF without permanent faultP, according to γ MTTFTAnd MTTFPRelation, be divided into 4 kinds of situations to carry out correspondence Process, the pot life overall to maximize system, whereinIt is set to definite value.For situation (ii), have employed one Plant incremental backup and the heuritic approach (PRS) of acceleration.Traditional reliability optimization often only focuses on lifetime of system and soft error The one of reliability, the optimization for one would generally weaken another kind.The present invention is considering permanent fault, transient fault comprehensively On the premise of throughput constraints, formulate and solve the maximized problem of system uptime.
Accompanying drawing explanation
Fig. 1 is flow chart of the present invention;
Fig. 2 is γ MTTF in 15 different benchmark task collection testsTAnd MTTFPResult figure;
Fig. 3 is under 4 kinds of different conditionsThe explanatory diagram of value;
Fig. 4 is the PRS algorithm in the present invention and the contrast signal in synthesis benchmark task collection test of other 4 kinds of algorithms Figure;
Fig. 5 is the PRS algorithm in the present invention and the contrast signal in true benchmark task collection is tested of other 4 kinds of algorithms Figure.
Detailed description of the invention
Below in conjunction with drawings and the specific embodiments, the present invention is described in further detail.
Task-set in the present inventionIt is made up of n independent task, given meeting In the case of handling capacity, task-set continuously performs.Dynamic voltage/frequency scaling is used to perform task-set, frequency sets F can be expressed asi(fmin≤fi≤fmax), fminFor processor minimum operation frequency, fmaxFor processor maximum operation frequency, appoint Business is with frequency fiTime required during execution is ti=ci/fi, ciFor task τiAt peak frequency fmaxThe lower execution time.
The present invention uses λ0Represent at processor maximum operation frequency fmaxTime failure rate, mean failure rate is about λ0 Exponential function, be represented byWith frequency fiDuring execution, mission reliability can be expressed asUsing backup tolerance transient fault in the present invention, the mission reliability after backing up can be expressed as Task-setFailure rate model is i.e. represented byThis In brightFor the execution time of task lump,For event first There is the first round executory expected time in task-set in barrier.UtilizeWithThree parameters, MTTFT(averagely without the time of transient fault) is i.e. expressed as
The present invention is meeting given throughput constraints condition, and considers lifetime of system and soft error reliability failure simultaneously In the case of, to formulate a single-object problem realization of goal system uptime and maximized, system uptime optimizes mould Type is established asMTTFTFor averagely without the time of transient fault, MTTFPFor average without permanent The time of property fault, MTTRTThe time repaired for average transient fault, MTTRPThe time repaired for average permanent fault.WillIt is set to γ, γ and can be seen as constant, then consider under lifetime of system and soft error reliability situation comprehensively, system entirety Pot life is decided by γ MTTFTAnd MTTRPIn smaller value, for making system uptime maximize, problem can be converted into max:{γMTTFT,MTTRP}。
As it is shown in figure 1, in the present invention, according to γ MTTFTAnd MTTRPRelation, be divided into 4 kinds of situations to carry out classification process.? Before classification processes, first verify that 4 kinds of situations are implicitly present in.Consider different benchmark task collection test program, core status, Under the factor such as temperature curve and frequency, the present invention has carried out some simulation experiments and has compared γ MTTFTAnd MTTRPValue.Given identical Processor, after change benchmark task collection test program and parameter are set for experiment, result is as in figure 2 it is shown, four kinds of situations are certain Exist.
For situation (i) γ MTTFT<<MTTFP, task-setIn each ancestral task carry out backup operation, for former Beginning task and backup tasks, processor is all with peak frequency fmaxExecution task;For situation (iii) γ MTTFT>MTTFPRather than γMTTFT>>MTTFP, use DVS method, by mission frequency fiScaling fall low operating temperature, to realize system uptime Maximize;For situation (iv) γ MTTFT>>MTTFP, use the strategy of reliability of service life perception to be adjusted, it is achieved system can With the maximization of time.The solution of above 3 kinds of situations has had a variety of, therefore, only makes a brief description in the present invention.Right In situation (ii) γ MTTFT<MTTFPRather than γ MTTFT<<MTTFP, have employed the heuritic approach of a kind of incremental backup and acceleration (PRS)。
In the heuritic approach of incremental backup and acceleration, first basis InValue determineWithBetween relation, wherein backup operation soft error reliability increment is For task τ after employing backupiReliability, RiFor without task τ during backupiReliability;Accelerate Operation soft error reliability incrementIt is f for operation frequencymaxTime task τiReliably Property, RiFor frequency fiTime mission reliability.Fig. 3 is under 4 kinds of different conditionsThe explanation of value.
Determine that algorithm determines RS according to RS state in step E4iValue, RSiWhen being 1, task τiCarry out backup operation;RSi When being 0, task τiIt is accelerated operation.Temporary duty collectionFor treating the task-set of decision-making (back up or accelerate), first preserve OneCopy, be updated after each decision-making.System current task collectionIt is initialised toIncremental backup and The workload iteration that system is current is compared γ MTTF by the heuritic approach (PRS) acceleratedTAnd MTTFPValue.When system meetsTime, algorithm first calculates each task to be operatedThe MTTF increasedT, certainly Its operation (back up or accelerate) fixed, the MTTF of systemTMaximal increment is the most found.After carrying out the operation of complete part or acceleration, Update currentBy τiFromIn remove.Finally, whenCondition be discontented with During foot, algorithm returns maximum system uptime.
In the present invention, after 4 kinds of different situations are carried out the operation of correspondence, consider lifetime of system and soft error failure comprehensively In the case of, it is achieved system uptime is maximum, optimizes complete.
In order to illustrate that the heuritic approach (PRS) of incremental backup and the acceleration proposed in the present invention is to system uptime Optimizing, experimental section will be carried out according to following steps:
1. in synthesis task-set benchmark test, the PRS algorithm in the present invention and the contrast situation of other 4 kinds of algorithms.
2. in true benchmark task collection is tested, the PRS algorithm in the present invention and the contrast situation of other 4 kinds of algorithms.
Embodiment
Step 1: utilize Alpha 21264 microprocessor to be respectively as follows: as hardware platform, processor frequencies/voltage level 1.0GHz/0.7V, 1.25GHz/0.8V, 1.5GHz/0.9V, 1.75GHz/1.0V and 2.0GHz/1.1V, utilize HotSpot5.0 Obtain temperature curve, initial temperature and ambient temperature and be set to 330 °F and 318.5 °F.
Step 2: utilize system-level modeling for life instrument to obtain MTTFP, for estimating MTTFT, parameter is set to λ0=10-7、 D=3, in simulation experiment, arranges γ=1.
Step 3:5 group verifies PRS algorithm and other 4 kinds of algorithms based on the benchmark test that synthesis task-set is the most integrated, Each task-set comprises 20 tasks, and experimental result is as shown in Figure 4.
4 kinds of comparison algorithms are respectively as follows: random algorithm (RA), and task backs up at random or accelerates;Accelerating algorithm completely (FSA), each task performs with peak frequency;Algorithm (FRA) completely, each task has a backup operation;Energy Amount is saved and reliability perception algorithm (ERA), and each task performs in the frequency of an energy efficient, and each task is altogether Enjoy a recovery tasks to guarantee system reliability.
PRS algorithm and other 4 kinds of algorithms, application are verified in benchmark task collection based on the true application test of step 4:4 group It is respectively automobile industry (automotive-industrial), consumption network (consumer-networking), telecommunications And mpeg standard (telecom), each task-set comprises 16,20,17,15 tasks respectively, experimental result such as Fig. 5 institute Show.4 kinds of comparison algorithms are ibid.
System entirety pot life is decided by γ MTTFTAnd MTTFPIn smaller value.The result of Fig. 4 and Fig. 5 shows invention In PRS algorithm be better than RA algorithm, FSA algorithm, FRA algorithm and ERA algorithm, average raising ratio is respectively 36.6%, 19.8%, 24.3% and 11.5%.Further improving PRS algorithm, relative 4 kinds of comparison algorithms, the raising of performance is respectively 85%, 33.5%, 51% and 45.1% can be reached.
The present invention proposes a new analytic method and calculates the MTTF determined due to soft error, and elaborates one Balance lifetime of system and the method optimizing system uptime of soft error.For situation (ii), have employed a kind of incremental backup With the heuritic approach (PRS) accelerated.Experiment shows, the algorithm representative relative to 4 kinds, our heuritic approach (PRS) there is well optimization for system uptime.

Claims (7)

1. the method optimizing system uptime weighing lifetime of system and soft error, it is characterised in that the method include with Lower step:
Step 1: task-set and processor model are set up;
Step 2: calculate γ MTTFTAnd MTTFP, wherein MTTFTFor averagely without the time of transient fault, MTTFPFor average without permanent The time of property fault, γ isMinima in Liang Zhe is i.e. defined as the pot life that system is overall;
Step 3: according to γ MTTFTAnd MTTFPRelation, point 4 kinds of situations process;Situation (i) γ MTTFT< < MTTFP, turn Step 4;Situation (ii) γ MTTFT<MTTFPRather than γ MTTFT< < MTTFP, go to step 5;Situation (iii) γ MTTFT>MTTFP Rather than γ MTTFT> > MTTFP, go to step 6;Situation (iv) γ MTTFT> > MTTFP, go to step 7;
Step 4: at (i) γ MTTFT< < MTTFPIn the case of, each ancestral task in task-set is carried out backup operation, right In ancestral task and backup tasks, processor is all with peak frequency fmaxExecution task, and go to step 8;
Step 5: at (ii) γ MTTFT<MTTFPRather than γ MTTFT< < MTTFPIn the case of, have employed a kind of incremental backup and add The heuritic approach (PRS) of speed selects partial task to carry out backup operation or with peak frequency fmaxPerform task operating, and turn step Rapid 8;
Step 6: at (iii) γ MTTFT>MTTFPRather than γ MTTFT> > MTTFPIn the case of, use DVS method, by task frequency The scaling fall low operating temperature of rate, it is achieved MTTFPRaising, go to step 8;
Step 7: at (iv) γ MTTFT> > MTTFPIn the case of, use a kind of method optimizing reliability of service life to be adjusted, real Existing MTTFPRaising, go to step 8;
Step 8: optimize and terminate.
2. optimization system uptime method as claimed in claim 1, it is characterised in that described step 1 specifically includes:
Step A1: the foundation of task-set model:
Wherein: n is task-setThe number of middle independent task;
Step A2: the foundation of task execution time model:
ti=ci/fi
Wherein: tiFor task τiIt is f in frequencyiTime the execution time, fi(fmin≤fi≤fmax) it is the operation frequency of processor, frequently Rate is with fmaxBeing normalized for standard, frequency values scope is [0,1], fminFor processor minimum operation frequency, fmaxFor place Reason device maximum operation frequency, ciFor task τiAt peak frequency fmaxThe lower execution time.
3. optimization system uptime method as claimed in claim 1, it is characterised in that described step 2 specifically includes:
Step B1: the foundation of failure rate model:
&lambda; ( f ) = &lambda; 0 10 d ( 1 - f ) 1 - f min
Wherein: λ0For at processor maximum operation frequency fmaxTime failure rate, d (> 0) be hardware specific constant, f for process Device frequency, fminFor processor minimum operation frequency;
Step B2: the foundation of mission reliability model:
R i = e - &lambda; ( f i ) c i f i
Wherein: λ (fi) it is operation frequency fiTime failure rate, ciFor task τiAt peak frequency fmaxThe lower execution time, fiFor The operation frequency of processor;
Step B3: the foundation of the mission reliability model after employing backup method:
R i r e p = 1 - ( 1 - e - &lambda; ( f i ) c i f i ) 2
Wherein: λ (fi) it is operation frequency fiTime failure rate, ciFor task τiAt peak frequency fmaxThe lower execution time, fiFor The operation frequency of processor;
Step B4: task-setThe foundation of failure rate model:
Wherein: RiFor task τiIt is f performing frequencyiTime reliability;
Step B5: averagely without the time MTTF of transient faultTThe foundation of model:
Wherein:For the execution time of task lump,Headed by There is the first round executory expected time in task-set in secondary fault;
Step B6: system entirety pot life model is set up:
{ MTTF T MTTF T + MTTR T , MTTF P MTTF P + MTTR P }
Wherein: MTTFTFor averagely without the time of transient fault, MTTFPFor averagely without the time of permanent fault, MTTRTFor averagely Transient fault repair time, MTTRPFor average permanent fault repair time;
Step B7: system uptime optimization:
max:{γMTTFT,MTTRP}
Wherein: MTTFTFor averagely without the time of transient fault, MTTFPFor averagely without the time of permanent fault, γ is
4. optimization system uptime method as claimed in claim 1, it is characterised in that described step 4 specifically includes:
Step C1: at (i) γ MTTFT< < MTTFPIn the case of, system uptime optimization:
To task-setIn each ancestral task carry out backup operation, for ancestral task and backup tasks, processor is all with Big frequency fmaxExecution task
Wherein: fmaxFor processor peak frequency.
5. optimization system uptime method as claimed in claim 1, it is characterised in that described step 5 specifically includes:
Step D1: at (ii) γ MTTFT<MTTFPRather than γ MTTFT< < MTTFPIn the case of, system uptime optimization:
The algorithm (PRS) using incremental backup and acceleration selects partial task carry out backup operation or accelerate (with peak frequency fmax Perform task operating);
Wherein: fmaxFor processor peak frequency;
Step D2: the backup of individual task or the foundation of acceleration mark:
Step D3: backup operation soft error reliability increment and the difference model accelerating operation soft error reliability increment are set up:
&Delta;R i r - s = &Delta;R i r - &Delta;R i s = 2 R i - R i 2 - R i | f i = f m a x
Wherein: For task τ after employing backupiReliability, RiFor without task τ during backupiCan By property,It is f for operation frequencymaxTime task τiReliability;
Step D4:RS state determines algorithm:
1 ) - - - if&Delta;R i r - s ( f m i n ) > 0
2 ) - - - t h e n RS i = 1 ; / / &Delta;R i r - s ( f i ) > &Delta;R i r - s ( f m i n ) &DoubleRightArrow; &Delta;R i r > &Delta;R i s
3 ) - - - if&Delta;R i r - s ( f m a x ) < 0
4 ) - - - t h e n RS i = 0 ; / / &Delta;R i r - s ( f i ) < &Delta;R i r - s ( f m a x ) &DoubleRightArrow; &Delta;R i r < &Delta;R i s
6 ) - - - t h e n RS i = 1 ; / / &Delta;R i r - s ( f i ) > &Delta;R i r - s ( f * ) &DoubleRightArrow; &Delta;R i r > &Delta;R i s
8 ) - - - t h e n RS i = 0 ; / / &Delta;R i r - s ( f i ) < &Delta;R i r - s ( f * ) &DoubleRightArrow; &Delta;R i r < &Delta;R i s
9)return RSi
Wherein: For task τ after employing backupiReliability, RiFor without task τ during backupiCan By property,It is f for operation frequencymaxTime task τiReliability, f*For working asTime Corresponding processor frequencies, works as RSiBackup operation is carried out when=1;Work as RSiBe accelerated when=0 operation, with optimize soft error can By property;Step D5:PRS heuritic approach:
Wherein:For the task-set of input, mission frequency set Fn={ f1,f2,…fi,…fn, fminFor processor minimum operation Frequency, fmaxFor processor maximum operation frequency, γ is MTTFPAnd MTTRTRatio,For pending backup or acceleration behaviour The task-set made,For the most not backing up the system current task collection the most not accelerating operation, first system preserves one Backup, update after decision when every step is finishedThen iteration comparesWithValue, whenTime, first calculate each task to be operatedThe MTTF increasedT, determine standby Part or acceleration operation, the MTTF of systemTMaximal increment is the most found;After carrying out the operation of complete part or acceleration, update current 'sBy τiFromIn remove;Finally, whenCondition when being unsatisfactory for, calculate Method returns maximum system uptime.
6. optimization system uptime method as claimed in claim 1, it is characterised in that described step 6 specifically includes:
Step E1: at (iii) γ MTTFT>MTTFPRather than γ MTTFT> > MTTFPIn the case of, system uptime optimization:
Use DVS method, by mission frequency fiScaling fall low operating temperature
Wherein: fi(fmin≤fi≤fmax) it is the operation frequency of processor.
7. optimization system uptime method as claimed in claim 1, it is characterised in that described step 7 specifically includes:
Step F1: at (iv) γ MTTFT> > MTTFPIn the case of, system uptime optimization:
A kind of method optimizing reliability of service life is utilized to be adjusted.
CN201610311676.1A 2016-05-12 2016-05-12 Method for optimizing system available time through considering system life and soft errors Pending CN105955834A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610311676.1A CN105955834A (en) 2016-05-12 2016-05-12 Method for optimizing system available time through considering system life and soft errors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610311676.1A CN105955834A (en) 2016-05-12 2016-05-12 Method for optimizing system available time through considering system life and soft errors

Publications (1)

Publication Number Publication Date
CN105955834A true CN105955834A (en) 2016-09-21

Family

ID=56911279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610311676.1A Pending CN105955834A (en) 2016-05-12 2016-05-12 Method for optimizing system available time through considering system life and soft errors

Country Status (1)

Country Link
CN (1) CN105955834A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563511A (en) * 2017-08-14 2018-01-09 华东师范大学 A kind of real-time system pot life is quickly estimated and optimization method
CN108983712A (en) * 2018-06-04 2018-12-11 华东师范大学 A kind of optimization mixes the method for scheduling task of crucial real-time system service life
CN115242806A (en) * 2022-06-14 2022-10-25 山东省计算中心(国家超级计算济南中心) Method and device for data backup of super computing center in super computing internet

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7971073B2 (en) * 2005-11-03 2011-06-28 Los Alamos National Security, Llc Adaptive real-time methodology for optimizing energy-efficient computing
CN103745107A (en) * 2014-01-10 2014-04-23 北京电子工程总体研究所 Fault mode-based establishment method for maintenance support simulation system for equipment basic level
CN105242966A (en) * 2015-09-28 2016-01-13 华东师范大学 Independent energy acquisition heterogeneous system oriented non-precision real-time task scheduling method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7971073B2 (en) * 2005-11-03 2011-06-28 Los Alamos National Security, Llc Adaptive real-time methodology for optimizing energy-efficient computing
CN103745107A (en) * 2014-01-10 2014-04-23 北京电子工程总体研究所 Fault mode-based establishment method for maintenance support simulation system for equipment basic level
CN105242966A (en) * 2015-09-28 2016-01-13 华东师范大学 Independent energy acquisition heterogeneous system oriented non-precision real-time task scheduling method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JUNLONG ZHOU等: "Balancing Lifetime and Soft-Error Reliability to Improve System Availability", 《DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2016 21ST ASIA AND SOUTH PACIFIC》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563511A (en) * 2017-08-14 2018-01-09 华东师范大学 A kind of real-time system pot life is quickly estimated and optimization method
CN107563511B (en) * 2017-08-14 2020-12-22 华东师范大学 Method for quickly estimating and optimizing available time of real-time system
CN108983712A (en) * 2018-06-04 2018-12-11 华东师范大学 A kind of optimization mixes the method for scheduling task of crucial real-time system service life
CN115242806A (en) * 2022-06-14 2022-10-25 山东省计算中心(国家超级计算济南中心) Method and device for data backup of super computing center in super computing internet
CN115242806B (en) * 2022-06-14 2023-09-29 山东省计算中心(国家超级计算济南中心) Method and device for backing up data of supercomputing center in supercomputing internet

Similar Documents

Publication Publication Date Title
EP3696401B1 (en) Power control method and apparatus for wind power generator
CN105572498A (en) Reliability acceleration test method of electronic products
CN103793752B (en) A kind of equipment failure number Forecasting Methodology based on modeling of degenerating
CN107221945B (en) A kind of UHVDC Transmission Lines forecast failure aid decision-making method and device
CN105955834A (en) Method for optimizing system available time through considering system life and soft errors
Mochamad et al. Assessing the impact of VSC-HVDC on the interdependence of power system dynamic performance in uncertain mixed AC/DC systems
CN112383045B (en) Transient stability out-of-limit probability calculation method and device for new energy power generation uncertainty
CN116184828B (en) Online real-time optimization method and system for high-speed valve characteristic curve of self-adaptive steam turbine
CN104102840A (en) Evaluation method for photovoltaic power receptivity of power distribution network
CN108565852A (en) A kind of Contingency screening and ranking method of the bulk power grid Voltage Stability Evaluation of three progress step by step
US20190137550A1 (en) Sensitivity Based Thevenin Index for Voltage Stability Assessment Considering N-1 Contingency
CN106056305B (en) Power generation system reliability rapid evaluation method based on state clustering
Wang Techniques for high performance analysis of transient stability
CN114937994A (en) Method and device for controlling stability of AC/DC hybrid power grid after DC blocking
CN113346489B (en) New energy space coupling modeling evaluation method and system
CN113949135A (en) Energy storage SOC recovery control method
CN110601187B (en) Multi-state power system optimization construction method based on continuous discrete function
CN113095741B (en) Method and device for planning grid-connected capacity of power electronic power supply
CN110649600B (en) Multi-state power system optimization construction method based on fuzzy generation function
CN102708296A (en) Energy supply and demand forecasting method on basis of gray multi-factor forecasting model
CN113822470B (en) Output data generation method and system considering uncertainty of new energy station output
Kim et al. Steady state and dynamic security assessment in composite power systems
CN112653129B (en) Transient power angle stability margin estimation method, device and system
Ji et al. Reliability Study on Control Unit of Metro Train Auxiliary Inverter Based on Improved Monte Carlo Algorithm
Kumar et al. Heuristic algorithm for constrained redundancy reliability optimization and performance evaluation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160921