CN103106139B - Based on the software failure time forecasting methods that relevance vector regression is estimated - Google Patents
Based on the software failure time forecasting methods that relevance vector regression is estimated Download PDFInfo
- Publication number
- CN103106139B CN103106139B CN201310013004.9A CN201310013004A CN103106139B CN 103106139 B CN103106139 B CN 103106139B CN 201310013004 A CN201310013004 A CN 201310013004A CN 103106139 B CN103106139 B CN 103106139B
- Authority
- CN
- China
- Prior art keywords
- software
- software failure
- sigma
- relevance vector
- failure time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of software failure time forecasting methods estimated based on relevance vector regression, software failure moment and m inefficacy time data before it are learnt, thus catching between the inefficacy moment in dependence, thus to build based on Method Using Relevance Vector Machine software reliability prediction method by it. Owing to having taken into full account the small sample characteristic of software reliability prediction, adopt kernel function technology that observational variable can be overcome more than the multicollinearity existed between situation and the variable of observation sample number, thus without model " over-fitting " situation produced by the modeling methods such as neutral net occurs. In new Forecasting Methodology, along with software failure constantly occurs, model parameter constantly will be adapted to the dynamic change of failure procedure automatically, thus realizing the adaptive prediction of software reliability, be effectively improved the adaptive capacity of software faults prediction model.
Description
[technical field]
The present invention relates in software reliability test and evaluation process software failure time data Forecasting Methodology next time or in the following long period.
[background technology]
Software reliability refers under prescribed conditions, and at the appointed time, the probability lost efficacy does not occur software. It is that Traditional solutions reflects the statistical philosophy of large sample solving reliability prediction problem, it is easy to study and the problem such as poor for applicability occurred.
Statistical Learning Theory is built upon on a set of more solid theoretical basis, provides a unified framework for solving finite sample problem concerning study. It can, by included for a lot of existing methods, be expected to help to solve many original insoluble problems, such as neural network structure select permeability, local minimum point's problem etc. Method Using Relevance Vector Machine (relevancevectormachine, RVM) it is Tipping in calendar year 2001 proposed a kind of management loading model, very good application is achieved in a lot, tracking such as object, 3D Attitude estimation, 3D model recovery etc., load forecast, channel equalization prediction etc.
[summary of the invention]
The technical problem to be solved is to provide a kind of software failure time forecasting methods estimated based on relevance vector regression, can realize the adaptive prediction of software reliability. For this, by the following technical solutions, it comprises the steps of this utility model:
(1), first observe and record successive software failure dates set, and all of inputoutput data normalization;
(2), by abstraction and it is assumed that software failure time prediction problem is converted into a function regression problem;
(3), the kernel function for predicting the initialization value of given parameters are selected;
(4) the fail data number for learning, is selected;
(5), adopt relevance vector regression algorithm for estimating to carry out study for different failure dates sets to optimize;
(6), finally select the parameter after optimizing that the new out-of-service time is predicted.
Further, described in step (2), software failure time prediction problem is converted into a function regression problem, adopts with the following method:
Assume that the software failure time occurred is t1,t2,…,tn, make tl=f (tl-m,tl-m+1,…,tl-1), then tlObey and fix but the conditional distribution function F (t of the unknownl|tl-m,tl-m+1,…,tl-1), at t1,t2,…,tkTo t under known conditionsk+1It is predicted becoming: known k-m observation (T1,tm+1),(T2,tm+2),…,(Tk-m,tk) and kth-m+1 input Tk-m+1When, estimate kth-m+1 output valveWherein, TiRepresent m dimensional vector [ti,ti+1,…,tm+i];
The kernel function used in step (3) is gaussian kernel function, κ (x, y)=e-g < x-y, x-y > 2, its initial parameter value g=1.
Fail data number in step (4) is the integer between 5-8.
Further, the employing relevance vector regression algorithm for estimating described in step (5) carries out study for different failure dates sets and optimizes, including following process:
(5.1), given a group vectorWith corresponding desired valueAs input, it is assumed that the corresponding relation of x and t meets following function:
p(ti)=N (ti|y(xi; W), σ2)
(5.2) probability distribution, making t is:
In formula, Φ=[φ (x1),φ(x2),…φ(xN)]T, φ (xn)=[1, k (xn,x1),k(xn,x2),…,k(xn,xN)]T;
W=[w0,w1,…wN]T,
(5.3), to each weights ωiDefinition prior probability distribution:
α=(α1,αi,…αN)。
(5.4) Posterior distrbutionp of unknown quantity, is calculated:
(5.5), after integration, abbreviation obtains:
μ=σ-2ΣΦTT, Σ=(A+ σ-2ΦTΦ)-1, A=diag (α0,α1,…αN), Ω=σ2I+ΦA-1ΦT,
(5.6), p (t is calculated*| approximate solution t):
(5.7), following formula iterative α is usedMP,
Owing to adopting technical scheme, the present invention uses RVM that software failure moment and m inefficacy time data before it are learnt thus catching between the inefficacy moment in dependence, thus builds based on Method Using Relevance Vector Machine software reliability prediction method. By the application of kernel function technology, software reliability prediction problem is converted into a regression estimation problem, and applies relevance vector regression algorithm for estimating to solve this problem. In new Forecasting Methodology, along with software failure constantly occurs, model parameter constantly will be adapted to the dynamic change of failure procedure automatically, thus realizing the adaptive prediction of software reliability.
[accompanying drawing explanation]
Fig. 1 is the flow chart of invention software out-of-service time Forecasting Methodology.
[detailed description of the invention]
1) data normalization
When using regression estimation algorithm to carry out study prediction, it is necessary first to all of inputoutput data is normalized to interval [0.1,0.9], the concrete formula that converts is: Wherein, y is the value after normalization, and x is actual value, xmaxIt is the maximum in data set, xminIt is minima, Δ=xmax-xmin, it was predicted that after terminating, adopt following mapping that data are mapped back to actual value:
2) problem converts
Assume that the software failure time occurred is t1,t2,…,tn, make tl=f (tl-m,tl-m+1,…,tl-1), then tlObey and fix but the conditional distribution function F (t of the unknownl|tl-m,tl-m+1,…,tl-1), use RVM that software failure time data is learnt, it is possible to catch the dependence of out-of-service time inherence. The input of RVM is m dimensional vector [tl-m,tl-m+1,…,tl-1], it is output as tl, then total for RVM list entries is t1,t2,…,tn...;
Output sequence is: tm+1,tm+2,…,tn,tn+1,…。
If being t for the RVM inefficacy moment sequence carrying out learning1,t2,…,tk(k > m), then at t1,t2,…,tkTo t under known conditionsk+1It is predicted becoming: known k-m observation (T1,tm+1),(T2,tm+2),…,(Tk-m,tk) and kth-m+1 input Tk-m+1When, estimate kth-m+1 output valveWherein, TiRepresent m dimensional vector [ti,ti+1,…,tm+i].?As input, then can predictIn like manner can obtain
The mean value function of predictive value is given by:
Probabilistic forecasting distribution function is:
3) kernel function for predicting the initialization value of given parameters are selected
4) value of kernel functional parameter is determined
Kernel functional parameter select permeability, its essence is exactly an optimization problem, adopts grid data service to carry out kernel functional parameter selection, such as when predicting with SVM, adopt gaussian kernel function, it is thus necessary to determine that two parameters and penalty factor and kernel functional parameter g, based on gridding method by C ∈ [C1,C2], change step is Cs, and g ∈ [g1,g2], change step is gt, for every pair of parameter, (C, g) is trained, and chooses a pair best parameter of effect as model parameter
5) Relevance vector machine for regression algorithm for estimating
Solve regression problem with RVM can be described as: given a group vectorWith corresponding desired valueAs input, it is desirable to find out xiWith tiBetween corresponding relation so that running into a new vector x*Time, it is possible to dope the desired value t that it is corresponding*, tiIt it is any real number. The corresponding relation of x and t meets following function:
p(ti)=N (ti|y(xi; W), σ2)
It may be reasonably assumed thatIt is random variable independent of each other,
KnownWith σ2Under condition, the probability distribution of t is
In formula, Φ=[φ (x1),φ(x2),…φ(xN)]T, φ (xn)=[1, k (xn,x1),k(xn,x2),…,k(xn,xN)]T;
W=[w0,w1,…wN]T, to each weights ωiDefinition prior probability distribution:
In formula, αiIt is determine wiThe hyper parameter of prior distribution, α=(α1,αi,…αN)。
Prior distribution according to weights and sample set likelihood function, the Posterior distrbutionp of unknown quantity can be calculated by Bayesian formula and obtain:
Therefore, a given new vector x*Time, t*Probability distribution prediction be:
p(t*| t)=∫ p (t*|w,α,σ2)p(w,α,σ2|t)dwdαdσ2,
p(w,α,σ2| t)=p (w | t, α, σ2)p(α,σ2|t)
Thus, have
P in above formula (t | w, σ2) it is all the product of Gaussian function with p (w | α), after integration, abbreviation obtains:
Wherein, μ=σ-2ΣΦTT, Σ=(A+ σ-2ΦTΦ)-1, A=diag (α0,α1,…αN), Ω=σ2I+ΦA-1ΦT, such that it is able to find p (t*| approximate solution t):
Two products being all Gaussian function in integration type. So, after definite integral, result is:
φ(x*)=[1, k (x*,x1),k(x*,x2),…,k(x*,xN)]T
Finally, remaining issues is to solve for
Wherein ΣiiIt is i-th element on the diagonal in Σ, first provides α, σ2Conjecture value, then constantly updated by above formula, just can approach αMP,
In order to provide rational comparison and analysis to the model set up, adopt 10 and carried out experimental analysis from the model that the true fail data set pair of dissimilar software is proposed, as shown in table 2. These data sets describe the failure procedure of each software system, and each data point comprises two kinds of observation statistics set: accumulative execution time and accumulative Failure count. In an experiment, training set includes starting rear complete thrashing process from test, in order to allow kernel function learn fully, in experimentation, take all data sets first three point one as learning data, compare with truthful data after 2/3rds data below are predicted.
Table lists the AE value of each model on ten data sets, wherein model 1-6 represents SRGMWithLogisticTEF, SRGMWithRayleighTEF, DelayedS-ShapedModelWithLogisticTEF, DelayedS-ShapedModelWithRayleighTEF, G-Omodel, YamadaDelayedS-Shaped respectively; Model 7 represents the method that the present invention adopts, and a, b, c, d represent kernel function respectively GaussianFunction, LinearFunction, PolynomialFunction, SymmetricTriangleFunction of adopting.
The AE value of each model prediction on 1:10 data set of table
Conclusion: on different pieces of information collection, when adopting different kernel functions and adopt different regression estimation methods, model prediction performance is all variant, adopts the prediction model of software reliability based on relevance vector regression algorithm for estimating can be effectively improved estimated performance and the suitability of model.
Claims (3)
1. the software failure time forecasting methods estimated based on relevance vector regression, is characterized in that, it comprises the steps of:
(1), first observe and record successive software failure dates set, and all of inputoutput data normalization;
(2), by abstraction and it is assumed that software failure time prediction problem is converted into a function regression problem;
(3), the kernel function for predicting the initialization value of given parameters are selected;
(4) the fail data number for learning, is selected;
(5), adopt relevance vector regression algorithm for estimating to carry out study for different failure dates sets to optimize;
(6), finally select the parameter after optimizing that the new out-of-service time is predicted;
Employing relevance vector regression algorithm for estimating described in step (5) carries out study for different failure dates sets and optimizes, including following process:
(5.1), given a group vectorWith corresponding desired valueAs input, it is assumed that the corresponding relation of x and t meets following function:
p(ti)=N (ti|y(xi; W), σ2)
(5.2) probability distribution, making t is:
In formula, Φ=[φ (x1),φ(x2),…φ(xN)]T, φ (xn)=[1, k (xn,x1),k(xn,x2),…,k(xn,xN)]T;
W=[w0,w1,…wN]T,
(5.3), to each weights ωiDefinition prior probability distribution:
In formula, αiIt is determine wiThe hyper parameter of prior distribution,
α=(α1,αi,…αN),
(5.4) Posterior distrbutionp of unknown quantity, is calculated:
(5.5), after integration, abbreviation obtains:
μ=σ-2ΣΦTT, Σ=(A+ σ-2ΦTΦ)-1, A=diag (α0,α1,…αN), Ω=σ2I+ΦA-1ΦT,
(5.6), p (t is calculated*| approximate solution t):
(5.7), following formula iterative α is usedMP,
2. the as claimed in claim 1 software failure time forecasting methods estimated based on relevance vector regression, is characterized in that, described in step (2), software failure time prediction problem are converted into a function regression problem, adopt with the following method:
Assume that the software failure time occurred is t1,t2,…,tn, make tl=f (tl-m,tl-m+1,…,tl-1), then tlObey and fix but the conditional distribution function F (t of the unknownl|tl-m,tl-m+1,…,tl-1), at t1,t2,…,tkTo t under known conditionsk+1It is predicted becoming: known k-m observation (T1,tm+1),(T2,tm+2),…,(Tk-m,tk) and kth-m+1 input Tk-m+1When, estimate kth-m+1 output valveWherein, TiRepresent m dimensional vector [ti,ti+1,…,tm+i]。
3. the software failure time forecasting methods estimated based on relevance vector regression as claimed in claim 1, is characterized in that, the kernel function used in step (3) is gaussian kernel function,Its initial parameter value g=1; Fail data number in step (4) is the integer between 5-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310013004.9A CN103106139B (en) | 2013-01-14 | 2013-01-14 | Based on the software failure time forecasting methods that relevance vector regression is estimated |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310013004.9A CN103106139B (en) | 2013-01-14 | 2013-01-14 | Based on the software failure time forecasting methods that relevance vector regression is estimated |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103106139A CN103106139A (en) | 2013-05-15 |
CN103106139B true CN103106139B (en) | 2016-06-15 |
Family
ID=48314017
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310013004.9A Active CN103106139B (en) | 2013-01-14 | 2013-01-14 | Based on the software failure time forecasting methods that relevance vector regression is estimated |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103106139B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104111887A (en) * | 2014-07-01 | 2014-10-22 | 江苏科技大学 | Software fault prediction system and method based on Logistic model |
CN105260304B (en) * | 2015-10-19 | 2018-03-23 | 湖州师范学院 | A kind of software reliability prediction method based on QBGSA RVR |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1667587A (en) * | 2005-04-11 | 2005-09-14 | 北京航空航天大学 | Software reliability estimation method based on expanded Markov-Bayesian network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120209575A1 (en) * | 2011-02-11 | 2012-08-16 | Ford Global Technologies, Llc | Method and System for Model Validation for Dynamic Systems Using Bayesian Principal Component Analysis |
-
2013
- 2013-01-14 CN CN201310013004.9A patent/CN103106139B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1667587A (en) * | 2005-04-11 | 2005-09-14 | 北京航空航天大学 | Software reliability estimation method based on expanded Markov-Bayesian network |
Non-Patent Citations (1)
Title |
---|
软件可靠性预测的核函数方法;楼俊钢 等;《计算机科学》;20120430;第39卷(第4期);摘要、第145页右栏第4行-第147页右栏倒数第11行,图1-3 * |
Also Published As
Publication number | Publication date |
---|---|
CN103106139A (en) | 2013-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kaytez et al. | Forecasting electricity consumption: A comparison of regression analysis, neural networks and least squares support vector machines | |
Corizzo et al. | Anomaly detection and repair for accurate predictions in geo-distributed big data | |
Pagel et al. | Forecasting species ranges by statistical estimation of ecological niches and spatial population dynamics | |
Li et al. | Big data driven vehicle battery management method: A novel cyber-physical system perspective | |
Vanem | Non-stationary extreme value models to account for trends and shifts in the extreme wave climate due to climate change | |
CN104123377B (en) | A kind of microblog topic temperature forecasting system and method | |
Turner et al. | Regime‐shifting streamflow processes: Implications for water supply reservoir operations | |
CN103197983B (en) | Service component reliability online time sequence predicting method based on probability graph model | |
CN104331572A (en) | Wind power plant reliability modeling method considering correlation between air speed and fault of wind turbine generator | |
CN105809264B (en) | Power load prediction method and device | |
Krishna | An integrated approach for weather forecasting based on data mining and forecasting analysis | |
CN104699979B (en) | Urban lake storehouse algal bloom Study on prediction technology of chaotic series based on complex network | |
Nunes et al. | The elimination-selection based algorithm for efficient resource discovery in Internet of Things environments | |
Kajornrit et al. | Estimation of missing precipitation records using modular artificial neural networks | |
CN103093095A (en) | Software failure time forecasting method based on kernel principle component regression algorithm | |
CN103106139B (en) | Based on the software failure time forecasting methods that relevance vector regression is estimated | |
CN114282704A (en) | Charging load prediction method and device for charging station, computer equipment and storage medium | |
Sobolewski et al. | Estimation of wind farms aggregated power output distributions | |
Yang et al. | A new hybrid model based on fruit fly optimization algorithm and wavelet neural network and its application to underwater acoustic signal prediction | |
Koivisto et al. | Statistical modeling of aggregated electricity consumption and distributed wind generation in distribution systems using AMR data | |
Pan et al. | A novel probabilistic modeling framework for wind speed with highlight of extremes under data discrepancy and uncertainty | |
CN103093094A (en) | Software failure time forecasting method based on kernel partial least squares regression algorithm | |
Lee et al. | A big data management system for energy consumption prediction models | |
CN104933052A (en) | Data true value estimation method and data true value estimation device | |
CN116523001A (en) | Method, device and computer equipment for constructing weak line identification model of power grid |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |