CN103093094A - Software failure time forecasting method based on kernel partial least squares regression algorithm - Google Patents
Software failure time forecasting method based on kernel partial least squares regression algorithm Download PDFInfo
- Publication number
- CN103093094A CN103093094A CN2013100130053A CN201310013005A CN103093094A CN 103093094 A CN103093094 A CN 103093094A CN 2013100130053 A CN2013100130053 A CN 2013100130053A CN 201310013005 A CN201310013005 A CN 201310013005A CN 103093094 A CN103093094 A CN 103093094A
- Authority
- CN
- China
- Prior art keywords
- software
- kernel
- failure time
- regression algorithm
- software failure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010238 partial least squares regression Methods 0.000 title claims abstract description 15
- 238000013277 forecasting method Methods 0.000 title abstract 3
- 238000000034 method Methods 0.000 claims abstract description 26
- 230000006870 function Effects 0.000 claims description 33
- 239000013598 vector Substances 0.000 claims description 11
- 238000005315 distribution function Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 2
- 238000012628 principal component regression Methods 0.000 claims 1
- 230000003044 adaptive effect Effects 0.000 abstract description 4
- 238000013528 artificial neural network Methods 0.000 abstract description 3
- 238000013459 approach Methods 0.000 abstract 1
- 230000003111 delayed effect Effects 0.000 description 3
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000005309 stochastic process Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
【技术领域】【Technical field】
本发明涉及软件可靠性测试以及评估过程中下一次或未来较长时间内软件失效时间数据预测方法。The invention relates to a software reliability test and a method for predicting software failure time data in the next or a long time in the future.
【背景技术】【Background technique】
软件可靠性指在规定条件下,在规定时间内,软件不发生失效的概率。随机过程可靠性模型是软件可靠性增长模型领域研究最多、应用最广泛的一类,但实际可靠性问题的统计成分并不能仅用经典的统计分布函数来描述,而且随机过程模型需要对软件故障的属性和软件失效过程做出许多先验的假设,这导致在不同的项目中各模型表现出极大的预测精度差异,即模型的适用性较差。Software reliability refers to the probability that software will not fail within a specified time under specified conditions. The stochastic process reliability model is the most researched and widely used category in the field of software reliability growth models, but the statistical components of actual reliability problems cannot be described only by classical statistical distribution functions, and the stochastic process model needs to analyze software faults. Many a priori assumptions are made for the attributes and software failure process, which leads to great differences in the prediction accuracy of each model in different projects, that is, the applicability of the model is poor.
基于核函数理论的方法专门针对小样本数据的预测和分类问题,在很多类似可靠性预测领域得到了非常好的结果,适合软件可靠性预测这种复杂问题。借助于计算机技术,这类模型具有自适应能力和学习功能,在模型适用性以及评估预测能力上均有较好的表现,基于核函数理论的软件可靠性模型在有限样本情况下表现出来的良好特性,在很大程度上可以解决神经网络的过学习等问题,成为目前软件可靠性模型研究中较为重要的一个突破口。The method based on kernel function theory is specially aimed at the prediction and classification of small sample data, and has obtained very good results in many similar fields of reliability prediction, which is suitable for complex problems such as software reliability prediction. With the help of computer technology, this type of model has adaptive ability and learning function, and has good performance in model applicability and evaluation and prediction ability. The software reliability model based on kernel function theory shows good performance in the case of limited samples. Features, to a large extent, can solve the problems of neural network over-learning, and become an important breakthrough in the research of software reliability model.
【发明内容】【Content of invention】
本发明所要解决的技术问题是提供一种基于核偏最小二乘回归算法的软件失效时间预测方法,实现软件可靠性的自适应预测,有效提高软件失效预测模型的适应能力。为此,本发明采用以下技术方案,它包含如下步骤:The technical problem to be solved by the present invention is to provide a software failure time prediction method based on kernel partial least squares regression algorithm, realize self-adaptive prediction of software reliability, and effectively improve the adaptability of software failure prediction model. For this reason, the present invention adopts following technical scheme, and it comprises the steps:
(1)、首先观测并记录顺序软件失效数据集,并把所有的输入输出数据归一化;(1), first observe and record the sequential software failure data set, and normalize all input and output data;
(2)、通过合理抽象与假设,把软件失效时间预测问题转化为一个函数回归问题;(2) Through reasonable abstraction and assumptions, the software failure time prediction problem is transformed into a function regression problem;
(3)、选择用于预测的核函数,并给定参数的初始化值;(3), select the kernel function used for prediction, and give the initialization value of the parameter;
(4)、选择用于学习的失效数据数目;(4) Select the number of failure data for learning;
(5)、采用核偏最小二乘回归算法针对不同失效数据集进行学习优化(5) Using kernel partial least squares regression algorithm to optimize learning for different failure data sets
(6)、最后选用优化后的参数对新的失效时间进行预测。(6). Finally, the optimized parameters are selected to predict the new failure time.
进一步地,步骤(2)所述的把软件失效时间预测问题转化为一个函数回归问题,采用以下方法:Further, the software failure time prediction problem described in step (2) is transformed into a function regression problem, and the following method is adopted:
假设已发生的软件失效时间为t1,t2,L,tn,令tl=f(tl-m,tl-m+1,L,tl-1),则tl服从固定但未知的条件分布函数F(tltl-m,tl-m+1,L,tl-1),在t1,t2,L,tk已知条件下对tk+1进行预测变为:已知k-m个观测(T1,tm+1),(T2,tm+2),L,(Tk-m,tk)和第k-m+1个输入Tk-m+1的情况下,估计第k-m+1个输出值其中,Ti表示m维向量[ti,ti+1,L,tm+i]。Assuming that the software failure time that has occurred is t 1 , t 2 , L, t n , let t l = f(t lm ,t l-m+1 , L,t l-1 ), then t l follows a fixed but unknown The conditional distribution function F(t l t lm ,t l-m+1 ,L,t l-1 ), when t 1 ,t 2 ,L,t k are known, the prediction of t k+1 becomes : Known km observations (T 1 ,t m+1 ),(T 2 ,t m+2 ),L,(T km ,t k ) and the k-m+1th input T k-m+1 In the case of , estimate the k-m+1th output value Wherein, T i represents an m-dimensional vector [t i ,t i+1 , L,t m+i ].
步骤(3)中用到的核函数为高斯核函数,其参数初始值g=1。The kernel function used in step (3) is a Gaussian kernel function, Its parameter initial value g=1.
步骤(4)中的失效数据数目为5-8之间的整数。The number of failure data in step (4) is an integer between 5-8.
步骤(5)采用核偏最小二乘回归算法针对不同失效数据集进行学习优化,包括如下过程:Step (5) Use the kernel partial least squares regression algorithm to optimize learning for different failure data sets, including the following process:
步骤1,输入数据为k维向量X={x1,x2,L,xl},输出为向量ys,s=1,2,L,mStep 1, the input data is a k-dimensional vector X={x 1 ,x 2 ,L,x l }, and the output is a vector y s , s=1,2,L,m
步骤2,构建核函数矩阵:Kij=k(xi,xj)i,j=1,2,L,l,其中Step 2, construct the kernel function matrix: K ij =k( xi ,x j )i,j=1,2,L,l, where
步骤3,令K1=K, 的第一行,uj=uj/||uj||Step 3, let K 1 =K, The first line of , u j =u j /||u j ||
步骤4,重复计算uj=uj/||uj||,直到收敛Step 4, recalculate u j =u j /||u j ||, until convergence
步骤5,计算τj=Kjuj, Step 5, calculate τ j =K j u j ,
Kj+1=(I-τjτ′j/||τj||2)Kj(I-τjτ′j/||τj||2)K j+1 =(I-τ j τ′ j /||τ j || 2 )K j (I-τ j τ′ j /||τ j || 2 )
步骤6,计算B=[β1,L,βk]T=[τ1,L,τk],得到系数α=B(T′KB)-1T′YStep 6, calculate B=[β 1 ,L,β k ]T=[τ 1 ,L,τ k ], get the coefficient α=B(T′KB) -1 T′Y
本发明充分考虑软件失效数据的小样本特性,把核函数理论作为一种主要手段和方法,结合软件失效过程所呈现出来的动态规律,把软件可靠性预测问题转化为一个回归估计问题,并应用核偏最小二乘回归算法来解决这一问题。The present invention fully considers the small sample characteristics of software failure data, uses the kernel function theory as a main means and method, combines the dynamic law presented by the software failure process, transforms the software reliability prediction problem into a regression estimation problem, and applies Kernel partial least squares regression algorithm to solve this problem.
本发明利用输入和输出变量之间的协方差信息提取数据的潜在特征,能克服观测变量多于观测样本数的情形以及变量之间存在的多重共线性,因此不会出现神经网络等建模方法所产生的模型“过拟合”情况。在新预测方法中,随着软件失效不断发生,模型参数将不断自动调整以适应失效过程的动态变化,从而实现软件可靠性的自适应预测,有效提高软件失效预测模型的适应能力。The present invention utilizes the covariance information between the input and output variables to extract the latent features of the data, which can overcome the situation that the observed variables are more than the number of observed samples and the multicollinearity between the variables, so there will be no modeling methods such as neural networks Resulting model "overfitting" cases. In the new prediction method, as software failures continue to occur, model parameters will be automatically adjusted to adapt to the dynamic changes in the failure process, thereby realizing adaptive prediction of software reliability and effectively improving the adaptability of software failure prediction models.
【附图说明】【Description of drawings】
图1为本发明软件失效时间预测方法的流程图。FIG. 1 is a flow chart of the software failure time prediction method of the present invention.
【具体实施方式】【Detailed ways】
1)数据归一化1) Data normalization
在使用回归估计算法进行学习预测时,首先需要把所有的输入输出数据归一化到区间[0.1,0.9],具体转化式子为:其中,y是归一化后的值,x是实际值,xmax是数据集中的最大值,xmin是最小值,Δ=xmax-xmin,预测结束后,采用以下映射把数据映射回到实际值:
2)问题转化2) Problem Transformation
在基于核函数理论的软件可靠性预测模型中,对软件失效时间数据与发生在其之前的m次失效时间数据之间的关系进行建模,则单步预测问题可以转化为:已知k-m个观测(T1,tm+1),(T2,tm+2),L,(Tk-m,tk)和第k-m+1个输入Tk-m+1的情况下,估计第k-m+1个输出值其中Ti表示m维向量[ti,ti+1,L,tm+i],同样的,把作为输入,则可以预测同理可以预测得到 In the software reliability prediction model based on kernel function theory, the relationship between software failure time data and the failure time data of m times before it is modeled, and the single-step prediction problem can be transformed into: known km In the case of observing (T 1 ,t m+1 ),(T 2 ,t m+2 ),L,(T km ,t k ) and the k-m+1th input T k-m+1 , estimate k-m+1th output value Where T i represents the m-dimensional vector [t i ,t i+1 ,L,t m+i ], similarly, put As input, you can predict Similarly, it can be predicted that
3)选用的核函数,参数的初始化值3) The selected kernel function, the initialization value of the parameter
4)确定核函数参数的值4) Determine the value of the kernel function parameter
核函数参数选择问题,其实质就是一个优化问题,采用网格搜索法进行核函数参数选择,比如在用SVM预测时,采用高斯核函数,需要确定两个参数即惩罚因子C与核函数参数g,基于网格法将C∈[C1,C2],变化步长为Cs,而g∈[g1,g2],变化步长为gt,针对每对参数(C,g)进行训练,选取效果最好的一对参数作为模型参数。The kernel function parameter selection problem is essentially an optimization problem. The grid search method is used to select the kernel function parameter. For example, when using SVM prediction, the Gaussian kernel function is used. Two parameters need to be determined, namely the penalty factor C and the kernel function parameter g , based on the grid method, C∈[C 1 ,C 2 ], the change step is C s , and g∈[g 1 ,g 2 ], the change step is g t , for each pair of parameters (C,g) For training, select a pair of parameters with the best effect as model parameters.
5)核偏最小二乘回归算法5) Kernel partial least squares regression algorithm
核函数回归问题求解可以描述为:给定一群向量与对应的目标值作为输入,想要找出xi与ti之间的对应关系,使得在遇到一个新的向量x*时,能够预测出它所对应的目标值t*,ti是任意实数。假设x与t的对应关系符合以下的函数:The solution to the kernel function regression problem can be described as: given a group of vectors with the corresponding target value As an input, we want to find out the corresponding relationship between x i and t i , so that when we encounter a new vector x * , we can predict its corresponding target value t * , and t i is any real number. Assume that the corresponding relationship between x and t conforms to the following function:
其中,k(x,xi)为核函数,核函数回归估计算法的目的是找到合适的wi。算法如下:Among them, k(x, x i ) is the kernel function, and the purpose of the kernel function regression estimation algorithm is to find the appropriate w i . The algorithm is as follows:
步骤1,输入数据为k维向量X={x1,x2,L,xl},输出为向量ys,s=1,2,L,mStep 1, the input data is a k-dimensional vector X={x 1 ,x 2 ,L,x l }, and the output is a vector y s , s=1,2,L,m
步骤2,构建核函数矩阵:Kij=k(xi,xj)i,j=1,2,L,l,其中Step 2, construct the kernel function matrix: K ij =k( xi ,x j )i,j=1,2,L,l, where
步骤3,令K1=K, 的第一行,uj=uj/||uj||Step 3, let K 1 =K, The first line of , u j =u j /||u j ||
步骤4,重复计算uj=uj/||uj||,直到收敛Step 4, recalculate u j =u j /||u j ||, until convergence
步骤5,计算τj=Kjuj, Step 5, calculate τ j =K j u j ,
Kj+1=(I-τjτ′j/||τj||2)Kj(I-τjτ′j/||τj||2)K j+1 =(I-τ j τ′ j /||τ j || 2 )K j (I-τ j τ′ j /||τ j || 2 )
为了对所建立的模型提供合理的比较与分析,采用10个来自不同类型软件的真实失效数据集对所提出的模型进行了实验分析,如表1所示。这些数据集描述了各个软件系统的失效过程,每个数据点包含两种观测统计集合:累计执行时间和累计失效次数。在实验中,训练集包括从测试开始后完整的系统失效过程,为了让核函数进行充分的学习,在实验过程中,取所有数据集的前三分之一作为学习数据,对后面三分之二数据进行预测后与真实数据进行比较。In order to provide a reasonable comparison and analysis of the established models, 10 real failure datasets from different types of software were used to carry out experimental analysis on the proposed model, as shown in Table 1. These data sets describe the failure process of each software system, and each data point contains two sets of observation statistics: cumulative execution time and cumulative failure times. In the experiment, the training set includes the complete system failure process from the beginning of the test. In order to allow the kernel function to fully learn, during the experiment, the first third of all data sets are taken as learning data, and the latter third is used as learning data. The second data is predicted and compared with the real data.
表中列出了在十个数据集上各个模型的AE值,其中模型1-6分别代表SRGMWith Logistic TEF、SRGM With Rayleigh TEF、Delayed S-Shaped Model WithLogistic TEF、Delayed S-Shaped Model With Rayleigh TEF,G-O model、YamadaDelayed S-Shaped;模型7代表本发明采用的方法,a、b、c、d代表采用的核函数分别为Gaussian Function、Linear Function、Polynomial Function、Symmetric Triangle Function。The table lists the AE values of each model on ten datasets, among which models 1-6 represent SRGMWith Logistic TEF, SRGM With Rayleigh TEF, Delayed S-Shaped Model WithLogistic TEF, Delayed S-Shaped Model With Rayleigh TEF, G-O model, Yamada Delayed S-Shaped; model 7 represents the method adopted in the present invention, a, b, c, d represent the kernel functions used are Gaussian Function, Linear Function, Polynomial Function, Symmetric Triangle Function respectively.
表1:10个数据集上各个模型预测的AE值Table 1: AE values predicted by each model on 10 datasets
结论:在不同数据集上,采用不同的核函数以及采用不同的回归估计方法时,模型预测性能均有差异,采用基于核偏最小二乘回归算法的软件可靠性预测模型能有效提高模型的预测性能和适用性。Conclusion: On different data sets, when using different kernel functions and different regression estimation methods, the prediction performance of the model is different. Using the software reliability prediction model based on the kernel partial least squares regression algorithm can effectively improve the prediction of the model. performance and applicability.
上述实施例是对本发明的说明,不是对本发明的限定,任何对本发明简单变换后的方案均属于本发明的保护范围。The above-mentioned embodiment is an illustration of the present invention, not a limitation of the present invention, and any solution after a simple transformation of the present invention belongs to the protection scope of the present invention.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013100130053A CN103093094A (en) | 2013-01-14 | 2013-01-14 | Software failure time forecasting method based on kernel partial least squares regression algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013100130053A CN103093094A (en) | 2013-01-14 | 2013-01-14 | Software failure time forecasting method based on kernel partial least squares regression algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103093094A true CN103093094A (en) | 2013-05-08 |
Family
ID=48205653
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2013100130053A Pending CN103093094A (en) | 2013-01-14 | 2013-01-14 | Software failure time forecasting method based on kernel partial least squares regression algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103093094A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105260304A (en) * | 2015-10-19 | 2016-01-20 | 湖州师范学院 | /Software reliability prediction method based on QBGSA RVR (Quantum-inspired Binary Gravitational Search Algorithm-Relevance Vector Machine) |
CN107947984A (en) * | 2017-11-24 | 2018-04-20 | 浙江网新电气技术有限公司 | A kind of failure predication processing method and its system towards railway transport of passengers service |
CN108267951A (en) * | 2016-12-30 | 2018-07-10 | 南京理工大学 | A kind of maximum power point-tracing control method based on core offset minimum binary |
-
2013
- 2013-01-14 CN CN2013100130053A patent/CN103093094A/en active Pending
Non-Patent Citations (2)
Title |
---|
楼俊钢 等: "《软件可靠性预测的核函数方法》", 《计算机科学》 * |
蒋红卫 等: "《核偏最小二乘回归及其在医学中的应用》", 《中国卫生统计》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105260304A (en) * | 2015-10-19 | 2016-01-20 | 湖州师范学院 | /Software reliability prediction method based on QBGSA RVR (Quantum-inspired Binary Gravitational Search Algorithm-Relevance Vector Machine) |
CN105260304B (en) * | 2015-10-19 | 2018-03-23 | 湖州师范学院 | A kind of software reliability prediction method based on QBGSA RVR |
CN108267951A (en) * | 2016-12-30 | 2018-07-10 | 南京理工大学 | A kind of maximum power point-tracing control method based on core offset minimum binary |
CN107947984A (en) * | 2017-11-24 | 2018-04-20 | 浙江网新电气技术有限公司 | A kind of failure predication processing method and its system towards railway transport of passengers service |
CN107947984B (en) * | 2017-11-24 | 2021-08-03 | 浙江网新电气技术有限公司 | Fault prediction processing method and system for railway passenger service |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | A compound framework for wind speed forecasting based on comprehensive feature selection, quantile regression incorporated into convolutional simplified long short-term memory network and residual error correction | |
Ch et al. | Streamflow forecasting by SVM with quantum behaved particle swarm optimization | |
CN111199270B (en) | Regional wave height forecasting method and terminal based on deep learning | |
CN108664687A (en) | A kind of industrial control system space-time data prediction technique based on deep learning | |
CN103544496B (en) | The robot scene recognition methods merged with temporal information based on space | |
Cerqueira et al. | A comparative study of performance estimation methods for time series forecasting | |
CN107480440A (en) | A kind of method for predicting residual useful life for modeling of being degenerated at random based on two benches | |
CN110309603A (en) | A short-term wind speed prediction method and system based on wind speed characteristics | |
CN112071062B (en) | A Travel Time Estimation Method Based on Graph Convolutional Network and Graph Attention Network | |
CN104954185B (en) | A kind of cloud computing load predicting method based on depth confidence net | |
CN105930562A (en) | Structural performance optimum design method under non-probability conditions | |
Liu et al. | Sparse-gev: Sparse latent space model for multivariate extreme value time serie modeling | |
CN103326903A (en) | Hidden-Markov-based Internet network delay forecasting method | |
CN106154259B (en) | A kind of multisensor adaptive management-control method under random set theory | |
CN107403188A (en) | A kind of quality evaluation method and device | |
CN107181474A (en) | A kind of kernel adaptive algorithm filter based on functional expansion | |
CN108879732A (en) | Transient stability evaluation in power system method and device | |
CN110119838A (en) | A kind of shared bicycle demand forecast system, method and device | |
CN114386666A (en) | A short-term wind speed prediction method for wind farms based on spatiotemporal correlation | |
CN101587154A (en) | Quick mode estimation mode estimating method suitable for complicated node and large scale metric data | |
CN105005197B (en) | Time-varying neurodynamics system identifying method based on Chebyshev polynomials expansion | |
CN103093094A (en) | Software failure time forecasting method based on kernel partial least squares regression algorithm | |
CN103093095A (en) | Software failure time forecasting method based on kernel principle component regression algorithm | |
CN106372440A (en) | Method and device for estimating self-adaptive robust state of distribution network through parallel computation | |
TWI852424B (en) | Modeling method of charging capacity prediction based on meteorological factors and charging facility failure, its prediction method and its system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20130508 |