CN104809066B - A kind of method by code quality assessment prediction open source software maintenance workload - Google Patents
A kind of method by code quality assessment prediction open source software maintenance workload Download PDFInfo
- Publication number
- CN104809066B CN104809066B CN201510218321.3A CN201510218321A CN104809066B CN 104809066 B CN104809066 B CN 104809066B CN 201510218321 A CN201510218321 A CN 201510218321A CN 104809066 B CN104809066 B CN 104809066B
- Authority
- CN
- China
- Prior art keywords
- index
- mrow
- msub
- index set
- open source
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Stored Programmes (AREA)
Abstract
The present invention relates to a kind of method by code quality assessment prediction open source software maintenance workload, the index of all sign open source software code qualities is obtained first, is then introduced variance inflation factor and is removed the larger index of the degree of association, obtains available index set;To with index set linear regression analysis can be carried out with index, the functional relation between the maintenance workload of open source software and available index is obtained, is the maintenance workload of predictable open source software using the relational expression.The index that open source software code quality is characterized in this method is readily available, and is predicted the outcome precisely, suitable for large-scale promotion.
Description
Technical field
The present invention relates to open source software maintenance, and in particular to one kind safeguards work by code quality assessment prediction open source software
The method that work is measured.
Background technology
The maintenance work of software is the final stage of software life-cycle, is also the most long stage.The maintenance work of software
Whole software life-cycle is spent 60% to 70% will be accounted for.Open source software provides their source code for the public, and
The use that the public can not be limited by licensing, changes and distributes their source code.
Because software maintenance work occupies very important status in modern software engineering, many people are carried out in this field
Study and create a series of forecast models.Different according to the index used in prediction, software maintenance workload prediction model can divide
For static prediction model and dynamic prediction model, static prediction model mainly has COCOMO models, Walston_Felix models
And Maston, Barnett and Mellichamp models, the class model thinks LOC and prediction FP (function points) to comment
Estimate software size and draw the important indicator of maintenance workload, the derivation key element that can be assessed as workload.For identical
KLOC or FP values, different results will be drawn with the estimation of different models.Main cause is, if these models majority be all according only to
The empirical data of limited project is derived in dry application field, restricted application.Therefore, it is necessary to according to current project
The characteristics of the applicable appraising model of selection, and adjustment (for example, modification model constants) appraising model as suitably desired.And
Dynamic multivariate model is derived according to the productivity data collected from more than 4000 present age software project.Such mould
Type regards workload as the function of software size and the two variables of development time.
1st, it is maintenance works of the Boehm to conventional software on the basis of COCOMO models to compare famous in static models
The ACT models of proposition.The mode degree of coming of (MM) people × moon should be used by proposing software maintenance workload first in ACT models
Measure and drawn formula:
E=ACT × 2.4KLOC1.05(1);
In formula (1), E represents workload, and ACT represents that system software source code code in the maintenance work of 1 year is changed
Ratio.
But it is that this model carrys out metric software maintenance work only with the lines of code of change the problem of ACT models, and
Represent that the size of software is also very inaccurate using code size.This model greatest problem is that model does not consider in itself
To any Scientific Attribute about software maintenance.
Boehm BW et al. have developed the models of COCOMO 2.0 again on the basis of COCOMO models.The model can be represented
For:
A represents thousand lines of codes, function points or object number in this model.SF represents five Size measurement sides
Formula includes priori, develops elasticity, risk resolution ability, Work team Cohesiveness and process degree of perfection.EM represents 17 works
Make metric method.The model also defines the scale of a new software project.But this model is only applicable to plan
Kind software project (project of software company's exploitation of such as perfecting system) and to evaluate the maintenance of a software project
Workload needs to carry out substantial amounts of collection and statistics to indices.
2nd, the form of typical dynamic multivariate appraising model is as follows:
E=(LOC × B0.333/P) 3 × (1/t) 4 (13.2) (3);
E is the workload in units of man month or man-year;T is the Item duration in units of the moon or year;B is special
Different technique factor, it is slowly increased with the increase of the demand to test, quality assurance, document and administrative skill, for compared with
Small program (KLOC=5~15), B=0.16;For the program more than 70KLOC, B=0.39, P is productivity parameters.It is anti-
Influence of following factors to workload is reflected:Overall process maturity and managerial skills;Use good software engineering practice
Degree;The rank of the programming language used;The state of software environment;The technology and experience of software project group;Application system
Complexity.
Sajid Anwar, Muhammad Ramzan, the Abdul Rauf in later dynamic model research, et al. from soft
Part framework, which sets out, have evaluated software architecture for the influence of software project quality and this influence is applied into software maintenance work
Analysis and estimation that work is measured.Substantial amounts of analysis is done to the whole life cycle of software last software dimension is assessed from process management
The workload of shield.These appraisal procedures are highly effective, but then and do not applied to for open source software, and open source software project does not have
The management constraint of strength, so management data that also cannot be related.Kun Chang Lee, Nam Yong Jo then use pattra leaves
Preferable result is analyzed and obtained to the analysis method of this network to influenceing the factor of software maintenance workload to be made that.Give
Some generally acknowledged at present assess mutual pass of the experimental basis and labor of software maintenances work figureofmerits between them
Connection.But can not above accomplish accurate prediction work amount in final result.
The method of static models embodies researchers from software most basic composition, such as source code, in the element such as document
Obtain software maintenance workload effort, but the method for static models and be not suitable for safeguard in software project, safeguard
During software all basic constituent elements can all change and current static method can not grasp completely these change pair
The influence that software maintenance workload is constituted.And the method for dynamic model then embodies researchers from software maintenance staff,
With the thought of soft project, estimation and prediction are made to software maintenance workload with the data for managing process.But dynamic analog
The method Data Collection of type is difficult, for open source software, and management data can not be collected at all.
The content of the invention
In view of the above-mentioned problems existing in the prior art, dynamic-static model is combined it is an object of the invention to provide one kind,
Pass through the method for code quality assessment prediction open source software maintenance workload.
To achieve the above object, the present invention is adopted the following technical scheme that:One kind is increased income soft by code quality assessment prediction
The method of part maintenance workload, comprises the following steps:
S1:All indexs for characterizing open source software code quality are obtained, if all index composing indexes set, the index set
Q element, q are had in conjunctionjFor the element of index set, j-th of index in index set is represented;
S2:Remove the high index of the degree of association in index set, method is as follows:
1) j=1 is set;
2) according to formula (4) parameter qjWith the variance inflation factor VIF of Q-1 index of other in index setj;
Wherein, RjRepresent the incidence coefficient of remaining index in j-th of index and index set, TOLjFor jth index
Tolerance limit,Represent index qjValue;
3) variance inflation factor VIF is judgedjWith the relation between 10, if VIFj> 10, then index qjRefer to be to be removed
Mark, is otherwise available index;
4) j=j+1 is made;
If 5) j≤Q, return to step 2), otherwise all indexs to be removed are put into index set to be removed, by institute
There is available index to be put into available index set, and perform next step;
6) index set to be removed has B element, bfFor the element of index set to be removed, index set to be removed is represented
In f-th of index to be removed;
Whether be 0, if 0, then perform step 16 if judging the number of elements in index set to be removed), under otherwise performing
One step;
7) f=1 is made;
8) index b to be removed is calculated according to formula (6)fWith the variance of B-1 in index set to be removed indexs to be removed
Expansion factor VIFf:
Wherein, RfThe incidence coefficient of remaining index to be removed in f-th of index to be removed and index set to be removed is represented,
TOLfFor the tolerance limit of f-th of index to be removed,Represent index b to be removedfValue;
9) variance inflation factor VIF is judgedfWith the relation between 10, if VIFf> 10, then perform step 11);Otherwise
By index b to be removedfIt is designated as to be moved, execution next step;
10) f=f+1 is made,
11) f≤Q, then return to step 8), otherwise perform next step;
12) judge mark is whether the quantity of index to be removed to be moved is 0, if 0 execution step 15), it is no
Then perform next step;
13) mark the index to be removed immigration for being to use index set by all, update index set to be removed
With the element of available index set, while updating the number of element in index set to be removed, next step is performed;
14) judge whether the number of elements in current index set to be removed is 0, if 0, perform step 16), it is no
Then return to step 7);
15) all indexs to be removed in current index set to be removed are removed, and perform next step;
16) currently available index set is exported;
S3:If available index set has P element, piTo can use the element of index set, expression can use index set
In i-th of available index, regression analyses are carried out to P available indexs in available index set, the maintenance work of open source software is obtained
Work amount and the regression estimates formula that can use index:
1) the maintenance workload y of each time point t open source software in period T is obtainedt;
2) and then using time point t as abscissa, with the maintenance workload y of open source softwaretDrawn for ordinate, obtain one
Broken line;
3) above-mentioned broken line is expressed in the form of formula (8):
Y=A1p1+A2p2+A3p3+...+Akpi+E (8);
4) dimension of the time point t and the corresponding open source softwares of time point t in linear regression method, passage time section T are used
Protect workload ytTry to achieve the coefficient A in formula (8)1, A2, A2..., AkAnd E;
S4:According to the formula that obtains (8), available index p is being learntiAfterwards, you can predict the maintenance workload y of the open source software.
Relative to prior art, the invention has the advantages that:The method that the present invention is provided characterizes code matter by introducing
The index of amount, and these indexs are screened, the functional relation then set up between index and software maintenance workload, from
And the maintenance workload of open source software is accurately predicted using the functional relation.Open source software code is characterized in this method
The index of quality is readily available, and is predicted the outcome precisely, suitable for large-scale promotion.
Brief description of the drawings
Fig. 1 is the inventive method flow chart.
The result figure that Fig. 2 is Lucene in embodiment.
The result figure that Fig. 3 is openwebbeans in embodiment.
The result figure that Fig. 4 is Shindig in embodiment.
Embodiment
The present invention is described in further detail below.
Referring to Fig. 1, a kind of method by code quality assessment prediction open source software maintenance workload, including following step
Suddenly:
S1:All indexs for characterizing open source software code quality are obtained, if all index composing indexes set, the index set
Q element, q are had in conjunctionjFor the element of index set, j-th of index in index set is represented;
S2:Remove the high index of the degree of association in index set, method is as follows:Using between variance inflation factor measurement index
Multicollinearity (i.e. the degree of association);
1) j=1 is set;
2) according to formula (4) parameter qjWith the variance inflation factor VIF of Q-1 index of other in index setj;
Wherein, RjRepresent the incidence coefficient of remaining index in j-th of index and index set, TOLjFor jth index
Tolerance limit,Represent index qjValue;
3) variance inflation factor VIF is judgedjWith the relation between 10, if VIFj> 10, (shows jth index and its
The multicollinearity of remaining index is excessive) then index qjIt is otherwise available index for index to be removed;
4) j=j+1 is made;
If 5) j≤Q, return to step 2), otherwise all indexs to be removed are put into index set to be removed, by institute
There is available index to be put into available index set, and perform next step;
6) index set to be removed has B element, bfFor the element of index set to be removed, index set to be removed is represented
In f-th of index to be removed;
Whether be 0, if 0, then perform step 16 if judging the number of elements in index set to be removed), under otherwise performing
One step;
7) f=1 is made;
8) index b to be removed is calculated according to formula (6)fWith the variance of B-1 in index set to be removed indexs to be removed
Expansion factor VIFf:
Wherein, RfThe incidence coefficient of remaining index to be removed in f-th of index to be removed and index set to be removed is represented,
TOLfFor the tolerance limit of f-th of index to be removed,Represent index b to be removedfValue;
9) variance inflation factor VIF is judgedfWith the relation between 10,
If VIFf> 10, then perform step 11);
Otherwise by index b to be removedfIt is designated as to be moved, execution next step;
10) f=f+1 is made,
11) f≤Q, then return to step 8), otherwise perform next step;
12) judge mark is whether the quantity of index to be removed to be moved is 0, if 0 execution step 15), it is no
Then perform next step;
13) mark the index to be removed immigration for being to use index set by all, update index set to be removed
With the element of available index set, while updating the number of element in index set to be removed, next step is performed;
14) judge whether the number of elements in current index set to be removed is 0, if 0, perform step 16), it is no
Then return to step 7);
15) all indexs to be removed in current index set to be removed are removed, and perform next step;
16) currently available index set is exported;
Because the multicollinearity of index can have a strong impact on the accuracy of regression analysis, so carrying out variance inflation factor
After value checking, the precondition for removing regression analysis after multicollinearity too big index is set up.
S3:If available index set has P element, piTo can use the element of index set, expression can use index set
In i-th of available index, regression analyses are carried out to P available indexs in available index set, the maintenance work of open source software is obtained
Work amount and the regression estimates formula that can use index:
1) the maintenance workload y of each time point t open source software in period T is obtainedt;(when can be arranged as required to
Between point t interval)
Generally, software maintenance workload is measured by the way of people is * days.But in the maintenance work of open source software
In specific developer can not obtain, the working time of developer can not also measure.But after software maintenance, be bound to
Software version is updated, change record of this release maintenance person to whole project is included in the daily record of each version number.Institute
As long as to obtain【Rev. (revision) revision version】This index can just measure out the maintenance workload of developer.
2) and then using time point t as abscissa, with the maintenance workload y of open source softwaretDrawn for ordinate, obtain one
Broken line;
3) above-mentioned broken line is expressed in the form of formula (8):
Y=A1p1+A2p2+A3p3+...+Akpi+E (8);
4) dimension of the time point t and the corresponding open source softwares of time point t in linear regression method, passage time section T are used
Protect workload ytTry to achieve the coefficient A in formula (8)1, A2, A2..., AkAnd E;
S4:According to the formula that obtains (8), available index p is being learntiAfterwards, you can predict the maintenance workload y of the open source software.
Embodiment:Hundreds of open source projects are supported as the representative Apache foundations of open source projects and to generation
The open source software project on boundary is made that the contribution of brilliance.Many projects have runed the several years under the support of Apache foundations
Time, have accumulated substantial amounts of source code, the information of source code change and project maintenance, and worldwide have a large amount of
Excellent programs person participate in these projects development and maintenance work.The present embodiment have chosen from these projects shindig,
The related data of Lucene and openwebbeans. projects is used as identifying object.Wherein shindig is one of apache
Incubator project, it is desirable to provide an Open Social container increased income terminated hatching in 2010 and switchs to formal level-one item,
All data before and after this project switchs to level-one item can be inquired according to SVN.Lucene is even more then to be chosen as " most by industry
One of nearly most influential projects of 10 years Apache ".Openwebbeans is defined as JSR-299 Web Beans explanations
Realize program.
Because the multicollinearity of index can have a strong impact on the accuracy of later stage linear regression analysis, using variance inflation because
Multicollinearity between sub- measurement index, after removing multicollinearity too big index, the precondition of linear regression analysis
Set up, then take the method for linear regression analysis to take down the index that index impacts are 0, to shindig, Lucene and
The index of tri- projects of openwebbeans is analyzed using variance inflation factor verification method, draws table 1:
Table 1 characterizes the variance inflation factor proof list of the index of code quality
The number it can be found that class is verified with variance inflation factor (VIF), the number of bag, the number of method etc. is all and code
There is the problem of multicollinearity is too high in line number, VIF values belong to (multicollinearity) degree of association more than 10 as can be seen from Table 1
It is excessive, Errors Catastrophic can be caused to the result of the linear regression analysis in later stage.And in linear regression analysis, we can be found that
The coefficient of these attributes is 0, i.e. influence of these attributes far away from other attributes to code release is big.Remove the degree of association and hand over big
Index Package, Class and Methods, retain LOC, Duplicated, Comment lines and Complexity, then do
VIF verifies that the overall VIF of this four indexs is less than 10 for 3.53.Finally Revision is added in index group and does VIF checkings, can
To find out the VIF values in this index of Revision and other indexs as 4653.51 significantly larger than 10.Thus Revision is proved
Hand over big with the degree of association of these indexs, these data can be used to carry out regression analysis.
LOC (item code line number):Scale for measuring whole software project.LOC is the weight for representing source code scale
Want index.
Dup. (Duplicated) code repeats line number:So-called duplicated code, refers to those similar each other or complete phases
Same code snippet.Duplicated code is all very common in all software, and it is also so that the maintenance cost of software is high
One of the reason for.Substantial amounts of duplicated code can cause sharply increasing for run time.
Comm. (Comment lines) code annotation line number:Code annotation occupies critical role in source code, including
The maintainability of code, website opening speed etc..Very few code annotation can have a strong impact on understanding of the attendant to source code,
And thus the hiding bug brought by maintenance can also increase.Excessive code is possible to influence the speed of service (such as page of software
The code annotation in face).
Comp. (Complexity) project complexity:The accurate complexity for saying software project should be divided into complicated
Degree and algorithm complex.The structure complexity of one software project comes from the component for constituting this project, and correspondence solves this
Individual problem can be counted as initial intellectual resources.And algorithm complex can directly with perform whole software project when
Between measure, accordingly solve this problem may be considered initial system resource.But in the feelings for safeguarding director's eligibility
Under condition, the system resource that Maintenance Significant Items are used will decay, and algorithm complex will become inessential in this case.So
In embodiment, the structure complexity of software project will be used as primary concern index.
Rev. (revision) revision version:The project source code version defined in SVN, in the daily record of each version number
Include change record of this release maintenance person to whole project.The health status of open source software is lived with the maintenance of open source software
Jump degree is related, and active degree can update from code release and draw.
Selection LOC, Dup., Comm. and Comp. are demonstrated as pushing away according to the variance inflation factor of these indexs
The independent variable for leading open source software maintenance workload is feasible, while demonstrating the index for characterizing source code quality and source code version
This quantity is implicitly present in contact.Also demonstrate and choose source code version quantity as the mathematics for assessing open source software maintenance workload
Reasonability.
In the life cycle of open source software project, substantial amounts of letter can be fed back to if the quantity sustainable growth of customer volume
Breath.The quantity sustainable growth of developer then contributes to handle the message of user feedback.Quick processing and persistence maintenance can ensure
The good operation of open source software project, so an open source software project well run needs to continue active renewal source code
Version.Moreover, the increment information existed in source code version can be seen that the maintenance activity of open source software.And source code version
Developer's name is included in this, the code change record of version is submitted every time, these information can reflect that source code is tieed up
The workload of shield.Therefore the maintenance workload for assessing open source software using source code version amount is well grounded.
The final work effect of maintenance work will be shown in the master data of source code, record the change of these source codes
More data are source code version change data.The source code version change data of open source software project refer to respect to other management
Mark is easier to obtain and dynamic change, thus, source code version quantity is set to the evaluation index of open source software maintenance workload
It is very necessary.
To characterizing the index of source code quality using linear regression analysis come forecasting software maintenance workload, linear regression divides
Analysis analysis Multiple-Factor Model is simple and convenient, accurately measures the degree of correlation between each factor and the height of regression fit degree,
The effect of predictive equation formula is improved, result of calculation is also convenient for the advantages such as contrast statistics.So linearly being returned to index herein
The change of source code index and the causality of source code version quantity can be drawn by returning analysis, and it is determined that can be with after coefficient
Quantity to source code version is predicted.
Assuming that LOC is X1, Comm. is X2, Dup. is X3, Comp. is X4, then maintenance workload y regression estimates formula is writeable
Make:
Y=AX1+BX2+CX3+DX4+E (9);
In order to confirm the coefficient in formula (9), the present embodiment is with shindig, Karaf, Lucene and openwebbeans
Regression analysis has been carried out using the statistics in July, 2011 to December exemplified by mesh, and has used the data in March, 2012 to August
Make the part that shindig, Lucene and openwebbeans project collected in different periods are listed in checking, table 2
The index of source code measurement:
The index table of the source code of table 2 measurement
Because this experiment takes multiple linear regression mode to be returned, check system override still uses real source generation
Code version is tested as most in the regression estimates formula maintenance workload that the statistic and regression estimates of each timing node are drawn
The good method of inspection.
This experiment draws following regression estimates formula using three projects from the data in May, 2011 in January, 2012:
Lucene:Y=-2.7614 × X1-4.4779×X2+1.603×X3+17.7877×X4; (10);
Openwebbean:Y=0.1131 × X1+2.8619×X2-0.1811×X3-2.4017×X4; (11);
Shindig:Y=-0.2107 × X1-0.7832×X2+0.2308×X3+1.917×X4; (12);
The index result progress maintenance workload that three projects were collected after 2 months 2012 is brought into according to regression estimates formula
Increment is verified.Assay figure employs difference of the maintenance workload increment at each time point and carries out doing figure.This does figure mode
The predicted value of workload increment and the difference directly perceived of actual value can be maintained on the premise of accuracy is not lost.Checking knot
Fruit is referring to Fig. 2-Fig. 4:
Fig. 2-Fig. 4 sets the time of abscissa starting as 0, and ordinate is maintenance workload increment.Three figures are shown respectively
Contrast verification of three projects in March, 2012 to the maintenance workload increment of August.Actual error is shown in Table 3.
The error rate of the maintenance workload increment of each timing node of the project of table 3
The error of table 3 is in the maintenance workload increment and the ratio of real maintenance workload increment predicted every time.Due to
Source code version quantity is always that incremental and general quantity is very big, is only occurred using predicted value and actual value directly contrast
The error of very little, but these errors spend with regard to enough attendants and go several moons to safeguard.So can be sent out using incremental error
The gap of the cumulative increment of existing predicted value and the cumulative increment of True Data, so that the effect that fault in enlargement is produced.In the table
It can be seen that error rate is all below 10%, reflect the stability and accuracy of the algorithm that the present invention is provided.
Finally illustrate, the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted, although with reference to compared with
The present invention is described in detail good embodiment, it will be understood by those within the art that, can be to skill of the invention
Art scheme is modified or equivalent substitution, and without departing from the objective and scope of technical solution of the present invention, it all should cover at this
Among the right of invention.
Claims (1)
1. a kind of method by code quality assessment prediction open source software maintenance workload, it is characterised in that:Including following step
Suddenly:
S1:All indexs for characterizing open source software code quality are obtained, if in all index composing indexes set, the index set
Have Q element, qjFor the element of index set, j-th of index in index set is represented;
S2:Remove the high index of the degree of association in index set, method is as follows:
1) j=1 is set;
2) according to formula (4) parameter qjWith the variance inflation factor VIF of Q-1 index of other in index setj;
<mrow>
<msub>
<mi>VIF</mi>
<mi>j</mi>
</msub>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mrow>
<msub>
<mi>TOL</mi>
<mi>j</mi>
</msub>
</mrow>
</mfrac>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mrow>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>R</mi>
<mi>j</mi>
<mn>2</mn>
</msubsup>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>4</mn>
<mo>)</mo>
</mrow>
<mo>;</mo>
</mrow>
<mrow>
<msub>
<mi>R</mi>
<mi>j</mi>
</msub>
<mo>=</mo>
<mover>
<msub>
<mi>q</mi>
<mi>j</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<mfrac>
<mrow>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>j</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>Q</mi>
</munderover>
<mover>
<msub>
<mi>q</mi>
<mi>j</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<mover>
<msub>
<mi>q</mi>
<mi>j</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
</mrow>
<mrow>
<mi>Q</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>5</mn>
<mo>)</mo>
</mrow>
<mo>;</mo>
</mrow>
Wherein, RjRepresent the incidence coefficient of remaining index in j-th of index and index set, TOLjFor the tolerance limit of j-th of index,Represent index qjValue;
3) variance inflation factor VIF is judgedjWith the relation between 10, if VIFj>10, then index qjFor index to be removed, otherwise
To can use index;
4) j=j+1 is made;
If 5) j≤Q, return to step 2), otherwise all indexs to be removed are put into index set to be removed, by it is all can
It is put into index in available index set, and performs next step;
6) index set to be removed has B element, bfFor the element of index set to be removed, f in index set to be removed is represented
Individual index to be removed;
Whether be 0, if 0, then perform step 16 if judging the number of elements in index set to be removed), otherwise perform next step;
7) f=1 is made;
8) index b to be removed is calculated according to formula (6)fWith the variance inflations of the indexs to be removed of B-1 in index set to be removed because
Sub- VIFf:
<mrow>
<msub>
<mi>VIF</mi>
<mi>f</mi>
</msub>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mrow>
<msub>
<mi>TOL</mi>
<mi>f</mi>
</msub>
</mrow>
</mfrac>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mrow>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>R</mi>
<mi>f</mi>
<mn>2</mn>
</msubsup>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>6</mn>
<mo>)</mo>
</mrow>
<mo>;</mo>
</mrow>
<mrow>
<msub>
<mi>R</mi>
<mi>f</mi>
</msub>
<mo>=</mo>
<mover>
<msub>
<mi>b</mi>
<mi>f</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<mfrac>
<mrow>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>f</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>B</mi>
</munderover>
<mover>
<msub>
<mi>b</mi>
<mi>f</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<mover>
<msub>
<mi>b</mi>
<mi>f</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
</mrow>
<mrow>
<mi>B</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>7</mn>
<mo>)</mo>
</mrow>
<mo>;</mo>
</mrow>
Wherein, RfRepresent the incidence coefficient of remaining index to be removed in f-th of index to be removed and index set to be removed, TOLf
For the tolerance limit of f-th of index to be removed,Represent index b to be removedfValue;
9) variance inflation factor VIF is judgedfWith the relation between 10, if VIFf>10, then perform step 11);Otherwise it will treat
Except index bfIt is designated as to be moved, execution next step;
10) f=f+1 is made,
11) f≤Q, then return to step 8), otherwise perform next step;
12) judge mark is whether the quantity of index to be removed to be moved is 0, if 0 execution step 15), otherwise hold
Row next step;
13) mark the index to be removed for being to move into can to use index set by all, update index set to be removed and can
With the element of index set, while updating the number of element in index set to be removed, next step is performed;
14) judge whether the number of elements in current index set to be removed is 0, if 0, perform step 16), otherwise return
Step 7);
15) all indexs to be removed in current index set to be removed are removed, and perform next step;
16) currently available index set is exported;
S3:If available index set has P element, piTo can use the element of index set, expression can be used i-th in index set
P available indexs in available index set are carried out regression analyses by individual available index, obtain the maintenance workload of open source software with
The regression estimates formula of index can be used:
1) the maintenance workload y of each time point t open source software in period T is obtainedt;
2) and then using time point t as abscissa, with the maintenance workload y of open source softwaretDrawn for ordinate, obtain a broken line;
3) above-mentioned broken line is expressed in the form of formula (8):
Y=A1p1+A2p2+A3p3+...+Akpi+E (8);
4) the maintenance work of the time point t and the corresponding open source softwares of time point t in linear regression method, passage time section T are used
Work amount ytTry to achieve the coefficient A in formula (8)1,A2,A2,...,AkAnd E;
S4:According to the formula that obtains (8), available index p is being learntiAfterwards, you can predict the maintenance workload y of the open source software.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510218321.3A CN104809066B (en) | 2015-04-30 | 2015-04-30 | A kind of method by code quality assessment prediction open source software maintenance workload |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510218321.3A CN104809066B (en) | 2015-04-30 | 2015-04-30 | A kind of method by code quality assessment prediction open source software maintenance workload |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104809066A CN104809066A (en) | 2015-07-29 |
CN104809066B true CN104809066B (en) | 2017-08-25 |
Family
ID=53693908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510218321.3A Active CN104809066B (en) | 2015-04-30 | 2015-04-30 | A kind of method by code quality assessment prediction open source software maintenance workload |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104809066B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472453B (en) * | 2018-10-12 | 2021-09-21 | 山大地纬软件股份有限公司 | Power consumer credit evaluation method based on global optimal fuzzy kernel clustering model |
CN109460908A (en) * | 2018-10-29 | 2019-03-12 | 成都安美勤信息技术股份有限公司 | The construction cost assessment method of soft project |
CN110033242B (en) * | 2019-04-23 | 2023-11-28 | 软通智慧科技有限公司 | Working time determining method, device, equipment and medium |
CN110286938B (en) * | 2019-07-03 | 2023-03-31 | 北京百度网讯科技有限公司 | Method and apparatus for outputting evaluation information for user |
CN112799712B (en) * | 2021-01-29 | 2024-02-02 | 中国工商银行股份有限公司 | Maintenance workload determination method, device, equipment and medium |
CN113590486A (en) * | 2021-02-23 | 2021-11-02 | 中国人民解放军军事科学院国防科技创新研究院 | Open source software code quality evaluation method based on measurement |
CN112800167B (en) * | 2021-04-13 | 2021-06-29 | 北京星天科技有限公司 | Method and device for evaluating workload of digital chart drawing |
CN113282299B (en) * | 2021-06-15 | 2024-06-07 | 中国农业银行股份有限公司 | Information processing method, device, equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005267489A (en) * | 2004-03-22 | 2005-09-29 | Toshiba Corp | Method and apparatus for measuring degree of software quality |
CN101710304A (en) * | 2009-11-27 | 2010-05-19 | 中国科学院软件研究所 | Method for evaluating implementation quality of software process |
CN104574209B (en) * | 2015-01-07 | 2018-03-16 | 国家电网公司 | The modeling method of Medium Early Warning model is overloaded in a kind of city net distribution transforming again |
-
2015
- 2015-04-30 CN CN201510218321.3A patent/CN104809066B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN104809066A (en) | 2015-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104809066B (en) | A kind of method by code quality assessment prediction open source software maintenance workload | |
Li et al. | Architectural technical debt identification based on architecture decisions and change scenarios | |
Di Bella et al. | Pair programming and software defects--a large, industrial case study | |
Borade et al. | Software project effort and cost estimation techniques | |
Suri et al. | Comparative analysis of software effort estimation techniques | |
De Bassi et al. | Measuring developers' contribution in source code using quality metrics | |
CN101561904B (en) | A kind of software project cost assay method of Kernel-based methods data and system | |
Parvan | Estimating the impact of building information modeling (BIM) utilization on building project performance | |
Alves et al. | An empirical study on the estimation of software development effort with use case points | |
Tarhan et al. | Investigating the effect of variations in the test development process: a case from a safety-critical system | |
Sharma et al. | Pivot: Project insights and visualization toolkit | |
Raza et al. | A model for analyzing performance problems and root causes in the personal software process | |
Bokhari et al. | Software reliability growth modeling for exponentiated Weibull function with actual software failures data | |
Gong et al. | A simulation model of kanban software process | |
Gavriliev et al. | Model and Procedure for Assessing the Qualification of a Software Developer | |
Kapur et al. | A unified approach for developing two-dimensional software reliability model | |
Dickmann et al. | Deriving a valid process simulation from real world experiences | |
Kundu et al. | Selection and classification of common factors affecting the maintainability on the basis of common criteria | |
Liu et al. | Metrics for software process simulation modeling | |
Wu et al. | Construct operation model based on process database for software reliability prediction | |
Shahid et al. | A road map to generative safety culture: An integrated conceptual model | |
Zhang et al. | Evaluation of project quality: a DEA-based approach | |
Condori-Fernandez et al. | Experimental study using functional size measurement in building estimation models for software project size | |
Yadav et al. | Testing effort-dependent software reliability growth model using time lag functions under distributed environment | |
Boring et al. | Human reliability analysis in the US nuclear power industry: a comparison of atomistic and holistic methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20180402 Address after: 400000 Chongqing city Shapingba District Jingyuan Road No. 8, No. 6-6 of 15 Patentee after: Chongqing michiro science and Technology Co., Ltd. Address before: 400044 Shapingba District Sha Street, No. 174, Chongqing Patentee before: Chongqing University |