CN104809066B - A kind of method by code quality assessment prediction open source software maintenance workload - Google Patents

A kind of method by code quality assessment prediction open source software maintenance workload Download PDF

Info

Publication number
CN104809066B
CN104809066B CN201510218321.3A CN201510218321A CN104809066B CN 104809066 B CN104809066 B CN 104809066B CN 201510218321 A CN201510218321 A CN 201510218321A CN 104809066 B CN104809066 B CN 104809066B
Authority
CN
China
Prior art keywords
index
mrow
msub
index set
open source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510218321.3A
Other languages
Chinese (zh)
Other versions
CN104809066A (en
Inventor
杨梦宁
罗杨洋
徐玲
洪明坚
葛永新
张小洪
杨丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Michiro Science And Technology Co Ltd
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN201510218321.3A priority Critical patent/CN104809066B/en
Publication of CN104809066A publication Critical patent/CN104809066A/en
Application granted granted Critical
Publication of CN104809066B publication Critical patent/CN104809066B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The present invention relates to a kind of method by code quality assessment prediction open source software maintenance workload, the index of all sign open source software code qualities is obtained first, is then introduced variance inflation factor and is removed the larger index of the degree of association, obtains available index set;To with index set linear regression analysis can be carried out with index, the functional relation between the maintenance workload of open source software and available index is obtained, is the maintenance workload of predictable open source software using the relational expression.The index that open source software code quality is characterized in this method is readily available, and is predicted the outcome precisely, suitable for large-scale promotion.

Description

A kind of method by code quality assessment prediction open source software maintenance workload
Technical field
The present invention relates to open source software maintenance, and in particular to one kind safeguards work by code quality assessment prediction open source software The method that work is measured.
Background technology
The maintenance work of software is the final stage of software life-cycle, is also the most long stage.The maintenance work of software Whole software life-cycle is spent 60% to 70% will be accounted for.Open source software provides their source code for the public, and The use that the public can not be limited by licensing, changes and distributes their source code.
Because software maintenance work occupies very important status in modern software engineering, many people are carried out in this field Study and create a series of forecast models.Different according to the index used in prediction, software maintenance workload prediction model can divide For static prediction model and dynamic prediction model, static prediction model mainly has COCOMO models, Walston_Felix models And Maston, Barnett and Mellichamp models, the class model thinks LOC and prediction FP (function points) to comment Estimate software size and draw the important indicator of maintenance workload, the derivation key element that can be assessed as workload.For identical KLOC or FP values, different results will be drawn with the estimation of different models.Main cause is, if these models majority be all according only to The empirical data of limited project is derived in dry application field, restricted application.Therefore, it is necessary to according to current project The characteristics of the applicable appraising model of selection, and adjustment (for example, modification model constants) appraising model as suitably desired.And Dynamic multivariate model is derived according to the productivity data collected from more than 4000 present age software project.Such mould Type regards workload as the function of software size and the two variables of development time.
1st, it is maintenance works of the Boehm to conventional software on the basis of COCOMO models to compare famous in static models The ACT models of proposition.The mode degree of coming of (MM) people × moon should be used by proposing software maintenance workload first in ACT models Measure and drawn formula:
E=ACT × 2.4KLOC1.05(1);
In formula (1), E represents workload, and ACT represents that system software source code code in the maintenance work of 1 year is changed Ratio.
But it is that this model carrys out metric software maintenance work only with the lines of code of change the problem of ACT models, and Represent that the size of software is also very inaccurate using code size.This model greatest problem is that model does not consider in itself To any Scientific Attribute about software maintenance.
Boehm BW et al. have developed the models of COCOMO 2.0 again on the basis of COCOMO models.The model can be represented For:
A represents thousand lines of codes, function points or object number in this model.SF represents five Size measurement sides Formula includes priori, develops elasticity, risk resolution ability, Work team Cohesiveness and process degree of perfection.EM represents 17 works Make metric method.The model also defines the scale of a new software project.But this model is only applicable to plan Kind software project (project of software company's exploitation of such as perfecting system) and to evaluate the maintenance of a software project Workload needs to carry out substantial amounts of collection and statistics to indices.
2nd, the form of typical dynamic multivariate appraising model is as follows:
E=(LOC × B0.333/P) 3 × (1/t) 4 (13.2) (3);
E is the workload in units of man month or man-year;T is the Item duration in units of the moon or year;B is special Different technique factor, it is slowly increased with the increase of the demand to test, quality assurance, document and administrative skill, for compared with Small program (KLOC=5~15), B=0.16;For the program more than 70KLOC, B=0.39, P is productivity parameters.It is anti- Influence of following factors to workload is reflected:Overall process maturity and managerial skills;Use good software engineering practice Degree;The rank of the programming language used;The state of software environment;The technology and experience of software project group;Application system Complexity.
Sajid Anwar, Muhammad Ramzan, the Abdul Rauf in later dynamic model research, et al. from soft Part framework, which sets out, have evaluated software architecture for the influence of software project quality and this influence is applied into software maintenance work Analysis and estimation that work is measured.Substantial amounts of analysis is done to the whole life cycle of software last software dimension is assessed from process management The workload of shield.These appraisal procedures are highly effective, but then and do not applied to for open source software, and open source software project does not have The management constraint of strength, so management data that also cannot be related.Kun Chang Lee, Nam Yong Jo then use pattra leaves Preferable result is analyzed and obtained to the analysis method of this network to influenceing the factor of software maintenance workload to be made that.Give Some generally acknowledged at present assess mutual pass of the experimental basis and labor of software maintenances work figureofmerits between them Connection.But can not above accomplish accurate prediction work amount in final result.
The method of static models embodies researchers from software most basic composition, such as source code, in the element such as document Obtain software maintenance workload effort, but the method for static models and be not suitable for safeguard in software project, safeguard During software all basic constituent elements can all change and current static method can not grasp completely these change pair The influence that software maintenance workload is constituted.And the method for dynamic model then embodies researchers from software maintenance staff, With the thought of soft project, estimation and prediction are made to software maintenance workload with the data for managing process.But dynamic analog The method Data Collection of type is difficult, for open source software, and management data can not be collected at all.
The content of the invention
In view of the above-mentioned problems existing in the prior art, dynamic-static model is combined it is an object of the invention to provide one kind, Pass through the method for code quality assessment prediction open source software maintenance workload.
To achieve the above object, the present invention is adopted the following technical scheme that:One kind is increased income soft by code quality assessment prediction The method of part maintenance workload, comprises the following steps:
S1:All indexs for characterizing open source software code quality are obtained, if all index composing indexes set, the index set Q element, q are had in conjunctionjFor the element of index set, j-th of index in index set is represented;
S2:Remove the high index of the degree of association in index set, method is as follows:
1) j=1 is set;
2) according to formula (4) parameter qjWith the variance inflation factor VIF of Q-1 index of other in index setj
Wherein, RjRepresent the incidence coefficient of remaining index in j-th of index and index set, TOLjFor jth index Tolerance limit,Represent index qjValue;
3) variance inflation factor VIF is judgedjWith the relation between 10, if VIFj> 10, then index qjRefer to be to be removed Mark, is otherwise available index;
4) j=j+1 is made;
If 5) j≤Q, return to step 2), otherwise all indexs to be removed are put into index set to be removed, by institute There is available index to be put into available index set, and perform next step;
6) index set to be removed has B element, bfFor the element of index set to be removed, index set to be removed is represented In f-th of index to be removed;
Whether be 0, if 0, then perform step 16 if judging the number of elements in index set to be removed), under otherwise performing One step;
7) f=1 is made;
8) index b to be removed is calculated according to formula (6)fWith the variance of B-1 in index set to be removed indexs to be removed Expansion factor VIFf
Wherein, RfThe incidence coefficient of remaining index to be removed in f-th of index to be removed and index set to be removed is represented, TOLfFor the tolerance limit of f-th of index to be removed,Represent index b to be removedfValue;
9) variance inflation factor VIF is judgedfWith the relation between 10, if VIFf> 10, then perform step 11);Otherwise By index b to be removedfIt is designated as to be moved, execution next step;
10) f=f+1 is made,
11) f≤Q, then return to step 8), otherwise perform next step;
12) judge mark is whether the quantity of index to be removed to be moved is 0, if 0 execution step 15), it is no Then perform next step;
13) mark the index to be removed immigration for being to use index set by all, update index set to be removed With the element of available index set, while updating the number of element in index set to be removed, next step is performed;
14) judge whether the number of elements in current index set to be removed is 0, if 0, perform step 16), it is no Then return to step 7);
15) all indexs to be removed in current index set to be removed are removed, and perform next step;
16) currently available index set is exported;
S3:If available index set has P element, piTo can use the element of index set, expression can use index set In i-th of available index, regression analyses are carried out to P available indexs in available index set, the maintenance work of open source software is obtained Work amount and the regression estimates formula that can use index:
1) the maintenance workload y of each time point t open source software in period T is obtainedt
2) and then using time point t as abscissa, with the maintenance workload y of open source softwaretDrawn for ordinate, obtain one Broken line;
3) above-mentioned broken line is expressed in the form of formula (8):
Y=A1p1+A2p2+A3p3+...+Akpi+E (8);
4) dimension of the time point t and the corresponding open source softwares of time point t in linear regression method, passage time section T are used Protect workload ytTry to achieve the coefficient A in formula (8)1, A2, A2..., AkAnd E;
S4:According to the formula that obtains (8), available index p is being learntiAfterwards, you can predict the maintenance workload y of the open source software.
Relative to prior art, the invention has the advantages that:The method that the present invention is provided characterizes code matter by introducing The index of amount, and these indexs are screened, the functional relation then set up between index and software maintenance workload, from And the maintenance workload of open source software is accurately predicted using the functional relation.Open source software code is characterized in this method The index of quality is readily available, and is predicted the outcome precisely, suitable for large-scale promotion.
Brief description of the drawings
Fig. 1 is the inventive method flow chart.
The result figure that Fig. 2 is Lucene in embodiment.
The result figure that Fig. 3 is openwebbeans in embodiment.
The result figure that Fig. 4 is Shindig in embodiment.
Embodiment
The present invention is described in further detail below.
Referring to Fig. 1, a kind of method by code quality assessment prediction open source software maintenance workload, including following step Suddenly:
S1:All indexs for characterizing open source software code quality are obtained, if all index composing indexes set, the index set Q element, q are had in conjunctionjFor the element of index set, j-th of index in index set is represented;
S2:Remove the high index of the degree of association in index set, method is as follows:Using between variance inflation factor measurement index Multicollinearity (i.e. the degree of association);
1) j=1 is set;
2) according to formula (4) parameter qjWith the variance inflation factor VIF of Q-1 index of other in index setj
Wherein, RjRepresent the incidence coefficient of remaining index in j-th of index and index set, TOLjFor jth index Tolerance limit,Represent index qjValue;
3) variance inflation factor VIF is judgedjWith the relation between 10, if VIFj> 10, (shows jth index and its The multicollinearity of remaining index is excessive) then index qjIt is otherwise available index for index to be removed;
4) j=j+1 is made;
If 5) j≤Q, return to step 2), otherwise all indexs to be removed are put into index set to be removed, by institute There is available index to be put into available index set, and perform next step;
6) index set to be removed has B element, bfFor the element of index set to be removed, index set to be removed is represented In f-th of index to be removed;
Whether be 0, if 0, then perform step 16 if judging the number of elements in index set to be removed), under otherwise performing One step;
7) f=1 is made;
8) index b to be removed is calculated according to formula (6)fWith the variance of B-1 in index set to be removed indexs to be removed Expansion factor VIFf
Wherein, RfThe incidence coefficient of remaining index to be removed in f-th of index to be removed and index set to be removed is represented, TOLfFor the tolerance limit of f-th of index to be removed,Represent index b to be removedfValue;
9) variance inflation factor VIF is judgedfWith the relation between 10,
If VIFf> 10, then perform step 11);
Otherwise by index b to be removedfIt is designated as to be moved, execution next step;
10) f=f+1 is made,
11) f≤Q, then return to step 8), otherwise perform next step;
12) judge mark is whether the quantity of index to be removed to be moved is 0, if 0 execution step 15), it is no Then perform next step;
13) mark the index to be removed immigration for being to use index set by all, update index set to be removed With the element of available index set, while updating the number of element in index set to be removed, next step is performed;
14) judge whether the number of elements in current index set to be removed is 0, if 0, perform step 16), it is no Then return to step 7);
15) all indexs to be removed in current index set to be removed are removed, and perform next step;
16) currently available index set is exported;
Because the multicollinearity of index can have a strong impact on the accuracy of regression analysis, so carrying out variance inflation factor After value checking, the precondition for removing regression analysis after multicollinearity too big index is set up.
S3:If available index set has P element, piTo can use the element of index set, expression can use index set In i-th of available index, regression analyses are carried out to P available indexs in available index set, the maintenance work of open source software is obtained Work amount and the regression estimates formula that can use index:
1) the maintenance workload y of each time point t open source software in period T is obtainedt;(when can be arranged as required to Between point t interval)
Generally, software maintenance workload is measured by the way of people is * days.But in the maintenance work of open source software In specific developer can not obtain, the working time of developer can not also measure.But after software maintenance, be bound to Software version is updated, change record of this release maintenance person to whole project is included in the daily record of each version number.Institute As long as to obtain【Rev. (revision) revision version】This index can just measure out the maintenance workload of developer.
2) and then using time point t as abscissa, with the maintenance workload y of open source softwaretDrawn for ordinate, obtain one Broken line;
3) above-mentioned broken line is expressed in the form of formula (8):
Y=A1p1+A2p2+A3p3+...+Akpi+E (8);
4) dimension of the time point t and the corresponding open source softwares of time point t in linear regression method, passage time section T are used Protect workload ytTry to achieve the coefficient A in formula (8)1, A2, A2..., AkAnd E;
S4:According to the formula that obtains (8), available index p is being learntiAfterwards, you can predict the maintenance workload y of the open source software.
Embodiment:Hundreds of open source projects are supported as the representative Apache foundations of open source projects and to generation The open source software project on boundary is made that the contribution of brilliance.Many projects have runed the several years under the support of Apache foundations Time, have accumulated substantial amounts of source code, the information of source code change and project maintenance, and worldwide have a large amount of Excellent programs person participate in these projects development and maintenance work.The present embodiment have chosen from these projects shindig, The related data of Lucene and openwebbeans. projects is used as identifying object.Wherein shindig is one of apache Incubator project, it is desirable to provide an Open Social container increased income terminated hatching in 2010 and switchs to formal level-one item, All data before and after this project switchs to level-one item can be inquired according to SVN.Lucene is even more then to be chosen as " most by industry One of nearly most influential projects of 10 years Apache ".Openwebbeans is defined as JSR-299 Web Beans explanations Realize program.
Because the multicollinearity of index can have a strong impact on the accuracy of later stage linear regression analysis, using variance inflation because Multicollinearity between sub- measurement index, after removing multicollinearity too big index, the precondition of linear regression analysis Set up, then take the method for linear regression analysis to take down the index that index impacts are 0, to shindig, Lucene and The index of tri- projects of openwebbeans is analyzed using variance inflation factor verification method, draws table 1:
Table 1 characterizes the variance inflation factor proof list of the index of code quality
The number it can be found that class is verified with variance inflation factor (VIF), the number of bag, the number of method etc. is all and code There is the problem of multicollinearity is too high in line number, VIF values belong to (multicollinearity) degree of association more than 10 as can be seen from Table 1 It is excessive, Errors Catastrophic can be caused to the result of the linear regression analysis in later stage.And in linear regression analysis, we can be found that The coefficient of these attributes is 0, i.e. influence of these attributes far away from other attributes to code release is big.Remove the degree of association and hand over big Index Package, Class and Methods, retain LOC, Duplicated, Comment lines and Complexity, then do VIF verifies that the overall VIF of this four indexs is less than 10 for 3.53.Finally Revision is added in index group and does VIF checkings, can To find out the VIF values in this index of Revision and other indexs as 4653.51 significantly larger than 10.Thus Revision is proved Hand over big with the degree of association of these indexs, these data can be used to carry out regression analysis.
LOC (item code line number):Scale for measuring whole software project.LOC is the weight for representing source code scale Want index.
Dup. (Duplicated) code repeats line number:So-called duplicated code, refers to those similar each other or complete phases Same code snippet.Duplicated code is all very common in all software, and it is also so that the maintenance cost of software is high One of the reason for.Substantial amounts of duplicated code can cause sharply increasing for run time.
Comm. (Comment lines) code annotation line number:Code annotation occupies critical role in source code, including The maintainability of code, website opening speed etc..Very few code annotation can have a strong impact on understanding of the attendant to source code, And thus the hiding bug brought by maintenance can also increase.Excessive code is possible to influence the speed of service (such as page of software The code annotation in face).
Comp. (Complexity) project complexity:The accurate complexity for saying software project should be divided into complicated Degree and algorithm complex.The structure complexity of one software project comes from the component for constituting this project, and correspondence solves this Individual problem can be counted as initial intellectual resources.And algorithm complex can directly with perform whole software project when Between measure, accordingly solve this problem may be considered initial system resource.But in the feelings for safeguarding director's eligibility Under condition, the system resource that Maintenance Significant Items are used will decay, and algorithm complex will become inessential in this case.So In embodiment, the structure complexity of software project will be used as primary concern index.
Rev. (revision) revision version:The project source code version defined in SVN, in the daily record of each version number Include change record of this release maintenance person to whole project.The health status of open source software is lived with the maintenance of open source software Jump degree is related, and active degree can update from code release and draw.
Selection LOC, Dup., Comm. and Comp. are demonstrated as pushing away according to the variance inflation factor of these indexs The independent variable for leading open source software maintenance workload is feasible, while demonstrating the index for characterizing source code quality and source code version This quantity is implicitly present in contact.Also demonstrate and choose source code version quantity as the mathematics for assessing open source software maintenance workload Reasonability.
In the life cycle of open source software project, substantial amounts of letter can be fed back to if the quantity sustainable growth of customer volume Breath.The quantity sustainable growth of developer then contributes to handle the message of user feedback.Quick processing and persistence maintenance can ensure The good operation of open source software project, so an open source software project well run needs to continue active renewal source code Version.Moreover, the increment information existed in source code version can be seen that the maintenance activity of open source software.And source code version Developer's name is included in this, the code change record of version is submitted every time, these information can reflect that source code is tieed up The workload of shield.Therefore the maintenance workload for assessing open source software using source code version amount is well grounded.
The final work effect of maintenance work will be shown in the master data of source code, record the change of these source codes More data are source code version change data.The source code version change data of open source software project refer to respect to other management Mark is easier to obtain and dynamic change, thus, source code version quantity is set to the evaluation index of open source software maintenance workload It is very necessary.
To characterizing the index of source code quality using linear regression analysis come forecasting software maintenance workload, linear regression divides Analysis analysis Multiple-Factor Model is simple and convenient, accurately measures the degree of correlation between each factor and the height of regression fit degree, The effect of predictive equation formula is improved, result of calculation is also convenient for the advantages such as contrast statistics.So linearly being returned to index herein The change of source code index and the causality of source code version quantity can be drawn by returning analysis, and it is determined that can be with after coefficient Quantity to source code version is predicted.
Assuming that LOC is X1, Comm. is X2, Dup. is X3, Comp. is X4, then maintenance workload y regression estimates formula is writeable Make:
Y=AX1+BX2+CX3+DX4+E (9);
In order to confirm the coefficient in formula (9), the present embodiment is with shindig, Karaf, Lucene and openwebbeans Regression analysis has been carried out using the statistics in July, 2011 to December exemplified by mesh, and has used the data in March, 2012 to August Make the part that shindig, Lucene and openwebbeans project collected in different periods are listed in checking, table 2 The index of source code measurement:
The index table of the source code of table 2 measurement
Because this experiment takes multiple linear regression mode to be returned, check system override still uses real source generation Code version is tested as most in the regression estimates formula maintenance workload that the statistic and regression estimates of each timing node are drawn The good method of inspection.
This experiment draws following regression estimates formula using three projects from the data in May, 2011 in January, 2012:
Lucene:Y=-2.7614 × X1-4.4779×X2+1.603×X3+17.7877×X4; (10);
Openwebbean:Y=0.1131 × X1+2.8619×X2-0.1811×X3-2.4017×X4; (11);
Shindig:Y=-0.2107 × X1-0.7832×X2+0.2308×X3+1.917×X4; (12);
The index result progress maintenance workload that three projects were collected after 2 months 2012 is brought into according to regression estimates formula Increment is verified.Assay figure employs difference of the maintenance workload increment at each time point and carries out doing figure.This does figure mode The predicted value of workload increment and the difference directly perceived of actual value can be maintained on the premise of accuracy is not lost.Checking knot Fruit is referring to Fig. 2-Fig. 4:
Fig. 2-Fig. 4 sets the time of abscissa starting as 0, and ordinate is maintenance workload increment.Three figures are shown respectively Contrast verification of three projects in March, 2012 to the maintenance workload increment of August.Actual error is shown in Table 3.
The error rate of the maintenance workload increment of each timing node of the project of table 3
The error of table 3 is in the maintenance workload increment and the ratio of real maintenance workload increment predicted every time.Due to Source code version quantity is always that incremental and general quantity is very big, is only occurred using predicted value and actual value directly contrast The error of very little, but these errors spend with regard to enough attendants and go several moons to safeguard.So can be sent out using incremental error The gap of the cumulative increment of existing predicted value and the cumulative increment of True Data, so that the effect that fault in enlargement is produced.In the table It can be seen that error rate is all below 10%, reflect the stability and accuracy of the algorithm that the present invention is provided.
Finally illustrate, the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted, although with reference to compared with The present invention is described in detail good embodiment, it will be understood by those within the art that, can be to skill of the invention Art scheme is modified or equivalent substitution, and without departing from the objective and scope of technical solution of the present invention, it all should cover at this Among the right of invention.

Claims (1)

1. a kind of method by code quality assessment prediction open source software maintenance workload, it is characterised in that:Including following step Suddenly:
S1:All indexs for characterizing open source software code quality are obtained, if in all index composing indexes set, the index set Have Q element, qjFor the element of index set, j-th of index in index set is represented;
S2:Remove the high index of the degree of association in index set, method is as follows:
1) j=1 is set;
2) according to formula (4) parameter qjWith the variance inflation factor VIF of Q-1 index of other in index setj
<mrow> <msub> <mi>VIF</mi> <mi>j</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <msub> <mi>TOL</mi> <mi>j</mi> </msub> </mrow> </mfrac> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <msubsup> <mi>R</mi> <mi>j</mi> <mn>2</mn> </msubsup> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
<mrow> <msub> <mi>R</mi> <mi>j</mi> </msub> <mo>=</mo> <mover> <msub> <mi>q</mi> <mi>j</mi> </msub> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <mfrac> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>Q</mi> </munderover> <mover> <msub> <mi>q</mi> <mi>j</mi> </msub> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <mover> <msub> <mi>q</mi> <mi>j</mi> </msub> <mo>&amp;OverBar;</mo> </mover> </mrow> <mrow> <mi>Q</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
Wherein, RjRepresent the incidence coefficient of remaining index in j-th of index and index set, TOLjFor the tolerance limit of j-th of index,Represent index qjValue;
3) variance inflation factor VIF is judgedjWith the relation between 10, if VIFj>10, then index qjFor index to be removed, otherwise To can use index;
4) j=j+1 is made;
If 5) j≤Q, return to step 2), otherwise all indexs to be removed are put into index set to be removed, by it is all can It is put into index in available index set, and performs next step;
6) index set to be removed has B element, bfFor the element of index set to be removed, f in index set to be removed is represented Individual index to be removed;
Whether be 0, if 0, then perform step 16 if judging the number of elements in index set to be removed), otherwise perform next step;
7) f=1 is made;
8) index b to be removed is calculated according to formula (6)fWith the variance inflations of the indexs to be removed of B-1 in index set to be removed because Sub- VIFf
<mrow> <msub> <mi>VIF</mi> <mi>f</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <msub> <mi>TOL</mi> <mi>f</mi> </msub> </mrow> </mfrac> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <msubsup> <mi>R</mi> <mi>f</mi> <mn>2</mn> </msubsup> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
<mrow> <msub> <mi>R</mi> <mi>f</mi> </msub> <mo>=</mo> <mover> <msub> <mi>b</mi> <mi>f</mi> </msub> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <mfrac> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>f</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>B</mi> </munderover> <mover> <msub> <mi>b</mi> <mi>f</mi> </msub> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <mover> <msub> <mi>b</mi> <mi>f</mi> </msub> <mo>&amp;OverBar;</mo> </mover> </mrow> <mrow> <mi>B</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
Wherein, RfRepresent the incidence coefficient of remaining index to be removed in f-th of index to be removed and index set to be removed, TOLf For the tolerance limit of f-th of index to be removed,Represent index b to be removedfValue;
9) variance inflation factor VIF is judgedfWith the relation between 10, if VIFf>10, then perform step 11);Otherwise it will treat Except index bfIt is designated as to be moved, execution next step;
10) f=f+1 is made,
11) f≤Q, then return to step 8), otherwise perform next step;
12) judge mark is whether the quantity of index to be removed to be moved is 0, if 0 execution step 15), otherwise hold Row next step;
13) mark the index to be removed for being to move into can to use index set by all, update index set to be removed and can With the element of index set, while updating the number of element in index set to be removed, next step is performed;
14) judge whether the number of elements in current index set to be removed is 0, if 0, perform step 16), otherwise return Step 7);
15) all indexs to be removed in current index set to be removed are removed, and perform next step;
16) currently available index set is exported;
S3:If available index set has P element, piTo can use the element of index set, expression can be used i-th in index set P available indexs in available index set are carried out regression analyses by individual available index, obtain the maintenance workload of open source software with The regression estimates formula of index can be used:
1) the maintenance workload y of each time point t open source software in period T is obtainedt
2) and then using time point t as abscissa, with the maintenance workload y of open source softwaretDrawn for ordinate, obtain a broken line;
3) above-mentioned broken line is expressed in the form of formula (8):
Y=A1p1+A2p2+A3p3+...+Akpi+E (8);
4) the maintenance work of the time point t and the corresponding open source softwares of time point t in linear regression method, passage time section T are used Work amount ytTry to achieve the coefficient A in formula (8)1,A2,A2,...,AkAnd E;
S4:According to the formula that obtains (8), available index p is being learntiAfterwards, you can predict the maintenance workload y of the open source software.
CN201510218321.3A 2015-04-30 2015-04-30 A kind of method by code quality assessment prediction open source software maintenance workload Active CN104809066B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510218321.3A CN104809066B (en) 2015-04-30 2015-04-30 A kind of method by code quality assessment prediction open source software maintenance workload

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510218321.3A CN104809066B (en) 2015-04-30 2015-04-30 A kind of method by code quality assessment prediction open source software maintenance workload

Publications (2)

Publication Number Publication Date
CN104809066A CN104809066A (en) 2015-07-29
CN104809066B true CN104809066B (en) 2017-08-25

Family

ID=53693908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510218321.3A Active CN104809066B (en) 2015-04-30 2015-04-30 A kind of method by code quality assessment prediction open source software maintenance workload

Country Status (1)

Country Link
CN (1) CN104809066B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472453B (en) * 2018-10-12 2021-09-21 山大地纬软件股份有限公司 Power consumer credit evaluation method based on global optimal fuzzy kernel clustering model
CN109460908A (en) * 2018-10-29 2019-03-12 成都安美勤信息技术股份有限公司 The construction cost assessment method of soft project
CN110033242B (en) * 2019-04-23 2023-11-28 软通智慧科技有限公司 Working time determining method, device, equipment and medium
CN110286938B (en) * 2019-07-03 2023-03-31 北京百度网讯科技有限公司 Method and apparatus for outputting evaluation information for user
CN112799712B (en) * 2021-01-29 2024-02-02 中国工商银行股份有限公司 Maintenance workload determination method, device, equipment and medium
CN113590486A (en) * 2021-02-23 2021-11-02 中国人民解放军军事科学院国防科技创新研究院 Open source software code quality evaluation method based on measurement
CN112800167B (en) * 2021-04-13 2021-06-29 北京星天科技有限公司 Method and device for evaluating workload of digital chart drawing
CN113282299B (en) * 2021-06-15 2024-06-07 中国农业银行股份有限公司 Information processing method, device, equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005267489A (en) * 2004-03-22 2005-09-29 Toshiba Corp Method and apparatus for measuring degree of software quality
CN101710304A (en) * 2009-11-27 2010-05-19 中国科学院软件研究所 Method for evaluating implementation quality of software process
CN104574209B (en) * 2015-01-07 2018-03-16 国家电网公司 The modeling method of Medium Early Warning model is overloaded in a kind of city net distribution transforming again

Also Published As

Publication number Publication date
CN104809066A (en) 2015-07-29

Similar Documents

Publication Publication Date Title
CN104809066B (en) A kind of method by code quality assessment prediction open source software maintenance workload
Li et al. Architectural technical debt identification based on architecture decisions and change scenarios
Di Bella et al. Pair programming and software defects--a large, industrial case study
Borade et al. Software project effort and cost estimation techniques
Suri et al. Comparative analysis of software effort estimation techniques
De Bassi et al. Measuring developers' contribution in source code using quality metrics
CN101561904B (en) A kind of software project cost assay method of Kernel-based methods data and system
Parvan Estimating the impact of building information modeling (BIM) utilization on building project performance
Alves et al. An empirical study on the estimation of software development effort with use case points
Tarhan et al. Investigating the effect of variations in the test development process: a case from a safety-critical system
Sharma et al. Pivot: Project insights and visualization toolkit
Raza et al. A model for analyzing performance problems and root causes in the personal software process
Bokhari et al. Software reliability growth modeling for exponentiated Weibull function with actual software failures data
Gong et al. A simulation model of kanban software process
Gavriliev et al. Model and Procedure for Assessing the Qualification of a Software Developer
Kapur et al. A unified approach for developing two-dimensional software reliability model
Dickmann et al. Deriving a valid process simulation from real world experiences
Kundu et al. Selection and classification of common factors affecting the maintainability on the basis of common criteria
Liu et al. Metrics for software process simulation modeling
Wu et al. Construct operation model based on process database for software reliability prediction
Shahid et al. A road map to generative safety culture: An integrated conceptual model
Zhang et al. Evaluation of project quality: a DEA-based approach
Condori-Fernandez et al. Experimental study using functional size measurement in building estimation models for software project size
Yadav et al. Testing effort-dependent software reliability growth model using time lag functions under distributed environment
Boring et al. Human reliability analysis in the US nuclear power industry: a comparison of atomistic and holistic methods

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180402

Address after: 400000 Chongqing city Shapingba District Jingyuan Road No. 8, No. 6-6 of 15

Patentee after: Chongqing michiro science and Technology Co., Ltd.

Address before: 400044 Shapingba District Sha Street, No. 174, Chongqing

Patentee before: Chongqing University