CN107515822A - Software defect positioning method based on multiple-objection optimization - Google Patents

Software defect positioning method based on multiple-objection optimization Download PDF

Info

Publication number
CN107515822A
CN107515822A CN201710700316.5A CN201710700316A CN107515822A CN 107515822 A CN107515822 A CN 107515822A CN 201710700316 A CN201710700316 A CN 201710700316A CN 107515822 A CN107515822 A CN 107515822A
Authority
CN
China
Prior art keywords
mrow
code file
msub
mtd
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710700316.5A
Other languages
Chinese (zh)
Other versions
CN107515822B (en
Inventor
吴芳芳
顾庆
陈道蓄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201710700316.5A priority Critical patent/CN107515822B/en
Publication of CN107515822A publication Critical patent/CN107515822A/en
Application granted granted Critical
Publication of CN107515822B publication Critical patent/CN107515822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses the software defect positioning method based on multiple-objection optimization, including:Arrange code file, BUG reports and the developer's information of software project;The keyword in code file and BUG reports is extracted, the text similarity function reported based on bag of words calculation code file and BUG;Architectural feature measurement based on code file, the structure complexity function of calculation code file;Developer's information based on code file, the not familiar degree function of computing staff;Given BUG reports, based on upper similarity function, structure complexity function and the not familiar degree function of developer, are ranked up to code file using the two-stage sort method based on multiple-objection optimization, export the code file of high defect suspicion rate.The present invention calculates simply, and autgmentability is strong, can fast and effeciently position defect code, available for different types of code file, exploitation and maintenance process suitable for large scope software product.

Description

Software defect positioning method based on multiple-objection optimization
Technical field
The present invention relates to field of software engineering, more particularly to the software defect positioning method based on multiple-objection optimization.
Background technology
Software defect is inevitable in the exploitation of software and maintenance process, constantly expands especially as software size, lacks Sunken quantity is also more and more huger, for example, only in 2013,3389 parts of BUG reports are just produced for Eclipse Platform.It is soft The main purpose of part debugging includes discovery, positioning, understanding and the removal of defect, and the Development Practice inside software enterprise shows, scarce Sunken be positioned in debugging process is a difficulty height and the key activities wasted time and energy.
The purpose of defect location is often of a high price and efficiency is low to reach for manual debugging code, also, artificial adjusts Examination often according to exploitation or the personal experience of attendant, has not reusability.For efficiently positioning defect, researcher proposes Many automatic adjustment methods.According to whether implementation of test cases, automatic defect localization method can be divided into dynamic and Static two kinds of classifications, dynamic defect positioning method need to analyze the immanent structure of tested program, collect holding for test case Row track and result, based on particular model to determine the position of defect code;Static method selective analysis code file and journey The internal structure of sequence such as controls dependence, data dependence relation, therefrom extracts feature, is built using machine learning method etc. Code file scoring, export the code file list of high defect suspicion rate.
In existing static defect localization method, the text based on information retrieval method calculation code file and BUG reports Similarity is a kind of mainstream thoughts, such as the BUGLocator that researcher proposes is based on the current BUG reports of vector space model calculating Accuse the text similarity reported with code file and history BUG;BLUiR instruments are retrieved as in code file based on structured message Class name, method name etc. assign different weights.However, existing method needs historical data, weight parameter is trained by supervised learning It is high Deng, computation complexity, it is not suitable for real-time exploitation and the maintenance process of large scope software.
To sum up, defect positioning method computation complexity of the prior art is high, is not easy to large-scale software development and maintenance.
The content of the invention
The invention provides the software defect positioning method based on multiple-objection optimization, compared with prior art, computational methods letter It is single, debugging efficiency is improved, being capable of effectively save manpower and time cost.
Software defect positioning method based on multiple-objection optimization, including:
S1, gather software under testing code file, BUG report and developer's posterior infromation, wherein, code file for Object oriented language refers to the class file of object-oriented, then refers to single code file for procedural language;BUG reports include Need the software defect data positioned;Document preparation person data during software project development are believed including developer's experience Breath;
S2, the code file and the BUG report loading keyword extracting method, obtain code file keyword and BUG file keywords;
S3, the code file keyword and the BUG files keyword constructed into code file S by bag of words R is reported with BUG, similarity function is defined according to the code file S and BUG reports R;
S4, the architectural feature measurement of the code file S obtain the code according to structure complexity function generating algorithm File S structure complexity function;
S5, according to developer's posterior infromation, the not familiar degree function of developer is calculated;
S6, the similarity function, the structure complexity function, the not familiar degree function of the developer are using based on more The two-stage sort method of objective optimization is ranked up to the code file S, obtains code file complete sequence;
S7, the code text by the top-k code files in the code file complete sequence labeled as high defect suspicion rate Preceding k positions in part, i.e. code file complete sequence, wherein, k is positive integer, k values can according to the order of severity of software defect and Complexity is manually adjusted.
Further, the keyword extraction method includes:
S21, by the code file and the BUG report resolve into by unordered group identifier into set;
S22, reserved word in punctuation mark in the set, operand, operator and programming language is filtered out, obtained Set after filtering;
S23, the identifier for compound word composition, the identifier is disassembled into single word according to capitalization;
S24, to after the filtering set carry out English word rootization handle, thus obtain the code file and The keyword set of BUG files.
Further, the S3 includes:
S31, corpus V constructed according to the code file keyword and the BUG files keyword;
The code file and BUG reports are expressed as code by S32, the corpus V applications bag of words File S and BUG report R,The code file S and BUG reports R is one-dimensional Vector set, wherein,WithFor the keyword t word frequency in the code file S and the BUG report R respectivelyWith it is inverse To document-frequency idftProduct, calculation formula is as follows:
Wherein,The number occurred for keyword t in file d, d are that the code file S or described BUG report R, N Represent file d total number, NtIt is the number of files for including keyword t;
S33, according to the code file S and the BUG report R define the similarity function, the similarity function For:
Wherein, RTS is R and S inner product, and T is vectorial transposition, | | R | | and | | S | | respectively R and S mould, i.e., all members Plain square root sum square.
Further, the architectural feature measurement includes:
Lines of code (LOC Line of Code), total line number of sentence is performed in representative code file;
Maximum McCabe loops complexity (MAX_CC, Max McCabe complexity), side in representative code file The maximum McCabe loops complexity of method/function;
Code revision number (NOC Number of Correct), representative code file is by modification number;
Other codes that the number of files (DFN Depended File Number) of dependence, representative code file are relied on Number of files;
Non- annotated code line number (NLC Noncommented Lines of Code), total line number subtracts in representative code file Go to annotate line number.
Further, the structure complexity function generating algorithm includes:
SS1, each code file, S represented according to the structural eigenvectora={ a1,a2,a3,a4,a5, wherein, a is Characteristic measure;
SS2, the dimension using the unified characteristic measure of normalizing equation, obtain normalization characteristic measurement, and formula is as follows:
Wherein, aminAnd amaxThe minimum value and maximum of expression characteristic measure, i=1,2,3,4,5;
SS3, according to the normalized characteristic measure, define the structure complexity function of the code file, the knot Structure complexity function is
Wherein, a is characterized measurement, i=1, and 2,3,4,5.
Further, the S5 includes:
S51, the cumulative time Y for being engaged according to developer developmentexpMapping obtains developer's experience measure index EXP;
S52, the not familiar degree function of the developer defined according to developer's experience degree index EXP,
Wherein, SPRepresent developer's set of the code file S, EXPiRepresent that developer i experience measure refers to Mark, i is positive integer.
Further, the developer is engaged in the cumulative time Y of developmentexpWith developer's experience measure Index EXP mapping relations are:
Yexp< 0.5, EXP=1;
0.5≤Yexp< 1, EXP=2;
1≤Yexp< 3, EXP=3;
Yexp>=3, EXP=5.
Further, the S6 includes:
S61, according to the similarity function SimR (S), the structure complexity function Comp (S) and the exploit person The not familiar degree function Rusd (S) of member, defines a multi-objective optimization question, and formula is as follows:
Wherein R is BUG reports, and S is the code file, triple (y1,y2,y3) in component y1、y2And y3Point Not Biao Shi similarity function, structure complexity function and developer it is not familiar degree function functional value, Γ is code file set; Y is the disaggregation that the similar Y is the similarity function, the structure complexity function and the not familiar degree function of the developer Spend the disaggregation of function, the structure complexity function and the not familiar degree function of the developer;
S62, the code file set Γ load the quick non-dominated sorting method based on layering, by code text Part set Γ is divided into different non-dominant layer FlIn, l=1,2 ..., m, the non-dominant layer is labeled as first stage sequence Row, wherein, m is the quantity of non-dominant layer, and the code suspicion rate in the smaller Fl of l values is higher;
S63, the non-dominant layer FlThe code file of (l=1,2 ..., m) is according to the similarity function SimR(S) enter Descending two minor sort of row, obtains second stage sequence;
S64, the splicing first stage sequence and second stage sequence, obtain code file complete sequence, wherein, it is described S62 includes:
S621, make code file Si∈Γ、Sj∈ Γ, i ≠ j;
So that SimR(Si) > SimR(Sj)、Comp(Si) > Comp (Sj)、Rusd(Si) > Rusd (Sj);
Then SiAnd SjRelation be:SiDominate Sj, it is denoted as
Wherein, R is BUG report, and Γ is the code file set, SimR () be the text similarity function, Comp () is that the structure complexity function, Rusd () are the not familiar degree functions of the developer;
S622, according to the code file SiCalculate dominant set DiWith by domination counter ni,
For the code file Sj∈ Γ, ifThen Di=Di ∪ { Sj};
IfThen ni++;Otherwise SiDo not dominate SjAnd SjDo not dominate Si, DiAnd niKeep constant;
First S623, generation non-dominant layer F1, include all domination counter ni=0 code file Si, i=1,2, 3…,|F1|, | F1| to dominate layer F1Cardinality of a set;
S624, with F1For initial value, non-dominant layer F is iterated to calculatel(l=2 ..., m), wherein m are the non-dominant layers of generation Quantity, Fl+1Iteration be based on Fl, for Si∈Fl, Sj∈Di, make by domination counter nj--, if nj=0, then Fl+1=Fl+1∪ {Sj, wherein i=1,2,3 ..., | Fl|, j=1,2,3 ..., | Di|, | Fl| it is to dominate layer FlCardinality of a set, | Di| it is dominant set DiCardinality of a set.
Further, k=10.
The beneficial effects of the invention are as follows:By the text similarity, the code file that consider BUG reports and code file Architectural feature and developer's empirical data, code file is entered using the two-stage sort method based on multiple-objection optimization Row sequence, exports the code file of high defect suspicion rate, can simplified calculation method, improve versatility and the extension of localization method Property, defect code is effectively positioned, improves defect repair efficiency.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, it will use below required in embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for ability For the those of ordinary skill of domain, on the premise of not paying creative work, it can also be obtained according to these accompanying drawings other attached Figure.
Fig. 1 is the overall framework of the software defect positioning method based on multiple-objection optimization;
Fig. 2 is the schematic diagram of the two-stage sort method based on multiple-objection optimization;
Fig. 3 is the flow chart to be sorted using quick non-dominated sorting method to code file.
Embodiment
To make those skilled in the art more fully understand technical scheme, with reference to embodiment to this Invention is described in further detail.
The embodiments of the invention provide the software defect positioning method based on multiple-objection optimization, as shown in figure 1, including:
S1, gather software under testing code file, BUG report and developer's posterior infromation, wherein, code file for Object oriented language refers to the class file of object-oriented, then refers to single code file for procedural language;BUG reports include Need the software defect data positioned;Document preparation person data during software project development are believed including developer's experience Breath;
S2, the code file and the BUG report loading keyword extracting method, obtain code file keyword and BUG file keywords;
S3, the code file keyword and the BUG files keyword constructed into code file S by bag of words R is reported with BUG, similarity function is defined according to the code file S and BUG reports R;
S4, the architectural feature measurement of the code file S obtain the code according to structure complexity function generating algorithm File S structure complexity function;
S5, according to developer's posterior infromation, the not familiar degree function of developer is calculated;
S6, the similarity function, the structure complexity function, the not familiar degree function of the developer are using based on more The two-stage sort method of objective optimization is ranked up to the code file S, obtains code file complete sequence;
S7, the code text by the top-k code files in the code file complete sequence labeled as high defect suspicion rate Part, k=10, k values can be manually adjusted according to the order of severity and complexity of software defect.
In above method step, the keyword extracting method includes:
S21, by the code file and the BUG report resolve into by unordered group identifier into set;
S22, reserved word in punctuation mark in the set, operand, operator and programming language is filtered out, obtained Set after filtering;
S23, the identifier for compound word composition, the identifier is disassembled into single word according to capitalization;
S24, the rootization processing that English word is carried out to the set after the filtering, for example, will appear from a document Word " delegating ", " delegate " and " delegation " is all summarized as their root-form " delegat ", thus Obtain the code file and the keyword set of BUG files.
The S3 includes:
S31, corpus V constructed according to the code file keyword and the BUG files keyword;
The code file and BUG reports are expressed as code by S32, the corpus V applications bag of words File S and BUG report R,The code file S and BUG reports R is one-dimensional Vector set, wherein,WithFor the keyword t word frequency in the code file S and the BUG report R respectivelyWith it is inverse To document-frequency idftProduct, calculation formula is as follows:
Wherein,The number occurred for keyword t in file d, d are that the code file S or described BUG report R, N Represent file d total number, NtIt is the number of files for including keyword t;
S33, according to the code file S and the BUG report R define the similarity function, the similarity function For:
Wherein, RTS is R and S inner product, and T is vectorial transposition, | | R | | and | | S | | respectively R and S mould, i.e., all members Plain square root sum square.
In the S3, the structure complexity function of the architectural feature metric computation code file based on code file, its The complexity of middle code is mainly reflected in each side such as the function numbers of lines of code and calling, and the structure of code is more complicated, Developer is more difficult to control, and the possibility that mistake occurs in code will greatly improve.Therefore for calculation code file Complexity, 5 code structure characteristic measures are defined, it is as shown in the table:
The code file structure Measure Indexes of table 1
The complexity index of above architectural feature measurement composition code file, each code file S can be expressed as structure Characteristic measure vector Sa={ a1,a2,a3,a4,a5, wherein a is characterized measurement.Because each Measure Indexes calculate dimension difference, need Dimension is unified using normalizing equation, formula is as follows:
Wherein i represents the subscript of characteristic measure, aminAnd amaxThe minimum value and maximum of individual features measurement are represented, according to Above-mentioned normalized characteristic measure, the structure complexity function of definition code file are as follows:
In the S4, developer's information based on code file, the not familiar degree function of computing staff.In general, through Test that the code logic that abundant programmer writes is clear, and form is good, be easy to read, and new hand programmer is often difficult to write wind It is easier mistake occur in the outstanding code of lattice and code.The Development Practice of software industry shows that most softwares are opened More people are needed to participate in exploitation during hair, and the experience level of developer differs, so the quality for the code file write It is uneven.Therefore personnel experience measure index EXP is designed and developed, be defined as follows, YexpRepresent to be engaged in the accumulative of development Time:
Development Experience (Yexp)/year EXP values
Yexp< 0.5 1
0.5≤Yexp< 1 2
1≤Yexp< 3 3
Yexp≥3 5
The developer's experience measure index of table 2
The defects of weighing code file according to the experience measure index EXP of above-mentioned developer tendentiousness, that is, think experience Value is higher, and defective tendentiousness is smaller, and vice versa, therefore both are in the personnel of inversely prroportional relationship, thus definition code file S Not familiar degree function:
SPRepresent code file S developer's set, EXPiExpression personnel i experience measure index.
In the S5, reported by given BUG, it is not familiar based on similarity function, structure complexity function, developer Function is spent, code file is ranked up using the two-stage sort method based on multiple-objection optimization, exports high defect suspicion rate Code file.Based on text similarity function, structure complexity function and the not familiar degree function of personnel, first by defect code Document alignment is converted to multi-objective optimization question;If code file S substitute into above-mentioned similarity function, structure complexity function and Make it that functional value is bigger in the not familiar degree function of developer, then the possibility comprising defect is bigger in code file S, exploitation Personnel just should pay close attention to and check code file S.Therefore, search the problem of meeting the code file S of above-mentioned requirements, can be with It is expressed as the multi-objective optimization question with a decision variable and three target variables:
Wherein triple (y1,y2,y3) in component y1、y2And y3Degree of denoting like function, structure complexity function With the functional value of the not familiar degree function of developer, Γ is the set of all code files in software project, i.e. Γ is above-mentioned more mesh Mark the decision variable set in optimization problem.
Then in generation, is completed using the quick non-dominated sorting method (fast-non-dominated-sort) based on layering The first stage sequence of code file, its processing step include:
First, the dominance relation between definition code file is closed in decision variable collection:Assuming that S be presenti∈Γ、Sj∈Γ And i ≠ j causes SimR(Si) > SimR(Sj), Comp (Si) > Comp (Sj) and Rusd (Si) > Rusd (Sj), then SiAnd SjPass It is to be:SiDominate Sj, it is denoted asR is the BUG reports, and Γ is the code file set, and SimR () is the text Similarity function, Comp () are that the structure complexity function, Rusd () are the not familiar degree functions of the developer.
Secondly, sorted according to the first stage of the specific steps completion code file of quick non-dominated sorting method, should The input of sort method is code file collection Γ, not familiar based on above-mentioned similarity function, structure complexity function and developer Degree function determines the dominance relation in Γ between code file, code file is divided into different non-dominant layer Fl(l=1, 2 ..., m) in, FlFor non-dominant levelCode file SiSet, m is the quantity of non-dominant layer, therefore the first rank The output of section sequence is { F1,F2,…,Fm}。
Specific steps using quick non-dominated sorting method sort code file can be divided into two parts, Part I Calculate single code file Si(i=1,2 ..., | Γ |) dominant set Di, by domination counter niAnd F1, DiRepresent to arbitrary Sj∈Di, have Part II is according to DiAnd niRenewal by code text Part is divided into corresponding non-dominant layer Fl(l=2 ..., m) in.
Wherein it is based on code file SiCalculate DiAnd niProcess be, for arbitrary Sj∈ Γ, if a)Then Di= Di∪{Sj};If b)Then ni++;C) otherwise, DiAnd niKeep constant.If ni=0, thenF1=F1∪{Si}。
Part II includes, with F1Based on be iteratively based on Fl(l=1,2 ..., m-1) update by domination counter niFrom And code file S is divided into corresponding non-dominant layer FlIn (l=2 ..., m), for l=1 ..., m-1, iteration each time Detailed process be, for Si∈Fl, Sj∈Di, make nj --If nj=0, thenFl+1=Fl+1∪{Sj}。
Code file in project is divided into different non-dominant layer { F by above-mentioned order of classification algorithm1,F2,…,FmIn, F1In code file form multi-objective optimization question Pareto optimal solution sets, F2Take second place, the like, step by step relative to upper Primary defect suspicion rate successively decreases.But Fl(l=1, there is no comparativity, i.e. F between 2 ..., m) internal code filelInterior generation There is no the sequence of defect suspicion rate between code file.To solve this problem, by F in the sequencer procedure of second stagel(l=1, 2 ..., m) internal code file is based on the text similarity function Sim with defect reportR(S) two minor sorts are carried out, are considered Arrive, FlInterior code file limited amount, therefore bubble sort scheduling algorithm may be selected according to Sim in second stageR(S) non-branch is completed Sequence with layer internal code file.
The complete sequence of code file is obtained after the sequence in two stages, exports top-k generation in collating sequence Code file of the code file as high defect suspicion rate, wherein k=10, k values according to the order of severity of software defect and can answer Miscellaneous degree is manually adjusted.Developer should pay close attention to and check the code file of high defect suspicion rate.
The present invention makes full use of the UG in Software Development maintenance process to report, code file information and developer Information, in file level defined feature index, from three different aspect objective functions, using non-supervisory method, using base In the quick non-dominated sorting method (fast-non-dominated-sort) of classification is carried out to code file the first stage Sequence, is then based on the sequence that the code file inside the non-dominant layer of text similarity function pair carries out second stage, and output is high The code of defect suspicion rate.
To sum up, the present invention calculates simple, and versatility and autgmentability are strong, can fast and effeciently position defect code, can be used for Different types of code file, programming language and platform, exploitation and maintenance process suitable for large scope software product.
The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any Those familiar with the art the invention discloses technical scope in, the change or replacement that can readily occur in, all should It is included within the scope of the present invention.Therefore, protection scope of the present invention should be defined by scope of the claims.

Claims (9)

1. the software defect positioning method based on multiple-objection optimization, it is characterised in that including:
S1, the code file for gathering software under testing, BUG reports and developer's posterior infromation;
S2, the code file and BUG report loading keyword extracting methods, obtain code file keyword and BUG texts Part keyword;
S3, by the code file keyword and the BUG files keyword by bag of words construct code file S and BUG reports R, and similarity function is defined according to the code file S and BUG reports R;
S4, the architectural feature measurement of the code file S obtain the code file according to structure complexity function generating algorithm S structure complexity function;
S5, according to developer's posterior infromation, the not familiar degree function of developer is calculated;
S6, the similarity function, the structure complexity function, the not familiar degree function of the developer use and are based on multiple target The two-stage sort method of optimization is ranked up to the code file S, obtains code file complete sequence;
S7, the code file by the preceding k positions code file in the code file complete sequence labeled as high defect suspicion rate, its In, k is positive integer.
2. the software defect positioning method according to claim 1 based on multiple-objection optimization, it is characterised in that the key Word extracting method includes:
S21, by the code file and the BUG report resolve into by unordered group identifier into set;
S22, reserved word in punctuation mark in the set, operand, operator and programming language is filtered out, filtered Set afterwards;
S23, the identifier for compound word composition, the identifier is disassembled into single word according to capitalization;
S24, the rootization processing that English word is carried out to the set after the filtering, thus obtain the code file and BUG The keyword set of file.
3. the software defect positioning method according to claim 1 based on multiple-objection optimization, it is characterised in that the S3 bags Include:
S31, corpus V constructed according to the code file keyword and the BUG files keyword;
The code file and BUG reports are expressed as code file S by S32, the corpus V applications bag of words R is reported with BUG,The code file S and BUG reports R is one-dimensional vector collection Close, wherein,WithFor the keyword t word frequency in the code file S and the BUG report R respectivelyWith reverse file Frequency idftProduct, calculation formula is as follows:
<mrow> <msubsup> <mi>w</mi> <mi>t</mi> <mi>d</mi> </msubsup> <mo>=</mo> <msubsup> <mi>tf</mi> <mi>t</mi> <mi>d</mi> </msubsup> <mo>&amp;times;</mo> <msub> <mi>idf</mi> <mi>t</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
<mrow> <msubsup> <mi>tf</mi> <mi>t</mi> <mi>d</mi> </msubsup> <mo>=</mo> <mi>l</mi> <mi>o</mi> <mi>g</mi> <mrow> <mo>(</mo> <msubsup> <mi>n</mi> <mi>t</mi> <mi>d</mi> </msubsup> <mo>)</mo> </mrow> <mo>+</mo> <mn>1</mn> </mrow>
<mrow> <msub> <mi>idf</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>log</mi> <mrow> <mo>(</mo> <mfrac> <mi>N</mi> <msub> <mi>N</mi> <mi>t</mi> </msub> </mfrac> <mo>)</mo> </mrow> </mrow>
Wherein,The number occurred for keyword t in file d, d are that the code file S or described BUG report that R, N are represented File d total number, NtIt is the number of files for including keyword t;
S33, according to the code file S and the BUG report R define the similarity function, the similarity function is:
<mrow> <msub> <mi>Sim</mi> <mi>R</mi> </msub> <mrow> <mo>(</mo> <mi>S</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msup> <mi>R</mi> <mi>T</mi> </msup> <mi>S</mi> </mrow> <mrow> <mo>|</mo> <mo>|</mo> <mi>R</mi> <mo>|</mo> <mo>|</mo> <mo>&amp;CenterDot;</mo> <mo>|</mo> <mo>|</mo> <mi>S</mi> <mo>|</mo> <mo>|</mo> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>
Wherein, RTS is R and S inner product, and T is vectorial transposition, | | R | | and | | S | | it is respectively R and S mould, i.e. all elements square Root sum square.
4. the software defect positioning method according to claim 1 based on multiple-objection optimization, it is characterised in that the structure Characteristic measure includes:Lines of code, maximum McCabe loops complexity, code revision number, the number of files relied on, non-annotation Lines of code.
5. the software defect positioning method according to claim 1 based on multiple-objection optimization, it is characterised in that the structure Complexity function generating algorithm includes:
SS1, each code file, S represented according to the structural eigenvectora={ a1,a2,a3,a4,a5, wherein, a is characterized Measurement;
SS2, the dimension using the unified characteristic measure of normalizing equation, obtain normalization characteristic measurement, and formula is as follows:
<mrow> <msubsup> <mi>a</mi> <mi>i</mi> <mo>&amp;prime;</mo> </msubsup> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mi>i</mi> </msub> <mo>=</mo> <msub> <mi>a</mi> <mi>min</mi> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mfrac> <mrow> <msub> <mi>a</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>a</mi> <mi>min</mi> </msub> </mrow> <mrow> <msub> <mi>a</mi> <mi>max</mi> </msub> <mo>-</mo> <msub> <mi>a</mi> <mi>min</mi> </msub> </mrow> </mfrac> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mi>min</mi> </msub> <mo>&lt;</mo> <msub> <mi>a</mi> <mi>i</mi> </msub> <mo>&lt;</mo> <msub> <mi>a</mi> <mi>max</mi> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mn>1</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mi>i</mi> </msub> <mo>=</mo> <msub> <mi>a</mi> <mi>max</mi> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
Wherein, aminAnd amaxThe minimum value and maximum of expression characteristic measure, i=1,2,3,4,5;
SS3, according to the normalized characteristic measure, define the structure complexity function of the code file, the structure is answered It is miscellaneous degree function be
<mrow> <mi>C</mi> <mi>o</mi> <mi>m</mi> <mi>p</mi> <mrow> <mo>(</mo> <mi>S</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mn>5</mn> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>5</mn> </msubsup> <msub> <mi>a</mi> <mi>i</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>
Wherein, a is characterized measurement, i=1, and 2,3,4,5.
6. the software defect positioning method according to claim 1 based on multiple-objection optimization, it is characterised in that the S5 bags Include:
S51, the cumulative time Y for being engaged according to developer developmentexpMapping obtains developer's experience measure index EXP;
S52, the not familiar degree function of the developer defined according to developer's experience degree index EXP,
<mrow> <mi>R</mi> <mi>u</mi> <mi>s</mi> <mi>d</mi> <mrow> <mo>(</mo> <mi>S</mi> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>&amp;Element;</mo> <msub> <mi>S</mi> <mi>P</mi> </msub> </mrow> </munder> <mfrac> <mn>1</mn> <mrow> <msub> <mi>EXP</mi> <mi>i</mi> </msub> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow>
Wherein, SPRepresent developer's set of the code file S, EXPiDeveloper i experience measure index is represented, i is Positive integer.
7. the software defect positioning method according to claim 6 based on multiple-objection optimization, it is characterised in that the exploitation Personnel are engaged in the cumulative time Y of developmentexpMapping relations with developer's experience measure index EXP are:
Yexp< 0.5, EXP=1;
0.5≤Yexp< 1, EXP=2;
1≤Yexp< 3, EXP=3;
Yexp>=3, EXP=5.
8. the software defect positioning method according to claim 6 based on multiple-objection optimization, it is characterised in that the S6 bags Include:
S61, according to the similarity function SimR (S), the structure complexity function Comp (S) and the developer give birth to Degree function Rusd (S) is dredged, defines a multi-objective optimization question, formula is as follows:
<mrow> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </mtd> <mtd> <mrow> <mi>Y</mi> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>Sim</mi> <mi>R</mi> </msub> <mo>(</mo> <mi>S</mi> <mo>)</mo> <mo>,</mo> <mi>C</mi> <mi>o</mi> <mi>m</mi> <mi>p</mi> <mo>(</mo> <mi>S</mi> <mo>)</mo> <mo>,</mo> <mi>R</mi> <mi>u</mi> <mi>s</mi> <mi>d</mi> <mo>(</mo> <mi>S</mi> <mo>)</mo> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>s</mi> <mi>t</mi> <mo>.</mo> </mrow> </mtd> <mtd> <mrow> <mi>S</mi> <mo>&amp;Element;</mo> <mi>&amp;Gamma;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow></mrow> </mtd> <mtd> <mrow> <mi>Y</mi> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>y</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>y</mi> <mn>2</mn> </msub> <mo>,</mo> <msub> <mi>y</mi> <mn>3</mn> </msub> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow>
Wherein R is BUG reports, and S is the code file, triple (y1,y2,y3) in component y1、y2And y3Table respectively Show the functional value of the similarity function, the structure complexity function and the not familiar degree function of the developer, Γ is code File set;Y is the disaggregation of the similarity function, the structure complexity function and the not familiar degree function of the developer;
S62, the code file set Γ load the quick non-dominated sorting method based on layering, by the code file collection Close Γ and be divided into different non-dominant layer FlIn, l=1,2 ..., m, the non-dominant layer is labeled as first stage sequence, its In, m is the quantity of non-dominant layer;
S63, the non-dominant layer FlThe code file of (l=1,2 ..., m) is according to the similarity function SimR(S) carry out by Young waiter in a wineshop or an inn's minor sort is arrived greatly, obtains second stage sequence;
S64, the splicing first stage sequence and second stage sequence, obtain code file complete sequence;
Wherein, the S62 includes:
S621, make code file Si∈Γ、Sj∈ Γ, i ≠ j,
So that SimR(Si) > SimR(Sj)、Comp(Si) > Comp (Sj)、Rusd(Si) > Rusd (Sj),
Then SiAnd SjRelation be:SiDominate Sj, it is denoted as
Wherein, R is the BUG reports, and Γ is the code file set, and SimR () is the text similarity function, Comp () is that the structure complexity function, Rusd () are the not familiar degree functions of the developer;
S622, according to the code file SiCalculate dominant set DiWith by domination counter ni
For the code file Sj∈ Γ, ifThen Di=Di∪{Sj};
IfThen ni++;Otherwise, SiDo not dominate SjAnd SjDo not dominate Si, DiAnd niKeep constant;
First S623, generation non-dominant layer F1, include all domination counter ni=0 code file Si, i=1,2,3 ..., |F1|, | F1| to dominate layer F1Cardinality of a set;
S624, with F1For initial value, non-dominant layer F is iterated to calculatel(l=2 ..., m), wherein m are the non-dominant numbers of plies of generation Amount, Fl+1Iteration be based on Fl, for Si∈Fl, Sj∈Di, make by domination counter nj--, if nj=0, then Fl+1=Fl+1∪ {Sj, wherein i=1, i=1,2,3 ..., | Fl|, j=1,2,3 ..., | Di|, | Fl| it is to dominate layer FlCardinality of a set, | Di| it is Dominant set DiCardinality of a set.
9. the software defect positioning method according to claim 1 based on multiple-objection optimization, it is characterised in that in the S7 In, k=10.
CN201710700316.5A 2017-08-16 2017-08-16 Software defect positioning method based on multiple-objection optimization Active CN107515822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710700316.5A CN107515822B (en) 2017-08-16 2017-08-16 Software defect positioning method based on multiple-objection optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710700316.5A CN107515822B (en) 2017-08-16 2017-08-16 Software defect positioning method based on multiple-objection optimization

Publications (2)

Publication Number Publication Date
CN107515822A true CN107515822A (en) 2017-12-26
CN107515822B CN107515822B (en) 2019-09-03

Family

ID=60723239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710700316.5A Active CN107515822B (en) 2017-08-16 2017-08-16 Software defect positioning method based on multiple-objection optimization

Country Status (1)

Country Link
CN (1) CN107515822B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109767438A (en) * 2019-01-09 2019-05-17 电子科技大学 A kind of thermal-induced imagery defect characteristic recognition methods based on dynamic multi-objective optimization
CN110580217A (en) * 2018-06-08 2019-12-17 阿里巴巴集团控股有限公司 software code health degree detection method, processing method and device and electronic equipment
CN111831541A (en) * 2019-04-22 2020-10-27 西安邮电大学 Software defect positioning method based on risk track
CN112328475A (en) * 2020-10-28 2021-02-05 南京航空航天大学 Defect positioning method for multiple suspicious code files
CN114510431A (en) * 2022-04-20 2022-05-17 武汉理工大学 Workload-aware intelligent contract defect prediction method, system and equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101231614A (en) * 2008-02-02 2008-07-30 南京大学 Method for locating software unsoundness base on execution track block semblance
CN105786704A (en) * 2016-02-22 2016-07-20 南京大学 Work amount sensitive bug positioning technology effectiveness evaluation method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101231614A (en) * 2008-02-02 2008-07-30 南京大学 Method for locating software unsoundness base on execution track block semblance
CN105786704A (en) * 2016-02-22 2016-07-20 南京大学 Work amount sensitive bug positioning technology effectiveness evaluation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LU HUIHUA 等: "Defect Prediction between Software Versions with Active Learning and Dimensionality Reduction", 《2014 IEEE 25TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING》 *
陈翔 等: "静态软件缺陷预测方法研究", 《软件学报》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580217A (en) * 2018-06-08 2019-12-17 阿里巴巴集团控股有限公司 software code health degree detection method, processing method and device and electronic equipment
CN110580217B (en) * 2018-06-08 2023-05-05 阿里巴巴集团控股有限公司 Software code health degree detection method, processing method, device and electronic equipment
CN109767438A (en) * 2019-01-09 2019-05-17 电子科技大学 A kind of thermal-induced imagery defect characteristic recognition methods based on dynamic multi-objective optimization
CN109767438B (en) * 2019-01-09 2021-06-08 电子科技大学 Infrared thermal image defect feature identification method based on dynamic multi-objective optimization
CN111831541A (en) * 2019-04-22 2020-10-27 西安邮电大学 Software defect positioning method based on risk track
CN111831541B (en) * 2019-04-22 2022-10-28 西安邮电大学 Software defect positioning method based on risk track
CN112328475A (en) * 2020-10-28 2021-02-05 南京航空航天大学 Defect positioning method for multiple suspicious code files
CN112328475B (en) * 2020-10-28 2021-11-30 南京航空航天大学 Defect positioning method for multiple suspicious code files
CN114510431A (en) * 2022-04-20 2022-05-17 武汉理工大学 Workload-aware intelligent contract defect prediction method, system and equipment

Also Published As

Publication number Publication date
CN107515822B (en) 2019-09-03

Similar Documents

Publication Publication Date Title
CN107515822A (en) Software defect positioning method based on multiple-objection optimization
CN102662930B (en) Corpus tagging method and corpus tagging device
CN107817404A (en) A kind of Portable metering automatization terminal trouble-shooter and its diagnostic method
CN110659814A (en) Power grid operation risk evaluation method and system based on entropy weight method
CN104536881A (en) Public testing error report priority sorting method based on natural language analysis
CN104881689A (en) Method and system for multi-label active learning classification
CN112364352B (en) Method and system for detecting and recommending interpretable software loopholes
Kumar Measuring Software reusability using SVM based classifier approach
CN104951987B (en) Crop Breeding evaluation method based on decision tree
CN108446885A (en) A kind of automatic collecting method of review comment
Kumar et al. Software fault proneness prediction using genetic based machine learning techniques
CN109711424A (en) A kind of rule of conduct acquisition methods, device and equipment based on decision tree
CN111199469A (en) User payment model generation method and device and electronic equipment
CN107066389A (en) The Forecasting Methodology that software defect based on integrated study is reopened
Sandhu et al. A comparative analysis of conjugate gradient algorithms & PSO based neural network approaches for reusability evaluation of procedure based software systems
CN108763459A (en) Professional trend analysis method and system based on psychological test and DNN algorithms
Kusiak A data mining approach for generation of control signatures
Au et al. Decision model for country site selection of overseas clothing plants
CN115345379A (en) Auxiliary decision-making method for operation and maintenance of power transformation equipment
Alba et al. Comparative analysis of modern optimization tools for the p-median problem
Kaur et al. Performance evaluation of reusable software components
CN107291722A (en) The sorting technique and equipment of a kind of descriptor
Hassanzadeh et al. Developing a new method using Artificial Immune System in order to High Productivity of Inefficient Units in Network DEA approach
Sun Construction principles of physical fitness training objective system based on machine learning and data mining
Manhas et al. Framework for Evaluating Reusability of Procedure Oriented System using Metrics based Approach

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant