WO2019242108A1 - 一种基于聚类分析的软件缺陷修复模板提取方法 - Google Patents

一种基于聚类分析的软件缺陷修复模板提取方法 Download PDF

Info

Publication number
WO2019242108A1
WO2019242108A1 PCT/CN2018/104075 CN2018104075W WO2019242108A1 WO 2019242108 A1 WO2019242108 A1 WO 2019242108A1 CN 2018104075 W CN2018104075 W CN 2018104075W WO 2019242108 A1 WO2019242108 A1 WO 2019242108A1
Authority
WO
WIPO (PCT)
Prior art keywords
modification mode
multiset
bug
modification
level
Prior art date
Application number
PCT/CN2018/104075
Other languages
English (en)
French (fr)
Inventor
孙小兵
朱轩锐
李斌
Original Assignee
扬州大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 扬州大学 filed Critical 扬州大学
Publication of WO2019242108A1 publication Critical patent/WO2019242108A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3608Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3628Software debugging of optimised code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management

Definitions

  • the invention belongs to the field of software maintenance, in particular to a software defect repair template extraction method based on cluster analysis.
  • test suite-based repair whose purpose is to generate a patch that allows the wrong test set to pass and other test sets to satisfy the test.
  • passing all tests in a real project does not necessarily mean that the program is correct, and if the accuracy of the repair system is low, developers still need to manually review the patches, and the current test-based repair technology is not high in accuracy .
  • Aiming at the problem of low accuracy of test-based repair technology many researchers have done a lot of research on this and found that it is difficult for the repair system to identify the correct program patch from a large number of reasonable patches.
  • the method to solve this problem is to sort patches according to their correct probability and return the possible patches with the highest probability, but the accuracy of this method is not satisfactory.
  • the fine-grained repair method can complete the software repair more accurately and efficiently. In the actual software maintenance process, there are often many types of defects and repair modes.
  • the currently proposed fine-grained repair mode technology is limited to solving the problem.
  • the specific defect code in a specific project is not universal and cannot be applied to the needs of any software defect repair.
  • the technical problem solved by the present invention is to provide a software defect repair template extraction method based on cluster analysis.
  • a technical solution to achieve the purpose of the present invention is: a method for extracting a software defect repair template based on cluster analysis, including the following steps:
  • Step 1 Define the fine-grained modification mode of the bug, and then perform text analysis on the bugs in the bug defect database to identify the fine-grained modification mode related to each bug;
  • Step 2 Use code analysis technology to capture the program elements of the fine-grained modification mode related to each bug
  • Step 3 Determine the relationship between the program elements in each bug captured in step 2, and then classify the top-level program elements of the same type as a top-level modification mode multiset of the bug; then perform the top-level modification mode multiset of all bugs.
  • Hierarchical clustering to obtain multiple sets of multiple top-level modification patterns after clustering;
  • Step 4 Obtain a new modification mode multiple set corresponding to each top-level modification mode multiple set according to the program element corresponding to the top-level modification mode multiple set;
  • Step 5 According to the relationship between the program elements, determine the relationship between the multiple new modification mode multiple sets obtained in step 4, and then connect the new modification mode multiple sets to obtain a modified mode multiple set map;
  • Step 6 Use frequent pattern mining technology to segment and optimize the modified pattern multiple set map obtained in step 5 to obtain modified pattern clustering.
  • Step 7 Cluster the software defect repair template according to the modified pattern obtained in step 6.
  • the present invention Compared with the prior art, the present invention has the following significant advantages: (1) The method of the present invention performs cluster analysis on the bug modification mode through semantics, context, and dependencies based on the fine-grained modification mode of the bug.
  • the obtained repair template has Semantic features, which are more universal and versatile; (2) The repair template obtained by the method of the present invention is more comprehensive, has guidance significance for the study of defect modes and defect classification, and improves the efficiency of defect repair; (3) this The repair template obtained by the invented method establishes the relationship between bugs and improves the accuracy of defect repair.
  • FIG. 1 is a schematic flowchart of a method of the present invention.
  • FIG. 2 is a meta-model diagram of a program model in the present invention.
  • FIG. 3 is a schematic diagram of cluster analysis in the method of the present invention.
  • FIG. 4 is a diagram of a modification mode multiset obtained in the embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a software defect repair template obtained in an embodiment of the present invention.
  • a method for extracting a software defect repair template based on cluster analysis of the present invention includes the following steps:
  • Step 1 Define the fine-grained modification mode of the bug, and then perform text analysis on the bugs in the bug defect library to identify the fine-grained modification mode related to each bug.
  • text analysis is performed on the bugs in the bug defect library to identify fine-grained modification modes related to each bug, specifically: identifying the conditional statements and assignment statements of each bug through analyzing data, searching and retrieving, and text mining. Fine-grained modification modes for interfaces, interfaces, and variables.
  • Step 2 Use code analysis technology to capture the program elements of the fine-grained modification mode related to each bug.
  • program elements include classes, interfaces, methods, and fields.
  • Step 3 Use a top-down method to determine the relationship between the program elements in each bug captured in step 2, and record the top-level program elements as the multiple set of top-level modification modes for the bug. Then use code similarity and heuristics.
  • the rule performs hierarchical clustering on the top-level modification pattern multiple sets of all bugs to obtain multiple top-level modification pattern multiple sets after clustering.
  • the relationships among the program elements include declarations, extensions, calls, implementations, and reads.
  • the code similarity is the degree of similarity of the code, which is measured by the code similarity, and the code similarity is represented by the Jeckard similarity coefficient, which is:
  • s 1 and s 2 are the first source code block and the second source code block, respectively.
  • the similarity between program element classes and interfaces are all 0, that is, no similarity.
  • Step 4 According to the program element corresponding to the top-level modification mode multiset, obtain a new modification mode multiset corresponding to each top-level modification mode multiset. Specifically:
  • the top-level modification mode multiset is directly used as its corresponding new modification mode multiset
  • the program element corresponding to the top-level modification mode multiset is a class or an interface, use the declaration rules to recurse the top-level modification mode multiset until a modification mode multiset containing only methods and fields is obtained.
  • the set serves as a new modification mode multiset corresponding to the top-level modification mode multiset.
  • Step 5 According to the relationship between the program elements, determine the relationship between the multiple new modification mode multiple sets obtained in step 4, and then connect the new modification mode multiple sets to obtain a modified mode multiple set map.
  • Step 6 Use frequent pattern mining technology to segment and optimize the modified pattern multiple set map obtained in step 5 to obtain modified pattern clustering. Specifically:
  • Step 6-1 Filter the modification mode multiset map, and filter the superset of each modification mode multiset in the modification mode multiset map with the modification mode multiple set that has the same support as the modification mode multiple set;
  • Step 6-2 according to the dependencies between the modification pattern multiple sets, sort the remaining modification pattern multiple sets after filtering in step 6-1 to obtain the modification pattern clusters.
  • Step 7 Cluster the software defect repair template according to the modification pattern obtained in step 6.
  • the software defect repair template includes a name and parameters.
  • the method for extracting a software defect repair template based on cluster analysis of the present invention includes the following steps:
  • Step 1 Define the fine-grained modification mode of the bug, and then perform text analysis on the bugs in the bug defect library to identify the fine-grained modification mode related to each bug.
  • the fine-grained modification mode of the bug defined in this embodiment is shown in Table 1 below.
  • a text analysis process is performed on a bug in a bug defect library, and a fine-grained modification mode related to each bug is identified as shown in Table 2 below.
  • Step 2 Use code analysis technology to capture the program elements of the fine-grained modification mode related to each bug.
  • the bug statement "private ImageButton getEditCancelButton () ⁇ return (ImageButton) getToolbarView ().
  • FindViewById (R.id.edit_cancel); ⁇ ” is taken as an example, and the repair statement corresponding to the bug is "private ViewViewgetEditCancelButton () ⁇ return getToolbarView ().
  • findViewById (R.id.edit_cancel); ⁇ ", to obtain the program elements (ImageButton, getToolbarView, findViewById, R.id.edit_cancel) and (View, getToolbarView, findViewById, R.id.edit_cancel).
  • Step 3 Use a top-down method to determine the relationship between the program elements in each bug captured in step 2, and classify the top-level program elements of the same type as a multi-set of top-level modification modes for the bug.
  • the hierarchical and heuristic rules are used to perform hierarchical clustering on the top-level modification pattern multiple sets of all bugs to obtain multiple top-level modification pattern multiple sets after clustering.
  • the program elements (ImageButton, getToolbarView, findViewById, R.id.edit_cancel) and (View, getToolbarView, findViewById, R.id.edit_cancel) of the bug obtained in step 2 are classified as this
  • a top-level modification mode multiset A ⁇ ImageButton, View ⁇ of the bug
  • the code similarity and heuristic rules are used for hierarchical clustering of the top-level modification pattern multiple sets of all bugs.
  • the top-level modification pattern multiple sets of similarity, that is, the Jeckard similarity coefficient greater than h, are clustered.
  • the specific value of h is randomly selected according to the requirements and the strictness of defect repair. In this embodiment, there is only one kind of bug, so there is no need to perform hierarchical clustering on the top-level modification pattern multiple set.
  • Step 4 According to the program element corresponding to the top-level modification mode multiset, obtain a new modification mode multiset corresponding to each top-level modification mode multiset.
  • the program elements corresponding to the top-level modification mode multiset obtained in step 3 of this embodiment are classes and interfaces. Therefore, declaration rules are used to recursively modify the top-level modification mode multiset until a modification mode multiset containing only methods and fields is obtained.
  • Step 5 According to the relationship between the program elements, determine the relationship between the multiple new modification mode multiple sets obtained in step 4, and then connect the new modification mode multiple sets to obtain a modified mode multiple set map.
  • the multi-set map of the modification mode obtained is shown in FIG. 4.
  • Step 6 Use frequent pattern mining technology to segment and optimize the modified pattern multiple set map obtained in step 5 to obtain modified pattern clustering.
  • the modification pattern clusters obtained in this embodiment are ⁇ A ⁇ , ⁇ B ⁇ , ⁇ C, D, E ⁇ .
  • Step 7 Cluster the software defect repair template according to the modified pattern obtained in step 6, as shown in FIG. 5, where the name is the function type and the parameter is type.
  • the method of the present invention performs cluster analysis on the bug modification mode through semantics, context, and dependencies.
  • the repair model obtained has semantic characteristics, and the template is more comprehensive, and the study of defect modes and defect classification It has guiding significance, its universality and versatility are stronger, and the efficiency and accuracy of defect repair are improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Stored Programmes (AREA)

Abstract

一种基于聚类分析的软件缺陷修复模板提取方法,属于软件维护领域,步骤如下:首先定义bug的细粒度修改模式,并识别出每个bug相关的细粒度修改模式;接着对每个bug相关的细粒度修改模式的程序元素进行捕获;之后获取每个bug的顶层修改模式多重集,然后进行层次聚类分析,获得聚类后的多个顶层修改模式多重集;之后获取每个顶层修改模式多重集对应的新的修改模式多重集;再根据程序元素之间的关系获取修改模式多重集图;然后对修改模式多重集图进行分割优化,获得修改模式聚类;最后根据修改模式聚类构建软件缺陷修复模板。该方法获得的修复模板具有语义特征,其普适性和通用性更强,提高了缺陷修复的效率和精度。

Description

一种基于聚类分析的软件缺陷修复模板提取方法 技术领域
本发明属于软件维护领域,特别是一种基于聚类分析的软件缺陷修复模板提取方法。
背景技术
由于软件产品的规模和复杂性爆炸式增长,开发高质量软件变得越来越具有挑战性,所以软件系统中的错误不可避免。通过研究软件缺陷模式,测试人员可以在测试过程中更快地对缺陷进行修复;开发人员也可以在开发过程中考虑采用什么样的开发技术预防这些缺陷模式的再次出现,从而提高软件开发和测试团队的整体水平。因此,对软件缺陷修复模式的研究越来越重要。
目前针对软件缺陷修复模式有很多技术,包括补丁产生和动态程序状态恢复。一系列的技术均围绕“基于测试套件的修复”这个主题,其目的是产生一个补丁,使错误的测试集能通过并且其他的测试集也能满足测试。但在真实的项目中通过所有的测试并不一定意味着该程序是正确的,而且如果修复系统的精度较低,开发人员仍需手动审查补丁,而目前的基于测试的修复技术精度都不高。针对基于测试的修复技术精度低的问题,很多学者对此有大量的研究,发现修复系统很难从大量合理的补丁中识别出正确的程序补丁。而解决这一问题的方法是根据补丁的正确概率对补丁进行排序,并以最高的可能性返回可能的补丁,但是此方法的精度还不令人满意。基于细粒度的修复方式能够更准确、更高效率得完成软件修复,在实际的软件维护过程中,缺陷的类型及修复模式往往有很多,而目前提出的基于细粒度的修复模式技术只限于解决特定项目中的特定缺陷代码,普适性差,不能适用于任意软件缺陷修复的需求。
发明内容
本发明所解决的技术问题在于提供一种基于聚类分析的软件缺陷修复模板提取方法。
实现本发明目的的技术解决方案为:一种基于聚类分析的软件缺陷修复模板提取方法,包括以下步骤:
步骤1、定义bug的细粒度修改模式,之后对bug缺陷库中的bug进行文本分析处理,识别出每个bug相关的细粒度修改模式;
步骤2、利用代码分析技术对每个bug相关的细粒度修改模式的程序元素进行捕获;
步骤3、确定步骤2捕获的每个bug中程序元素之间的关系,然后将同一类型的顶层程序元素归为该bug的一个顶层修改模式多重集;之后对所有bug的顶层修改模式多重集进行层次聚类,获得聚类后的多个顶层修改模式多重集;
步骤4、根据顶层修改模式多重集对应的程序元素,获取每个顶层修改模式多重集对应的新的修改模式多重集;
步骤5、根据程序元素之间的关系,确定步骤4获得的多个新的修改模式多重集之间的关系,之后对新的修改模式多重集进行连接,获得修改模式多重集图;
步骤6、利用频繁模式挖掘技术对步骤5获得的修改模式多重集图进行分割优化,获得修改模式聚类;
步骤7、根据步骤6获得的修改模式聚类构建软件缺陷修复模板。
本发明与现有技术相比,其显著优点为:(1)本发明的方法依据bug的细粒度修改模式,通过语义、上下文及依赖关系对bug修改模式进行聚类分析,获得的修复模板具有语义特征,其普适性和通用性更强;(2)本发明的方法获得的修复模板更全面,对缺陷模式的研究以及缺陷分类具有指导意义,提高了缺陷修复的效率;(3)本发明的方法获得的修复模板,建立了bug之间的关系,提高了缺陷修复的精度。
下面结合附图对本发明作进一步详细描述。
附图说明
图1为本发明方法的流程示意图。
图2为本发明中程序模型的元模型图。
图3为本发明方法中聚类分析示意图。
图4为本发明实施例中获得的修改模式多重集图。
图5为本发明实施例中获得的软件缺陷修复模板示意图。
具体实施方式
结合图1,本发明的一种基于聚类分析的软件缺陷修复模板提取方法,步骤如下:
步骤1、定义bug的细粒度修改模式,之后对bug缺陷库中的bug进行文本分析处理,识别出每个bug相关的细粒度修改模式。其中,对bug缺陷库中的bug进行文本分析处理,识别出每个bug相关的细粒度修改模式,具体为:通过解析数据、搜索检索、文本挖掘,识别出每个bug的条件语句、赋值语句、接口、变量方面的细粒度修改模式。
步骤2、利用代码分析技术对每个bug相关的细粒度修改模式的程序元素进行捕获。 其中,程序元素包括类、接口、方法和字段。
步骤3、利用自顶向下的方法确定步骤2捕获的每个bug中程序元素之间的关系,并将顶层程序元素记为该bug的顶层修改模式多重集,之后利用代码相似性和启发式规则对所有bug的顶层修改模式多重集进行层次聚类,获得聚类后的多个顶层修改模式多重集。其中所述程序元素之间的关系包括声明、扩展、调用、实现和读取。其中代码相似性为代码的相似程度,其是通过代码相似度衡量,而代码相似度是由杰卡德相似系数来表示,杰卡德相似系数为:
Figure PCTCN2018104075-appb-000001
式中,s 1、s 2分别为第一源代码块、第二源代码块。
为了提高层次聚类精度,定义以下规则:程序元素类与接口之间的相似度、类与超类之间的相似度均为0,即无相似性。
步骤4、根据顶层修改模式多重集对应的程序元素,获取每个顶层修改模式多重集对应的新的修改模式多重集。具体为:
若顶层修改模式多重集对应的程序元素为方法、字段,则将该顶层修改模式多重集直接作为其对应的新的修改模式多重集;
若顶层修改模式多重集对应的程序元素为类、接口,利用声明规则对该顶层修改模式多重集进行递归,直至获得仅包含方法、字段的修改模式多重集,将递归过程中的所有修改模式多重集作为该顶层修改模式多重集对应的新的修改模式多重集。
步骤5、根据程序元素之间的关系,确定步骤4获得的多个新的修改模式多重集之间的关系,之后对新的修改模式多重集进行连接,获得修改模式多重集图。
步骤6、利用频繁模式挖掘技术对步骤5获得的修改模式多重集图进行分割优化,获得修改模式聚类。具体为:
步骤6-1、对修改模式多重集图进行过滤,将修改模式多重集图中每个修改模式多重集的超集、与该修改模式多重集具有相同支持的修改模式多重集过滤;
步骤6-2、根据修改模式多重集之间的依赖性对步骤6-1过滤后剩余的修改模式多重集进行排序,获得修改模式聚类。
步骤7、根据步骤6获得的修改模式聚类构建软件缺陷修复模板,该软件缺陷修复模板包括名称、参数。
实施例
结合图1,本发明基于聚类分析的软件缺陷修复模板提取方法,包括以下步骤:
步骤1、定义bug的细粒度修改模式,之后对bug缺陷库中的bug进行文本分析处理,识别出每个bug相关的细粒度修改模式。本实施例中定义的bug的细粒度修改模式如下表1所示。
表1 bug的细粒度的修改模式
Figure PCTCN2018104075-appb-000002
本实施例中对某一bug缺陷库中的bug进行文本分析处理,识别出每个bug相关的细粒度修改模式如下表2所示。
表2 bug细粒度修改模式分析
Figure PCTCN2018104075-appb-000003
步骤2、利用代码分析技术对每个bug相关的细粒度修改模式的程序元素进行捕获。本实施例中以bug语句“private ImageButton getEditCancelButton(){return(ImageButton)getToolbarView().findViewById(R.id.edit_cancel);}”为例,该bug对应的修复语句为“private View getEditCancelButton(){return getToolbarView().findViewById(R.id.edit_cancel);}”,由此获得该bug相关的细粒度修改模式的程序元素(ImageButton,getToolbarView,findViewById,R.id.edit_cancel)和(View,getToolbarView,findViewById,R.id.edit_cancel)。
步骤3、利用自顶向下的方法确定步骤2捕获的每个bug中程序元素之间的关系,并将同一类型的顶层程序元素归为该bug的一个顶层修改模式多重集,之后利用代码相似性和启发式规则对所有bug的顶层修改模式多重集进行层次聚类,获得聚类后的多个顶层修改模式多重集。本实施例中,将步骤2中获得的bug的程序元素(ImageButton,getToolbarView,findViewById,R.id.edit_cancel)和(View,getToolbarView,findViewById,R.id.edit_cancel)中的ImageButton、View归为该bug的一个顶层修改模式多重集 A={ImageButton,View},getToolbarView、getToolbarView归为该bug的另一个顶层修改模式多重集B={getToolbarView,getToolbarView}。其中,利用代码相似性和启发式规则对所有bug的顶层修改模式多重集进行层次聚类,假设相似性阈值为h,将相似度即杰卡德相似系数大于h的顶层修改模式多重集聚类为一类,0≤h≤1,h的具体取值根据需求、缺陷修复的严格程度随机自由选取。本实施例中,只有一种bug,因此不需要对顶层修改模式多重集进行层次聚类。
步骤4、根据顶层修改模式多重集对应的程序元素,获取每个顶层修改模式多重集对应的新的修改模式多重集。本实施例步骤3获得的顶层修改模式多重集对应的程序元素为类、接口,因此利用声明规则对顶层修改模式多重集进行递归,直至获得仅包含方法、字段的修改模式多重集,由此获得该bug顶层修改模式多重集对应的新的修改模式多重集为C={findViewById,findViewById}、D={R.id.edit_cancel、R.id.edit_cancel}、E={ImageButton.init(),View.init()}。
步骤5、根据程序元素之间的关系,确定步骤4获得的多个新的修改模式多重集之间的关系,之后对新的修改模式多重集进行连接,获得修改模式多重集图。本实施例中,获得的修改模式多重集图如图4所示。
步骤6、利用频繁模式挖掘技术对步骤5获得的修改模式多重集图进行分割优化,获得修改模式聚类。本实施例中获得的修改模式聚类为{A},{B},{C,D,E}。
步骤7、根据步骤6获得的修改模式聚类构建软件缺陷修复模板,如图5所示,其中名称为函数类型,参数为type。
本发明的方法依据bug的细粒度修改模式,通过语义、上下文及依赖关系对bug修改模式进行聚类分析,获得的修复模具有语义特征,其该模板更全面,对缺陷模式的研究以及缺陷分类具有指导意义,其普适性和通用性更强,提高了缺陷修复的效率和精度。

Claims (9)

  1. 一种基于聚类分析的软件缺陷修复模板提取方法,其特征在于,包括以下步骤:
    步骤1、定义bug的细粒度修改模式,之后对bug缺陷库中的bug进行文本分析处理,识别出每个bug相关的细粒度修改模式;
    步骤2、利用代码分析技术对每个bug相关的细粒度修改模式的程序元素进行捕获;
    步骤3、确定步骤2捕获的每个bug中程序元素之间的关系,然后将同一类型的顶层程序元素归为该bug的一个顶层修改模式多重集;之后对所有bug的顶层修改模式多重集进行层次聚类,获得聚类后的多个顶层修改模式多重集;
    步骤4、根据顶层修改模式多重集对应的程序元素,获取每个顶层修改模式多重集对应的新的修改模式多重集;
    步骤5、根据程序元素之间的关系,确定步骤4获得的所有新的修改模式多重集之间的关系,之后对新的修改模式多重集进行连接,获得修改模式多重集图;
    步骤6、利用频繁模式挖掘技术对步骤5获得的修改模式多重集图进行分割优化,获得修改模式聚类;
    步骤7、根据步骤6获得的修改模式聚类构建软件缺陷修复模板。
  2. 根据权利要求1所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤1中所述对bug缺陷库中的bug进行文本分析处理,识别出每个bug相关的细粒度修改模式,具体为:通过解析数据、搜索检索、文本挖掘,识别出每个bug的条件语句、赋值语句、接口、变量方面的细粒度修改模式。
  3. 根据权利要求1所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤2中所述程序元素包括类、接口、方法和字段。
  4. 根据权利要求1所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤3具体为:
    步骤3-1、利用自顶向下的方法确定步骤2捕获的每个bug中程序元素之间的关系,并将同一类型的顶层程序元素归为该bug的一个顶层修改模式多重集;其中所述程序元素之间的关系包括声明、扩展、调用、实现和读取;
    步骤3-2、利用代码相似性和启发式规则对所有bug的顶层修改模式多重集进行层次聚类,获得聚类后的多个顶层修改模式多重集。
  5. 根据权利要求4所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤3-2所述代码相似性为代码的相似程度,其是通过代码相似度衡量,而代码相 似度是由杰卡德相似系数来表示,杰卡德相似系数为:
    Figure PCTCN2018104075-appb-100001
    式中,s 1、s 2分别为第一源代码块、第二源代码块。
  6. 根据权利要求1或3或5所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤3-2所述利用代码相似性和启发式规则对步骤2获得的顶层程序元素进行层次聚类,定义以下规则:程序元素类与接口之间的相似度、类与超类之间的相似度均为0,即无相似性。
  7. 根据权利要求1所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤4所述根据顶层修改模式多重集对应的程序元素,获取每个顶层修改模式多重集对应的新的修改模式多重集,具体为:
    若顶层修改模式多重集对应的程序元素为方法、字段,则将该顶层修改模式多重集直接作为其对应的新的修改模式多重集;
    若顶层修改模式多重集对应的程序元素为类、接口,利用声明规则对该顶层修改模式多重集进行递归,直至获得仅包含方法、字段的修改模式多重集,将递归过程中的所有修改模式多重集作为该顶层修改模式多重集对应的新的修改模式多重集。
  8. 根据权利要求1所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤6所述利用频繁模式挖掘技术对步骤5获得的修改模式多重集图进行分割优化,获得修改模式聚类具体为:
    步骤6-1、对修改模式多重集图进行过滤,将修改模式多重集图中每个修改模式多重集的超集、与该修改模式多重集具有相同支持的修改模式多重集过滤;
    步骤6-2、根据修改模式多重集之间的依赖性对步骤6-1过滤后剩余的修改模式多重集进行排序,获得修改模式聚类。
  9. 根据权利要求1所述的基于聚类分析的软件缺陷修复模板提取方法,其特征在于,步骤7所述软件缺陷修复模板包括名称、参数。
PCT/CN2018/104075 2018-06-20 2018-09-05 一种基于聚类分析的软件缺陷修复模板提取方法 WO2019242108A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810637180.2 2018-06-20
CN201810637180.2A CN109165155B (zh) 2018-06-20 2018-06-20 一种基于聚类分析的软件缺陷修复模板提取方法

Publications (1)

Publication Number Publication Date
WO2019242108A1 true WO2019242108A1 (zh) 2019-12-26

Family

ID=64897173

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/104075 WO2019242108A1 (zh) 2018-06-20 2018-09-05 一种基于聚类分析的软件缺陷修复模板提取方法

Country Status (2)

Country Link
CN (1) CN109165155B (zh)
WO (1) WO2019242108A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918100B (zh) * 2019-01-25 2022-05-17 扬州大学 一种面向版本缺陷的基于修复模式的修复推荐方法
CN113590167B (zh) * 2021-07-09 2023-03-24 四川大学 一种面向对象程序中条件语句缺陷补丁生成与验证方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6092189A (en) * 1998-04-30 2000-07-18 Compaq Computer Corporation Channel configuration program server architecture
CN105653444A (zh) * 2015-12-23 2016-06-08 北京大学 基于互联网日志数据的软件缺陷故障识别方法和系统
CN106598850A (zh) * 2016-12-03 2017-04-26 浙江理工大学 一种基于程序失效聚类分析的错误定位方法
CN107329770A (zh) * 2017-07-04 2017-11-07 扬州大学 针对软件安全性bug修复的个性化推荐方法
CN107608732A (zh) * 2017-09-13 2018-01-19 扬州大学 一种基于bug知识图谱的bug搜索定位方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559025B (zh) * 2013-10-21 2017-01-25 沈阳建筑大学 一种采用聚类方式进行软件重构的方法
CN103729197B (zh) * 2014-01-22 2017-01-18 扬州大学 一种基于lda模型的多粒度层次软件聚类方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6092189A (en) * 1998-04-30 2000-07-18 Compaq Computer Corporation Channel configuration program server architecture
CN105653444A (zh) * 2015-12-23 2016-06-08 北京大学 基于互联网日志数据的软件缺陷故障识别方法和系统
CN106598850A (zh) * 2016-12-03 2017-04-26 浙江理工大学 一种基于程序失效聚类分析的错误定位方法
CN107329770A (zh) * 2017-07-04 2017-11-07 扬州大学 针对软件安全性bug修复的个性化推荐方法
CN107608732A (zh) * 2017-09-13 2018-01-19 扬州大学 一种基于bug知识图谱的bug搜索定位方法

Also Published As

Publication number Publication date
CN109165155A (zh) 2019-01-08
CN109165155B (zh) 2021-06-22

Similar Documents

Publication Publication Date Title
CN106909510B (zh) 一种获取测试用例的方法以及服务器
Roy Detection and analysis of near-miss software clones
Ray et al. Detecting and characterizing semantic inconsistencies in ported code
US9354867B2 (en) System and method for identifying, analyzing and integrating risks associated with source code
Zhang et al. Analyzing and supporting adaptation of online code examples
JP6674459B2 (ja) グラフのデバッグ
JP7404839B2 (ja) ソフトウェアプログラム不良位置の識別
JP2022037061A (ja) アプリケーション・テスト
US11347484B2 (en) Format-specific data processing operations
JP2019096292A (ja) 自動化されたソフトウェアプログラム修復候補の選択
An et al. An empirical study of crash-inducing commits in mozilla firefox
WO2019242108A1 (zh) 一种基于聚类分析的软件缺陷修复模板提取方法
US9563541B2 (en) Software defect detection identifying location of diverging paths
CN108897678B (zh) 静态代码检测方法和静态代码检测系统、存储设备
Lavoie et al. A case study of TTCN-3 test scripts clone analysis in an industrial telecommunication setting
CN116483700A (zh) 一种基于反馈机制的api误用检测与修正方法
Sadiq et al. On the Evolutionary Relationship between Change Coupling and Fix-Inducing Changes.
CN115292571A (zh) 一种App数据采集方法及系统
Ramler et al. Noise in bug report data and the impact on defect prediction results
CN114625633A (zh) 用于接口测试的方法、系统和存储介质
CN113051171A (zh) 接口测试方法、装置、设备及存储介质
US11714743B2 (en) Automated classification of defective code from bug tracking tool data
CN114610320B (zh) 一种基于llvm的变量类型信息修复与比较方法及系统
CN112579440B (zh) 一种虚拟测试依赖对象的确定方法及装置
CN115543836A (zh) 脚本质量检测方法以及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18923210

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18923210

Country of ref document: EP

Kind code of ref document: A1