CN103744788A - Feature localization method based on multi-source software data analysis - Google Patents

Feature localization method based on multi-source software data analysis Download PDF

Info

Publication number
CN103744788A
CN103744788A CN201410031303.XA CN201410031303A CN103744788A CN 103744788 A CN103744788 A CN 103744788A CN 201410031303 A CN201410031303 A CN 201410031303A CN 103744788 A CN103744788 A CN 103744788A
Authority
CN
China
Prior art keywords
information
characteristic
feature localization
technology
corpus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410031303.XA
Other languages
Chinese (zh)
Other versions
CN103744788B (en
Inventor
孙小兵
吴鹏
李云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou Gezhi Photoelectric Technology Co ltd
Original Assignee
Yangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou University filed Critical Yangzhou University
Priority to CN201410031303.XA priority Critical patent/CN103744788B/en
Publication of CN103744788A publication Critical patent/CN103744788A/en
Application granted granted Critical
Publication of CN103744788B publication Critical patent/CN103744788B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a feature localization method based on multi-source software data analysis in the technical field of software engineering, and aims to solve the technical problems that feature localization results are inaccurate and incomplete in the prior art. The method combines information retrieval technology, data mining technology and dynamic analysis technology to respectively perform feature localization on a current software system, a evoluting history library and executing tracks, and intersection calculation is performed on feature localization results by the three technologies to acquire final feature localization results, feature localization based on multi-source software data analysis is implemented, and better accurateness, complement and efficiency are provided; the three technologies are high in maturity, so that the method is easy to operate and implement. The method can be applied in feature localization in class hierarchy and method hierarchy, reality of the analysis can be combined, corresponding granularity hierarchy can be selected to perform feature localization, and a flexible selecting frame is provided for realistic multi granularity hierarchical feature location.

Description

The characteristic positioning method of analyzing based on multi-source software data
Technical field
The present invention relates to a kind of characteristic positioning method, particularly a kind of characteristic positioning method of analyzing based on multi-source software data, belongs to technical field of software engineering.
Background technology
Along with growing with each passing day that information society relies on software, user is more and more higher to the requirement of existing software systems, more and more, therefore with regard to needs, constantly software systems upgraded and safeguard, and the modification request of these upgradings and maintenance is also referred to as certain feature conventionally.In software systems, a feature can represent a kind of function, and this function is according to developer and user's requirement and can acceptance defines.Software maintenance may comprise various modification activities with evolution, as increases new function, improvement existing capability and patching bugs.Determine that the position of a known specific function in source code is called feature location.The process of feature location is: determine start node; Select the next node that will access; Access this node; Judge that whether this node is relevant to investigated feature; Check whether oneself is through having obtained all relevant nodes.In current software, implement certain and revise request, first will accurately find the start node of the request of modification, if cannot find this feature locations, whole modification process cannot complete smoothly.
Current feature location research mainly comprises the characteristic positioning method based on Program Static Structure and the characteristic positioning method based on the dynamic section of program.Some important characteristic informations such as these two kinds of methods are only carried out feature location by static analysis target program with by performance analysis object code, the history modification information to target program cannot be monitored, and cause accuracy and the comprehensive reduction of feature location.In addition, software data not only comprises static information and multidate information, also comprises the procedural information of Software Evolution, if only use the information of a certain type wherein may cause the out of true of feature location result and imperfect.
In prior art, have a kind of Java platform testing architecture, English full name is: Java Platform Debugger Architecture, is abbreviated as JPDA.JPDA is a whole set of of virtual machine instrument and interface for debugging, the interface providing by JPDA and agreement, debugger developer just can be according to specific development person's demand, and expansion customization Java debugging utility, develops the debugging acid that attracts developer to use.JPDA mainly consists of three parts: 1, Java Virtual Machine tool interface (JVMTI): defining virtual machine (VM) is in the service that when debugging must provide, and comprises Debugging message (as stack information), debugging behavior (as client arranges a breakpoint) and notifies (notifying client when arrived certain breakpoint); 2, Java debugging wire protocol (JDWP): be defined in the information transmitted between debug process and debugger front end and the form of request; 3, Java debugging interface (JDI): defined the operable debugging interface of debugging person, carried out alternately to facilitate with long-range debugging services.
In prior art, also have a kind of testing tool and performance tool platform, English full name is: Test and Performance Tools Platform, is abbreviated as TPTP.TPTP is a top project of Eclipse foundation, increase income test and the performance tool of a set of complete function are provided, whole test and performance life cycle have been covered, supervision from early stage test to production application program, comprises testing and writes and carry out, monitors, follows the tracks of and analysis and log analysis characteristic.
Summary of the invention
The object of this invention is to provide a kind of characteristic positioning method of analyzing based on multi-source software data, solved and only the characteristic information of single type has been carried out to analysis mining in prior art, cause feature location result out of true, incomplete technical matters.
The object of the present invention is achieved like this: the characteristic positioning method of analyzing based on multi-source software data, comprises the following steps:
Step 1: current software systems are retrieved by information retrieval technique: inquire about in the source code of current software systems and ask relevant program code to described current modification, described program code is designated as to characteristic information a;
Step 2: excavate Historical Evolution information by data mining technology: ask relevant history to revise request to described current modification in inquiry history of evolution storehouse, the modification element that relevant history is revised in request carries out and set operation, and the program code of output is designated as characteristic information b;
Step 3: carry out track by dynamic analysis technology analysis, described execution track includes mark and carries out information and complete execution information, and complete execution information and mark execution information are carried out to subtraction, and Output rusults is designated as execution information undetermined; Then execution information undetermined is carried out to static analysis, obtain asking relevant information set to described current modification in execution information undetermined, described information set, mark execution information are carried out and set operation, the program code of output is designated as characteristic information c;
Step 4: to the calculating of occuring simultaneously of a, b, tri-kinds of characteristic informations of c, output characteristic positioning result m.
The searching step of described information retrieval technique is as follows:
A) set up corpus: defined file granularity is also set up the corpus of described file granularity level;
B) natural language processing: utilize natural language processing technique to implement pre-service to described corpus, described pre-service comprises: deleting source code operational character and programming language key word, isolating identifier and compound phrase, cutting stem is root;
C) index corpus: the source code of the key word that comprises described current modification request in retrieval corpus.
The collection that described mark is carried out information adopts JPDA technology, and the collection of described complete execution information adopts TPTP technology.
Compared with prior art, the invention has the beneficial effects as follows: 1, combine information retrieval technique, data mining technology and dynamic analysis technology and respectively current software systems, history of evolution storehouse and execution track have been carried out to feature location, realized the feature location of analyzing based on multi-source software data, with respect to the method for only information of single type being carried out feature location in prior art, feature location result of the present invention has higher accuracy, integrality and high efficiency; The degree of ripeness of 2, information retrieval of the present invention, data mining and performance analysis three technology is higher, and easy operating of the present invention is realized; 3, the present invention can be used for the feature location of class hierarchy, method level, can be combined into the realities such as this analysis, selects corresponding granularity level to carry out feature location, for many granularities of reality level feature location provides Selection Framework flexibly.
Accompanying drawing explanation
Fig. 1 is process flow diagram of the present invention.
Fig. 2 is information retrieval technique principle of work block scheme of the present invention.
Fig. 3 is data mining technology principle of work block scheme of the present invention.
Fig. 4 is dynamic analysis technology principle of work block scheme of the present invention.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further detail.
As shown in Figure 1, the characteristic positioning method of analyzing based on multi-source software data, comprises the following steps:
Structure is based on multi-source software data analytical characteristic location model: set a four-tuple ﹤ m, and a, b, c ﹥, wherein, m is final feature location result, a, b, c are respectively the characteristic informations extracting from three kinds of different pieces of information sources.
The localization method of characteristic information a is as follows: by information retrieval technique, current software systems are retrieved, inquired about in the source code of current software systems and ask relevant program code to current modification, program code is designated as to characteristic information a.
As shown in Figure 2, be information retrieval technique principle of work block scheme of the present invention.The concrete searching step of information retrieval technique is as follows:
A) set up corpus: defined file granularity is also set up the corpus of file granularity level.The file granularity of definition can be bag, class, method;
B) natural language processing: utilize natural language processing technique to implement pre-service to corpus, pre-service comprises: delete source code operational character and programming language key word, isolating identifier and compound phrase, for example: " impactAnalysis " is separated into " impact " and " Analysis "; Cutting stem is root, for example: " impacted " is cut to " impact ";
C) index corpus: the source code of the key word that retrieval comprises current modification request in corpus, establish this source code and be: e1, e2, e4, e6, e8, e10, characteristic information a={ e1, e2, e4, e6, e8, e10}.
The localization method of characteristic information b is as follows: by data mining technology, excavate Historical Evolution information.Be illustrated in figure 3 data mining technology principle of work block scheme of the present invention.First from history of evolution storehouse, extract and historical revise request and history is revised to request that to carry out information integrated, after supposing that information is integrated, historical revise request and the modification element corresponding with it as shown in table 1:
Figure 201410031303X100002DEST_PATH_IMAGE001
Then, in inquiry history of evolution storehouse, ask relevant history to revise request to current modification, history modification request, current modification request are carried out to similarity matrix computing; Finally, the modification element that relevant history is revised in request carries out and set operation, and the program code of output is designated as characteristic information b.If historical, revise request c2 relevant to current modification request with c5, characteristic information b=c2 ∪ c5={ e2, e3, e5, e7, e4, e12}.
The localization method of characteristic information c is as follows: by dynamic analysis technology analysis, carry out track, be illustrated in figure 4 dynamic analysis technology principle of work block scheme of the present invention.First, adopt JPDA technology to collect mark and carry out information, adopt TPTP technology to collect complete execution information, complete execution information and mark execution information are carried out to subtraction, Output rusults is designated as execution information undetermined; Then execution information undetermined is carried out to static analysis, obtain asking relevant information set to current modification in execution information undetermined, information set, mark execution information are carried out and set operation, the program code of output is designated as characteristic information c.Suppose that mark execution information is g1={ e1, e3, e4}, complete execution information is g2={ e1, e2, e3, e4, e5, e6, e7, e8 }, execution information g3=g2-g1={ e2 undetermined, e5, e6, e7, e8}, g3 is carried out to static analysis, obtain asking relevant information set g4 to current modification in g3, if g4={ is e2, e5}, g1 and g4 are carried out and set operation obtains characteristic information c, characteristic information c=g1 ∪ g4={e1, e2, e3, e4, e5}.
The computing method of feature location result m are as follows: to the calculating of occuring simultaneously of a, b, tri-kinds of characteristic informations of c, output characteristic positioning result m, i.e. m=a ∩ b ∩ c={ e1, e2, e4, e6, e8, e10} ∩ { e2, e3, e5, e7, e4, e12} ∩ { e1, e2, e3, e4, e5}={ e2, e4}.
The present invention is not limited to above-described embodiment, as: there is not sequencing in characteristic information a, characteristic information b, characteristic information c three's positioning action step, can put upside down mutually.On the basis of technical scheme disclosed by the invention; those skilled in the art is according to disclosed technology contents; do not need performing creative labour just can make some replacements and distortion to some technical characterictics wherein, these replacements and distortion are all in protection scope of the present invention.

Claims (3)

1. the characteristic positioning method of analyzing based on multi-source software data, is characterized in that, comprises the following steps:
Step 1: current software systems are retrieved by information retrieval technique: inquire about in the source code of current software systems and ask relevant program code to described current modification, described program code is designated as to characteristic information a;
Step 2: excavate Historical Evolution information by data mining technology: ask relevant history to revise request to described current modification in inquiry history of evolution storehouse, the modification element that relevant history is revised in request carries out and set operation, and the program code of output is designated as characteristic information b;
Step 3: carry out track by dynamic analysis technology analysis, described execution track includes mark and carries out information and complete execution information, and complete execution information and mark execution information are carried out to subtraction, and Output rusults is designated as execution information undetermined; Then execution information undetermined is carried out to static analysis, obtain asking relevant information set to described current modification in execution information undetermined, described information set, mark execution information are carried out and set operation, the program code of output is designated as characteristic information c;
Step 4: to the calculating of occuring simultaneously of a, b, tri-kinds of characteristic informations of c, output characteristic positioning result m.
2. the characteristic positioning method of analyzing based on multi-source software data according to claim 1, is characterized in that, the searching step of described information retrieval technique is as follows:
A) set up corpus: defined file granularity is also set up the corpus of described file granularity level;
B) natural language processing: utilize natural language processing technique to implement pre-service to described corpus, described pre-service comprises: deleting source code operational character and programming language key word, isolating identifier and compound phrase, cutting stem is root;
C) index corpus: the source code of the key word that comprises described current modification request in retrieval corpus.
3. the characteristic positioning method of analyzing based on multi-source software data according to claim 1, is characterized in that, the collection that described mark is carried out information adopts JPDA technology, and the collection of described complete execution information adopts TPTP technology.
CN201410031303.XA 2014-01-22 2014-01-22 The characteristic positioning method analyzed based on multi-source software data Active CN103744788B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410031303.XA CN103744788B (en) 2014-01-22 2014-01-22 The characteristic positioning method analyzed based on multi-source software data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410031303.XA CN103744788B (en) 2014-01-22 2014-01-22 The characteristic positioning method analyzed based on multi-source software data

Publications (2)

Publication Number Publication Date
CN103744788A true CN103744788A (en) 2014-04-23
CN103744788B CN103744788B (en) 2016-08-31

Family

ID=50501807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410031303.XA Active CN103744788B (en) 2014-01-22 2014-01-22 The characteristic positioning method analyzed based on multi-source software data

Country Status (1)

Country Link
CN (1) CN103744788B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930162A (en) * 2016-04-24 2016-09-07 复旦大学 Subgraph search-based feature location method
CN106663003A (en) * 2014-06-13 2017-05-10 查尔斯斯塔克德拉珀实验室公司 Systems and methods for software analysis
CN107491299A (en) * 2017-07-04 2017-12-19 扬州大学 Towards developer's portrait modeling method of multi-source software development data fusion
CN107832781A (en) * 2017-10-18 2018-03-23 扬州大学 A kind of software defect towards multi-source data represents learning method
CN112699253A (en) * 2019-10-23 2021-04-23 广州彩熠灯光股份有限公司 Source code positioning method, system, medium and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070260628A1 (en) * 2006-05-02 2007-11-08 Tele Atlas North America, Inc. System and method for providing a virtual database environment and generating digital map information
CN102508767A (en) * 2011-09-30 2012-06-20 东南大学 Software maintenance method based on formal concept analysis
CN103136332A (en) * 2013-01-28 2013-06-05 福州新锐同创电子科技有限公司 Method for achieving making, management and retrieval of knowledge points

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070260628A1 (en) * 2006-05-02 2007-11-08 Tele Atlas North America, Inc. System and method for providing a virtual database environment and generating digital map information
CN102508767A (en) * 2011-09-30 2012-06-20 东南大学 Software maintenance method based on formal concept analysis
CN103136332A (en) * 2013-01-28 2013-06-05 福州新锐同创电子科技有限公司 Method for achieving making, management and retrieval of knowledge points

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨帅等: "软件演化过程中运行实例的在线可信演化", 《计算机应用研究》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106663003A (en) * 2014-06-13 2017-05-10 查尔斯斯塔克德拉珀实验室公司 Systems and methods for software analysis
CN105930162A (en) * 2016-04-24 2016-09-07 复旦大学 Subgraph search-based feature location method
CN105930162B (en) * 2016-04-24 2019-05-03 复旦大学 A kind of characteristic positioning method based on subgraph search
CN107491299A (en) * 2017-07-04 2017-12-19 扬州大学 Towards developer's portrait modeling method of multi-source software development data fusion
CN107832781A (en) * 2017-10-18 2018-03-23 扬州大学 A kind of software defect towards multi-source data represents learning method
CN107832781B (en) * 2017-10-18 2021-09-14 扬州大学 Multi-source data-oriented software defect representation learning method
CN112699253A (en) * 2019-10-23 2021-04-23 广州彩熠灯光股份有限公司 Source code positioning method, system, medium and device

Also Published As

Publication number Publication date
CN103744788B (en) 2016-08-31

Similar Documents

Publication Publication Date Title
CN103744788B (en) The characteristic positioning method analyzed based on multi-source software data
CN103092761B (en) Method and device of recognizing and checking modifying code blocks based on difference information file
CN102750223B (en) A kind of location of mistake method based on object-oriented program section spectrum
CN103020494B (en) Copyright ownership detecting method using Program code programming mode copyright ownership detecting model
CN111538731A (en) Industrial data automatic generation report system
CN108228187B (en) Global optimization method of numerical program
CN104731588A (en) Page layout file generation method and device
CN114443854A (en) Processing method and device of multi-source heterogeneous data, computer equipment and storage medium
CN106294136A (en) The online test method of concurrent program run duration performance change and system
CN106802958A (en) Conversion method and system of the CAD data to GIS data
CN106383734A (en) Method for extracting detailed design from codes
CN116132499B (en) Compression method and device for call chain, computer equipment and storage medium
Akca et al. Run-time measurement of cosmic functional size for java business applications: Initial results
CN103268280B (en) The software fault positioning system combined based on distance metric and statistical study and method
CN111581306B (en) Driving track simulation method and device
CN104199649A (en) Path profiling method for interaction information between parent process and child process
Agrawal et al. Ruffle: Extracting co-change information from software project repositories
CN113867714B (en) Automatic code generation method adapting to multiple languages
CN107239373B (en) Simulation method and system of embedded relay protection equipment
CN109754159B (en) Method and system for extracting information of power grid operation log
CN110309047B (en) Test point generation method, device and system
Bán et al. Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware
CN106021380A (en) Method and device for compiling aircraft technical publication based on maintenance engineering analysis data
CN110543293A (en) Standardized use method based on computer software development
Sosnówka Test City metaphor as support for visual testcase analysis within integration test domain

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210528

Address after: Yangzhou xiaonaxiong robot Co., Ltd., 3 / F, building 9, 20 Hongyang Road, Yangzhou Economic Development Zone, Jiangsu Province, 225000

Patentee after: Yangzhou xiaonaxiong robot Co.,Ltd.

Address before: 225009 No. 88, South University Road, Jiangsu, Yangzhou

Patentee before: YANGZHOU University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210709

Address after: Yangzhou Gezhi Optoelectronic Technology Co., Ltd., 2601, 217 Kaifa West Road, high tech Industrial Development Zone, Yangzhou City, Jiangsu Province, 225000

Patentee after: Yangzhou Gezhi Photoelectric Technology Co.,Ltd.

Address before: Yangzhou xiaonaxiong robot Co., Ltd., 3 / F, building 9, 20 Hongyang Road, Yangzhou Economic Development Zone, Jiangsu Province, 225000

Patentee before: Yangzhou xiaonaxiong robot Co.,Ltd.