CN109918129B - Software key function identification method based on g-kernel decomposition - Google Patents

Software key function identification method based on g-kernel decomposition Download PDF

Info

Publication number
CN109918129B
CN109918129B CN201910033265.4A CN201910033265A CN109918129B CN 109918129 B CN109918129 B CN 109918129B CN 201910033265 A CN201910033265 A CN 201910033265A CN 109918129 B CN109918129 B CN 109918129B
Authority
CN
China
Prior art keywords
function
software
node
nodes
core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910033265.4A
Other languages
Chinese (zh)
Other versions
CN109918129A (en
Inventor
潘伟丰
李�浩
王家乐
姜波
柴春来
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhun Shu Technology Co ltd
Original Assignee
Shenzhen Zhun Shu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhun Shu Technology Co ltd filed Critical Shenzhen Zhun Shu Technology Co ltd
Priority to CN201910033265.4A priority Critical patent/CN109918129B/en
Publication of CN109918129A publication Critical patent/CN109918129A/en
Application granted granted Critical
Publication of CN109918129B publication Critical patent/CN109918129B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Stored Programmes (AREA)

Abstract

The invention discloses a software key function identification method based on g-kernel decomposition, which comprises the following steps: abstracting a function execution process of software written by Java language during operation into a function dependency graph; calculating the g-core number of the function node based on the function dependency graph; and (4) taking the g-core number of the function node as a measurement index of the importance of the function node to perform descending order to obtain a key function. The construction of the function dependency graph is constructed on the basis of dynamic analysis of Java software during execution, represents functions in the software and real interactive relations among the functions, is more accurate than a static analysis method based on software source codes, and solves the problem that a key element identification method model based on static analysis is inaccurate to a certain extent. The method overcomes the defect that the prior art is rarely used for identifying key functions related to fine granularity, and has important significance for improving software understanding efficiency, software testing efficiency, code maintenance efficiency and the like.

Description

Software key function identification method based on g-kernel decomposition
Technical Field
The invention relates to a software key function identification method, in particular to a software key function identification method based on g-kernel decomposition.
Background
Computer software has entered into all aspects of our lives and is an indispensable part of our lives. The software is changing and will continue to change our lives. People have higher and higher requirements on the functions and the performances of software, so that the scale of the software is increasingly complicated, and the quality is difficult to ensure. When new needs arise, old software often needs to be adapted to the new needs through certain maintenance work. However, the complexity of the software makes the maintenance work of the software increasingly complex, and the maintenance cost is high all the time, which accounts for more than 60% of the total cost of the software. To perform maintenance on old software, it is first of all a problem to understand the software to be maintained. However, the complexity of the software makes the understanding of the software difficult. Therefore, it is a technical problem to provide an effective technique to assist the maintainer in understanding the software, thereby simplifying the maintenance work of the software and reducing the maintenance cost.
Understanding software from its key elements (packages, classes, functions, attributes, etc.) is one possible approach. The key elements are understood first and then the elements associated with the key elements are understood, thereby gradually understanding the entire software. In order to identify key elements in software, different approaches have been proposed: zaidman et al constructs a static class dependency graph and identifies key classes using the HITS algorithm. Zhou Yuming and the like abstract a software system of class granularity by using a class dependency graph, and identify key classes by using methods such as a PageRank algorithm, HITS, betweenness centrality and the like. Jiang Shujuan et al construct a state transition model for software, and identify key classes by calculating the complexity of state transition tree nodes. Pan Weifeng, et al, construct a software structure diagram of class granularity and package granularity, and further identify key classes and key packages in the software by using the PageRank algorithm. Although there is currently some work on the identification of key elements in software, the following disadvantages remain:
(1) The existing work mainly focuses on static analysis of software codes and lacks dynamic analysis of the actual running of the software. Static analysis does not need to run software, only depends on software source codes, and the relationship among the extracted elements is actually a relationship under the 'worst' condition and may contain redundant relationships; the dynamic analysis needs to run software, collect elements and relationships among the elements in the software running process, and represent the real interaction among the elements. Dynamic analysis is more accurate than static analysis.
(2) The existing work mainly aims at the identification of key packages and key classes and lacks the identification of key functions.
Packets and classes are relatively large-grained software elements, while functions are relatively fine-grained software elements. The technology for recognizing the key function can make up the defects of the existing work, so that the key elements of the software can be recognized in an all-round way from coarse granularity to fine granularity, and technical support is provided for software understanding, software testing and software maintenance work.
Disclosure of Invention
The invention aims to provide a software key function identification method based on g-kernel decomposition aiming at the defects of the prior art.
The technical problem of the invention is mainly solved by the following technical scheme: a software key function identification method based on g-kernel decomposition comprises the following steps:
(1) Abstracting software written in Java language into a function dependency graph FG = (N, D) at the function granularity, wherein N is a set of function nodes in the software; d = { (f) i ,f j )}(f i ∈N,f j e.N) is a set of undirected edges and represents the calling relationship among functions; each edge will be assigned a non-negative integer as the strength value of the function-call relationship.
(2) And (3) calculating the g core number g (i) of the function node i based on the FG constructed in the step (1) as the importance value of the corresponding function of the node.
(3) And (3) performing descending order arrangement on the function nodes based on the g-core number of the function nodes obtained in the step (2) to obtain a key function.
Further, the functions in the step (1) and the call relations among the functions are obtained according to the actual running process of the Java software on the Java virtual machine, and are a dynamic analysis, not a static analysis based on source codes.
Further, the strength value of the edge in the step (1) refers to the number of calls between functions. The calling times are obtained according to the actual running process of the Java software on the Java virtual machine, and are dynamic analysis instead of static analysis based on source codes.
Further, the calculation of the g-kernel number g (i) of the node i in the step (2) specifically includes the following sub-steps:
and (2.1) calculating the weighting degree of all function nodes in the FG obtained in the step (1). Weighting degree w of function node j j Defined as the sum of the strength values of all edges in the FG that are connected to the node of the function, i.e.:
Figure BDA0001944992430000021
wherein v is j Is a set of neighbor function nodes for function node j; w (j, m) is the intensity value on the edge (j, m).
And (2.2) solving the degrees of all function nodes in the FG obtained in the step (1). Degree k of function node j j Defined as the number of edges in the FG that connect to the function node.
And (2.3) solving the geometric mean degree of all nodes in the FG obtained in the step (1). Geometric mean s of function node j j Is numerically equal to the nearest w j And k j Integer of the arithmetic square root of the product. s j The calculation formula is as follows:
Figure BDA0001944992430000022
where round (n) returns the integer closest to the value n.
(2.4) obtaining g cores (g =1,2,3, …) of the FG obtained in the step (1): and repeatedly removing function nodes with the geometric mean degree smaller than g in the FG and connecting edges thereof to obtain a subgraph, namely the g core of the FG.
(2.5) calculating the g cores of all function nodes in the FG obtained in the step (1): comparing the function nodes in the g core and the (g + 1) core, if the function node exists in the g core but is deleted in the (g + 1) core, the number of the g cores of the function node is g.
Further, in the step (3), the function nodes are sorted in a descending order by using a bubble sorting algorithm.
Further, in the step (3), after descending order, the top 15% (rounding up) ranked function is taken as the identified key function.
Compared with the prior art, the invention has the following advantages and positive effects:
(1) The FG is constructed based on dynamic analysis of Java software in execution, represents functions in the software and real interactive relations among the functions, is more accurate than a static analysis method based on software source codes, and solves the problem of inaccurate model of a key element identification method based on static analysis to a certain extent.
(2) The invention provides a method for identifying key functions in software based on g-kernel decomposition, which solves the problems that the existing method only focuses on identification of key packets and key classes with coarse granularity in the software and ignores identification of key functions of the software with fine granularity to a certain extent, and can provide technical support for software understanding, software testing, code maintenance and other works.
Drawings
FIG. 1 source code written in the Java language of the present invention;
FIG. 2 is a FG constructed in accordance with an embodiment of the invention;
FIG. 3 g-nucleus decomposition process of an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained by the embodiment and the accompanying drawings:
the invention provides a software key function identification method based on g-kernel decomposition, which comprises the following specific steps:
(1) Software written in the Java language is abstracted at function granularity to a function dependency graph FG = (N, D). Fig. 1 shows a Java source code. According to the Java source code given in fig. 1, when it runs on the JVM, the main function is executed first, then the add function is called 9 times by the main function, and the sub1 function is called 1 time by the main function; sub1 calls sub2 and add 1 time each at the time of execution. Therefore, according to its operation, the FG shown in fig. 2 can be obtained, and the text of the node edge is the name of the corresponding function of the node. Wherein N = { main, sub1, sub2, add } is a set of function nodes; d = { (main, sub 1), (main, add), (sub 1, sub 2), (sub 1, add) } is a set of undirected edges, and represents a call relationship between functions; the numbers on the edges represent the frequency of calling the relationship.
(2) And (3) calculating the g core number g (i) of the function node i based on the FG constructed in the step (1) as the importance value of the corresponding function of the node. The calculation of the number of g cores g (i) of the function node i comprises in particular the following sub-steps (these steps are collectively referred to as the g core decomposition process of the FG):
and (2.1) calculating the weighting degree of all function nodes in the FG obtained in the step (1). Weighting degree w of function node j j Defined as the sum of the strength values of all edges in the FG that are connected to the node of the function, i.e.:
Figure BDA0001944992430000031
wherein v is j Is a neighbor function node set of function node j; w (j, m) is the intensity value on the edge (j, m). Thus, the weighting degree w of the function node main in FIG. 2 main The weighting degree of =9+1=10,sub1 is w sub1 Weighting degree w of =1+1= 3,sub2 sub2 Weighting degree of =1,add is w add =9+1=10。
And (2.2) solving the degrees of all function nodes in the FG obtained in the step (1). Degree k of function node j j Defined as the number of edges in the FG that connect to the function node. Thus, the degree k of the function node main in FIG. 2 main Degree of = 1=2,sub1 is k sub1 Degree k of =1+1= 3,sub2 sub2 Degree of =1,add is k add =1+1=2。
And (2.3) solving the geometric mean degree of all function nodes in the FG obtained in the step (1). Geometric mean s of function node j j Is numerically equal to and w j And k j The arithmetic square root of the product is the nearest integer. s j The calculation formula is as follows:
Figure BDA0001944992430000032
where round (n) returns the integer closest to the value n. Thus, the geometric mean of the function node main in FIG. 2
Figure BDA0001944992430000033
(2.4) obtaining g cores (g =1,2,3, …) of the FG obtained in the step (1): and repeatedly removing function nodes with the geometric mean degree smaller than g in the FG and connecting edges thereof to obtain a subgraph, namely the g core of the FG. The g-nucleus decomposition process of fig. 2 is shown in fig. 3.
Calculate 1 kernel (g = 1). Based on the step (2.3), it is possible to obtain
Figure BDA0001944992430000041
Figure BDA0001944992430000042
The resulting 1-core diagram is shown in fig. 3 (second left).
Calculate 2 kernels (g = 2). Based on 1 nucleus, simultaneously because
Figure BDA0001944992430000043
Figure BDA0001944992430000044
Therefore, in core 2, node sub2 and its connected edges in core 1 are removed, and the resulting core 2 graph is shown in the (left three) sub-graph of fig. 3.
Calculate 3 kernels (g = 3). Based on 2 kernels, the geometric mean of each node is recalculated simultaneously, as shown in the (left three) subgraph of FIG. 3, because
Figure BDA0001944992430000045
Therefore, in core 3, node sub1 and its connected edges in core 2 are removed, and the resulting 3-core graph is shown in the (left four) sub-graph of fig. 3.
4 kernels were calculated (g = 4). Based on 3 kernels, the geometric mean of each node is recalculated at the same time, as shown in the (left-four) subgraph of FIG. 3, because
Figure BDA0001944992430000046
So in 4 cores, all the nodes main, add and their connected edges in the original 3 cores will be removed, and the resulting 4 cores are empty, i.e. there are no 4 cores in the FG of fig. 2.
(2.5) calculating the g cores of all function nodes in the FG obtained in the step (1): comparing the function nodes in the g core and the (g + 1) core, if the function node exists in the g core but is deleted in the (g + 1) core, the g core number of the function node is g. As in fig. 3, since node sub2 is in 1 core, but not in 2 core, the g core number is 1. Similarly, the g core number of the node sub1 is 2, and the g core number of the add is 3.
(3) And (3) based on the g-core number of the function nodes obtained in the step (2), performing descending arrangement on the function nodes by using a bubble sorting algorithm, and taking a function which is 15 percent of the top rank (one value commonly adopted in the prior art) as the identified key function. Based on the result of step (2.5), the resulting ranking result is (main = add: 3) > (sub 1: 2) > (sub 2: 1). So the function node that ranks the top 15% (15% × 4=0.6, rounding to 1) is main or add.
The specific embodiments described herein are merely illustrative of the spirit of the invention, and the equal values of main and add are merely one possible scenario in reality, and do not represent all scenarios. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (4)

1. A software key function identification method based on g-kernel decomposition is characterized by comprising the following steps:
(1) Abstracting software written by Java language into a function dependency graph FG = (N, D) at the function granularity; wherein N is a set of function nodes in software; d is a set of undirected edges and represents the calling relationship among functions; each edge is endowed with a nonnegative integer as the intensity value of the function call relation; the strength value on the edge refers to the number of calls between functions; the calling times and the calling relation among the functions are obtained according to the actual running process of the Java software on the Java virtual machine, and are dynamic analysis instead of static analysis based on source codes;
(2) Calculating g-core number g (i) of a function node i based on the FG constructed in the step (1) as an importance value of a corresponding function of the node;
(3) And (3) performing descending order arrangement on the function nodes based on the g-core number of the function nodes obtained in the step (2) to obtain a key function.
2. The software key function identification method based on g-kernel decomposition as claimed in claim 1, wherein the calculation of the g-kernel number g (i) of the node i in the step (2) specifically comprises the following sub-steps:
(2.1) solving the weighting degrees of all function nodes in the FG obtained in the step (1); weighting degree w of function node j j Defined as the sum of the strength values of all edges in the FG that connect to the function node, i.e.:
Figure FDA0003642759230000011
wherein v is j Is a set of neighbor function nodes for function node j; w (j, m) is the intensity value on the edge (j, m);
(2.2) solving the degrees of all function nodes in the FG obtained in the step (1); degree k of function node j j Defining the number of edges connected with the function node in FG;
(2.3) solving the geometric mean degree of all nodes in the FG obtained in the step (1); geometric mean s of function node j j Is numerically equal to the nearest w j And k j Integer of the arithmetic square root of the product; s j The calculation formula is as follows:
Figure FDA0003642759230000012
wherein round (n) returns the integer closest to the value n;
(2.4) obtaining g cores of FG obtained in the step (1), wherein g =1,2,3, …: repeatedly removing function nodes with geometric mean smaller than g in FG and connecting edges thereof to obtain a subgraph which is a g core of FG;
(2.5) calculating the g cores of all function nodes in the FG obtained in the step (1): comparing the function nodes in the g core and the g +1 core, if the function node exists in the g core but is deleted in the g +1 core, the g core number of the function node is g.
3. The software key function identification method based on g-kernel decomposition as claimed in claim 1, wherein in the step (3), the function nodes are sorted in descending order by using a bubble sorting algorithm.
4. The method for identifying key functions of software based on g-kernel decomposition as claimed in claim 1, wherein in step (3), after descending order, the rounded-up top 15% function is used as the identified key function.
CN201910033265.4A 2019-01-14 2019-01-14 Software key function identification method based on g-kernel decomposition Active CN109918129B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910033265.4A CN109918129B (en) 2019-01-14 2019-01-14 Software key function identification method based on g-kernel decomposition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910033265.4A CN109918129B (en) 2019-01-14 2019-01-14 Software key function identification method based on g-kernel decomposition

Publications (2)

Publication Number Publication Date
CN109918129A CN109918129A (en) 2019-06-21
CN109918129B true CN109918129B (en) 2022-12-23

Family

ID=66960257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910033265.4A Active CN109918129B (en) 2019-01-14 2019-01-14 Software key function identification method based on g-kernel decomposition

Country Status (1)

Country Link
CN (1) CN109918129B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045574A (en) * 2015-06-24 2015-11-11 广东电网有限责任公司电力科学研究院 Software key function identification method based on complex network fault propagation
CN105389192A (en) * 2015-12-18 2016-03-09 浙江工商大学 Method for measuring importance of software class based on weighted q2 index

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102009059939A1 (en) * 2009-12-22 2011-06-30 Giesecke & Devrient GmbH, 81677 Method for compressing identifiers

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045574A (en) * 2015-06-24 2015-11-11 广东电网有限责任公司电力科学研究院 Software key function identification method based on complex network fault propagation
CN105389192A (en) * 2015-12-18 2016-03-09 浙江工商大学 Method for measuring importance of software class based on weighted q2 index

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于软件网络加权k-核分析的关键类识别方法";潘伟丰等;《电子学报》;20180830;第46卷(第5期);第1071-1077页,图1-2 *

Also Published As

Publication number Publication date
CN109918129A (en) 2019-06-21

Similar Documents

Publication Publication Date Title
CN111722839B (en) Code generation method and device, electronic equipment and storage medium
CN101651576B (en) Alarm information processing method and system
EP3674918B1 (en) Column lineage and metadata propagation
US10241785B2 (en) Determination of production vs. development uses from tracer data
CN103873318A (en) Website automated testing method and automated testing system
CN111240876A (en) Fault positioning method and device for microservice, storage medium and terminal
CN108021509A (en) Test case dynamic order method based on program behavior network polymerization
CN113590454A (en) Test method, test device, computer equipment and storage medium
CN111427577A (en) Code processing method and device and server
US11526429B1 (en) Identifying critical methods and critical paths in software code
CN109918129B (en) Software key function identification method based on g-kernel decomposition
CN110633084B (en) Transcoding derivation method and device based on single sample
CN114727100B (en) Joint debugging method and device for monitoring equipment
CN107977304B (en) System debugging method and device
JP5121891B2 (en) Rule inspection device, rule inspection method and rule inspection program
CN115757172A (en) Test execution method and device, storage medium and computer equipment
CN115455091A (en) Data generation method and device, electronic equipment and storage medium
CN115794618A (en) Method and device for detecting repeated application function module
CN109871318B (en) Key class identification method based on software operation network
JP2015018307A (en) File evaluation program, file identifying apparatus, and file evaluation method
CN109947428B (en) High-quality software recommendation method based on software stability measurement
CN109976807B (en) Key package identification method based on software operation network
US11782682B2 (en) Providing metric data for patterns usable in a modeling environment
CN116414354A (en) Method, device and equipment for developing standard codes
JP7164473B2 (en) Defect information extraction device, method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220927

Address after: No. n817, 3rd floor, xingguangyingjing, No. 117, Shuiyin Road, Yuexiu District, Guangzhou, Guangdong 510000

Applicant after: Zhiyueyun (Guangzhou) Digital Information Technology Co.,Ltd.

Address before: 310018, No. 18 Jiao Tong Street, Xiasha Higher Education Park, Hangzhou, Zhejiang

Applicant before: ZHEJIANG GONGSHANG University

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221202

Address after: 1906A, 19/F, Jingdi Building, No. 3, Fuhua Third Road, Fushan Community, Futian District, Shenzhen, Guangdong 518000

Applicant after: Shenzhen Zhun Shu Technology Co.,Ltd.

Address before: No. n817, 3rd floor, xingguangyingjing, No. 117, Shuiyin Road, Yuexiu District, Guangzhou, Guangdong 510000

Applicant before: Zhiyueyun (Guangzhou) Digital Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant