CN107832080A - Component Importance measure based on node betweenness under a kind of Software Evolution environment - Google Patents

Component Importance measure based on node betweenness under a kind of Software Evolution environment Download PDF

Info

Publication number
CN107832080A
CN107832080A CN201710977888.8A CN201710977888A CN107832080A CN 107832080 A CN107832080 A CN 107832080A CN 201710977888 A CN201710977888 A CN 201710977888A CN 107832080 A CN107832080 A CN 107832080A
Authority
CN
China
Prior art keywords
component
node
betweenness
software
dependency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710977888.8A
Other languages
Chinese (zh)
Inventor
成蕾
林英
李彤
谢仲文
莫启
秦江龙
王晓芳
郑交交
李响
杨真谛
郑明�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN201710977888.8A priority Critical patent/CN107832080A/en
Publication of CN107832080A publication Critical patent/CN107832080A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/77Software metrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention belongs to software component importance measures technical field, disclose the Component Importance measure based on node betweenness under a kind of Software Evolution environment, software architecture is used as blueprint and support, software architecture Directed Graph Model is proposed, node betweenness is introduced and the importance of component is measured.And the request to component is relied on, service dependence is analyzed, and is analyzed, found out and the maximally related factor of node betweenness by using Pearson correlation coefficient.The present invention to a large amount of open source software source codes by testing, test result indicates that, importance with node betweenness metrology member is effective, and the summation and the node betweenness of component that the request of component relies on and service relies on are mostly concerned, this also specifies another analysis directions to weigh Component Importance using dependence in next step.

Description

Component importance measurement method based on node betweenness in software evolution environment
Technical Field
The invention belongs to the technical field of software component importance measurement, and particularly relates to a component importance measurement method based on node betweenness in a software evolution environment.
Background
Software systems are gradually developed into combined delivery of services and components, and are continuously adjusted and expanded as required in the development of the society, so that the scale of the software systems is increased, the structure has multiple levels, different granularities and multiple integration modes, and people use the term evolution (evolution) to describe the continuous change. This is commonly found in software systems, and a series of complex changing activities of the software system gradually changes until the ideal form is reached is software evolution.
Software has two basic properties of construction and evolution. The development of Software Architecture (SA) has grown to maturity as a blueprint to support people's understanding of the overall software architecture from the macro level. However, as software systems develop in terms of functionality and scale, the mastering and control of software evolution becomes more complex and increasingly difficult. The traditional measurement method has important contribution in software evolution, and shows certain characteristics of the software evolution. However, these conventional measurement methods have a common property and fall into complicated details in the software structure early, which is not enough to focus on the macro aspect, and it is difficult to grasp the software structure integrally and comprehensively.
In the 90 s of the 20 th century, bohner elaborated software changes using the concept of reachable matrices based on the process framework for software change analysis, but did not give the concept of the size of the contribution of constituent elements to the software. Valverde et al first analyzed object-oriented software systems, which abstracted the class diagrams of the systems into directed net graphs. Myers, valverde, and Moura et al use a directed network to represent the structure of a software system, and propose a reconstruction-based software model based on this. Then, a domestic batch of analysts in wangnao et al use a weighted network to analyze a software network of a complex software system, and carry out software structure analysis such as the Kingshihui and Zhangukin, so that a series of analysis results are obtained.
In summary, the problems of the prior art are as follows: the measurement methods of the traditional components are all shared and fall into complicated details in the software structure in advance, so that the attention on the macroscopic aspect is insufficient, and the software structure is difficult to be integrally and comprehensively grasped. So far, there is no standard and generally agreed influence factor for measuring the importance of the components in a complex software system, and in the measurement of the software architecture at the present stage, due to the complexity of the software architecture, nodes with similar structures often appear, and the difference and importance of the software components cannot be strictly reflected. The technology provides a component importance measuring method which can be comprehensively considered and has reasonable calculation cost, on the other hand, the importance of components in a software system structure is sequenced by combining node dependence and node betweenness, and the Pearson correlation coefficient is used for verification.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a component importance measurement method based on node betweenness in a software evolution environment.
The present invention has been accomplished in such a manner that,
a component importance measuring method based on node betweenness in a software evolution environment adopts a software architecture as a blueprint and a support, proposes a software architecture unweighted directed graph model, and introduces the node betweenness to measure the importance of a component; and the request dependence, the service dependence and the total dependence of the components are analyzed by using Pearson correlation coefficients to find out the factors most relevant to the node betweenness.
Further, the method for providing the directed graph model of the software architecture by taking the software architecture as a blueprint and a support comprises the following steps:
1) The model G of the SA of the software system is an unweighted directed graph triple < NG, V (G), E (G) >:
N G is the name of the software system SA model;
v (G) is a set of nodes represented by the members constituting the software system;
e (G) is a set of unweighted directed edges represented by relationships between the components that make up the software system;
2) The member V represented by the node is a binary < NC, FC >:
nc is the name of the member;
fc is a functional description of the building block;
3) The interactive relation among the components is an unweighted directed edge E which is a triplet<E n ,V i ,V j >:
E n Is a unique identification of a directed edge;
V i is a member for initiating dependence, and is an initial node;
V j is a member for accepting dependence, and is a termination node;
<V i ,V j &gt represents a node V i Point of direction V j
4) Model G = of SA<N G ,V(G),E(G)&gt, in the formula, the component vi belongs to V (G), and the component V i The total number of edges as starting nodes is the component v i Request dependency of (d) req (v i );
5) Model G = SA<V(G),E(G)&gt, middle and component v i Is e.g. V (G) structurePiece v i The total number of edges as termination nodes is component v i Service dependency of (d) ser (v i );
The sum of the request dependencies of a component and the service dependencies of the component, which is the total dependency of the component, is denoted as d sum (v i );
6) Given graph G =<V(G),E(G)&gt, node v i E.g. V (G), passing through the node V in the graph G i The ratio of the total number of shortest paths in (1) to all shortest paths in graph G is v i The number of nodes of (C) is denoted as C (v) i ) (ii) a Then:
wherein δ st is the total number of all shortest paths from the node s to the node t, and δ st (v) is the number of shortest paths passing through the node v in the number of shortest paths from the node s to the node t.
Further, the method for measuring the importance of the member by introducing the node betweenness comprises the following steps:
acquiring association between the components, taking a class in a source code as one component, and scanning the source code to obtain a relationship between a component name identifier and the component;
processing the relation data between the members and mapping the relation data into an adjacent matrix; if the number of the components is n, mapping the relationship between the components into an n-dimensional adjacent matrix M, and defaulting M 11 ,M 22 ,……;
Calculating the member request dependency, the member service dependency and the total dependency of the members of each node on the basis of the adjacency matrix M;
calculating the node betweenness of each node; calculating the shortest path of the whole graph to obtain the total number of the shortest paths of the whole graph and the number of the shortest paths of each node passing through the node, and then calculating the node betweenness of each component according to a formula (1); measuring important components in the SA according to the size of the node betweenness;
respectively calculating Pearson correlation coefficients of request dependency of the component, service dependency of the component, total dependency of the component and node betweenness; respectively calculating according to a formula (2), and analyzing factors most relevant to node betweenness;
the calculation formula of the Pearson correlation coefficient used is:
where X and Y are two vectors of equal length that require correlation to be computed,andare the average of vectors X and Y, respectively, and the order of X and Y does not affect the result of calculation of Pearson correlation coefficients.
Further, the method of processing the relationship data between the members and mapping the relationship data into the adjacency matrix; the method comprises the following steps:
inputting: the method comprises the following steps that (1) a linked list Name of the component and a linked list Connection of the interaction relationship between the components are identified;
and (3) outputting: an adjacency Matrix of a model of the SA;
initialization: the two-dimensional array Matrix is used for storing an adjacent Matrix of the SA model, and the row-column length of the Matrix is equal to the length of a linked list Name;
initializing all M11, M22, \8230;, mnn has a value of 0 except for a member with special self-call;
when pointer integer i =0 starts looping:
assigning the sequence index of the initial member in the Name identification linked list Name in the ith element Connection [ i ] of the inter-member interaction linked list Connection to a variable row;
assigning the sequence index of the termination component in the Connection [ i ] in the Name of the Name identification linked list to a variable column;
let Matrix [ row ] [ column ] be 1;
the value of the pointer i is i +1, if the value of i +1 is smaller than the length of the linked list Connection, the circulation continues, and if not, the circulation is terminated, and an adjacent matrix is obtained; the following were used:
the request dependency and the service dependency of each component are calculated after the adjacency matrix of the SA model is obtained.
Further, calculating the shortest path of the whole graph includes:
inputting: an adjacency Matrix of the SA model;
and (3) outputting: the full graph shortest path Pathes of the SA model;
initialization: taking integers i and j as pointers of each element Matrix [ i ] [ j ] of the adjacent Matrix, wherein i and j respectively represent different nodes, traversing every two nodes, finding out the shortest path among all the nodes according to the content of the adjacent Matrix, and storing the shortest path into the shortest path Pathes of the whole graph every time one shortest path is found out.
Further, calculating the node betweenness of each node comprises:
inputting: the full graph shortest path Pathes of the SA model, and a component Name identification linked list Name;
and (3) outputting: node betweenness of each member of the SA model;
initialization: the chain table Betweenness is used for storing the node Betweenness of the SA model, and the length of the chain table Betweenness is equal to the length of the Name of the component Name identification chain table;
initializing an integer k =0 and an integer i =0, traversing the Name of the component Name identification linked list from the first element, adding 1 to the value of k when finding a path containing the node Name [ i ] in the shortest path Pathes of the whole graph, and after the traversal is finished, taking the node Betweenness [ i ] of the node Name [ i ] as the length of k divided by the linked list Pathes.
The invention also aims to provide a component importance measurement system based on node betweenness in a software evolution environment.
The invention has the advantages and positive effects that: the invention provides a directed graph model of a software architecture by taking the software architecture as a blueprint and a support, and introduces node betweenness to measure the importance of a component. And analyzing the request dependence and the service dependence of the component, and finding out the factor most relevant to the node betweenness by analyzing by using the Pearson correlation coefficient. Through experiments on a large number of open source software source codes, the experimental result shows that the method for measuring the importance of the component by using the node betweenness is effective, the sum of the request dependence and the service dependence of the component is most relevant to the node betweenness of the component, and another analysis direction is indicated for measuring the importance of the component by using the dependence relation in the next step.
According to experimental statistics, compared with the traditional software evolution which is trapped in complex details too early and does not concern microstructures, the importance of components of a complex software system is measured, and the time cost can be saved by 12% on average on unnecessary component observation and evaluation costs. On the other hand, the method measures the importance of the components by adopting node dependence and node betweenness, is more accurate than single node betweenness, is more accurate in distinguishing nodes with similar structures, and can eliminate about 9 percent of similar nodes on average.
In a software system with a good software architecture, the technology is found to be effective, and the total dependence of components and the trend of node betweenness always accord with each other regardless of the scale and the functional type of the software. It is not difficult to find that there is no obvious correlation between the request dependency of the component and the service dependency of the component, that is, there is no regular correspondence between the request dependency of the component and the service dependency of the component, and a component with high request dependency may have high service dependency or low service dependency. The fluctuation trend of the total dependence of the components and the node numbers is basically consistent, which means that the node numbers of the components with high total dependence are also high. The higher the node betweenness of the component, the more important the function and position of the component in the software architecture.
By calculating the total dependence of the components and node betweenness, the importance of the components in the whole software architecture can be clearly measured, the evolution process of the important nodes can be better grasped when the software architecture evolves, the evolution risk is reduced, and the monitoring and management of activities and components which are difficult to control in the evolution activity are facilitated. The request dependency and the service dependency of the components in the software architecture have no regular correlation, and the total dependency and the node betweenness of the components usually show strong positive correlation or strong positive correlation, i.e. the node betweenness of the components with high total dependency is also high.
Drawings
Fig. 1 is a flowchart of a component importance measurement method based on node betweenness in a software evolution environment according to an embodiment of the present invention.
Fig. 2 is a model relationship diagram of an SA provided in an embodiment of the present invention.
Fig. 3 is a percentage line graph of the request dependency of the building block, the service dependency of the building block, the total dependency of the building block, and the node betweenness of the eclipse3.0 provided by the embodiment of the present invention.
In the figure: (a) A distribution curve of request dependencies, service dependencies of the component; (b) percentage line graph of total dependency and node betweenness.
Fig. 4 is a percentage line graph of the request dependency of the component and the service dependency of the component of Jabref, the total dependency of the component, and the node betweenness provided in the embodiment of the present invention.
In the figure: A. a distribution graph of request dependencies, service dependencies of the components; total dependency of B member, percentage line graph of node betweenness.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The application of the principles of the present invention will now be further described with reference to the accompanying drawings.
As shown in fig. 1, the method for measuring importance of a component based on node betweenness in a software evolution environment provided by the embodiment of the present invention includes the following steps:
s101: taking a software architecture as a blueprint and a support, putting forward an unweighted directed graph model of the software architecture, and introducing node betweenness to measure the importance of the component;
s102: and analyzing the request dependence, the service dependence and the total dependence of the component, and finding out factors most relevant to node betweenness by analyzing by using Pearson correlation coefficients.
The invention is further described below with reference to specific assays.
The model of the software architecture is:
given the unrecognized definition of SA, the present invention adopts a relatively popular simple definition: SA is a high-level abstraction of the components and connections that make up the system, with the interaction relationships between the components being treated as connections.
The component realizes the specific functions needed in the system, conforms to a set of interface standards, realizes a group of interfaces, is represented as a data or computing unit bearing certain functions in the system, is also represented as a reusable software module oriented to a software system architecture, and is a replaceable part actually existing in the system.
In the present invention, it is regarded as an opaque whole regardless of its internal structure. When a software system instance is regarded as an SA, the interaction and dependency between the components in the SA are directional, and the interaction between the components is non-directional, then the model of the SA can be defined as follows:
define 1 model of SA (Software architecture model) describe model G of SA of a Software system instance as an unweighted directed graph triplet < NG, V (G), E (G) >:
(1)N G is the name of the SA model of the software system instance;
(2) V (G) is a set of nodes represented by the members constituting the software system;
(3) E (G) is a collection of unweighted directed edges represented by relationships between the components that make up the software system.
Definition 2 building block (Component): the description of the building block V represented by a node is a binary < NC, FC >:
(1) Nc is the name of the member;
(2) Fc is a functional description of a building block.
Defining 3 inter-Component association describing the interaction between components as an unweighted directed edge E as a triple<E n ,V i ,V j >:
(1)E n Is a unique identification of a directed edge;
(2)V i is a member for initiating dependency, namely an initial node;
(3)V j is the member that accepts the dependency, i.e., the termination node;
(4)<V i ,V j &gt, i represents a node V i Point of direction V j
Define 4 request dependency of building block model G =sa<N G ,V(G),E(G)&gt, in which the component vi ∈ V (G) is represented by the component V i The total number of edges as starting nodes is called the building Block v i Request dependency of (d) req (v i )。
Request dependencies of a component describe the extent and relationship to which the component depends on other modules. The higher the request dependency of a component is, the greater the number of components that the component directly depends on is, the more complex the behavior of the component is, and the higher the component hierarchy is.
Define 5 Component service dependency model G = at SA<V(G),E(G)&gt, middle and component v i E.g. V (G) by member V i The total number of edges as termination nodes is called the building block v i Service dependency of (d) ser (v i )。
The service dependency of a component characterizes the extent to which the component is directly dependent by other modules in the SA. The higher the service dependency of a component is, the stronger the direct dependency of the component is, the higher the reuse rate in the SA is, which means that the behavior function of the component is more fixed.
The sum of the request dependencies of a component and the service dependencies of the component, called the total dependency of the component, is denoted d sum (v i )。
Define 6 node Betweenness given graph G =<V(G),E(G)&gt, node v i E.g. V (G), passing through the node V in the graph G i The ratio of the total number of shortest paths in (c) to all shortest paths in graph G is called v i The number of nodes of (C) is denoted as C (v) i ). Then:
wherein δ st is the total number of all shortest paths from the node s to the node t, and δ st (v) is the number of shortest paths from the node s to the node t that pass through the node v.
The node betweenness is an important global geometric quantity, reflects the action and the influence of nodes in the whole graph, abstracts the model of the SA into a directed graph model, introduces the node betweenness in the SA evolution, can intuitively observe the position and the influence of the nodes corresponding to the component in the whole SA, is an important index for measuring the key degree and the position of the component during the SA evolution, and has intuitive guidance effect and strong practical significance for mastering and controlling the influence range and the strength of the component before and after the SA evolution.
In the experiment of the invention, classes are simulated as members, class relations are simulated as the unweighted directed edges in the model of the SA, and the corresponding relations are shown in FIG. 2.
The method for measuring the importance of the components in the evolution environment provided by the embodiment of the invention comprises the following steps:
the evolution is a necessary activity of all software systems, the overall structure of the system tends to be complex, the number of components is large, important components in the structure are found out, a basis is provided for detection and controllability of software evolution, an important aspect of mastering and evaluating the evolution is also provided, and the evolution is particularly important for the software evolution work.
The importance of a component is measured in 5 steps, including:
(1) Component and inter-component association is obtained. And taking the class in the source code as a component, and scanning the source code to obtain the relationship between the component name identifier and the component.
(2) The relationship data between the members is processed and mapped into an adjacency matrix. If the number of the components is n, mapping the relationship between the components into an n-dimensional adjacent matrix M, and defaulting M 11 ,M 22 ,……,M nn Is 0, except for the components with special self-call; for example, if component 1 has a dependency on component 2 and component 2 has no dependency on component 1, then M is 12 Has a value of 1,M 21 The value of (2) is 0.
(3) And calculating the member request dependency, the member service dependency and the total dependency of the members of each node on the basis of the adjacency matrix M.
(4) And calculating the node betweenness of each node. And (3) calculating the shortest paths of the whole graph, and calculating the node betweenness of each component according to a formula (1) after obtaining the total number of the shortest paths of the whole graph and the number of the shortest paths of each node passing through the node. And measuring important components in the SA according to the size of the node betweenness.
(5) And respectively calculating Pearson correlation coefficients of request dependency of the components, service dependency of the components, total dependency of the components and node betweenness. And (3) respectively calculating according to a formula (2), and analyzing factors most relevant to node betweenness.
The calculation formula of the Pearson correlation coefficient used in the present invention is:
where X and Y are two vectors of equal length that require correlation to be computed,andare the average of vectors X and Y, respectively, and the order of X and Y does not affect the calculation result of Pearson correlation coefficients.
In the definition of Pearson correlation coefficient: the absolute value of the correlation coefficient is [0.8,1.0], which is extremely strong correlation; the absolute value of the correlation coefficient is [0.6,0.8], and the correlation is strong; the absolute value of the correlation coefficient is [0.4,0.6], and the correlation is moderate; the absolute value of the correlation coefficient is [0.2,0.4], which is weak correlation; the absolute value is [0,0.2], which is very weak or no correlation.
The Pearson correlation coefficient is used to calculate the correlation between the request dependency of the node building blocks, the service dependency of the building blocks, the total dependency of the building blocks and the node betweenness. In the experiment, observed values of two variables are paired between node betweenness, request dependency of a member, service dependency of the member and total dependency of the member, the observed values of each pair are independent of each other, and standard deviation of the observed values is not 0, so that a Pearson correlation coefficient is defined.
Algorithm
Algorithm 1 obtains the adjacency matrix algorithm of the SA model.
Inputting: the Name of the component identifies the linked list Name and the Connection of the interactive relation linked list among the components.
And (3) outputting: the adjacency Matrix of the model of SA.
Initialization: the two-dimensional array Matrix is used for storing the adjacency Matrix of the SA model, and the length of the rows and the columns of the Matrix is equal to the length of the Name of the linked list.
For example, the partial matrix obtained in experiment one with eclipse3.0 is as follows:
the adjacency matrix of the SA model is obtained, and then the request dependence and the service dependence of each component can be calculated. When calculating the request dependence of the component, the number of 1 in the column of Matrix [ i ] [ ], and the result obtained by the final accumulation and addition is the request dependence of the component corresponding to the node vi; similarly, when calculating the service dependency of a component, the number of 1's in the row of Matrix [ ] [ j ], and the result of the final cumulative addition is the service dependency of the component corresponding to the node vj. The request dependencies and service dependencies of each component are summed to obtain the total dependency of the node.
Algorithm 2SA full graph shortest path algorithm of directed graph model.
Inputting: the adjacency Matrix of the SA model.
And (3) outputting: full graph shortest path pates for SA model.
Initialization: the linked list Pathes is used to store all shortest paths in the graph.
Algorithm 3SA model node betweenness calculation algorithm.
Inputting: and the full graph shortest path Pathes of the SA model, and the member Name identification linked list Name.
And (3) outputting: node betweenness of each member of the SA model.
Initialization: the chain Betweenness is used for storing the node Betweenness of the SA model, and the length of the chain Betweenness is equal to the length of the Name of the component Name identification chain.
The invention is further described below with reference to specific assays.
1. Analysis of experiments
The open source software selected by the invention is nearly one hundred, comprises various functions, such as a software development platform, a programming language source code packet, open source professional software and the like, and can be divided into three types according to the number of nodes: the node number is less than 50 of small-scale software, the node number is 50 to 200 of medium-scale software, and the node number is more than 200 of large-scale software.
As a result, the method is practical and effective, and the total dependence of the components and the trend of the node betweenness always accord with the software scale and the functional category. The embodiment of the node betweenness is related to the structural design of software, and in the software with good design and system structure, the obtained experimental result is most ideal, so that not only the total dependence of the components and the change of the node betweenness always tend to be synchronous, but also the difference of the node betweenness among the components is more obvious; on the contrary, in a software system without good architectural support, the node betweenness between the components is nearly consistent, which leads to the difficulty of measurement between the components.
Due to space limitation, eclipse3.0 belonging to large-scale software and source code belonging to middle-scale Jabref are finally selected as typical two experimental examples for analysis.
1.1 experiment one
The source code of eclipse3.0 was used as experimental data.
FIG. 3 is a percentage line graph of request dependencies and service dependencies of components, total dependencies and node betweenness of components of eclipse3.0, wherein the range change is small because the node betweenness takes values between [ -1,1], and the value of the total dependencies is much larger than that of the node betweenness, so that the relationship and trend of the total dependencies and the node betweenness can be more clearly seen, and the percentage line graph is adopted for analysis. In the figure, the Y-axis represents the size of the request dependency of the component and the service dependency of the component, and the X-axis represents the node. Wherein, (a) a distribution curve of request dependencies, service dependencies of the component; (b) percentage line graph of total dependency and node betweenness.
By observing the trend of the line graph, it is not difficult to find that there is no obvious correlation between the request dependency of the component and the service dependency of the component, that is, there is no regular correspondence between the request dependency of the component and the service dependency of the component, and the component with high request dependency may have high service dependency or low service dependency. The fluctuation trend of the total dependence of the components and the node numbers is basically consistent, which means that the node numbers of the components with high total dependence are also high. The higher the node betweenness of the component, the more important the function and position of the component in the software architecture.
And (3) calculating the correlation between the degree and the node betweenness by adopting a formula (2), and judging whether the correlation between the degree and the node betweenness exists as the correlation shown by a line graph or not by using a Pearson correlation coefficient.
TABLE 1 eclipse3.0 correlation analysis
P(X,Y) Pearson correlation coefficient value
P(d ser ,d req ) -0.00283943
P(d ser ,C) 0.49037850
P(d req ,C) 0.51885469
P(d sum ,C) 0.69526782
As can be seen from the calculation result of the Pearson correlation coefficient value, the trend shown by the line graph is correct, the total dependence of the component is in strong positive correlation with the node betweenness, the larger the total dependence of the component is, the larger the node betweenness is, the service dependence and the request dependence of the component are respectively in moderate positive correlation with the node betweenness, and the request dependence and the service dependence are in extremely weak negative correlation or no correlation.
1.2 experiment two
The source code of Jabref was used as experimental data.
FIG. 4 is a graph of the distribution of the request dependencies of the components, the service dependencies of the components, and the percentage of the total dependencies and node intermediaries of Jabref. The sizes of the request dependencies of the building blocks and the service dependencies of the building blocks are shown on the Y-axis of the graph, and the X-axis represents the nodes. In the figure: A. a distribution graph of request dependencies, service dependencies of the components; total dependency of B member, percentage line graph of node betweenness.
In the line graph of Jabref, the request dependency and the service dependency of the components do not show a regular correlation trend, and the total dependency and the fluctuation of the node numbers of the components almost overlap, which indicates that in Jabref, the node numbers of the components with high total dependency are also high.
TABLE 2 Jabref correlation analysis
P(X,Y) Pearson correlation coefficient value
P(d ser, d req ) 0.11752896
P(d ser, C) 0.62832354
P(d req, C) 0.60910461
P(d sum, C) 0.80746250
Jabref's correlation analysis proves again that the correlation trend shown by the line graph is correct, the request dependence and the service dependence of the component are extremely weak positive correlation or irrelevant, the request dependence and the service dependence of the component are respectively and positively correlated with the node betweenness, the total dependence and the node betweenness are extremely strong positive correlation, and the node betweenness increases with the increase of the total dependence.
By calculating the Pearson correlation coefficients of the service dependence and the request dependence of the building blocks of eclipse3.0 and Jabref, it can be seen that the distribution of the service dependence and the request dependence of the building blocks is irregular, and no correlation exists between the building blocks.
According to the analysis of the Pearson correlation coefficient, in the software architecture, the total dependency and the node betweenness of the component are in extremely strong positive correlation, and the trend changes of the Pearson correlation coefficient of the service dependency and the node betweenness of the component and the Pearson correlation coefficient of the request dependency and the node betweenness of the component are unstable. It can be known from the definition of node betweenness that the node betweenness of a component and the service dependency, request dependency and total dependency of the component have direct relations, when the total dependency of a component is larger, the position and the importance degree of the component in the whole software architecture are higher, and the corresponding node betweenness is higher; conversely, if the total dependency of a component is lower, or even none, then the lower the position and importance of the component in the overall software architecture, the lower the node betweenness.
The total dependence of the components is most closely related to the change of the components, and is a key judgment factor when the positions and the importance degrees of the components in the whole software architecture are judged through node betweenness.
Experiments prove that the importance of the components in the evolution environment of the software by using node-mediated metrics is effective, and the defects of the traditional metric method in mastering the macroscopic characteristics of the software architecture are overcome.
In the whole software architecture, nodes with high request dependence of components are often poor in independence, strong in dependence on bottom-layer components or other basic components, high in coupling degree and complex in function; the nodes with high service dependence of the components are generally high in cohesion, stable in structure and single in function.
By calculating the total dependence of the components and the node betweenness, the importance of the components in the whole software architecture can be clearly measured, and when the software architecture evolves, the evolution process of the important nodes can be better mastered, the evolution risk is reduced, and the monitoring and management of activities and components which are difficult to control in the evolution activity are facilitated. The request dependency and the service dependency of the components in the software architecture have no regular correlation, and the total dependency and the node betweenness of the components usually show strong positive correlation or strong positive correlation, i.e. the node betweenness of the components with high total dependency is also high.
The total dependence is in extremely strong positive correlation with node betweenness, another analysis direction is indicated for the importance measurement of the component, and particularly in a huge software architecture, the dimension reduction is performed on a large number of nodes in a source code during calculation, so that the total dependence of the component which is calculated more quickly is the key point of the next analysis.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (7)

1. A component importance measuring method based on node betweenness in a software evolution environment is characterized in that a software architecture is adopted as a blueprint and a support in the component importance measuring method based on the node betweenness in the software evolution environment, an unweighted directed graph model of the software architecture is provided, and the node betweenness is introduced to measure the importance of a component; and the request dependence, the service dependence and the total dependence of the components are analyzed by using Pearson correlation coefficients to find out the factors most relevant to the node betweenness.
2. The method for measuring the importance of a component based on node betweenness in a software evolution environment according to claim 1, wherein the step of providing the directed graph model of the software architecture by taking the software architecture as a blueprint and a support comprises the following steps:
1) The model G of the SA of the software system is an unweighted directed graph triple < NG, V (G), E (G) >:
N G is the name of the software system SA model;
v (G) is a set of nodes represented by the members constituting the software system;
e (G) is a set of unweighted directed edges represented by relationships between the components that make up the software system;
2) The member V represented by the node is a binary < NC, FC >:
nc is the name of the component;
fc is a functional description of a building block;
3) The interactive relation among the components is an unweighted directed edge E which is a triplet<E n ,V i ,V j >:
E n Is the only identification of the directed edge;
V i is a member for initiating dependence and is an initial node;
V j is a member for accepting dependence, and is a termination node;
<V i ,V j &gt represents a node V i Point of direction V j
4) Model G = of SA<N G ,V(G),E(G)&gt, in the formula, the component vi belongs to V (G), and the component V i The total number of edges as starting nodes is member v i Request dependency of (d) req (v i );
5) Model G = of SA<V(G),E(G)&gt, middle, component v i E.v (G), member V i The total number of edges as termination nodes is component v i Service dependency of (d) ser (v i );
The sum of the request dependencies of the building blocks and the service dependencies of the building blocks, denoted d, is the total dependency of the building blocks sum (v i );
6) Given graph G =<V(G),E(G)&gt, node v i E.g. V (G), passing through the node V in the graph G i The ratio of the total number of shortest paths in (1) to all shortest paths in graph G is v i The number of nodes of (C) is denoted as C (v) i ) (ii) a Then:
wherein δ st is the total number of all shortest paths from the node s to the node t, and δ st (v) is the number of shortest paths passing through the node v in the number of shortest paths from the node s to the node t.
3. The method for measuring the importance of a component based on node betweenness in a software evolution environment as claimed in claim 1, wherein said method for introducing node betweenness to measure the importance of a component comprises:
acquiring association between the components, taking a class in a source code as one component, and scanning the source code to obtain a relationship between a component name identifier and the component;
processing the relation data between the members and mapping the relation data into an adjacent matrix; if the number of the components is n, mapping the relationship between the components into an n-dimensional adjacent matrix M, and defaulting M 11 ,M 22 ,……;
Calculating the member request dependency, the member service dependency and the total dependency of the members of each node on the basis of the adjacency matrix M;
calculating the node betweenness of each node; calculating the shortest path of the whole graph to obtain the total number of the shortest paths of the whole graph and the number of the shortest paths of each node passing through the node, and then calculating the node betweenness of each component according to a formula (1); measuring important components in the SA according to the size of the node betweenness;
respectively calculating Pearson correlation coefficients of request dependency of the component, service dependency of the component, total dependency of the component and node betweenness; respectively calculating according to a formula (2), and analyzing factors most relevant to node betweenness;
the calculation formula of the Pearson correlation coefficient used is:
where X and Y are two vectors of equal length for which correlation needs to be computed,andare the average of vectors X and Y, respectively, and the order of X and Y does not affect the calculation result of Pearson correlation coefficients.
4. The method for measuring the importance of a component based on node betweenness in a software evolution environment as claimed in claim 3, wherein the method for processing the relationship data between components and mapping the relationship data into an adjacency matrix is characterized in that the method for processing the relationship data comprises a step of calculating the relationship data of the components and a step of calculating the relationship data of the components; the method comprises the following steps:
inputting: the Name of the component Name identification linked list and the Connection of the interactive relationship linked list between the components;
and (3) outputting: an adjacency Matrix of a model of the SA;
initialization: the two-dimensional array Matrix is used for storing an adjacent Matrix of the SA model, and the row-column length of the Matrix is equal to the length of a linked list Name;
initializing all M11, M22, \8230;, mnn has a value of 0 except for a member with special self-call;
when pointer integer i =0 starts looping:
assigning the sequence index of the initial member in the Name identification linked list Name in the ith element Connection [ i ] of the inter-member interaction linked list Connection to a variable row;
assigning the sequence index of the termination component in the Connection [ i ] in the Name of the Name identification linked list to a variable column;
let Matrix [ row ] [ column ] be 1;
the value of the pointer i is i +1, if the value of i +1 is smaller than the length of the linked list Connection, the circulation continues, otherwise, the circulation is terminated, and an adjacent matrix is obtained; the following were used:
the request dependency and the service dependency of each component are calculated after the adjacency matrix of the SA model is obtained.
5. The method for measuring the importance of a component based on node betweenness in a software evolution environment as claimed in claim 3, wherein the calculating the shortest path of the whole graph comprises:
inputting: an adjacency Matrix of the SA model;
and (3) outputting: the full graph shortest path Pathes of the SA model;
initialization: taking integers i and j as pointers of each element Matrix [ i ] [ j ] of the adjacent Matrix, wherein the i and the j represent different nodes respectively, traversing every two nodes, finding out the shortest path among all the nodes according to the content of the adjacent Matrix, and storing the shortest path into the shortest path Pathes of the whole graph every time the shortest path is found out.
6. The method for measuring the importance of a component based on node betweenness in a software evolution environment of claim 3, wherein calculating the node betweenness of each node comprises:
inputting: the full graph shortest path Pathes of the SA model, and the Name of the component Name identification linked list;
and (3) outputting: node betweenness of each member of the SA model;
initialization: the chain table Betweenness is used for storing the node Betweenness of the SA model, and the length of the chain table Betweenness is equal to the length of the Name of the component Name identification chain table;
initializing an integer k =0 and an integer i =0, traversing the Name of the component Name identification linked list from the first element, adding 1 to the value of k when finding a path containing the node Name [ i ] in the shortest path Pathes of the whole graph, and after the traversal is finished, taking the node Betweenness [ i ] of the node Name [ i ] as the length of k divided by the linked list Pathes.
7. A component importance measurement system based on node betweenness in software evolution environment of the component importance measurement method based on node betweenness in software evolution environment according to claim 1.
CN201710977888.8A 2017-10-17 2017-10-17 Component Importance measure based on node betweenness under a kind of Software Evolution environment Pending CN107832080A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710977888.8A CN107832080A (en) 2017-10-17 2017-10-17 Component Importance measure based on node betweenness under a kind of Software Evolution environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710977888.8A CN107832080A (en) 2017-10-17 2017-10-17 Component Importance measure based on node betweenness under a kind of Software Evolution environment

Publications (1)

Publication Number Publication Date
CN107832080A true CN107832080A (en) 2018-03-23

Family

ID=61648447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710977888.8A Pending CN107832080A (en) 2017-10-17 2017-10-17 Component Importance measure based on node betweenness under a kind of Software Evolution environment

Country Status (1)

Country Link
CN (1) CN107832080A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086050A (en) * 2018-07-04 2018-12-25 烽火通信科技股份有限公司 A kind of analysis method and system of module dependencies
CN109660406A (en) * 2019-01-18 2019-04-19 天津七二通信广播股份有限公司 A method of based on blueprint and chained list implementation trade-off radio frequency system function remodeling
CN112015890A (en) * 2020-09-07 2020-12-01 广东工业大学 Movie scenario abstract generation method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030115025A1 (en) * 2001-12-19 2003-06-19 Lee Moon Soo Method and apparatus for wrapping existing procedure oriented program into component based system
CN101114222A (en) * 2007-07-26 2008-01-30 南京大学 Reflexion type architecture self-evolvement method based on noumenon
CN102880642A (en) * 2012-08-20 2013-01-16 浙江工业大学 Bus transfer method based on weighted directed network model
CN105893257A (en) * 2016-03-30 2016-08-24 东南大学 Software architecture evaluation method based on evolution

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030115025A1 (en) * 2001-12-19 2003-06-19 Lee Moon Soo Method and apparatus for wrapping existing procedure oriented program into component based system
CN101114222A (en) * 2007-07-26 2008-01-30 南京大学 Reflexion type architecture self-evolvement method based on noumenon
CN102880642A (en) * 2012-08-20 2013-01-16 浙江工业大学 Bus transfer method based on weighted directed network model
CN105893257A (en) * 2016-03-30 2016-08-24 东南大学 Software architecture evaluation method based on evolution

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
成蕾,林英等: ""软件演化环境下基于节点介数的构件重要性度量方法"", 《计算机应用与软件》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086050A (en) * 2018-07-04 2018-12-25 烽火通信科技股份有限公司 A kind of analysis method and system of module dependencies
CN109086050B (en) * 2018-07-04 2021-11-30 烽火通信科技股份有限公司 Method and system for analyzing module dependency relationship
CN109660406A (en) * 2019-01-18 2019-04-19 天津七二通信广播股份有限公司 A method of based on blueprint and chained list implementation trade-off radio frequency system function remodeling
CN112015890A (en) * 2020-09-07 2020-12-01 广东工业大学 Movie scenario abstract generation method and device
CN112015890B (en) * 2020-09-07 2024-01-23 广东工业大学 Method and device for generating movie script abstract

Similar Documents

Publication Publication Date Title
Cont et al. Recovering volatility from option prices by evolutionary optimization
WO2021208079A1 (en) Method and apparatus for obtaining power battery life data, computer device, and medium
CN111693931A (en) Intelligent electric energy meter error remote calculation method and device and computer equipment
CN107832080A (en) Component Importance measure based on node betweenness under a kind of Software Evolution environment
CN110110529B (en) Software network key node mining method based on complex network
CN112907026A (en) Comprehensive evaluation method based on editable mesh index system
Ajwani et al. Average-case analysis of incremental topological ordering
CN111814342A (en) Complex equipment reliability hybrid model and construction method thereof
Guolo et al. A simulation‐based comparison of techniques to correct for measurement error in matched case–control studies
CN114331349B (en) Scientific research project management method and system based on Internet of things technology
Vats et al. Estimation of efforts during testing of OOP using the AVISAR framework
CN109344050A (en) A kind of interface parameters analysis method and device based on structure tree
CN113987261A (en) Video recommendation method and system based on dynamic trust perception
Guan et al. Impact of uncertainty and correlations on mapping of embedded systems
Vidotto et al. Averaging models: parameters estimation with the R-Average procedure
Sangeetha et al. Software Sizing with Use case point
Kaur et al. Comparative analysis of the software effort estimation models
CN105281977B (en) A kind of intelligent behaviour method of testing and system based on binary tree algorithm
Pattnaik et al. Prediction of software quality using neuro-fuzzy model
Lee et al. Optimal weighting systems for direct age‐adjustment of vital rates
ShahMohammadi Evaluation of the Software Architecture Styles From Maintainability Viewpoint
CN114897340B (en) GitLab-based small-scale team software developer work metric method
CN114398291B (en) Software regression testing influence domain analysis method and system
CN111242593B (en) Method for detecting consistency of overlapping corresponding behaviors of trading system based on partner matrix
Csárdi Dynamics of citation networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180323