CN113743467B - Case diagram similarity judging method based on maximum public subgraph calculation - Google Patents
Case diagram similarity judging method based on maximum public subgraph calculation Download PDFInfo
- Publication number
- CN113743467B CN113743467B CN202110886966.XA CN202110886966A CN113743467B CN 113743467 B CN113743467 B CN 113743467B CN 202110886966 A CN202110886966 A CN 202110886966A CN 113743467 B CN113743467 B CN 113743467B
- Authority
- CN
- China
- Prior art keywords
- graph
- use case
- uml
- similarity
- maxcsg
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010586 diagram Methods 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000004364 calculation method Methods 0.000 title claims abstract description 20
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 abstract description 16
- 238000012423 maintenance Methods 0.000 description 12
- 238000012546 transfer Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- RDYMFSUJUZBWLH-UHFFFAOYSA-N endosulfan Chemical compound C12COS(=O)OCC2C2(Cl)C(Cl)=C(Cl)C1(Cl)C2(Cl)Cl RDYMFSUJUZBWLH-UHFFFAOYSA-N 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of software development, and discloses a use case diagram similarity judging method based on maximum public sub-graph calculation, which comprises the following steps: step 1: preprocessing the UML use case diagram to be compared, and representing the UML use case diagram as a directed diagram; step 2: calculating and acquiring the maximum public subgraphs among the directed graphs to be compared; step 3: similarity is calculated using a similarity determination algorithm. The maximum public sub-graph algorithm used by the method is simple in process, the graph structure is directly analyzed, the efficiency is high, the high efficiency and convenience in the use process can be ensured, and the method has strong applicability.
Description
Technical Field
The invention belongs to the technical field of software development, and particularly relates to a use case diagram similarity judging method based on maximum public sub-graph calculation.
Background
In the process of software development, software reuse strategies are frequently utilized, i.e., existing software components are used, including code fragments, designs, test data, or cost estimates, etc., to build new software. The software reuse can save development cost and time and improve the software development process. With the increasing complexity of software, software reuse has involved various stages of the software lifecycle, including demand analysis, design, testing, and even maintenance, and is no longer limited to code.
The analysis of the requirements in the software development period is a key basis for software design, implementation, test and maintenance, and can indicate the working direction and development strategy for developers. The case diagram is the most simple expression form of the interaction between the user and the system, becomes the most commonly used tool in the software demand analysis stage by virtue of the advantages of intuitiveness, standardization, accurate description and the like, and plays a vital role in the process of collecting the software demand. Multiplexing of usage patterns can help developers build their new usage pattern models in a short time, quickly determine requirements, and thus increase work efficiency. But its multiplexing is difficult due to the semi-structural nature of the use case diagram.
In a study using a maximum subgraph to determine similarity of UML use cases, reza Fauzan, daniel Siahan and Siti Rochimah et al propose a method for calculating cosine similarity and word text semantic similarity to determine similarity between two UML use cases. The method comprises the steps of preprocessing and structuring two UML use cases, and judging semantic similarity of two words through Wu Palmer and Wordnet. Wu Palmer will calculate a similarity range value of two words in Wordnet 0,1, where 0 indicates that there is no similarity between the two words and 1 indicates that the two words have complete similarity. Meanwhile, since the text appearing in the UML use case diagram may be composed of a plurality of words, the method also combines cosine similarity and Wu Palmer for comprehensive measurement. The disadvantage of this approach is that it does not make the correct similarity determination for two UML use cases from different projects but similar in structure.
As shown in fig. 1 and 2, if the similarity comparison is performed on the two UML use cases by using the semantic similarity method proposed by Reza Fauzan et al, the two UML use cases will be determined to be dissimilar (i.e., the numerical value is lower than the set threshold) due to low text similarity. In reality, however, the two UML use cases differ only in text content, and the structure thereof has a great similarity, and in the actual software multiplexing process, the UML use case cases shown in fig. 1 can be multiplexed to generate the UML use case shown in fig. 2. The difference between the expected result and the actual result shows that in the process of multiplexing the UML use cases, we need to consider not only the factors of text semantic similarity, but also the influence of structural similarity on the similarity of the two. The method formats the UML use case diagram and calculates the maximum public subgraph of the UML use case diagram and the UML use case diagram as the basis for similarity judgment, so that the problem that the semantic similarity has no strong applicability in structural similarity judgment can be effectively solved.
Furthermore, mohammad Nazir Arifin and Daniel sialan et al also propose a method to determine the similarity between two UML use cases using minimum edit distance and word text semantic similarity. Similarly, the method pre-processes and constructs two UML use cases, then uses the minimum editing distance between the two use cases to measure the structural similarity, then combines the text semantic similarity with the use cases and gives a certain weight to the text semantic similarity, thus calculating the final similarity. However, the minimum edit distance algorithm used in the method has complex process and high time complexity, so that the algorithm has low execution efficiency. The maximum public sub-graph algorithm used by the method is simple in process, the graph structure is directly analyzed, the efficiency is high, the high efficiency and convenience in the use process can be ensured, and the method has strong applicability.
Disclosure of Invention
The invention aims to provide a case diagram similarity judging method based on maximum public sub-graph calculation, which provides support for software multiplexing and finally improves the development efficiency of software.
In order to solve the technical problems, the specific technical scheme of the use case diagram similarity judging method based on the maximum public sub-graph calculation is as follows:
a use case diagram similarity judging method based on maximum public subgraph calculation comprises the following steps:
step 1: preprocessing the UML use case diagram to be compared, and representing the UML use case diagram as a directed diagram;
step 2: calculating and acquiring the maximum public subgraphs among the directed graphs to be compared;
step 3: similarity is calculated using a similarity determination algorithm.
Further, the UML use case diagram is composed of relationships among participants, use cases and elements, wherein the participants of the UML use case diagram refer to users, organizations or external systems interacting with applications or systems; the use cases refer to functions contained in the system; the relationships among the elements comprise association relationships, generalization relationships, inclusion relationships and expansion relationships.
Further, the step 1 comprises the following specific steps:
step 1.1: preprocessing of the UML use case diagram is to extract elements and data in the UML use case diagram by converting the UML use case diagram into a formatting language; known UML instance graphs were constructed using an open-source UML modeling tool, thus exporting each instance graph (model) as an extensible markup language metadata interchange format (XMI) file;
step 1.2: parsing the XMI file and representing the elements as directed graphs; let g (V, E) be a set of graphs comprising vertices V and directed edges E, where vertices V are used to represent a participant or a use case and directed edges E represent associations between participants, between participants and use cases, and between use cases and use cases.
Further, the theoretical basis of the UML use case similarity judgment based on the maximum public subgraph is:
if two graphs are closer in structure, the more common parts of the two graphs are, i.e. there will be a common sub-graph between them, whereby we can use the largest common sub-graph of the two graphs to compare their degree of similarity in structure; the related concepts are defined as follows:
definition one (subgraph): given two graphs c1 (Vc, ec) and g (V, E), we call graph c1 a sub-graph of graph g, written as
Definition two (maximum common subgraph): given two graphs g1 and g2, if there is an additional graph m, the following condition can be satisfied:
and no graph m' satisfies the following condition:
(3)|m′|>|m|;
then figure m is the largest common subgraph of figures g1 and g2, denoted maxcsg (g 1, g 2);
step 2.1: setting the maximum common subgraph of g1 and g2 as maxcsg, traversing and comparing the nodes of g1 and g2, and taking the common nodes of the same type of g1 and g2 as the nodes of maxcsg;
step 2.2: traversing the nodes in the map maxcsg obtained in the step 2.1 again, and if two nodes are adjacent in g1 and g2 and the types of the edges connecting the nodes are the same, generating the edges of the corresponding types and adding the edges into the map maxcsg;
step 2.3: obtaining the maximum common sub-graph maxcsg of g1 and g 2.
Further, the step 3 comprises the following specific steps:
after the maximum public sub-graph maxcsg is obtained, the calculation of the similarity is completed by utilizing the proportion of the nodes and edges of the maximum public sub-graph maxcsg in the comparison object, and the similarity calculation formula is as follows:
wherein VT represents the elements of participants (including normal participants and generalized participants), use cases (including general use cases, generalized use cases, extended use cases and including use cases) existing in the UML use case diagram, gamma V Weights set for nodes representing each type element, which are defined manually, and Σ v∈VT γ v =1, vertexnum (maxcsg, v) represents the v-type node in the maximum common sub-graph maxcsgNumber, max (VertexNum (g 1, v), vertexNum (g 2, v)) represents the maximum value of the number of corresponding types of nodes in g1 and g2, ET represents the relationship of association, generalization, inclusion and expansion, etc. existing in UML use case diagram, θx represents the weight set for each type of edge, which is defined manually, and Σ x∈ET θ x =1, edgeNum (maxcsg, x) represents the number of edges of type x in the maximum common sub-graph maxcsg, max (EdgeNum (g 1, x), edgeNum (g 2, x)) represents the maximum value of the number of corresponding types of edges contained in g1 and g2, α, β is set manually, and α, β e (0, 1).
The use case diagram similarity judging method based on the maximum public subgraph calculation has the following advantages: the maximum public sub-graph algorithm used by the method is simple in process, the graph structure is directly analyzed, the efficiency is high, the high efficiency and convenience in the use process can be ensured, and the method has strong applicability.
Drawings
FIG. 1 is a diagram of a UML use of a banking system;
FIG. 2 is a diagram of UML usage of the warehouse management system;
FIG. 3 is an overview of the method of the present invention;
FIG. 4 is a diagram of a UML use of a banking counter system in accordance with an embodiment of the present invention;
FIG. 5 is a diagram of a UML use of the warehouse management system in accordance with an embodiment of the present invention;
FIG. 6 is a directed graph g1 (V) 1 ,E 1 );
FIG. 7 is a warehouse management system directed graph g2 (V) 2 ,E 2 );
Fig. 8 is a diagram of the maximum common subgraph generation process of the present invention.
Detailed Description
For better understanding of the purpose, structure and function of the present invention, a method for determining similarity of usage patterns based on maximum common sub-graph computation will be described in further detail with reference to the accompanying drawings.
An overview of the method of the present method is shown in fig. 3, which we use the maximum common subgraph to calculate and judge in order to judge the similarity between two UML usage graphs. To obtain the largest common subgraph that can be used by the present algorithm, we will preprocess the UML use case graph to facilitate computation. The algorithm has the following steps in the use process:
1) The UML use case diagram is preprocessed, in this step, we need to convert the UML use case diagram into the form of XMI file first, and then convert the XMI file into the directed diagram through the custom rule.
2) The maximum common subgraph among the compared directed graphs is calculated, and in this step, the maximum common subgraph among the directed graphs to be compared is calculated through a maximum common subgraph algorithm.
3) Similarity calculation in this step we will put the largest common sub-band into the custom similarity calculation formula to get the similarity between them.
The method comprises the following specific steps:
(1) pretreatment of
The UML use case diagram consists of relationships between participants, use cases, and elements. Where the participants of the UML use case diagram refer to users, organizations, or external systems that interact with an application or system, typically represented by a small person. Use cases refer to functions contained within the system and are typically represented using an oval. And the relationship between the elements includes association relationship, generalization relationship, inclusion relationship and expansion relationship. The roles and notations of the relationships are shown in Table 1:
TABLE 1 element relationship Table
The preprocessing of the UML use case diagram mainly extracts elements and data from the UML use case diagram by converting the UML use case diagram into a formatting language. Known UML instance graphs were constructed using an open-source UML modeling tool, so we export each instance graph (model) as an extensible markup language metadata interchange format (XMI) file.
The banking counter system and the warehouse management system are representative service systems among the financial service system and the warehouse service system, so we use the banking counter system and the warehouse management system as an example to describe the method herein. The UML usage diagrams for the bank counter system and warehouse management system are shown in fig. 4 and 5:
in the UML use case diagram of the bank counter system of fig. 4, there are four participants (general depositors, VIP depositors, teller), seven use cases (transfer, deposit, withdrawal, loss reporting, personal transfer, public transfer, frozen account) and thirteen correspondences (association, inclusion, expansion, generalization).
In the UML use case diagram of the warehouse management system of fig. 5, five participants (temporary buyer, in-process buyer, warehouse buyer, logistics driver, warehouse manager) are all in total, six use cases (maintenance, warehouse-in, purchase, ex-warehouse, self-maintenance, factory-return maintenance) and eleven correspondences (association, inclusion, generalization).
We first parse this UML use case diagram using a tool to convert it into a file in XMI format.
The next step is to parse the XMI file and represent the elements as directed graphs. The method sets g (V, E) as a group of graphs comprising a vertex V and a directed edge E. Where vertex V is used to represent a participant or a use case. The directed edge E is used to represent the association between the participants, between the participants and the use cases, and between the use cases.
According to this rule, we convert the participants 'normal depositors and VIP depositors in the XMI file of the bank counter system to FA1 and FA2 vertices, respectively, and the participants' depositors and teller to vertices A1 and A2, respectively. Then we convert case transfer, deposit, withdrawal and loss reporting to the vertices U1, U2, U3 and U4, respectively, and case freeze account, person transfer and revolution account to vertices T1, B1 and B2, respectively. At the same time we convert the generalized relationship between normal and VIP and between the depositors into one-way connected edges (i.e. connecting two vertices with one arc) ef1 and ef2. The association relationship between the participant depositors and the case transfer, deposit, withdrawal and loss reporting is respectively converted into two-way communication edges (namely two arcs with opposite directions are used for connecting two vertexes) eg1, eg2, eg3 and eg4, the association relationship between the participant teller and the case transfer, deposit, withdrawal and loss reporting is respectively converted into two-way communication edges eg5, eg6, eg7 and eg8, and the relationship between the case freezing account, personal transfer, revolution counter account reporting and case loss reporting is respectively converted into one-way communication edges et1, eb1 and eb2.
Similarly, we convert the participant temporary and in the XMI files of the warehouse management system to FA3 and FA4 vertices, respectively, and the participant warehouse buyer and logistics driver and warehouse manager to vertices A3, A4 and A5, respectively. Then we convert the use case maintenance, warehouse entry, purchase and warehouse exit into U5, U6, U7 and U8 vertexes respectively, and convert the use case self maintenance and the factory return maintenance into vertexes B3 and B4 respectively. At the same time we convert the generalized relationship between temporary and warehouse purchasers, between the programmed and warehouse purchasers into one-way connected edges ef3 and ef4. The association relations between the participant warehouse buyers and the use case warehouses and the purchases are respectively converted into two-way communication edges eg9 and eg10, the association relations between the participant warehouse administrators and the use case maintenance, warehouse entry, purchase and ex-warehouse are respectively converted into two-way communication edges eg11, eg12, eg13 and eg14, the association relations between the participant logistics drivers and the use case ex-warehouse are converted into two-way communication edges eg15, and the relations between the use case self-maintenance, the factory return maintenance and the use case maintenance are respectively converted into one-way communication edges eb3 and eb4.
The directed graph after conversion is shown in fig. 6 and fig. 7, and in fig. 6 and fig. 7, the labels corresponding to the elements of different types are shown in the directed graph label correspondence table of table 2:
TABLE 2 case directed icon mapping table
Numbering device | Label name | Type(s) | Numbering device | Label name | Type(s) |
1 | A1 | Participants (participants) | 24 | eg2 | Association relation |
2 | A2 | Participants (participants) | 25 | eg3 | Association relation |
3 | A3 | Participants (participants) | 26 | eg4 | Association relation |
4 | A4 | Participants (participants) | 27 | eg5 | Association relation |
5 | A5 | Participants (participants) | 28 | eg6 | Association relation |
6 | FA1 | Participant (generalization) | 29 | eg7 | Association relation |
7 | FA2 | Participant (generalization) | 30 | eg8 | Association relation |
8 | FA3 | Participant (generalization) | 31 | eg9 | Association relation |
9 | FA4 | Participant (generalization) | 32 | eg10 | Association relation |
10 | U1 | Use case | 33 | eg11 | Association relation |
11 | U2 | Use case | 34 | eg12 | Association relation |
12 | U3 | Use case | 35 | eg13 | Association relation |
13 | U4 | Use case | 36 | eg14 | Association relation |
14 | U5 | Use case | 37 | eg15 | Association relation |
15 | U6 | Use case | 38 | ef1 | Generalizing relationships |
16 | U7 | Use case | 39 | ef2 | Generalizing relationships |
17 | U8 | Use case | 40 | ef3 | Generalizing relationships |
18 | B1 | Use case (include) | 41 | ef4 | Generalizing relationships |
19 | B2 | Use case (include) | 42 | eb1 | Containment relationship |
20 | B3 | Use case (include) | 43 | eb2 | Containment relationship |
21 | B4 | Use case (include) | 44 | eb3 | Containment relationship |
22 | T1 | Use case (expansion) | 45 | eb4 | Containment relationship |
23 | eg1 | Association relation | 46 | et1 | Expanding relationships |
It should be noted that, in this case, a generalized relationship between use cases does not occur, and if this relationship occurs in an application, the corresponding node is marked with FUX, where X represents the internal serial number of the same type of tag.
(2) Calculating the maximum common subgraph between the compared directed graphs
The theoretical basis for UML use case similarity judgment based on the maximum public subgraph is as follows: if two graphs are closer in structure, the more common parts of the two graphs are, i.e. there will be a common sub-graph between them. So we can use the largest common sub-graph of the two graphs to compare their degree of similarity in structure. Before the comparison, the relevant concepts are defined.
Definition one (subgraph): given two figures c1 (V c ,E c ) And g (V, E), we call graph c1 a sub-graph of graph g, written as
Definition two (maximum common subgraph): given two graphs g1 and g2, if there is an additional graph m, the following condition can be satisfied:
and no graph m' satisfies the following condition:
(3)|m′|>|m|;
then figure m is the largest common sub-graph of figures g1 and g2, denoted maxcsg (g 1, g 2).
The solution process of the maximum common subgraph has two major steps, here we use the graphs mentioned in fig. 6 and fig. 7g1(V 1 ,E 1 ) And FIG. g2 (V) 2 ,E 2 ) For illustration, a schematic diagram is shown in fig. 8:
in the first step, we set the maximum common sub-graph of g1 and g2 as maxcsg, then traverse and compare the nodes of g1 and g2, and take the common nodes of the same type of g1 and g2 as the nodes of maxcsg.
And secondly, traversing the nodes in the map maxcsg obtained in the first step again, and if two nodes are adjacent in g1 and g2 and the types of the edges connecting the nodes are the same, generating the edges of the corresponding types and adding the edges into the map maxcsg.
Through the above steps we get the maximum common sub-graph maxcsg for g1 and g 2.
(3) Similarity calculation
After the maximum public subgraph maxcsg is obtained, the calculation of the similarity is completed by utilizing the proportion of the nodes and edges of the maximum public subgraph maxcsg in comparison objects, and a specific similarity calculation formula is shown in a formula 1:
wherein VT represents elements such as participants (including general participants and generalized participants), use cases (including general use cases, generalized use cases, extended use cases and including use cases) and the like existing in the UML use case diagram, and gamma V Weights set for nodes representing each type element, which are defined manually, and Σ v∈VT γ v =1, vertegnum (maxcsg, v) represents the number of v type nodes in the maximum common sub-graph maxcsg, max (vertegnum (g 1, v), vertegnum (g 2, v)) represents the maximum value of the number of corresponding type nodes in g1 and g2, ET represents the relationship of association, generalization, inclusion, expansion, etc. existing in the UML use diagram, θ x Weights, defined manually, are set for each type of edge, and Σ x∈ET θ x =1, edgeNum (maxcsg, x) represents the number of edges of type x in the maximum common sub-graph maxcsg, max (EdgeNum (g 1, x), edgeNum (g 2, x)) represents the most number of edges of the corresponding type contained in g1 and g2Large values, α, β are set manually, and α, β e (0, 1).
Based on the present case, we set α, β to 0.55 and 0.45, respectively, the γ values of the normal participant, the generalized participant, the normal use case, the generalized use case, the extended use case, the inclusion use case are 0.2,0, 0.2, respectively, and the θ values of the association, generalization, inclusion and extended relationships are 0.2,0.266,0.266,0.266, respectively. Simusecanase (g 1, g 2) = 0.7102 can be calculated.
Thus, the similarity of g1 and g2 is judged to be finished, and each parameter can be adjusted to meet the requirement in practical application.
It will be understood that the invention has been described in terms of several embodiments, and that various changes and equivalents may be made to these features and embodiments by those skilled in the art without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.
Claims (3)
1. A use case diagram similarity judging method based on maximum public subgraph calculation is characterized by comprising the following steps:
step 1: preprocessing the UML use case diagram to be compared, and representing the UML use case diagram as a directed diagram; the UML use case diagram consists of relationships among participants, use cases and elements, wherein the participants of the UML use case diagram refer to users, organizations or external systems interacting with applications or systems; the use cases refer to functions contained in the system; the relationships among the elements comprise association relationships, generalization relationships, inclusion relationships and expansion relationships;
step 1.1: preprocessing of the UML use case diagram is to extract elements and data in the UML use case diagram by converting the UML use case diagram into a formatting language; known UML instance graphs are built using an open-source UML modeling tool, thus exporting each instance graph as an extensible markup language metadata interchange format XMI file;
step 1.2: parsing the XMI file and representing the elements as directed graphs; let g (V, E) be a set of graphs comprising vertices V and directed edges E, wherein vertices V are used to represent a participant or a use case, and directed edges E represent associations between participants, between participants and use cases, and between use cases and use cases;
step 2: calculating and acquiring the maximum public subgraphs among the directed graphs to be compared;
step 3: similarity is calculated using a similarity determination algorithm.
2. The use case graph similarity judging method based on the maximum public subgraph calculation according to claim 1, wherein the theoretical basis of the UML use case similarity judgment based on the maximum public subgraph is: if two graphs are closer in structure, the more common parts of the two graphs are, i.e. there will be a common sub-graph between them, whereby we can use the largest common sub-graph of the two graphs to compare their degree of similarity in structure; the related concepts are defined as follows:
defining a sub-graph: given two graphs c1 (Vc, ec) and g (V, E), we call graph c1 a sub-graph of graph g, written as
(1)
(2)
Defining two maximum public subgraphs: given two graphs g1 and g2, if there is an additional graph m, the following condition can be satisfied:
(1)
(2)
and no graph m' satisfies the following condition:
(1)
(2)
(3)|m′|>|m|;
then figure m is the largest common subgraph of figures g1 and g2, denoted maxcsg (g 1, g 2);
step 2.1: setting the maximum common subgraph of g1 and g2 as maxcsg, traversing and comparing the nodes of g1 and g2, and taking the common nodes of the same type of g1 and g2 as the nodes of maxcsg;
step 2.2: traversing the nodes in the map maxcsg obtained in the step 2.1 again, and if two nodes are adjacent in g1 and g2 and the types of the edges connecting the nodes are the same, generating the edges of the corresponding types and adding the edges into the map maxcsg;
step 2.3: obtaining the maximum common sub-graph maxcsg of g1 and g 2.
3. The usage graph similarity determination method based on maximum common subgraph calculation according to claim 2, wherein step 3 includes the following specific steps:
after the maximum public sub-graph maxcsg is obtained, the calculation of the similarity is completed by utilizing the proportion of the nodes and edges of the maximum public sub-graph maxcsg in the comparison object, and the similarity calculation formula is as follows:
wherein VT represents participants, use case elements, gamma existing in UML use case diagram V Weights set for nodes representing each type element, which are defined manually, and Σ v∈VT γ v =1, vertegnum (maxcsg, v) represents the number of v type nodes in the maximum common sub-graph maxcsg, max (vertegnum (g 1, v), vertegnum (g 2, v)) represents the maximum value of the number of corresponding type nodes in g1 and g2, ET represents the relationship of association, generalization, inclusion, expansion, etc. existing in the UML use graph, θx represents the weight set for each type of edge, which is defined manually, and Σ x∈ET θ x =1, edgeNum (maxcsg, x) represents the number of edges of type x in the maximum common sub-graph maxcsg, max (EdgeNum (g 1, x), edgeNum (g 2, x)) represents the maximum value of the number of corresponding types of edges contained in g1 and g2, α, β is set manually, and α, β e (0, 1).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110886966.XA CN113743467B (en) | 2021-08-03 | 2021-08-03 | Case diagram similarity judging method based on maximum public subgraph calculation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110886966.XA CN113743467B (en) | 2021-08-03 | 2021-08-03 | Case diagram similarity judging method based on maximum public subgraph calculation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113743467A CN113743467A (en) | 2021-12-03 |
CN113743467B true CN113743467B (en) | 2024-01-12 |
Family
ID=78729978
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110886966.XA Active CN113743467B (en) | 2021-08-03 | 2021-08-03 | Case diagram similarity judging method based on maximum public subgraph calculation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113743467B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114201598B (en) * | 2022-02-18 | 2022-05-17 | 药渡经纬信息科技(北京)有限公司 | Text recommendation method and text recommendation device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008146300A (en) * | 2006-12-08 | 2008-06-26 | Nec Corp | Information processor, information processing method and program |
CN107133257A (en) * | 2017-03-21 | 2017-09-05 | 华南师范大学 | A kind of similar entities recognition methods and system based on center connected subgraph |
CN108363563A (en) * | 2018-02-05 | 2018-08-03 | 海南大学 | Uml model consistency detecting method based on data collection of illustrative plates, Information Atlas and knowledge mapping framework |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9158503B2 (en) * | 2013-10-08 | 2015-10-13 | King Fahd University Of Petroleum And Minerals | UML model integration and refactoring method |
-
2021
- 2021-08-03 CN CN202110886966.XA patent/CN113743467B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008146300A (en) * | 2006-12-08 | 2008-06-26 | Nec Corp | Information processor, information processing method and program |
CN107133257A (en) * | 2017-03-21 | 2017-09-05 | 华南师范大学 | A kind of similar entities recognition methods and system based on center connected subgraph |
CN108363563A (en) * | 2018-02-05 | 2018-08-03 | 海南大学 | Uml model consistency detecting method based on data collection of illustrative plates, Information Atlas and knowledge mapping framework |
Non-Patent Citations (1)
Title |
---|
基于最大公共子图的本体映射方法研究;郭竹为;刘胜全;刘艳;赵美玲;符贤哲;;计算机工程(05);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113743467A (en) | 2021-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107679221B (en) | Time-space data acquisition and service combination scheme generation method for disaster reduction task | |
CN106202536B (en) | Global metadata standardized platform system and its construction method based on XBRL | |
US6874141B1 (en) | Method of compiling schema mapping | |
CN106067094A (en) | A kind of dynamic assessment method and system | |
WO2005055001A2 (en) | Method for assisting in automated conversion of data and associated metadata | |
CN113743467B (en) | Case diagram similarity judging method based on maximum public subgraph calculation | |
CN113326028A (en) | Micro-service decomposition method based on domain-driven design and service panoramic event storm | |
CN110502667A (en) | The parsing of ODX document and generation technique based on DOM frame | |
CN104750499A (en) | Constraint solving and description logic based web service combination method | |
CN116860856A (en) | Financial data processing method and device, computer equipment and storage medium | |
Zhu et al. | Tool support for design pattern recognition at model level | |
Jain et al. | Towards automating the development of federated distributed simulations for modeling sustainable urban infrastructures | |
Giannoulis et al. | Model-driven strategic awareness: From a unified business strategy meta-model (UBSMM) to enterprise architecture | |
Huang et al. | Research on precision marketing of real estate market based on data mining | |
CN113642291B (en) | Method, system, storage medium and terminal for constructing logical structure tree reported by listed companies | |
CN115934693A (en) | Dynamic calculation method for regional real population | |
CN113344604B (en) | User subdivision method based on user behavior data and stream calculation | |
Mousavian Anaraki et al. | Providing a hybrid clustering method as an auxiliary system in automatic labeling to divide employee into different levels of productivity and their retention | |
JP2003337697A (en) | Development system for business system, and development method for business system | |
CN114238263A (en) | Database modeling system based on data dictionary | |
Albassam | A black-box computational business rules extraction approach through test-driven development | |
Mêda et al. | Towards Software Integration in the Construction Industry–ERP and ICIS Case Study | |
Ricaurte et al. | Representing interoperability between software systems by using pre-conceptual schemas | |
CN109300023A (en) | A kind of method and system that increment tax on land value big data is extracted and applied | |
Zhao et al. | A cloud platform architecture recovery metric method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |