CN113743467B - Case diagram similarity judging method based on maximum public subgraph calculation - Google Patents

Case diagram similarity judging method based on maximum public subgraph calculation Download PDF

Info

Publication number
CN113743467B
CN113743467B CN202110886966.XA CN202110886966A CN113743467B CN 113743467 B CN113743467 B CN 113743467B CN 202110886966 A CN202110886966 A CN 202110886966A CN 113743467 B CN113743467 B CN 113743467B
Authority
CN
China
Prior art keywords
graph
use case
uml
similarity
maxcsg
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110886966.XA
Other languages
Chinese (zh)
Other versions
CN113743467A (en
Inventor
汪烨
宋师哲
周澳回
姜波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN202110886966.XA priority Critical patent/CN113743467B/en
Publication of CN113743467A publication Critical patent/CN113743467A/en
Application granted granted Critical
Publication of CN113743467B publication Critical patent/CN113743467B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of software development, and discloses a use case diagram similarity judging method based on maximum public sub-graph calculation, which comprises the following steps: step 1: preprocessing the UML use case diagram to be compared, and representing the UML use case diagram as a directed diagram; step 2: calculating and acquiring the maximum public subgraphs among the directed graphs to be compared; step 3: similarity is calculated using a similarity determination algorithm. The maximum public sub-graph algorithm used by the method is simple in process, the graph structure is directly analyzed, the efficiency is high, the high efficiency and convenience in the use process can be ensured, and the method has strong applicability.

Description

Case diagram similarity judging method based on maximum public subgraph calculation
Technical Field
The invention belongs to the technical field of software development, and particularly relates to a use case diagram similarity judging method based on maximum public sub-graph calculation.
Background
In the process of software development, software reuse strategies are frequently utilized, i.e., existing software components are used, including code fragments, designs, test data, or cost estimates, etc., to build new software. The software reuse can save development cost and time and improve the software development process. With the increasing complexity of software, software reuse has involved various stages of the software lifecycle, including demand analysis, design, testing, and even maintenance, and is no longer limited to code.
The analysis of the requirements in the software development period is a key basis for software design, implementation, test and maintenance, and can indicate the working direction and development strategy for developers. The case diagram is the most simple expression form of the interaction between the user and the system, becomes the most commonly used tool in the software demand analysis stage by virtue of the advantages of intuitiveness, standardization, accurate description and the like, and plays a vital role in the process of collecting the software demand. Multiplexing of usage patterns can help developers build their new usage pattern models in a short time, quickly determine requirements, and thus increase work efficiency. But its multiplexing is difficult due to the semi-structural nature of the use case diagram.
In a study using a maximum subgraph to determine similarity of UML use cases, reza Fauzan, daniel Siahan and Siti Rochimah et al propose a method for calculating cosine similarity and word text semantic similarity to determine similarity between two UML use cases. The method comprises the steps of preprocessing and structuring two UML use cases, and judging semantic similarity of two words through Wu Palmer and Wordnet. Wu Palmer will calculate a similarity range value of two words in Wordnet 0,1, where 0 indicates that there is no similarity between the two words and 1 indicates that the two words have complete similarity. Meanwhile, since the text appearing in the UML use case diagram may be composed of a plurality of words, the method also combines cosine similarity and Wu Palmer for comprehensive measurement. The disadvantage of this approach is that it does not make the correct similarity determination for two UML use cases from different projects but similar in structure.
As shown in fig. 1 and 2, if the similarity comparison is performed on the two UML use cases by using the semantic similarity method proposed by Reza Fauzan et al, the two UML use cases will be determined to be dissimilar (i.e., the numerical value is lower than the set threshold) due to low text similarity. In reality, however, the two UML use cases differ only in text content, and the structure thereof has a great similarity, and in the actual software multiplexing process, the UML use case cases shown in fig. 1 can be multiplexed to generate the UML use case shown in fig. 2. The difference between the expected result and the actual result shows that in the process of multiplexing the UML use cases, we need to consider not only the factors of text semantic similarity, but also the influence of structural similarity on the similarity of the two. The method formats the UML use case diagram and calculates the maximum public subgraph of the UML use case diagram and the UML use case diagram as the basis for similarity judgment, so that the problem that the semantic similarity has no strong applicability in structural similarity judgment can be effectively solved.
Furthermore, mohammad Nazir Arifin and Daniel sialan et al also propose a method to determine the similarity between two UML use cases using minimum edit distance and word text semantic similarity. Similarly, the method pre-processes and constructs two UML use cases, then uses the minimum editing distance between the two use cases to measure the structural similarity, then combines the text semantic similarity with the use cases and gives a certain weight to the text semantic similarity, thus calculating the final similarity. However, the minimum edit distance algorithm used in the method has complex process and high time complexity, so that the algorithm has low execution efficiency. The maximum public sub-graph algorithm used by the method is simple in process, the graph structure is directly analyzed, the efficiency is high, the high efficiency and convenience in the use process can be ensured, and the method has strong applicability.
Disclosure of Invention
The invention aims to provide a case diagram similarity judging method based on maximum public sub-graph calculation, which provides support for software multiplexing and finally improves the development efficiency of software.
In order to solve the technical problems, the specific technical scheme of the use case diagram similarity judging method based on the maximum public sub-graph calculation is as follows:
a use case diagram similarity judging method based on maximum public subgraph calculation comprises the following steps:
step 1: preprocessing the UML use case diagram to be compared, and representing the UML use case diagram as a directed diagram;
step 2: calculating and acquiring the maximum public subgraphs among the directed graphs to be compared;
step 3: similarity is calculated using a similarity determination algorithm.
Further, the UML use case diagram is composed of relationships among participants, use cases and elements, wherein the participants of the UML use case diagram refer to users, organizations or external systems interacting with applications or systems; the use cases refer to functions contained in the system; the relationships among the elements comprise association relationships, generalization relationships, inclusion relationships and expansion relationships.
Further, the step 1 comprises the following specific steps:
step 1.1: preprocessing of the UML use case diagram is to extract elements and data in the UML use case diagram by converting the UML use case diagram into a formatting language; known UML instance graphs were constructed using an open-source UML modeling tool, thus exporting each instance graph (model) as an extensible markup language metadata interchange format (XMI) file;
step 1.2: parsing the XMI file and representing the elements as directed graphs; let g (V, E) be a set of graphs comprising vertices V and directed edges E, where vertices V are used to represent a participant or a use case and directed edges E represent associations between participants, between participants and use cases, and between use cases and use cases.
Further, the theoretical basis of the UML use case similarity judgment based on the maximum public subgraph is:
if two graphs are closer in structure, the more common parts of the two graphs are, i.e. there will be a common sub-graph between them, whereby we can use the largest common sub-graph of the two graphs to compare their degree of similarity in structure; the related concepts are defined as follows:
definition one (subgraph): given two graphs c1 (Vc, ec) and g (V, E), we call graph c1 a sub-graph of graph g, written as
Definition two (maximum common subgraph): given two graphs g1 and g2, if there is an additional graph m, the following condition can be satisfied:
and no graph m' satisfies the following condition:
(3)|m′|>|m|;
then figure m is the largest common subgraph of figures g1 and g2, denoted maxcsg (g 1, g 2);
step 2.1: setting the maximum common subgraph of g1 and g2 as maxcsg, traversing and comparing the nodes of g1 and g2, and taking the common nodes of the same type of g1 and g2 as the nodes of maxcsg;
step 2.2: traversing the nodes in the map maxcsg obtained in the step 2.1 again, and if two nodes are adjacent in g1 and g2 and the types of the edges connecting the nodes are the same, generating the edges of the corresponding types and adding the edges into the map maxcsg;
step 2.3: obtaining the maximum common sub-graph maxcsg of g1 and g 2.
Further, the step 3 comprises the following specific steps:
after the maximum public sub-graph maxcsg is obtained, the calculation of the similarity is completed by utilizing the proportion of the nodes and edges of the maximum public sub-graph maxcsg in the comparison object, and the similarity calculation formula is as follows:
wherein VT represents the elements of participants (including normal participants and generalized participants), use cases (including general use cases, generalized use cases, extended use cases and including use cases) existing in the UML use case diagram, gamma V Weights set for nodes representing each type element, which are defined manually, and Σ v∈VT γ v =1, vertexnum (maxcsg, v) represents the v-type node in the maximum common sub-graph maxcsgNumber, max (VertexNum (g 1, v), vertexNum (g 2, v)) represents the maximum value of the number of corresponding types of nodes in g1 and g2, ET represents the relationship of association, generalization, inclusion and expansion, etc. existing in UML use case diagram, θx represents the weight set for each type of edge, which is defined manually, and Σ x∈ET θ x =1, edgeNum (maxcsg, x) represents the number of edges of type x in the maximum common sub-graph maxcsg, max (EdgeNum (g 1, x), edgeNum (g 2, x)) represents the maximum value of the number of corresponding types of edges contained in g1 and g2, α, β is set manually, and α, β e (0, 1).
The use case diagram similarity judging method based on the maximum public subgraph calculation has the following advantages: the maximum public sub-graph algorithm used by the method is simple in process, the graph structure is directly analyzed, the efficiency is high, the high efficiency and convenience in the use process can be ensured, and the method has strong applicability.
Drawings
FIG. 1 is a diagram of a UML use of a banking system;
FIG. 2 is a diagram of UML usage of the warehouse management system;
FIG. 3 is an overview of the method of the present invention;
FIG. 4 is a diagram of a UML use of a banking counter system in accordance with an embodiment of the present invention;
FIG. 5 is a diagram of a UML use of the warehouse management system in accordance with an embodiment of the present invention;
FIG. 6 is a directed graph g1 (V) 1 ,E 1 );
FIG. 7 is a warehouse management system directed graph g2 (V) 2 ,E 2 );
Fig. 8 is a diagram of the maximum common subgraph generation process of the present invention.
Detailed Description
For better understanding of the purpose, structure and function of the present invention, a method for determining similarity of usage patterns based on maximum common sub-graph computation will be described in further detail with reference to the accompanying drawings.
An overview of the method of the present method is shown in fig. 3, which we use the maximum common subgraph to calculate and judge in order to judge the similarity between two UML usage graphs. To obtain the largest common subgraph that can be used by the present algorithm, we will preprocess the UML use case graph to facilitate computation. The algorithm has the following steps in the use process:
1) The UML use case diagram is preprocessed, in this step, we need to convert the UML use case diagram into the form of XMI file first, and then convert the XMI file into the directed diagram through the custom rule.
2) The maximum common subgraph among the compared directed graphs is calculated, and in this step, the maximum common subgraph among the directed graphs to be compared is calculated through a maximum common subgraph algorithm.
3) Similarity calculation in this step we will put the largest common sub-band into the custom similarity calculation formula to get the similarity between them.
The method comprises the following specific steps:
(1) pretreatment of
The UML use case diagram consists of relationships between participants, use cases, and elements. Where the participants of the UML use case diagram refer to users, organizations, or external systems that interact with an application or system, typically represented by a small person. Use cases refer to functions contained within the system and are typically represented using an oval. And the relationship between the elements includes association relationship, generalization relationship, inclusion relationship and expansion relationship. The roles and notations of the relationships are shown in Table 1:
TABLE 1 element relationship Table
The preprocessing of the UML use case diagram mainly extracts elements and data from the UML use case diagram by converting the UML use case diagram into a formatting language. Known UML instance graphs were constructed using an open-source UML modeling tool, so we export each instance graph (model) as an extensible markup language metadata interchange format (XMI) file.
The banking counter system and the warehouse management system are representative service systems among the financial service system and the warehouse service system, so we use the banking counter system and the warehouse management system as an example to describe the method herein. The UML usage diagrams for the bank counter system and warehouse management system are shown in fig. 4 and 5:
in the UML use case diagram of the bank counter system of fig. 4, there are four participants (general depositors, VIP depositors, teller), seven use cases (transfer, deposit, withdrawal, loss reporting, personal transfer, public transfer, frozen account) and thirteen correspondences (association, inclusion, expansion, generalization).
In the UML use case diagram of the warehouse management system of fig. 5, five participants (temporary buyer, in-process buyer, warehouse buyer, logistics driver, warehouse manager) are all in total, six use cases (maintenance, warehouse-in, purchase, ex-warehouse, self-maintenance, factory-return maintenance) and eleven correspondences (association, inclusion, generalization).
We first parse this UML use case diagram using a tool to convert it into a file in XMI format.
The next step is to parse the XMI file and represent the elements as directed graphs. The method sets g (V, E) as a group of graphs comprising a vertex V and a directed edge E. Where vertex V is used to represent a participant or a use case. The directed edge E is used to represent the association between the participants, between the participants and the use cases, and between the use cases.
According to this rule, we convert the participants 'normal depositors and VIP depositors in the XMI file of the bank counter system to FA1 and FA2 vertices, respectively, and the participants' depositors and teller to vertices A1 and A2, respectively. Then we convert case transfer, deposit, withdrawal and loss reporting to the vertices U1, U2, U3 and U4, respectively, and case freeze account, person transfer and revolution account to vertices T1, B1 and B2, respectively. At the same time we convert the generalized relationship between normal and VIP and between the depositors into one-way connected edges (i.e. connecting two vertices with one arc) ef1 and ef2. The association relationship between the participant depositors and the case transfer, deposit, withdrawal and loss reporting is respectively converted into two-way communication edges (namely two arcs with opposite directions are used for connecting two vertexes) eg1, eg2, eg3 and eg4, the association relationship between the participant teller and the case transfer, deposit, withdrawal and loss reporting is respectively converted into two-way communication edges eg5, eg6, eg7 and eg8, and the relationship between the case freezing account, personal transfer, revolution counter account reporting and case loss reporting is respectively converted into one-way communication edges et1, eb1 and eb2.
Similarly, we convert the participant temporary and in the XMI files of the warehouse management system to FA3 and FA4 vertices, respectively, and the participant warehouse buyer and logistics driver and warehouse manager to vertices A3, A4 and A5, respectively. Then we convert the use case maintenance, warehouse entry, purchase and warehouse exit into U5, U6, U7 and U8 vertexes respectively, and convert the use case self maintenance and the factory return maintenance into vertexes B3 and B4 respectively. At the same time we convert the generalized relationship between temporary and warehouse purchasers, between the programmed and warehouse purchasers into one-way connected edges ef3 and ef4. The association relations between the participant warehouse buyers and the use case warehouses and the purchases are respectively converted into two-way communication edges eg9 and eg10, the association relations between the participant warehouse administrators and the use case maintenance, warehouse entry, purchase and ex-warehouse are respectively converted into two-way communication edges eg11, eg12, eg13 and eg14, the association relations between the participant logistics drivers and the use case ex-warehouse are converted into two-way communication edges eg15, and the relations between the use case self-maintenance, the factory return maintenance and the use case maintenance are respectively converted into one-way communication edges eb3 and eb4.
The directed graph after conversion is shown in fig. 6 and fig. 7, and in fig. 6 and fig. 7, the labels corresponding to the elements of different types are shown in the directed graph label correspondence table of table 2:
TABLE 2 case directed icon mapping table
Numbering device Label name Type(s) Numbering device Label name Type(s)
1 A1 Participants (participants) 24 eg2 Association relation
2 A2 Participants (participants) 25 eg3 Association relation
3 A3 Participants (participants) 26 eg4 Association relation
4 A4 Participants (participants) 27 eg5 Association relation
5 A5 Participants (participants) 28 eg6 Association relation
6 FA1 Participant (generalization) 29 eg7 Association relation
7 FA2 Participant (generalization) 30 eg8 Association relation
8 FA3 Participant (generalization) 31 eg9 Association relation
9 FA4 Participant (generalization) 32 eg10 Association relation
10 U1 Use case 33 eg11 Association relation
11 U2 Use case 34 eg12 Association relation
12 U3 Use case 35 eg13 Association relation
13 U4 Use case 36 eg14 Association relation
14 U5 Use case 37 eg15 Association relation
15 U6 Use case 38 ef1 Generalizing relationships
16 U7 Use case 39 ef2 Generalizing relationships
17 U8 Use case 40 ef3 Generalizing relationships
18 B1 Use case (include) 41 ef4 Generalizing relationships
19 B2 Use case (include) 42 eb1 Containment relationship
20 B3 Use case (include) 43 eb2 Containment relationship
21 B4 Use case (include) 44 eb3 Containment relationship
22 T1 Use case (expansion) 45 eb4 Containment relationship
23 eg1 Association relation 46 et1 Expanding relationships
It should be noted that, in this case, a generalized relationship between use cases does not occur, and if this relationship occurs in an application, the corresponding node is marked with FUX, where X represents the internal serial number of the same type of tag.
(2) Calculating the maximum common subgraph between the compared directed graphs
The theoretical basis for UML use case similarity judgment based on the maximum public subgraph is as follows: if two graphs are closer in structure, the more common parts of the two graphs are, i.e. there will be a common sub-graph between them. So we can use the largest common sub-graph of the two graphs to compare their degree of similarity in structure. Before the comparison, the relevant concepts are defined.
Definition one (subgraph): given two figures c1 (V c ,E c ) And g (V, E), we call graph c1 a sub-graph of graph g, written as
Definition two (maximum common subgraph): given two graphs g1 and g2, if there is an additional graph m, the following condition can be satisfied:
and no graph m' satisfies the following condition:
(3)|m′|>|m|;
then figure m is the largest common sub-graph of figures g1 and g2, denoted maxcsg (g 1, g 2).
The solution process of the maximum common subgraph has two major steps, here we use the graphs mentioned in fig. 6 and fig. 7g1(V 1 ,E 1 ) And FIG. g2 (V) 2 ,E 2 ) For illustration, a schematic diagram is shown in fig. 8:
in the first step, we set the maximum common sub-graph of g1 and g2 as maxcsg, then traverse and compare the nodes of g1 and g2, and take the common nodes of the same type of g1 and g2 as the nodes of maxcsg.
And secondly, traversing the nodes in the map maxcsg obtained in the first step again, and if two nodes are adjacent in g1 and g2 and the types of the edges connecting the nodes are the same, generating the edges of the corresponding types and adding the edges into the map maxcsg.
Through the above steps we get the maximum common sub-graph maxcsg for g1 and g 2.
(3) Similarity calculation
After the maximum public subgraph maxcsg is obtained, the calculation of the similarity is completed by utilizing the proportion of the nodes and edges of the maximum public subgraph maxcsg in comparison objects, and a specific similarity calculation formula is shown in a formula 1:
wherein VT represents elements such as participants (including general participants and generalized participants), use cases (including general use cases, generalized use cases, extended use cases and including use cases) and the like existing in the UML use case diagram, and gamma V Weights set for nodes representing each type element, which are defined manually, and Σ v∈VT γ v =1, vertegnum (maxcsg, v) represents the number of v type nodes in the maximum common sub-graph maxcsg, max (vertegnum (g 1, v), vertegnum (g 2, v)) represents the maximum value of the number of corresponding type nodes in g1 and g2, ET represents the relationship of association, generalization, inclusion, expansion, etc. existing in the UML use diagram, θ x Weights, defined manually, are set for each type of edge, and Σ x∈ET θ x =1, edgeNum (maxcsg, x) represents the number of edges of type x in the maximum common sub-graph maxcsg, max (EdgeNum (g 1, x), edgeNum (g 2, x)) represents the most number of edges of the corresponding type contained in g1 and g2Large values, α, β are set manually, and α, β e (0, 1).
Based on the present case, we set α, β to 0.55 and 0.45, respectively, the γ values of the normal participant, the generalized participant, the normal use case, the generalized use case, the extended use case, the inclusion use case are 0.2,0, 0.2, respectively, and the θ values of the association, generalization, inclusion and extended relationships are 0.2,0.266,0.266,0.266, respectively. Simusecanase (g 1, g 2) = 0.7102 can be calculated.
Thus, the similarity of g1 and g2 is judged to be finished, and each parameter can be adjusted to meet the requirement in practical application.
It will be understood that the invention has been described in terms of several embodiments, and that various changes and equivalents may be made to these features and embodiments by those skilled in the art without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims (3)

1. A use case diagram similarity judging method based on maximum public subgraph calculation is characterized by comprising the following steps:
step 1: preprocessing the UML use case diagram to be compared, and representing the UML use case diagram as a directed diagram; the UML use case diagram consists of relationships among participants, use cases and elements, wherein the participants of the UML use case diagram refer to users, organizations or external systems interacting with applications or systems; the use cases refer to functions contained in the system; the relationships among the elements comprise association relationships, generalization relationships, inclusion relationships and expansion relationships;
step 1.1: preprocessing of the UML use case diagram is to extract elements and data in the UML use case diagram by converting the UML use case diagram into a formatting language; known UML instance graphs are built using an open-source UML modeling tool, thus exporting each instance graph as an extensible markup language metadata interchange format XMI file;
step 1.2: parsing the XMI file and representing the elements as directed graphs; let g (V, E) be a set of graphs comprising vertices V and directed edges E, wherein vertices V are used to represent a participant or a use case, and directed edges E represent associations between participants, between participants and use cases, and between use cases and use cases;
step 2: calculating and acquiring the maximum public subgraphs among the directed graphs to be compared;
step 3: similarity is calculated using a similarity determination algorithm.
2. The use case graph similarity judging method based on the maximum public subgraph calculation according to claim 1, wherein the theoretical basis of the UML use case similarity judgment based on the maximum public subgraph is: if two graphs are closer in structure, the more common parts of the two graphs are, i.e. there will be a common sub-graph between them, whereby we can use the largest common sub-graph of the two graphs to compare their degree of similarity in structure; the related concepts are defined as follows:
defining a sub-graph: given two graphs c1 (Vc, ec) and g (V, E), we call graph c1 a sub-graph of graph g, written as
(1)
(2)
Defining two maximum public subgraphs: given two graphs g1 and g2, if there is an additional graph m, the following condition can be satisfied:
(1)
(2)
and no graph m' satisfies the following condition:
(1)
(2)
(3)|m′|>|m|;
then figure m is the largest common subgraph of figures g1 and g2, denoted maxcsg (g 1, g 2);
step 2.1: setting the maximum common subgraph of g1 and g2 as maxcsg, traversing and comparing the nodes of g1 and g2, and taking the common nodes of the same type of g1 and g2 as the nodes of maxcsg;
step 2.2: traversing the nodes in the map maxcsg obtained in the step 2.1 again, and if two nodes are adjacent in g1 and g2 and the types of the edges connecting the nodes are the same, generating the edges of the corresponding types and adding the edges into the map maxcsg;
step 2.3: obtaining the maximum common sub-graph maxcsg of g1 and g 2.
3. The usage graph similarity determination method based on maximum common subgraph calculation according to claim 2, wherein step 3 includes the following specific steps:
after the maximum public sub-graph maxcsg is obtained, the calculation of the similarity is completed by utilizing the proportion of the nodes and edges of the maximum public sub-graph maxcsg in the comparison object, and the similarity calculation formula is as follows:
wherein VT represents participants, use case elements, gamma existing in UML use case diagram V Weights set for nodes representing each type element, which are defined manually, and Σ v∈VT γ v =1, vertegnum (maxcsg, v) represents the number of v type nodes in the maximum common sub-graph maxcsg, max (vertegnum (g 1, v), vertegnum (g 2, v)) represents the maximum value of the number of corresponding type nodes in g1 and g2, ET represents the relationship of association, generalization, inclusion, expansion, etc. existing in the UML use graph, θx represents the weight set for each type of edge, which is defined manually, and Σ x∈ET θ x =1, edgeNum (maxcsg, x) represents the number of edges of type x in the maximum common sub-graph maxcsg, max (EdgeNum (g 1, x), edgeNum (g 2, x)) represents the maximum value of the number of corresponding types of edges contained in g1 and g2, α, β is set manually, and α, β e (0, 1).
CN202110886966.XA 2021-08-03 2021-08-03 Case diagram similarity judging method based on maximum public subgraph calculation Active CN113743467B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110886966.XA CN113743467B (en) 2021-08-03 2021-08-03 Case diagram similarity judging method based on maximum public subgraph calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110886966.XA CN113743467B (en) 2021-08-03 2021-08-03 Case diagram similarity judging method based on maximum public subgraph calculation

Publications (2)

Publication Number Publication Date
CN113743467A CN113743467A (en) 2021-12-03
CN113743467B true CN113743467B (en) 2024-01-12

Family

ID=78729978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110886966.XA Active CN113743467B (en) 2021-08-03 2021-08-03 Case diagram similarity judging method based on maximum public subgraph calculation

Country Status (1)

Country Link
CN (1) CN113743467B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114201598B (en) * 2022-02-18 2022-05-17 药渡经纬信息科技(北京)有限公司 Text recommendation method and text recommendation device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008146300A (en) * 2006-12-08 2008-06-26 Nec Corp Information processor, information processing method and program
CN107133257A (en) * 2017-03-21 2017-09-05 华南师范大学 A kind of similar entities recognition methods and system based on center connected subgraph
CN108363563A (en) * 2018-02-05 2018-08-03 海南大学 Uml model consistency detecting method based on data collection of illustrative plates, Information Atlas and knowledge mapping framework

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9158503B2 (en) * 2013-10-08 2015-10-13 King Fahd University Of Petroleum And Minerals UML model integration and refactoring method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008146300A (en) * 2006-12-08 2008-06-26 Nec Corp Information processor, information processing method and program
CN107133257A (en) * 2017-03-21 2017-09-05 华南师范大学 A kind of similar entities recognition methods and system based on center connected subgraph
CN108363563A (en) * 2018-02-05 2018-08-03 海南大学 Uml model consistency detecting method based on data collection of illustrative plates, Information Atlas and knowledge mapping framework

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于最大公共子图的本体映射方法研究;郭竹为;刘胜全;刘艳;赵美玲;符贤哲;;计算机工程(05);全文 *

Also Published As

Publication number Publication date
CN113743467A (en) 2021-12-03

Similar Documents

Publication Publication Date Title
CN107679221B (en) Time-space data acquisition and service combination scheme generation method for disaster reduction task
CN106202536B (en) Global metadata standardized platform system and its construction method based on XBRL
US6874141B1 (en) Method of compiling schema mapping
CN106067094A (en) A kind of dynamic assessment method and system
WO2005055001A2 (en) Method for assisting in automated conversion of data and associated metadata
CN113743467B (en) Case diagram similarity judging method based on maximum public subgraph calculation
CN113326028A (en) Micro-service decomposition method based on domain-driven design and service panoramic event storm
CN110502667A (en) The parsing of ODX document and generation technique based on DOM frame
CN104750499A (en) Constraint solving and description logic based web service combination method
CN116860856A (en) Financial data processing method and device, computer equipment and storage medium
Zhu et al. Tool support for design pattern recognition at model level
Jain et al. Towards automating the development of federated distributed simulations for modeling sustainable urban infrastructures
Giannoulis et al. Model-driven strategic awareness: From a unified business strategy meta-model (UBSMM) to enterprise architecture
Huang et al. Research on precision marketing of real estate market based on data mining
CN113642291B (en) Method, system, storage medium and terminal for constructing logical structure tree reported by listed companies
CN115934693A (en) Dynamic calculation method for regional real population
CN113344604B (en) User subdivision method based on user behavior data and stream calculation
Mousavian Anaraki et al. Providing a hybrid clustering method as an auxiliary system in automatic labeling to divide employee into different levels of productivity and their retention
JP2003337697A (en) Development system for business system, and development method for business system
CN114238263A (en) Database modeling system based on data dictionary
Albassam A black-box computational business rules extraction approach through test-driven development
Mêda et al. Towards Software Integration in the Construction Industry–ERP and ICIS Case Study
Ricaurte et al. Representing interoperability between software systems by using pre-conceptual schemas
CN109300023A (en) A kind of method and system that increment tax on land value big data is extracted and applied
Zhao et al. A cloud platform architecture recovery metric method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant