CN104598591B - A kind of model element matching process for type attribute graph model - Google Patents

A kind of model element matching process for type attribute graph model Download PDF

Info

Publication number
CN104598591B
CN104598591B CN201510028158.4A CN201510028158A CN104598591B CN 104598591 B CN104598591 B CN 104598591B CN 201510028158 A CN201510028158 A CN 201510028158A CN 104598591 B CN104598591 B CN 104598591B
Authority
CN
China
Prior art keywords
node
model
characteristic vector
segmentation
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510028158.4A
Other languages
Chinese (zh)
Other versions
CN104598591A (en
Inventor
覃征
张任伟
李胜男
杨晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201510028158.4A priority Critical patent/CN104598591B/en
Publication of CN104598591A publication Critical patent/CN104598591A/en
Application granted granted Critical
Publication of CN104598591B publication Critical patent/CN104598591B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees

Abstract

The invention discloses a kind of model element matching process for type attribute graph model, methods described includes:Build the cum rights multi-dimensional search tree of model to be analyzed;Range searching is carried out on the cum rights multi-dimensional search tree for model element to be matched, so as to obtain the similar node collection of the model element to be matched;Calculate the similarity of each node element that the similar node concentrates and the model element to be matched respectively so that it is determined that with node element described in the model element similarity highest to be matched.The model element matching process of the present invention not only has a case that higher versatility but also can apply to multiple person cooperational edit model;Meanwhile, the method based on the present invention carries out model element matching, and the amount of calculation of its matching process is greatly reduced, so as to substantially reduce the time-consuming of model element matching overall process, improves the execution speed and efficiency of model element matching.

Description

A kind of model element matching process for type attribute graph model
Technical field
The present invention relates to areas of information technology, in particular relate to a kind of model element for type attribute graph model and match Method.
Background technology
With the software development methodology of model-driven and continuing to develop for application technology, the focus of attention of developer is gradually By the design for writing steering model of code, this just brings higher requirement to the maintenance and management of model.
On the one hand, the edition management system of model needs to face increasingly increased scale of model, it is desirable to can be to comprising upper The large-sized model of ten thousand model elements is matched.Meanwhile, in order to shorten the construction cycle, multiple developers are often needed to same mould Type collaborative editing, this, which is accomplished by model version management system, can compare the similarities and differences between the multiple versions of same model faster. Therefore, the matching efficiency of the model element between large-sized model how is improved, is the research emphasis of model version control field.
There are following problems in current model element matching process.Firstly, since model to be matched should not be limited only to certain One class, that is, require that algorithm can make matching to universal model, but not high in the versatility of current model matching method algorithm. Secondly, the matching algorithm of current main-stream is mostly based on static identifier, by distributing unique mark to direct for model element Compare.The uncertainty of identifier during in view of model concurrent development, this method is in multiple person cooperational edit model and uncomfortable With.Again, common model version management tool needs to travel through all model elements, the computation complexity of its matching algorithm compared with It is high.When Matching Model element is more, longer time need to be expended, influence subsequent model clash detection and model combination Efficiency.
Therefore, the problem of existing for current model matching method, it is necessary to a kind of new model element matching process with Obtain more preferably model element matching result.
The content of the invention
The problem of existing for current model matching method, the invention provides a kind of for type attribute graph model Model element matching process, the described method comprises the following steps:
Step one, the cum rights multi-dimensional search tree of model to be analyzed is built, the cum rights multi-dimensional search tree includes and gathered around each other There are the directory node and node element of level subordinate relation, the node element is used to describe corresponding in the model to be analyzed Model element, the directory node include multiple subtrees, the directory node or the node element construction in its subordinate In the subtree of directory node;
Step 2, carries out range searching, so as to search for for model element to be matched on the cum rights multi-dimensional search tree Go out the upper all elements node similar to the model element to be matched of the cum rights multi-dimensional search tree, and then treated described in construction Similar node collection with model element;
Step 3, calculates each node element and the model element to be matched that the similar node is concentrated respectively Similarity so that it is determined that with node element described in the model element similarity highest to be matched.
In one embodiment, the step one is comprised the steps of:
Characteristic vector group construction step, according to the corresponding characteristic vector of feature construction of the model to be analyzed so as to build The first eigenvector group of the model to be analyzed;
Cum rights multi-dimensional search tree construction step, the cum rights multi-dimensional search tree is built using the first eigenvector group.
In one embodiment, the characteristic vector group construction step is comprised the steps of:
The model to be analyzed is analyzed so as to obtain the characteristic vector of the model to be analyzed, wherein, the mould to be analyzed Each model element one characteristic vector of correspondence of type, each characteristic feature correspondence spy of the model element Levy a dimension of vector;
Semanteme based on the characteristic feature distributes weights for each dimension of the characteristic vector.
In one embodiment, the cum rights multi-dimensional search tree construction step is comprised the steps of:
Segmentation object collection construction step, creates segmentation object collection and the first eigenvector group is added into the segmentation Target tightening;
Split judgment step, concentrate optional characteristic vector group from the segmentation object, judge that the characteristic vector group is It is no to need to be divided and remove the characteristic vector group for completing the judgement from segmentation object concentration, wherein, as the spy The characteristic vector group needs to be divided when levying the number of the characteristic vector in Vector Groups more than the particular value that user pre-defines;
Segmentation step, when the characteristic vector group need be divided when, based on specific segmentation strategy to the feature to Amount group performs a cutting operation to obtain multiple subcharacter Vector Groups, and each subcharacter Vector Groups are added into institute State segmentation object concentration;
Directory node constitution step, according to the current segmentation step after the current segmentation step is finished The subcharacter Vector Groups of acquisition build corresponding directory node, wherein, each described catalogue of the subcharacter Vector Groups correspondence One subtree of node;
Node element constitution step, when the target feature vector group need not be divided, based on the target signature Node element described in vectorial set constructor;
Segmentation object traversal step, concentrates all characteristic vector groups to perform the segmentation and judges for the segmentation object Step simultaneously performs the segmentation step and the directory node constitution step or institute according to the result of the segmentation judgment step Node element constitution step is stated until the segmentation object collection is sky.
In one embodiment, the segmentation strategy includes segmentation dimension and partition value, in the segmentation step, by institute Value of each characteristic vector in the segmentation dimension in characteristic vector group is stated to do contrast to obtain contrast with the partition value As a result, the characteristic vector component is segmented into by two sub- characteristic vector groups according to the comparing result.
In one embodiment, the corresponding weights of each dimension institute corresponding with the dimension in the characteristic vector group is calculated There is the product of the variance of the characteristic vector, choose the maximum corresponding dimension of the product of numerical value and tieed up as the segmentation Degree.
In one embodiment, the middle position of the value of all characteristic vectors in segmentation dimension described in the characteristic vector group is selected Number is used as the partition value.
In one embodiment, in the segmentation step, it is respectively that the first son is special to define described two subcharacter Vector Groups Vector Groups and the second subcharacter Vector Groups are levied, wherein:
The value of characteristic vector in the target feature vector group is less than the characteristic vector ownership institute of the partition value State the first subcharacter Vector Groups;
The value of characteristic vector in the target feature vector group is more than the characteristic vector ownership institute of the partition value State the second subcharacter Vector Groups;
The target is determined according to whether the first subcharacter Vector Groups and the second subcharacter Vector Groups balance The value of characteristic vector in characteristic vector group is equal to the ownership of the characteristic vector of the partition value.
In one embodiment, the directory node record has the corresponding segmentation dimension of current directory node, segmentation dimension Spend corresponding partition value, the corresponding weights of the segmentation dimension, corresponding first subcharacter of the current directory node to In amount group in the minimum value of characteristic vector and the second subcharacter Vector Groups characteristic vector maximum.
In one embodiment, the step 2 is comprised the steps of:
Search node collection construction step, creates search node collection and by the upper first catalogue section of the cum rights multi-dimensional search tree Point is added to the search node and concentrated;
Node type judgment step, an optional node is concentrated in the search node, and whether judge the node is described Node element simultaneously concentrates removal to perform the node that node type judges from the search node;
Similar node obtaining step, when the node is the node element, adds the node to the similar section Point set;
Hunting zone obtaining step, when the node is not the node element, obtains model element correspondence to be matched In the hunting zone of the node;
Search node collection updates step, the subtree for the node for meeting the hunting zone is searched out, by the subtree On node be added to the search node and concentrate;
Search node traversal step, performs the node type for all nodes that the search node is concentrated and judges to walk It is rapid and hunting zone obtaining step is performed according to the result of the node type judgment step and search node update step or Similar node obtaining step is sky until the search node collection.
In one embodiment, in the region of search obtaining step, based on the corresponding segmentation dimension of the search node Weights define the model element to be matched it is described segmentation dimension on hunting zone.
Compared with prior art, the invention has the advantages that:
The model element matching process of the present invention has higher versatility;
Static identifier is distributed when the model element matching process of the present invention avoids model concurrent development to greatest extent The uncertain influence to matching primitives so that method of the invention can apply to the situation of multiple person cooperational edit model;
Method based on the present invention carries out model element matching, and the amount of calculation of its matching process is greatly reduced, so that significantly The time-consuming of model element matching overall process is reduced, the execution speed and efficiency of model element matching is improved.
The further feature or advantage of the present invention will be illustrated in the following description.Also, the present invention Partial Feature or Advantage will be become apparent by specification, or be appreciated that by implementing the present invention.The purpose of the present invention and part Advantage can be realized or obtained by specifically noted step in specification, claims and accompanying drawing.
Brief description of the drawings
Accompanying drawing is used for providing a further understanding of the present invention, and constitutes a part for specification, the reality with the present invention Apply example to be provided commonly for explaining the present invention, be not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the execution flow chart of one embodiment of the invention;
Fig. 2 is the cum rights multi-dimensional search tree construction schematic diagram of one embodiment of the invention;
Fig. 3 is the structural representation of the source model of one embodiment of the invention;
Fig. 4 is the structural representation of the modification model of one embodiment of the invention.
Embodiment
Describe embodiments of the present invention in detail below with reference to drawings and Examples, whereby implementation personnel of the invention Can fully understand how application technology means solve technical problem by the present invention, and reach technique effect implementation process and according to The present invention is embodied according to above-mentioned implementation process.If it should be noted that do not constitute conflict, each embodiment in the present invention And each feature in each embodiment can be combined with each other, the technical scheme formed protection scope of the present invention it It is interior.
The invention provides a kind of model element matching process for type attribute graph model.In order to improve model element The efficiency of two Model Matchings when more, method of the invention using first preliminary screening go out in model to be analyzed with model element to be matched The similar multiple model elements of element, then calculate its similarity respectively again, thus finally obtain in model to be analyzed with it is to be matched The model element that model element is most matched.
By preliminary screening, it is necessary to which the model element for carrying out Similarity Measure greatly reduces, so as to substantially reduce calculating Amount.Simultaneously as multi-dimensional search tree range searching is very fast, greatly shortened so overall flow is time-consuming, matching efficiency is greatly improved.
The present invention is described further with reference to the accompanying drawings and detailed description.Shown in the flow chart of accompanying drawing Step can be performed in the computer system comprising such as one group computer executable instructions.Although showing in flow charts The logical order of each step, but in some cases, can be with the step shown or described by being performed different from order herein Suddenly.
In order to filter out model element similar to model element to be matched in model to be analyzed, as shown in figure 1, of the invention Method first have to perform step S100, build the cum rights multi-dimensional search tree of model to be analyzed.In order to improve versatility, the present invention Method employ type attribute graph model as object.When model element to be matched or model to be analyzed are not belonging to type attribute Then the use of model transformation tools is type attribute graph model by model element to be matched or model conversion to be analyzed during graph model.
Because type attribute graph model is the current types of models more commonly used, and most current typess of models are all Type attribute graph model can be converted into using model transformation tools.Therefore compared to prior art, method of the invention has Higher versatility.
In the present embodiment, using Eclipse model frameworks (Eclipse Model Framework, EMF) structural type Attribute graph model.EMF provides a kind of framework of structural model, and support directly designs a model, and generates respective code.In EMF In, each model element can have multiple attributes, be connected by incidence relation (inherit, polymerization etc.) between model element.
In the present embodiment, it is based on possessing level subordinate pass each other according to the cum rights multi-dimensional search tree of EMF model constructions Multiple nodes of system are built.Node is divided into node element and the class of directory node two.Wherein, directory node is more for describing cum rights The Logic Architecture of search tree is tieed up, node element is used for the model element for stating model.Between directory node and directory node There is level subordinate relation between node element, directory node includes multiple subtrees, and directory node or node element construction exist In the subtree of the directory node of its subordinate.
There is level subordinate relation between directory node and between directory node and node element to be recorded by directory node The concrete model element content that records of catalogue scope and node element determine.As shown in Fig. 2 directory node 201 is band The node of multi-dimensional search tree highest level is weighed, directory node 202 and directory node 203 are subordinated to directory node 201, catalogue section Point 202 and directory node 203 are built in the subtree of directory node 201.Node element 211 and the subordinate of node element 212 In directory node 202, node element 211 and node element 212 are built in the subtree of directory node 202.Directory node 204 And node element 215 is subordinated to directory node 203, directory node 204 and node element 215 are built in directory node 203 Subtree on.Node element 213 and node element 214 are subordinated to directory node 204, node element 213 and node element 214 build in the subtree of directory node 204.Node element 211,212,213,214 and 215 is cum rights multi-dimensional search tree The node of the bottom.
Next the cum rights multi-dimensional search tree building process of the present embodiment is described in detail, in order to build cum rights multi-dimensional search Tree, first has to perform step S101, characteristic vector group construction step.In step S101, according to the feature structure of model to be analyzed Corresponding characteristic vector is built to build the characteristic vector group of model to be analyzed.Construction feature Vector Groups, first have to analyze and treat point Analysis model is so as to obtain the characteristic vector of model to be analyzed.In the present embodiment, each model element pair of model to be analyzed Answer a characteristic vector, a dimension of each characteristic feature character pair vector of model element.
By taking a specific model as an example, model as shown in Figure 3, model includes altogether five model elements, is respectively Model element 301 (continent), model element 302 (city), model element 303 (district), model element 304 And model element 305 (countrylanguage) (country).The characteristic feature of model element can be expressed as model name Claim length, attribute number, quote number of times, primary key attribute number and non-null attribute number.By each spy of a model element The corresponding characteristic vector of the model element can be obtained for an array by levying the value arrangements of description.Specific to each model element It is as follows:
The characteristic feature of model element 301 (continent) be model name length 9, attribute number 3, quote number of times 1, Primary key attribute number 1 and non-null attribute number 3, are represented by characteristic vector
Continent (9,3,1,1,3);
The characteristic feature of model element 302 (city) is model name length 4, attribute number 4, reference number of times 2, major key category Property number 1 and non-null attribute number 4, are represented by characteristic vector
City (4,4,2, Isosorbide-5-Nitrae);
The characteristic feature of model element 303 (district) is model name length 8, attribute number 3, reference number of times 1, master Key attribute number 1 and non-null attribute number 3, are represented by characteristic vector
District (8,3,1,1,3);
The characteristic feature of model element 304 (country) is model name length 7, attribute number 10, reference number of times 3, master Key attribute number 1 and non-null attribute number 7, are represented by characteristic vector
Country (7,10,3,1,7);
The characteristic feature of model element 305 (countrylanguage) is model name length 15, attribute number 4, quoted Number of times 1, primary key attribute number 1 and non-null attribute number 4, are represented by characteristic vector
Countrylangurage (15,4,1, Isosorbide-5-Nitrae);
Model shown in Fig. 3 is just represented by characteristic vector group:
(continent (9,3,1,1,3),
City (4,4,2, Isosorbide-5-Nitrae),
Country (7,10,3,1,7),
Countrylangurage (15,4,1, Isosorbide-5-Nitrae),
District (8,3,1,1,3)) (1)
In Matching Model elementary analysis similarity, it is necessary in view of the different to model element similarity of characteristic feature Influence.The different characteristic of model influences different to the overall similarity degree of model, for example, crucial characteristic feature is to model element The influence of similarity is greater than non-key characteristic feature.In the present embodiment, in order that obtaining the scale model element after The similar model element of the important characteristic feature of prioritizing selection in screening process, according to the semanteme of the characteristic feature of model to difference Characteristic feature distribute different weights, the i.e. corresponding weights of different dimensions imparting to characteristic vector group.Specific to shown in Fig. 3 Model in, as shown in table 1:
Characteristic feature Weights
Model name length 0.3
Attribute number 0.25
Quote number of times 0.25
Primary key attribute number 0.1
Non- null attribute number 0.1
Table 1
Characteristic vector group structure just can build cum rights multi-dimensional search tree after finishing according to characteristic vector group.Due to each mould Type includes multiple model elements, in the present embodiment, and user can be used for the mould for stating model by sets itself according to actual needs The express ranges of the node element of type element, i.e. one several model element of node element correspondence of sets itself.Building cum rights What a particular value multi-dimensional search tree, user according to actual needs, presets, and the particular value is exactly that a node element institute is right Answer the number of model element.
Next just it can carry out segmentation to characteristic vector group the characteristic vector of representative model element is assigned into member Plain node.In the present embodiment, for the ease of segmentation characteristic vector group, step S108 is first carried out, segmentation object collection creates step Suddenly, create segmentation object collection and be analysed to the corresponding characteristic vector group of model and be added to segmentation object concentration.
Then step S102 can be just performed, splits judgment step, an optional characteristic vector group is concentrated from segmentation object Judge whether it needs to be divided and remove the characteristic vector group for completing to split judgment step from segmentation object concentration.It is not difficult to manage Solution, the characteristic vector group initially selected is the corresponding characteristic vector group of model to be analyzed.
In step s 102, when the number of the characteristic vector in characteristic vector group is more than the particular value that user pre-defines Target feature vector group needs to be divided.It can be appreciated that generally when particular value is set, particular value is less than model to be analyzed The number of characteristic vector in corresponding characteristic vector group, therefore step S102 can be initially being omitted at the beginning, directly sentence The corresponding characteristic vector group of model to be analyzed of breaking needs to be divided.
When characteristic vector group need not be divided, step S104, node element constitution step, based on spy are next performed Levy vectorial set constructor node element.In step S104, characteristic vector all in characteristic vector group is added to node element In, so as to construct node element.
When target feature vector group needs to be divided, step S103 is next performed, segmentation step splits characteristic vector Group.In step s 103, perform a cutting operation to characteristic vector group to obtain many height based on specific segmentation strategy Characteristic vector group.Explanation is needed exist for, can be with one when splitting to characteristic vector group according to different user's requests It is secondary to be classified as different number of subcharacter Vector Groups.In the present embodiment, use feature in a cutting operation Vector Groups are divided into the mode of two sub- characteristic vector groups.
In the present embodiment, segmentation dimension and partition value two parameters defined in segmentation strategy.Splitting dimension is A specific dimension in characteristic vector group, i.e., its can correspond to a specific characteristic feature in model to be analyzed.Partition value is then It is a possible value of the characteristic vector of correspondence segmentation dimension in characteristic vector group.
In the present embodiment, value of each characteristic vector in characteristic vector group in segmentation dimension is contrasted with partition value So as to obtain comparing result, characteristic vector component is segmented into by two sub- characteristic vector groups according to comparing result.
In the present embodiment, different segmentation dimension and partition value are used for different characteristic vector groups.European sky Between certain upper dimension variance it is maximum, it is meant that in this direction, data are most scattered.Cut in such dimension, identification Highest, is easy to improve range searching efficiency.In addition, it is contemplated that the difference of the upper each dimension influence power of cum rights multi-dimensional search tree, therefore Variance is combined in the present embodiment with two factors of weights to determine to split dimension.When choosing segmentation dimension every time, as far as possible The dimension for selecting data weighting big and most scattered.Select dim=i | Max (widi), diRepresent dimension lower variance a little of i.
Calculate the variance of the corresponding weights of each dimension all characteristic vectors corresponding with the dimension in characteristic vector group Product, chooses the maximum corresponding dimension of product of numerical value as segmentation dimension.Then dimension is split in selection target characteristic vector group The median of the value of all characteristic vectors is used as partition value on degree.Specifically, to characteristic vector group segmentation dimension on it is all The value sequence of characteristic vector, the numerical value conduct in the middle of in above-mentioned sequence is selected when characteristic vector group includes odd number characteristic vector Partition value, being averaged for the two values in the middle of in above-mentioned sequence is selected when target feature vector group includes even number characteristic vector Number is used as partition value.
When splitting characteristic vector group:The value of the characteristic vector in target feature vector group is set to be less than the spy of partition value first Levy vector one sub- characteristic vector group of ownership;Then make characteristic vector in characteristic vector group value be more than the feature of partition value to Amount belongs to another subcharacter Vector Groups;Determined finally according to whether being balanced between two sub- characteristic vector groups target signature to The value of characteristic vector in amount group is equal to the ownership of the characteristic vector of partition value.
Next step S106, directory node constitution step, after current segmentation step is finished can just be performed The subcharacter Vector Groups obtained according to current segmentation step build corresponding directory node.In step s 106, each height One subtree of characteristic vector group correspondence directory node.
In addition to the directory node of highest level, other all directory nodes build pair in the directory node of its subordinate Answer in subtree.The i.e. current directory node to be constructed builds object (the current mesh of the current cutting operation corresponding to it Mark characteristic vector) in corresponding subtree.
In the present embodiment, a segmentation step (S103) includes a cutting operation, and a cutting operation is only to one Target feature vector group is split.A directory node can be built for each cutting operation, one time cutting operation is obtained Subcharacter Vector Groups be subordinated to the corresponding directory node of this cutting operation, the subtree of itself and directory node is corresponded.
Directory node record has the corresponding segmentation dimension of current directory node, the corresponding segmentation dimension pair of current directory node The corresponding weights of the corresponding segmentation dimension of partition value, current directory node, corresponding two subcharacters of current directory node answered The minimum value and maximum of characteristic vector in Vector Groups.
Because segmentation can all generate multiple subcharacter Vector Groups every time, and be required for holding for each subcharacter Vector Groups Row segmentation judgment step (S102) simultaneously performs segmentation step (S103), directory node according to the result of segmentation judgment step (S102) Or node element (S104) constitution step (S106).In order to ensure that the complete of cum rights multi-dimensional search tree (does not miss subcharacter The situation of Vector Groups).In the present embodiment, step S105 is carried out after each segmentation step S103, segmentation object updates step Suddenly, each subcharacter Vector Groups obtained after segmentation are added into segmentation object to concentrate.
And segmentation object traversal step is next performed, concentrates all characteristic vector groups to perform for segmentation object and divides Cut judgment step (S102) and segmentation step (S103), directory node are performed according to the result of segmentation judgment step (S102) Or node element (S104) constitution step (S106).
After once segmentation is completed and corresponding directory node is fabricated, that is, there are new subcharacter Vector Groups to be added to point Cutting object set now can directly carry out segmentation judgment step S102.
Mean that cum rights multi-dimensional search tree has arrived terminal in the branch after node element is fabricated, now It is accomplished by handling the branch that other are not reached home.First have to determine whether the branch for being not reaching to terminal, i.e., Execution step S107, segmentation object collection judgment step, whether judge segmentation object collection is empty.
When segmentation object integrates not as space-time, i.e., when in the presence of the subcharacter vector for needing to be handled, S102 is continued executing with, from Segmentation object concentrates an optional characteristic vector group to be handled.An optional characteristic vector group is concentrated to perform from segmentation object Step S102.And step S104 or step S103, step S105 and step S106 are performed according to step S102 result.Weight Said process is performed again until segmentation object collection is sky.When segmentation object integrates as space-time, cum rights multi-dimensional search tree completes to build.
By taking the model shown in Fig. 3 as an example, a node element correspondence one in the cum rights multi-dimensional search tree of the model is defined on Individual model element.The corresponding characteristic vector group of model is the characteristic vector group shown in formula (1).
First the corresponding characteristic vector group of model is carried out segmentation to construct directory node.
First selection segmentation dimension.The variance and the product of weights of each dimension in characteristic vector group are calculated, is calculated per dimension As a result it is respectively 3.9,1.74,0.16,0,0.216.Therefore the first dimension of selection (model name length) is segmentation dimension;
Then partition value is selected, is sorted according to the first dimension, i.e., (4,7,8,9,15), selects its median (being herein 8) It is used as partition value;
Then characteristic vector component is segmented into following two sub- characteristic vector group (Vector Groups a and Vector Groups b).
Vector Groups a:(city (4,4,2, Isosorbide-5-Nitrae), country (7,10,3,1,7), district (8,3,1,1,3))
Vector Groups b:(continent (9,3,1,1,3), countrylangurage (15,4,1, Isosorbide-5-Nitrae))
As shown in Fig. 2 building the subtree 221 and 222 in corresponding directory node 201 and directory node 201, Vector Groups B correspondence subtrees 221, Vector Groups a correspondence subtrees 222.The record of directory node 201 has segmentation dimension, and (the first dimension, model name is long Degree), partition value (8), segmentation the corresponding weights (0.3) of dimension and two sub- characteristic vector groups in characteristic vector minimum value with And maximum (4,15).
Next Vector Groups b is split and directory node 202 is constructed in subtree 221.Vector Groups b is divided into pair Answer the Vector Groups c (continent (9,3,1,1,3)) of the subtree 223 and Vector Groups d of correspondence subtree 224 (countrylangurage (15,4,1, Isosorbide-5-Nitrae)).The record of directory node 202 has segmentation dimension (the second dimension, attribute number), Partition value (3.5), segmentation the corresponding weights (0.25) of dimension and two sub- characteristic vector groups in characteristic vector minimum value with And maximum (3,4).
According to Vector Groups c (continent (9,3,1,1,3)) and Vector Groups d (countrylangurage (15,4,1, Isosorbide-5-Nitrae)) node element 211 is constructed in subtree 223 respectively and node element 212 is constructed in subtree 224.
Vector Groups a is split and directory node 203 is constructed in subtree 222.Vector Groups a is divided into correspondence subtree 225 Vector Groups e (city (4,4,2, Isosorbide-5-Nitrae), district (8,3,1,1,3)) and the Vector Groups f of correspondence subtree 226 (country (7,10,3,1,7)).Directory node 203, which is recorded, segmentation dimension (the second dimension, attribute number), partition value (4), Split in the corresponding weights (0.25) of dimension and two sub- characteristic vector groups the minimum value of characteristic vector and maximum (3, 10)。
Node element 215 is constructed in subtree 226 according to Vector Groups f (country (7,10,3,1,7)).
Vector Groups e is split and directory node 204 is constructed in subtree 225.Vector Groups e is divided into correspondence subtree 227 Vector Groups g (city (4,4,2, Isosorbide-5-Nitrae)) and the Vector Groups h (district (8,3,1,1,3)) of correspondence subtree 228.Mesh The record record of node 204 has segmentation dimension (the first dimension, model name length), partition value (6), the corresponding weights of segmentation dimension (0.3) minimum value and maximum (4,8) of characteristic vector and in two sub- characteristic vector groups.
According to Vector Groups Vector Groups g (city (4,4,2, Isosorbide-5-Nitrae)) and Vector Groups h (district (8,3,1,1,3)) points Node element 213 is not constructed in subtree 227 and node element 214 is constructed in subtree 228.
Fig. 1 is returned to, step S110, range searching step can be just performed after cum rights multi-dimensional search tree structure is finished. In step S110, range searching is carried out on cum rights multi-dimensional search tree for model element to be matched, so that it is more to search out cum rights All nodes similar to model element to be matched on search tree are tieed up, and then construct the similar node collection of model element to be matched.
Step S110 specific execution is as described below, and for the ease of performing range searching, step S117 is first carried out, and searches for Set of node construction step, builds search node collection and cum rights multi-dimensional search tree is gone up to node (first catalogue section of highest level Point) (the corresponding characteristic vector group of the model to be analyzed directory node constructed when being divided first time) be added to search node Collection.
Then step S111 is performed, node type judgment step concentrates an optional node to scan in search node, first Whether judge the node is node element.Meanwhile, concentrated from search node and remove the node for performing node type judgment step.
Because initial search node concentrates first directory node only having on cum rights multi-dimensional search tree.Therefore initially perform Step S111 target is first directory node on cum rights multi-dimensional search tree, and initially performs search node after step S111 Integrate as sky.
If the node currently scanned for is not node element, step S113, hunting zone obtaining step are performed.When Right, the node of the highest level of the cum rights multi-dimensional search tree built according to the method for the present embodiment is directory node, therefore Just start not performing step S111 during search and directly perform step S113 in the present embodiment.
In step S113, the hunting zone that model element to be matched corresponds to present node is obtained.Due to cum rights multidimensional In search tree, the difference of weights size determines that the region of search of spatially all directions also should not be identical.Therefore in step In S113, the weights based on the corresponding segmentation dimension of present node define search of the model element to be matched in the segmentation dimension Scope.
In the present embodiment, the hunting zone for defining cum rights multi-dimensional search tree is as follows
Wherein, RwFor hunting zone, d is search radius.wk|xk-ak|≤d represent on hyperspace dimension it is all with The set of the point of destination node a distance and the product of weights no more than search radius d.Search radius is generally according to actual conditions Set.
Next node for needing to scan for is assured that based on hunting zone, therefore performs step S115, search The subtree step of condition is met, the subtree for meeting hunting zone is searched out from all subtrees of current node.
In practical operation, step S115 may search it is multiple meet hunting zone subtree (in the present embodiment, pin 2 subtrees for meeting hunting zone are at most found to a search node).But in the present embodiment, step S111 is each only Analysis judgement is carried out to a node.Therefore in order to not miss node, it is ensured that the node of search in need all searched for, , it is necessary to perform step S114 in the present embodiment, search node collection updates step, and the node met in the subtree of hunting zone is added Enter to search node and concentrate.
Then perform step S111 again after step sl 14, concentrate an optional node from search node, judge that it is No is node element.If being exactly to search 2 subtrees times to take the node in a subtree to perform step specific to the present embodiment Node in another subtree is simultaneously added to search node collection by S111;Directly it is performed if 1 subtree is only searched Step S111 and search node collection is constant.
If next new node is still that directory node so continues executing with step S113, step S115 and step S114.Said process is repeated until the node in step S111 is node element.Now next perform step S112, phase Like node obtaining step, addition present node (node element) to similar node collection.
Mean cum rights multi-dimensional search tree searching in the branch when a node element is added into similar node collection Rope has been completed, and is now accomplished by handling branch's (node that search node is concentrated) that other are not completed.In step Step S116 is performed after S112, search node collection judgment step judges search node concentrates whether also there is node.Work as search When still having node in set of node, an optional node is concentrated to carry out step S111 from search node.When search node is concentrated During in the absence of node, search is completed.
Herein it is pointed out that there is search in step sl 15 less than the subtree for meeting hunting zone, Now also illustrate that the search in the branch has been completed, i.e., continue executing with step S116 in this case.
In addition, in the present embodiment, when final search is completed, if similar node, which is concentrated, does not include any element section Point, then relax the hunting zone set in step S113 and re-search for.
By taking a specific application as an example, the model shown in Fig. 4 is the revision of the model shown in Fig. 3.Compared with Fig. 3, Fig. 4 In shown model:
1) model element 303 (district) is renamed as street (model element 403)
2) model element 302 (city) addition attribute cityRank (model element 402)
3) addition model element 406 (province)
If wanting similar node of the search model element 404 (country) in Fig. 3 institutes representation model, then such as Fig. 2 institutes are utilized The cum rights multi-dimensional search tree shown scans for.Search routine substantially is:
Search node collection (empty set) is set up, directory node 201 is searched for;
It is the subtree for meeting hunting zone to search subtree 222;
Directory node 203 is added into search node to concentrate;
Concentrated from search node and take out directory node 203 (search node collection is sky), search for directory node 203;
It is the subtree for meeting hunting zone to search subtree 226;
Node element 215 is added into search node to concentrate;
Concentrated from search node and take out node element 215 (search node collection is sky), search for node element 215;
Node element 215 (country (7,10,3,1,7)) is added into similar node collection;
Search node collection is sky, and search terminates.
Search radius is set to 0.8 in the above-mentioned search procedure of the present embodiment.
Step S120 can just be performed by getting after similar node collection, each element that similar node is concentrated is calculated respectively The similarity of node and model element to be matched so that it is determined that with model element similarity highest node element to be matched.
Model element calculating formula of similarity is defined in the present embodiment, it is as follows:
Sm=λ Sw+(1-λ)Ss,0≤λ≤1(3)
In formula (3), model element similarity SmIt is space length similarity SwWith model element similarity of character string SsTwo Partial weighted average, λ is the parameter between 0 to 1.
Space length similarity is defined as:
Wherein:distw(α, β) is the cum rights Euclidean distance of point-to-point transmission, is defined as:
Wherein:
For weight vector, and
The definition of similarity of character string is then by means of string editing distance.The editing distance of two character strings refers to from source word symbol The minimum edit operation gone here and there to required for target string, including one specific character of insertion, replacement and deletion any character. The size of editing distance has reacted the similarity degree of character string.Similarity of character string is defined as follows:
Wherein, edit (a, b) represents two character string a, b editing distance, La,LbRespectively its length.As edit (a, b) When=0, Ss=1, represent that two character strings are essentially equal.When edit (a, b) is maxstrlen, illustrate that two character strings are complete Complete different, now similarity is 0.Formula also embodies similarity to be influenceed by editor's length with string length simultaneously.
To sum up, it can be concentrated from the similar node of a certain model element, calculate model element most like in model to be matched Element.
In the building process of model, many factors such as developer, constructing environment, application target can all have influence on finally Build the model element completed.The uncertainty of model element increases the difficulty of model element matching significantly.Due to the present invention Method construct unified characteristic vector group and build standard and cum rights multi-dimensional search tree structure standard and establish uniformly Similarity calculating method, therefore for greatly reducing the influence of the uncertain final matching results of model element.The present invention's Method is not based on static identifier, so that the uncertainty that static identifier is distributed when avoiding model concurrent development is counted to matching The influence of calculation.And then cause the method for the present invention to can apply to the situation of multiple person cooperational edit model.
It is assumed below and the model name of the model element 402 (city) in Fig. 4 institutes representation model is changed to cities, other is special Levy constant, so as to obtain new model element (cities).
Assuming that the model element most matched with model element (cities) in 3 representation models of search graph is wanted, setting search half Footpath is 2, then can obtain the similar section for including node element 213 and 214 by the cum rights multi-dimensional search tree shown in search graph 2 Point set (city (4,4,2, Isosorbide-5-Nitrae), district (8,3,1,1,3)) is correspondence model element 302 and 303.
The corresponding characteristic vector of model element (cities) is (6,5,2,1,5)
Cities and city and district similarity is calculated respectively.
If λ=0.5
(1) cities and city
distw(α, β)=(0.3*2*2+0.25*1*1+0+0+0.1*1)1/2=1.24
Sw=1/ (1+1.24)=0.44
Ss=1-3/6=0.5
Sm=0.5*0.44+0.5*0.5=0.47
(2) cities and district
distw(α, β)=(0.3*2*2+0.25*2*2+0.25*1+0.1*1)1/2=1.59
Sw=1/ (1+1.59)=0.38
Ss=1-8/8=0;
Sm=0.5*0.38=0.19
It in summary it can be seen compared to model element (district), model element (city) and model element (cities) Similarity it is higher.Therefore the model element matched in the model shown in Fig. 3 with model element (cities) is model element Plain (city).2 Similarity Measures only need to be carried out during above-mentioned acquisition model element matching result, compared to prior art The middle method for calculating and (need to calculate 5 times in the application example) all elements traversal in model, amount of calculation is greatly reduced, and calculates speed Degree and efficiency are both greatly improved, so as to substantially reduce the time-consuming of model element matching overall process, improve model The execution speed and efficiency of Match of elemental composition.
While it is disclosed that embodiment as above, but described content is only to facilitate understanding the present invention and adopting Embodiment, is not limited to the present invention.Method of the present invention can also have other various embodiments.Without departing substantially from In the case of essence of the present invention, those skilled in the art work as can make various corresponding changes or change according to the present invention Shape, but these corresponding changes or deformation should all belong to the scope of the claims of the present invention.

Claims (9)

1. a kind of model element matching process for type attribute graph model, it is characterised in that methods described includes following step Suddenly:
Step one, the cum rights multi-dimensional search tree of model to be analyzed is built, the cum rights multi-dimensional search tree includes possesses layer each other The directory node and node element of level subordinate relation, the node element are used to describe corresponding mould in the model to be analyzed Type element, the directory node includes multiple subtrees, and the directory node or the node element construct the catalogue in its subordinate In the subtree of node;
Step 2, carries out range searching, so as to search out institute for model element to be matched on the cum rights multi-dimensional search tree The upper all elements node similar to the model element to be matched of cum rights multi-dimensional search tree is stated, and then constructs the mould to be matched The similar node collection of type element;
Step 3, each node element that the similar node concentration is calculated respectively is similar to the model element to be matched Degree so that it is determined that with node element described in the model element similarity highest to be matched;
Wherein, the step one is comprised the steps of:
Characteristic vector group construction step, according to the corresponding characteristic vector of feature construction of the model to be analyzed so that described in building The first eigenvector group of model to be analyzed;
Cum rights multi-dimensional search tree construction step, the cum rights multi-dimensional search tree is built using the first eigenvector group;
Wherein, the cum rights multi-dimensional search tree construction step is comprised the steps of:
Segmentation object collection construction step, creates segmentation object collection and the first eigenvector group is added into the segmentation object Concentrate,
Split judgment step, concentrate optional characteristic vector group from the segmentation object, judge whether the characteristic vector group needs It is divided and concentrates the characteristic vector group for completing the segmentation judgment step from the segmentation object and removes, wherein, work as institute State the characteristic vector in characteristic vector group number be more than user pre-define particular value when the characteristic vector group need by Segmentation;
Segmentation step, when the characteristic vector group needs to be divided, determines segmentation strategy, based on described point according to user's request Cut strategy and a cutting operation is performed to the characteristic vector group so as to obtain multiple subcharacter Vector Groups, and will each son Characteristic vector group is added to the segmentation object and concentrated, and the segmentation strategy includes segmentation dimension and partition value;
Directory node constitution step, is obtained after the current segmentation step is finished according to the current segmentation step Subcharacter Vector Groups build corresponding directory node, wherein, each described directory node of subcharacter Vector Groups correspondence A subtree;
Node element constitution step, when target feature vector group need not be divided, based on the target feature vector group structure Make the node element;
Segmentation object collection traversal step, concentrates all characteristic vector groups to perform the segmentation and judges step for the segmentation object The result execution segmentation step and the directory node constitution step or described rapid and according to the segmentation judgment step Node element constitution step is sky until the segmentation object collection.
2. the method as described in claim 1, it is characterised in that the characteristic vector group construction step is comprised the steps of:
The model to be analyzed is analyzed so as to obtain the characteristic vector of the model to be analyzed, wherein, the model to be analyzed Each model element one characteristic vector of correspondence, each characteristic feature correspondence feature of the model element to One dimension of amount, the characteristic feature includes:Model name length, attribute number, quote number of times, primary key attribute number and Non- null attribute number;
Semanteme based on the characteristic feature distributes weights for each dimension of the characteristic vector.
3. the method as described in claim 1, it is characterised in that in the segmentation step, will be every in the characteristic vector group Value of the individual characteristic vector in the segmentation dimension does contrast so as to obtain comparing result, according to the contrast with the partition value As a result the characteristic vector component is segmented into two sub- characteristic vector groups.
4. method as claimed in claim 3, it is characterised in that calculate the corresponding weights of each dimension in the characteristic vector group The product of the variance of all characteristic vectors corresponding with the dimension, chooses the maximum product of numerical value corresponding described Dimension is used as the segmentation dimension.
5. method as claimed in claim 4, it is characterised in that own described in the selection characteristic vector group in segmentation dimension The median of the value of characteristic vector is used as the partition value.
6. method as claimed in claim 5, it is characterised in that in the segmentation step, define described two subcharacters to Amount group is respectively the first subcharacter Vector Groups and the second subcharacter Vector Groups, wherein:
The value of characteristic vector in the target feature vector group is less than the characteristic vector ownership described the of the partition value One subcharacter Vector Groups;
The value of characteristic vector in the target feature vector group is more than the characteristic vector ownership described the of the partition value Two subcharacter Vector Groups;
The target signature is determined according to whether the first subcharacter Vector Groups and the second subcharacter Vector Groups balance The value of characteristic vector in Vector Groups is equal to the ownership of the characteristic vector of the partition value.
7. method as claimed in claim 6, it is characterised in that the directory node record has corresponding point of current directory node Cut dimension, the corresponding partition value of the segmentation dimension, the corresponding weights of the segmentation dimension, the current directory node corresponding Characteristic vector is most in the minimum value of characteristic vector and the second subcharacter Vector Groups in the first subcharacter Vector Groups Big value.
8. method as claimed in claim 7, it is characterised in that the step 2 is comprised the steps of:
Search node collection construction step, creates search node collection and adds upper first directory node of the cum rights multi-dimensional search tree Enter search node collection;
Node type judgment step, concentrates an optional node, whether judge the node is the element in the search node Node simultaneously removes the node for performing node type judgment step from search node concentration;
Similar node obtaining step, when the node is the node element, adds the node to the similar node collection;
Hunting zone obtaining step, when the node is not the node element, obtains model element to be matched and corresponds to institute State the hunting zone of node;
Search node collection updates step, the subtree for the node for meeting the hunting zone is searched out, by the subtree Node adds the search node collection;
Search node collection traversal step, the node type judgment step is performed for all nodes that the search node is concentrated And hunting zone obtaining step, search node renewal step or similar section are performed according to the result of the node type judgment step Point obtaining step is sky until the search node collection.
9. method as claimed in claim 8, it is characterised in that in the hunting zone obtaining step, based on described current The weights of the corresponding segmentation dimension of directory node define the model element to be matched hunting zone in the segmentation dimension.
CN201510028158.4A 2015-01-20 2015-01-20 A kind of model element matching process for type attribute graph model Active CN104598591B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510028158.4A CN104598591B (en) 2015-01-20 2015-01-20 A kind of model element matching process for type attribute graph model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510028158.4A CN104598591B (en) 2015-01-20 2015-01-20 A kind of model element matching process for type attribute graph model

Publications (2)

Publication Number Publication Date
CN104598591A CN104598591A (en) 2015-05-06
CN104598591B true CN104598591B (en) 2017-08-04

Family

ID=53124376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510028158.4A Active CN104598591B (en) 2015-01-20 2015-01-20 A kind of model element matching process for type attribute graph model

Country Status (1)

Country Link
CN (1) CN104598591B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106997335B (en) * 2016-01-26 2020-05-19 阿里巴巴集团控股有限公司 Identical character string determination method and device
CN108491592B (en) * 2018-03-06 2022-06-14 中国第一汽车股份有限公司 CAE simulation result automatic processing method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2355418A1 (en) * 2001-08-16 2003-02-16 Ibm Canada Limited-Ibm Canada Limitee A schema for sql statements
CN103049503A (en) * 2012-12-11 2013-04-17 南京大学 UML (Unified Modeling Language) model querying method based on structure matching
CN103390018B (en) * 2013-04-28 2016-05-18 浙江工业大学 A kind of Web service data modeling and searching method based on SDD

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于带权多维搜索树的模型匹配算法;张任伟 等;《清华大学学报(自然科学版)》;20141231;第54卷(第12期);第1522-1528页 *

Also Published As

Publication number Publication date
CN104598591A (en) 2015-05-06

Similar Documents

Publication Publication Date Title
Jovanovic et al. Ant colony optimization algorithm with pheromone correction strategy for the minimum connected dominating set problem
Flaherty et al. Parallel structures and dynamic load balancing for adaptive finite element computation
Hellmuth et al. Orthology relations, symbolic ultrametrics, and cographs
CA2424031C (en) System and process for validating, aligning and reordering genetic sequence maps using ordered restriction map
CN105138601B (en) A kind of graphic mode matching method for supporting fuzzy constraint relationship
CN105787105A (en) Iterative-model-based establishment method of Chinese encyclopedic knowledge graph classification system
JP6694447B2 (en) Big data calculation method and system, program, and recording medium
CN104573039A (en) Keyword search method of relational database
CN109614520B (en) Parallel acceleration method for multi-pattern graph matching
JP2001014329A (en) Database processing method and implementation device, and medium stored with the processing program
CN110866029A (en) sql statement construction method, device, server and readable storage medium
CN107239549A (en) Method, device and the terminal of database terminology retrieval
Arasteh et al. Bölen: Software module clustering method using the combination of shuffled frog leaping and genetic algorithm
CN104598591B (en) A kind of model element matching process for type attribute graph model
Halperin et al. Optimal randomized EREW PRAM algorithms for finding spanning forests
Han et al. A fast layout algorithm for protein interaction networks
CN106445913A (en) MapReduce-based semantic inference method and system
CN109545283A (en) A kind of phylogenetic tree construction method based on Sequential Pattern Mining Algorithm
Al Aghbari et al. Geosimmr: A mapreduce algorithm for detecting communities based on distance and interest in social networks
CN110209699B (en) Data interface dynamic generation and execution method based on openEHR Composition template
Zhang et al. An improved label propagation algorithm based on the similarity matrix using random walk
KR100597089B1 (en) Method for identifying of relevant groups of genes using gene expression profiles
CN107133281B (en) Global multi-query optimization method based on grouping
Komendantskaya et al. Proof mining with dependent types
Abdolazimi et al. Connected components of big graphs in fixed mapreduce rounds

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant