CN107203529A - Multi-service correlation analysis method and device based on metadata graph structural similarity - Google Patents

Multi-service correlation analysis method and device based on metadata graph structural similarity Download PDF

Info

Publication number
CN107203529A
CN107203529A CN201610150952.0A CN201610150952A CN107203529A CN 107203529 A CN107203529 A CN 107203529A CN 201610150952 A CN201610150952 A CN 201610150952A CN 107203529 A CN107203529 A CN 107203529A
Authority
CN
China
Prior art keywords
metadata
business
graph
relation
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610150952.0A
Other languages
Chinese (zh)
Other versions
CN107203529B (en
Inventor
李湛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Hebei Co Ltd
Original Assignee
China Mobile Group Hebei Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Hebei Co Ltd filed Critical China Mobile Group Hebei Co Ltd
Priority to CN201610150952.0A priority Critical patent/CN107203529B/en
Publication of CN107203529A publication Critical patent/CN107203529A/en
Application granted granted Critical
Publication of CN107203529B publication Critical patent/CN107203529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of multi-service correlation analysis method and device based on metadata graph structural similarity, wherein, methods described includes:Obtained from multiple business after metadata, set up the graph of a relation of metadata object;Judge in the graph of a relation of the metadata object with linear model with the presence or absence of common metadata object and metadata object attribute, if there is, then according to the similitude on the summit of structure and vertex attribute and side in the graph of a relation of the metadata object, the structural similarity of metadata graph is obtained;Based on the structural similarity of the metadata graph, to determine the incidence relation between multiple business.

Description

Multi-service correlation analysis method and device based on metadata graph structural similarity
Technical field
The present invention relates to correlation analysis technology, more particularly to it is a kind of based on many of metadata graph structural similarity Business association analysis method and device.
Background technology
Metadata refers to the data for describing data, concept, relation, rule, the semanteme in main description field etc.. Metadata is to manage mass data system (for example:Data warehouse, Data Mart, Hadoop big data platforms Deng) effective way, it can provide the catalogue of complete display to access data, allow users to from entirety Data are well understood, instruct user efficiently to use data.
Using prior art, when being analyzed based on being associated property of metadata, following defect is primarily present:
One, a relational link of metadata is that have direct adduction relationship or immediate data flow direction to close from the beginning to the end One operation flow of system, and generally also existed between multiple business of enterprise many indirect System, but existing metadata system does not have method to determine the crucial contact pivot point between multiple business, institute Can not intuitively assess influence of this business to other business when a business bore changes, only Influence of each metadata object to operation flow can be searched using the method manually recalled.
Two, existing metadata association analyzes the first number overlapped in simply rough two relational links of comparison According to the number of object, and in fact different business is often using the different attribute and logic flow of metadata object Journey relation also tends to difference, therefore, does not account for the relevance of metadata object attribute information and logical relation The result of analysis often lacks accuracy.
Three, business personnel is relied primarily on to business progress classification at present and by virtue of experience carries out manual sort, in number It can be handled reluctantly according to when measuring smaller, but in face of the complicated big data of magnanimity, manual sort just substantially has It is a little unable to do what one wishes, and existing metadata application system lacks the method and mechanism of subsidiary classification.
The content of the invention
In view of this, the embodiment of the present invention is desirable to provide a kind of multi-service based on metadata graph structural similarity Correlation analysis method and device, solve at least the problem of prior art is present.
What the technical scheme of the embodiment of the present invention was realized in:
A kind of multi-service correlation analysis method based on metadata graph structural similarity of the embodiment of the present invention, Methods described includes:
Obtained from multiple business after metadata, set up the graph of a relation of metadata object;
Judge in the graph of a relation of the metadata object with linear model with the presence or absence of common metadata object and Metadata object attribute, if it is present according to the summit of structure in the graph of a relation of the metadata object and Vertex attribute and the similitude on side, obtain the structural similarity of metadata graph;
Based on the structural similarity of the metadata graph, to determine the incidence relation between multiple business.
It is described to be obtained from multiple business after metadata in such scheme, the graph of a relation of metadata object is set up, Including:
The metadata is divided into multiple classes, the descriptive model that each class is set up respectively according to different granularities For the meta-model;
The metadata object is constituted by the example or entity of the meta-model;
Metadata relationship is set up according to the reference between the metadata object or data flow relation, and with member Data object is that the relation between summit, metadata object is side, sets up the digraph of metadata object, will The digraph of the metadata object as the metadata object graph of a relation.
In such scheme, methods described also includes:
Relation between resource object and the resource object that each business is related to all is supported to use first number It is indicated according to the digraph of object.
In such scheme, the summit of structure and vertex attribute in the graph of a relation according to the metadata object And the similitude on side, the structural similarity of metadata graph is obtained, including:
The summit for obtaining structure in the graph of a relation of the metadata object combines the similarity of vertex attribute;
Obtain the similarity on side in the graph of a relation of the metadata object;
According to the summit with reference to the similarity of vertex attribute and the similarity on the side, metadata graph is obtained Structural similarity.
In such scheme, the summit of structure belongs to reference to summit in the graph of a relation for obtaining the metadata object The similarity of property, including:
Each business is indicated with the metadata subgraph of a metadata digraph;
The public vertex and its attribute for obtaining two metadata subgraphs account for the proportion of specified specification figure, according to described Proportion calculates similarity of the summit with reference to vertex attribute of the corresponding metadata subgraph structure of any two business.
In such scheme, the similarity on side in the graph of a relation for obtaining the metadata object, including:
Each business is indicated with the metadata subgraph of a metadata digraph;
The common edge for obtaining two metadata subgraphs accounts for the proportion of specified specification figure, is calculated and appointed according to the proportion The similarity on the side of the corresponding metadata subgraph structure of two business of anticipating.
In such scheme, based on the structural similarity of the metadata graph, to determine the pass between multiple business Connection relation, including:
The summit for integrating the corresponding metadata subgraph structure of any two business combines the similar of vertex attribute The similarity on the side of degree metadata subgraph structure corresponding with any two business, weighs any not of the same trade or business Relevance between business;
According to the angle for being actually needed concern, weights are adjusted by regulatory factor and obtain business association angle value, by The business association angle value determines the incidence relation between multiple business.
A kind of multi-service correlation analysis device based on metadata graph structural similarity of the embodiment of the present invention, Described device includes:
Unit is set up, for being obtained from multiple business after metadata, the graph of a relation of metadata object is set up;
With linear model with the presence or absence of common in processing unit, the graph of a relation for judging the metadata object Metadata object and metadata object attribute, if it is present according to the graph of a relation of the metadata object The summit of middle structure and vertex attribute and the similitude on side, obtain the structural similarity of metadata graph;
Determining unit, for the structural similarity based on the metadata graph, to determine between multiple business Incidence relation.
It is described to set up unit in such scheme, further comprise:
Classification subelement, for the metadata to be divided into multiple classes, each class point according to different granularities The descriptive model do not set up is the meta-model;
Subelement is constituted, for constituting the metadata object by the example or entity of the meta-model;
Relation sets up subelement, for being built according to the reference between the metadata object or data flow relation Vertical metadata relationship, and using metadata object as summit, the relation between metadata object is side, sets up member The digraph of data object, using the digraph of the metadata object as the metadata object graph of a relation.
In such scheme, described device also includes:
Relation between resource object and the resource object that each business is related to all is supported to use first number It is indicated according to the digraph of object.
In such scheme, the processing unit further comprises:
The summit of structure combines top in first processing subelement, the graph of a relation for obtaining the metadata object The similarity of point attribute;
The similarity on side in second processing subelement, the graph of a relation for obtaining the metadata object;
3rd processing subelement, for according to the summit with reference to the similarity of vertex attribute and the phase on the side Like spending, the structural similarity of metadata graph is obtained.
In such scheme, the first processing subelement is further used for:
Each business is indicated with the metadata subgraph of a metadata digraph;
The public vertex and its attribute for obtaining two metadata subgraphs account for the proportion of specified specification figure, according to described Proportion calculates similarity of the summit with reference to vertex attribute of the corresponding metadata subgraph structure of any two business.
In such scheme, the second processing subelement is further used for:
Each business is indicated with the metadata subgraph of a metadata digraph;
The common edge for obtaining two metadata subgraphs accounts for the proportion of specified specification figure, is calculated and appointed according to the proportion The similarity on the side of the corresponding metadata subgraph structure of two business of anticipating.
In such scheme, the determining unit is further used for:
The summit for integrating the corresponding metadata subgraph structure of any two business combines the similar of vertex attribute The similarity on the side of degree metadata subgraph structure corresponding with any two business, weighs any not of the same trade or business Relevance between business;
According to the angle for being actually needed concern, weights are adjusted by regulatory factor and obtain business association angle value, by The business association angle value determines the incidence relation between multiple business.
The multi-service correlation analysis method based on metadata graph structural similarity of the embodiment of the present invention includes: Obtained from multiple business after metadata, set up the graph of a relation of metadata object;Judge the metadata object Graph of a relation in linear model whether there is common metadata object and metadata object attribute, if deposited , then according to the similitude on the summit of structure and vertex attribute and side in the graph of a relation of the metadata object, Obtain the structural similarity of metadata graph;Based on the structural similarity of the metadata graph, to determine multiple industry Incidence relation between business.Using the embodiment of the present invention, the degree of accuracy and the effect of correlation analysis can be improved.
Brief description of the drawings
Fig. 1 is the method flow schematic diagram of the embodiment of the present invention;
Fig. 2 is the three-tier architecture schematic diagram of metadata system in the application scenarios using the embodiment of the present invention;
Fig. 3 is the multi-service based on metadata graph structural similarity in the application scenarios using the embodiment of the present invention Correlation analysis schematic diagram;
Fig. 4 is the multi-service based on metadata graph structural similarity in the application scenarios using the embodiment of the present invention Correlation analysis flow chart.
Embodiment
The implementation to technical scheme is described in further detail below in conjunction with the accompanying drawings.
A kind of multi-service correlation analysis method based on metadata graph structural similarity of the embodiment of the present invention, As shown in figure 1, methods described includes:
Step 101, from multiple business obtain metadata after, set up the graph of a relation of metadata object.
Step 102, judge in the graph of a relation of the metadata object to whether there is common member with linear model Data object and metadata object attribute, if it is present being tied according in the graph of a relation of the metadata object The summit of structure and vertex attribute and the similitude on side, obtain the structural similarity of metadata graph.
Step 103, the structural similarity based on the metadata graph, to determine the association between multiple business Relation.
It is described to be obtained from multiple business after metadata in the embodiment of the embodiment of the present invention one, set up member The graph of a relation of data object, including:The metadata is divided into multiple classes according to different granularities, it is each The descriptive model that class is set up respectively is the meta-model;The member is constituted by the example or entity of the meta-model Data object;Metadata relationship is set up according to the reference between the metadata object or data flow relation, And using metadata object as summit, the relation between metadata object is side, sets up the oriented of metadata object Figure, using the digraph of the metadata object as the metadata object graph of a relation.
In the embodiment of the embodiment of the present invention one, methods described also includes:The resource pair that each business is related to As the relation between the resource object all supports that the digraph using the metadata object is indicated.
In the embodiment of the embodiment of the present invention one, structure in the graph of a relation according to the metadata object Summit and vertex attribute and side similitude, obtain the structural similarity of metadata graph, including:Obtain The summit of structure combines the similarity of vertex attribute in the graph of a relation of the metadata object;Obtain first number According to the similarity on side in the graph of a relation of object;The similarity with reference to vertex attribute and the side according to the summit Similarity, obtain the structural similarity of metadata graph.
In the embodiment of the embodiment of the present invention one, structure in the graph of a relation for obtaining the metadata object Summit combine vertex attribute similarity, including:The metadata of one metadata digraph of each business Subgraph is indicated;The public vertex and its attribute for obtaining two metadata subgraphs account for specified specification figure (as most Small figure) proportion, the summit of the corresponding metadata subgraph structure of any two business is calculated according to the proportion With reference to the similarity of vertex attribute.
In the embodiment of the embodiment of the present invention one, side in the graph of a relation for obtaining the metadata object Similarity, including:Each business is indicated with the metadata subgraph of a metadata digraph;Obtain two The common edge of individual metadata subgraph accounts for the proportion of specified specification figure (such as minimal graph), is calculated according to the proportion The similarity on the side of the corresponding metadata subgraph structure of any two business.
In the embodiment of the embodiment of the present invention one, based on the structural similarity of the metadata graph, to determine Incidence relation between multiple business, including:Integrate the corresponding metadata subgraph knot of any two business Similarity and any two business corresponding metadata subgraph structure of the summit of structure with reference to vertex attribute The similarity on side, weighs the relevance between any different business;According to the angle for being actually needed concern, lead to Overregulate the factor adjustment weights obtain business association angle value, by the business association angle value determine multiple business it Between incidence relation.
The multi-service correlation analysis device based on metadata graph structural similarity of the embodiment of the present invention, it is described Device includes:Unit is set up, for being obtained from multiple business after metadata, the pass of metadata object is set up System's figure;And processing unit, it whether there is with linear model in the graph of a relation for judging the metadata object Common metadata object and metadata object attribute, if it is present according to the pass of the metadata object It is summit and the similitude on vertex attribute and side of structure in figure, obtains the structural similarity of metadata graph; And determining unit, for the structural similarity based on the metadata graph, to determine the pass between multiple business Connection relation.
It is described to set up unit in the embodiment of the embodiment of the present invention one, further comprise:
Classification subelement, for the metadata to be divided into multiple classes, each class point according to different granularities The descriptive model do not set up is the meta-model;
Subelement is constituted, for constituting the metadata object by the example or entity of the meta-model;
Relation sets up subelement, for being built according to the reference between the metadata object or data flow relation Vertical metadata relationship, and using metadata object as summit, the relation between metadata object is side, sets up member The digraph of data object, using the digraph of the metadata object as the metadata object graph of a relation.
In the embodiment of the embodiment of the present invention one, described device also includes:
Relation between resource object and the resource object that each business is related to all is supported to use first number It is indicated according to the digraph of object.
In the embodiment of the embodiment of the present invention one, the processing unit further comprises:
The summit of structure combines top in first processing subelement, the graph of a relation for obtaining the metadata object The similarity of point attribute;
The similarity on side in second processing subelement, the graph of a relation for obtaining the metadata object;
3rd processing subelement, for according to the summit with reference to the similarity of vertex attribute and the phase on the side Like spending, the structural similarity of metadata graph is obtained.
In the embodiment of the embodiment of the present invention one, the first processing subelement is further used for:
Each business is indicated with the metadata subgraph of a metadata digraph;
The public vertex and its attribute for obtaining two metadata subgraphs account for the proportion of specified specification figure, according to described Proportion calculates similarity of the summit with reference to vertex attribute of the corresponding metadata subgraph structure of any two business.
In the embodiment of the embodiment of the present invention one, the second processing subelement is further used for:
Each business is indicated with the metadata subgraph of a metadata digraph;
The common edge for obtaining two metadata subgraphs accounts for the proportion of specified specification figure, is calculated and appointed according to the proportion The similarity on the side of the corresponding metadata subgraph structure of two business of anticipating.
In the embodiment of the embodiment of the present invention one, the determining unit is further used for:
The summit for integrating the corresponding metadata subgraph structure of any two business combines the similar of vertex attribute The similarity on the side of degree metadata subgraph structure corresponding with any two business, weighs any not of the same trade or business Relevance between business;
According to the angle for being actually needed concern, weights are adjusted by regulatory factor and obtain business association angle value, by The business association angle value determines the incidence relation between multiple business.
The embodiment of the present invention is described below by taking a practical application scene as an example:
First an application scenarios of the embodiment of the present invention are described below:
In current big data epoch, business intelligence BI successful realization and with depending on effective metadata Management and application.Metadata is defined as describing the data of other data, mainly including business, technology and pipe The data such as related subject, concept, term, structure, flow, relation and the rules in field such as reason.High level Metadata application can serve as mark of leading the way for the data of various complicated systems and magnanimity, can help to use Family is best understood from the ins and outs of miscellaneous service, and enhancing data lift number to the base support ability of business According to the management and control ability of quality, efficient business administration is realized.However, the application of metadata is still within present Simple service stage, lack the research and application of high-level depth, in multiple business complex relationship analysis sides Face stills need to greatly improve.
Above-mentioned application scenarios use the embodiment of the present invention, are the summits according to the corresponding metadata digraph of business And its similitude on attribute and side weighs the relevance between multiple business, can reflect between multiple business The influence relation of complex cross, relation provides guidance between being familiar with multiple business for business personnel, is that enterprise enters Row performance analysis provides decision-making.The subject matter that can be solved has:1) by setting up the graph of a relation of metadata object And compare the similitude on graph structure summit and its attribute and side and determine incidence relation between multiple business, Influence degree of the business change to other business can intuitively be reflected.2) consider and used without business The situation of the front and rear logical relation of identical Resource Properties and business, takes full advantage of the top of metadata object When the similitude of metadata graph structure is weighed on point and its attribute information and the corresponding side of front and rear logic flow relation, Make business association analysis result more credible.3) calculated based on metadata graph structural similarity and obtain industry A series of applications can be created after the business degree of association, for example:Business change influence early warning system, combing and merging The redundancy of business repeats flow, the classification of automatic auxiliary activities etc., can solve some complexity that big data faces Problem.
The three-tier architecture of metadata system increases as shown in Fig. 2 above-mentioned application scenarios use the embodiment of the present invention The multi-service meta object analytic function module (A15 in such as Fig. 2 based on metadata graph structural similarity is added It is shown), the high-rise expanded application such as business as shown in A16 in Fig. 2 is further provided on this basis Change the moulds such as warning module, the redundancy repetition process module for combing merging business and the classification of automatic auxiliary activities Block, remaining is with A11, A12, A13, and the module that A14 is marked is existing module.
Cardinal principle such as Fig. 3 of multi-service correlation analysis functional module based on metadata graph structural similarity Shown, Fig. 3 is the multi-service correlation analysis schematic diagram based on metadata graph structural similarity, is specifically described It is as follows:
One, the resource object that different business is used has similar part, i.e., different business have all been used in conjunction with certain Some attributes of a little resource objects or these resource objects, map metadata figure is that summit and its attribute have altogether Same part, then be the presence of association between these business.
For example:Tu3Zhong companies have two business to distinguish corresponding two occupation rate of market forms, this two reports The data of table are all collected by same table, but some phases of the table have been used according to different bores With field, then the related form of the two business be associated with by the table and the attribute field of table being related to jointly Together.
At present, structural data (for example, relational database, OLAP on-line analysis data etc.) and non-knot The description information of structure data is (for example, journal file, XML file, Webservice interfaces, Hadoop Platform data etc.) be common generation metadata main body, by the description data of these data are carried out from Dynamic or manual extraction typing is the main path that layer obtains data that obtains of metadata system.
In the logical layer of metadata system, metadata is divided into δ classes, every kind of classification according to different granularities Set up a descriptive model respectively, referred to as meta-model, all metadata can so be carried out by meta-model Classify and be expressed as a set M={ m1,m2,...,mδ, wherein each meta-model mχSeveral attributes can be used Description, i.e. mχ=(a1,a2,...,aκ).The example or entity of one meta-model are referred to as meta object, are expressed asMetadata relationship is set up according to the reference between meta object or data flow relation, It is expressed as rχ,γThat is metadata objectWithBetween relation.Using meta object as summit, between meta object Relation be used as side, then the digraph of metadata can be set up, be expressed as G=< V, E>, wherein summit table It is shown as setWhile being expressed as adjacency matrixSo, often Relation between resource object and these resource objects that individual business is related to can use the oriented chart of metadata Show to come.In the functional layer of metadata, based on carrying out the abstract digraph for obtaining metadata to business, Compare the attribute that whether there is common meta object and meta object in metadata digraph with linear model, root The relevance between different business is weighed according to the summit of metadata graph structure and the similitude of vertex attribute.
Because the attribute dimensions of same linear model are identicals, therefore the meta object being derived with linear model Dimension size with the attribute of meta object is also identical, but because business is different, specifically Meta object or its property value are probably different, using the embodiment of the present invention, are weighed using cosine similarity With linear model mχDifferent meta-objectWithAttribute between similarity, calculation formula (1-1) is such as Under:
Wherein, if'sAttribute is not sky, then it represents that is 1, is otherwise 0.
Each business is represented with the subgraph of a metadata digraph, it is considered to the public vertex and its category of two figures Property accounts for the proportion of minimal graph, then can calculate the corresponding metadata graph structure of any two business α and β Summit combines the similarity of vertex attribute, shown in such as formula (1-2):
Wherein subgraph gα,gβ∈G。
Two, the logical process of different business is similar, that is to say, that these different business all employ from certain A little resource objects or its attribute map first number to the logic flow of other resource objects or its corresponding attribute It is common continuous directed edge according to figure, then is associated between these business.
In the present embodiment, from the perspective of abstract metadata digraph, if being existed between meta object Common continuous, oriented side, then can weigh different according to the similitude on the side of metadata graph structure Relevance between business.The related form of two business of such as Shang Lizhong companies by the table that is related to jointly and The attribute field of table is associated together, and this table and field are all handled by same storing process, It thus there is continuous, the oriented logical links from storing process to table and its between field.
The proportion that the common edge of the corresponding metadata digraph of two business of consideration accounts for minimal graph can be calculated arbitrarily Shown in the side similarity of the corresponding metadata graph structure of two business α and β, such as formula (1-3):
Three, the summit of comprehensive metadata graph structure and the similitude of vertex attribute and the similitude on side the two sides The relevance between multiple different business is weighed in face, according to the angle that is actually needed concern, by adjust because Son adjustment weights obtain business association angle value.
Often two business of consideration are used in conjunction with simultaneously in reality resource object and attribute and service logic flow To compare the relation between two business, therefore the present embodiment combination above-mentioned two formula (1-2) and (1-3), Calculating any two business α and β degree of association formula is proposed, as shown in (1-4):
Rel (α, β)=sim (gα,gβ)=θ sv (gα,gβ)+(1-θ)·se(gα,gβ) (1-4)
If a business is the subservice of another business in two business α and β, then the two industry The degree of association of business is 100%, i.e. rel (α, β)=1.
Four, according to the association angle value between different business, a series of applications can be created, for example:Business becomes More influence early warning system, the redundancy for combing and merging business repeat flow, the classification of automatic auxiliary activities etc..
The multi-service correlation analysis knot of metadata graph structural similarity is utilized in the functional layer of metadata system Fruit can set up a series of level expansion applications.
Specifically, business change influence early warning system can assess the change operation of a business in advance to it The influence of his business, if the degree of association of business is high and influences to exceed threshold value of warning, sends alarm, this Sample can avoid only considering one business of change and ignoring the serious harmful effect that other business are produced.
The application that the redundancy of combing and merging business repeats flow can find out these industry according to the degree of association of business Redundancy that may be present of being engaged in repeats flow and these flows is adjusted into merging, so as to save resource and cost.
The application of automatic auxiliary activities classification can be carried out according to the degree of association of business and existing class of service Automatic subsidiary classification, reduces the workload of manual sort.
Fig. 4 is the multi-service correlation analysis flow chart based on metadata graph structural similarity, as shown in figure 4, The whole flow process of multi-service correlation analysis method based on metadata graph structural similarity is as follows:
Step 11, the business progress metadata acquisition to needing to manage.
Here, carrying out metadata acquisition includes interface, source code, the document of operation system, the table of database, View, storing process, the rule such as ETL data pick-up, cleaning, conversion, mapping, loading model work The collection of the resource objects such as data model API, OLAP on-line analysis data of tool description and relation rule The collection of description, wherein structural data can be obtained by data dictionary, and unstructured data includes XML File, journal file, Webservice interfaces, Hadoop platform etc. are parsed by providing standard rule Mode obtain.
Step 12, meta-model, each meta-model m are set up according to the granularity of regulationχSeveral attribute descriptions are used, That is mχ=(a1,a2,...,aκ), metadata is classified by meta-model.
Step 13, metadata is described according to meta-model and sets up meta objectAccording between meta object Reference/be cited or the rule such as data outflow/inflow to set up metadata relationship r χ, γ be metadata objectWithBetween relation.Using meta object as summit, the relation between meta object sets up metadata as side Digraph G=< V, E >, wherein summit are setWhile being adjacency matrix
Step 14, some resource objects or these resource objects all it have been used in conjunction with according to different business Then there is the principle of association between these business in some attributes, by aforementioned formula (1-1) and (1-2) calculating The summit of the corresponding metadata graph structures of any two business α and β combines the similarity of vertex attribute.
Step 15, it all employ from some resource objects or its attribute to other resources according to different business The logic flow of object or its corresponding attribute is then associated principle between these business, by aforementioned formula (1-3) calculates the side similarity of the corresponding metadata graph structure of any two business α and β.
The similitude of step 16, the similitude on the summit of comprehensive metadata graph structure and vertex attribute and side this two Individual aspect to calculate any two business α and the β degree of association by formula (1-4).
Step 17, according to two business α with β degree of association threshold values judge whether two business α associate with β, If then performing step 18, otherwise, return re-executes step 12.
Step 18, a series of applications of association angle value establishment according to two business α and β.
Here, for creating a series of applications according to association angle value, for example:Business change influence early warning system System, the redundancy for combing and merging business repeat flow, the classification of automatic auxiliary activities etc..
Using the embodiment of the present invention, 1) some resource objects or its some category have been used in conjunction with according to different business Property, and be used in conjunction with from some resource objects or its attribute to other resource objects or its corresponding category The summit for the metadata digraph structure that the similitude of the service logic flow of property is taken out and its attribute and The similitude on side, relevance between multiple business is weighed according to the similitude;2) metadata digraph is used The similitude on the summit of structure and its attribute and side analyzes the relevance between multiple business and realizes application The metadata association analysis module of layer.The embodiment of the present invention, which compensate for existing metadata technique, can not be competent at processing The deficiency of complex relationship between multiple business, this many industry based on metadata graph structural similarity proposed Business correlation analysis method, by setting up the graph of a relation of metadata object and comparing graph structure summit and its category Property and the similitude on side determine incidence relation between multiple business, can intuitively reflect a business change More to the influence degree of other business, can create on this basis a series of applications realize business change early warning, Redundant traffic flow merges and automatic auxiliary activities is classified etc., solves some problems faced in big data.This Scheme has higher practicality in actual applications.
If integrated module described in the embodiment of the present invention is realized using in the form of software function module and as independently Production marketing or in use, can also be stored in a computer read/write memory medium.Based on so Understanding, the part that the technical scheme of the embodiment of the present invention substantially contributes to prior art in other words can To be embodied in the form of software product, the computer software product is stored in a storage medium, bag Some instructions are included to so that a computer equipment (can be personal computer, server or network Equipment etc.) perform all or part of each of the invention embodiment methods described.And foregoing storage medium bag Include:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with storage program generation The medium of code.So, the embodiment of the present invention is not restricted to any specific hardware and software combination.
Accordingly, the embodiment of the present invention also provides a kind of computer-readable storage medium, wherein the computer journey that is stored with Sequence, the computer program is used for the multi-service based on metadata graph structural similarity for performing the embodiment of the present invention Correlation analysis method.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the protection model of the present invention Enclose.

Claims (14)

1. a kind of multi-service correlation analysis method based on metadata graph structural similarity, it is characterised in that Methods described includes:
Obtained from multiple business after metadata, set up the graph of a relation of metadata object;
Judge in the graph of a relation of the metadata object with linear model with the presence or absence of common metadata object and Metadata object attribute, if it is present according to the summit of structure in the graph of a relation of the metadata object and Vertex attribute and the similitude on side, obtain the structural similarity of metadata graph;
Based on the structural similarity of the metadata graph, to determine the incidence relation between multiple business.
2. according to the method described in claim 1, it is characterised in that described that first number is obtained from multiple business According to rear, the graph of a relation of metadata object is set up, including:
The metadata is divided into multiple classes, the descriptive model that each class is set up respectively according to different granularities For the meta-model;
The metadata object is constituted by the example or entity of the meta-model;
Metadata relationship is set up according to the reference between the metadata object or data flow relation, and with member Data object is that the relation between summit, metadata object is side, sets up the digraph of metadata object, will The digraph of the metadata object as the metadata object graph of a relation.
3. method according to claim 2, it is characterised in that methods described also includes:
Relation between resource object and the resource object that each business is related to all is supported to use first number It is indicated according to the digraph of object.
4. method according to claim 2, it is characterised in that described according to the metadata object The similitude on the summit of structure and vertex attribute and side in graph of a relation, obtains the structural similarity of metadata graph, Including:
The summit for obtaining structure in the graph of a relation of the metadata object combines the similarity of vertex attribute;
Obtain the similarity on side in the graph of a relation of the metadata object;
According to the summit with reference to the similarity of vertex attribute and the similarity on the side, metadata graph is obtained Structural similarity.
5. method according to claim 4, it is characterised in that the acquisition metadata object The summit of structure combines the similarity of vertex attribute in graph of a relation, including:
Each business is indicated with the metadata subgraph of a metadata digraph;
The public vertex and its attribute for obtaining two metadata subgraphs account for the proportion of specified specification figure, according to described Proportion calculates similarity of the summit with reference to vertex attribute of the corresponding metadata subgraph structure of any two business.
6. the method according to claim 4 or 5, it is characterised in that the acquisition metadata pair The similarity on side in the graph of a relation of elephant, including:
Each business is indicated with the metadata subgraph of a metadata digraph;
The common edge for obtaining two metadata subgraphs accounts for the proportion of specified specification figure, is calculated and appointed according to the proportion The similarity on the side of the corresponding metadata subgraph structure of two business of anticipating.
7. method according to claim 6, it is characterised in that the structure phase based on the metadata graph Like property, to determine the incidence relation between multiple business, including:
The summit for integrating the corresponding metadata subgraph structure of any two business combines the similar of vertex attribute The similarity on the side of degree metadata subgraph structure corresponding with any two business, weighs any not of the same trade or business Relevance between business;
According to the angle for being actually needed concern, weights are adjusted by regulatory factor and obtain business association angle value, by The business association angle value determines the incidence relation between multiple business.
8. a kind of multi-service correlation analysis device based on metadata graph structural similarity, it is characterised in that Described device includes:
Unit is set up, for being obtained from multiple business after metadata, the graph of a relation of metadata object is set up;
With linear model with the presence or absence of common in processing unit, the graph of a relation for judging the metadata object Metadata object and metadata object attribute, if it is present according to the graph of a relation of the metadata object The summit of middle structure and vertex attribute and the similitude on side, obtain the structural similarity of metadata graph;
Determining unit, for the structural similarity based on the metadata graph, to determine between multiple business Incidence relation.
9. device according to claim 8, it is characterised in that described to set up unit, further comprises:
Classification subelement, for the metadata to be divided into multiple classes, each class point according to different granularities The descriptive model do not set up is the meta-model;
Subelement is constituted, for constituting the metadata object by the example or entity of the meta-model;
Relation sets up subelement, for being built according to the reference between the metadata object or data flow relation Vertical metadata relationship, and using metadata object as summit, the relation between metadata object is side, sets up member The digraph of data object, using the digraph of the metadata object as the metadata object graph of a relation.
10. device according to claim 9, it is characterised in that described device also includes:
Relation between resource object and the resource object that each business is related to all is supported to use first number It is indicated according to the digraph of object.
11. device according to claim 9, it is characterised in that the processing unit, is further wrapped Include:
The summit of structure combines top in first processing subelement, the graph of a relation for obtaining the metadata object The similarity of point attribute;
The similarity on side in second processing subelement, the graph of a relation for obtaining the metadata object;
3rd processing subelement, for according to the summit with reference to the similarity of vertex attribute and the phase on the side Like spending, the structural similarity of metadata graph is obtained.
12. device according to claim 11, it is characterised in that the first processing subelement, enters One step is used for:
Each business is indicated with the metadata subgraph of a metadata digraph;
The public vertex and its attribute for obtaining two metadata subgraphs account for the proportion of specified specification figure, according to described Proportion calculates similarity of the summit with reference to vertex attribute of the corresponding metadata subgraph structure of any two business.
13. the device according to claim 11 or 12, it is characterised in that second processing is single Member, is further used for:
Each business is indicated with the metadata subgraph of a metadata digraph;
The common edge for obtaining two metadata subgraphs accounts for the proportion of specified specification figure, is calculated and appointed according to the proportion The similarity on the side of the corresponding metadata subgraph structure of two business of anticipating.
14. device according to claim 13, it is characterised in that the determining unit, is further used In:
The summit for integrating the corresponding metadata subgraph structure of any two business combines the similar of vertex attribute The similarity on the side of degree metadata subgraph structure corresponding with any two business, weighs any not of the same trade or business Relevance between business;
According to the angle for being actually needed concern, weights are adjusted by regulatory factor and obtain business association angle value, by The business association angle value determines the incidence relation between multiple business.
CN201610150952.0A 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity Active CN107203529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610150952.0A CN107203529B (en) 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610150952.0A CN107203529B (en) 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity

Publications (2)

Publication Number Publication Date
CN107203529A true CN107203529A (en) 2017-09-26
CN107203529B CN107203529B (en) 2020-02-21

Family

ID=59903515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610150952.0A Active CN107203529B (en) 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity

Country Status (1)

Country Link
CN (1) CN107203529B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416525A (en) * 2018-03-13 2018-08-17 三峡大学 A kind of procedural model method for measuring similarity based on metadata
CN109766940A (en) * 2018-12-29 2019-05-17 北京天诚同创电气有限公司 The method and apparatus for assessing the similarity between multiple sewage disposal systems
CN110287223A (en) * 2019-06-24 2019-09-27 北京明略软件系统有限公司 Information storage means and device, electronic device and storage medium
CN110795524A (en) * 2019-10-31 2020-02-14 北京东软望海科技有限公司 Main data mapping processing method and device, computer equipment and storage medium
CN111951035A (en) * 2019-05-17 2020-11-17 上海树融数据科技有限公司 Consumption analysis method, system, device and consumption analysis platform
CN113687825A (en) * 2021-08-25 2021-11-23 恒安嘉新(北京)科技股份公司 Software module construction method, device, equipment and storage medium
WO2022242524A1 (en) * 2021-05-19 2022-11-24 中兴通讯股份有限公司 Modeling method, network element data processing method and apparatus, electronic device, and medium
CN116149831A (en) * 2023-04-20 2023-05-23 山东海量信息技术研究院 Task scheduling method, system, electronic device, quantum cloud system and storage medium
CN111951035B (en) * 2019-05-17 2024-06-11 嘉兴树融数据科技有限公司 Consumption analysis method, system, device and platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1460580A1 (en) * 2001-12-14 2004-09-22 NEC Corporation Face meta-data creation and face similarity calculation
CN102239458A (en) * 2008-12-02 2011-11-09 起元技术有限责任公司 Visualizing relationships between data elements
CN102982168A (en) * 2012-12-12 2013-03-20 江苏省电力公司信息通信分公司 Metadata schema matching method based on XML (extensive markup language) document
CN104850632A (en) * 2015-05-22 2015-08-19 东北师范大学 Generic similarity calculation method and system based on heterogeneous information network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1460580A1 (en) * 2001-12-14 2004-09-22 NEC Corporation Face meta-data creation and face similarity calculation
CN102239458A (en) * 2008-12-02 2011-11-09 起元技术有限责任公司 Visualizing relationships between data elements
CN102982168A (en) * 2012-12-12 2013-03-20 江苏省电力公司信息通信分公司 Metadata schema matching method based on XML (extensive markup language) document
CN104850632A (en) * 2015-05-22 2015-08-19 东北师范大学 Generic similarity calculation method and system based on heterogeneous information network

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416525B (en) * 2018-03-13 2020-10-30 三峡大学 Flow model similarity measurement method based on metadata
CN108416525A (en) * 2018-03-13 2018-08-17 三峡大学 A kind of procedural model method for measuring similarity based on metadata
CN109766940B (en) * 2018-12-29 2024-02-02 北京天诚同创电气有限公司 Method and apparatus for evaluating similarity between multiple sewage treatment systems
CN109766940A (en) * 2018-12-29 2019-05-17 北京天诚同创电气有限公司 The method and apparatus for assessing the similarity between multiple sewage disposal systems
CN111951035A (en) * 2019-05-17 2020-11-17 上海树融数据科技有限公司 Consumption analysis method, system, device and consumption analysis platform
CN111951035B (en) * 2019-05-17 2024-06-11 嘉兴树融数据科技有限公司 Consumption analysis method, system, device and platform
CN110287223A (en) * 2019-06-24 2019-09-27 北京明略软件系统有限公司 Information storage means and device, electronic device and storage medium
CN110795524A (en) * 2019-10-31 2020-02-14 北京东软望海科技有限公司 Main data mapping processing method and device, computer equipment and storage medium
CN110795524B (en) * 2019-10-31 2022-07-05 望海康信(北京)科技股份公司 Main data mapping processing method and device, computer equipment and storage medium
WO2022242524A1 (en) * 2021-05-19 2022-11-24 中兴通讯股份有限公司 Modeling method, network element data processing method and apparatus, electronic device, and medium
CN113687825A (en) * 2021-08-25 2021-11-23 恒安嘉新(北京)科技股份公司 Software module construction method, device, equipment and storage medium
CN113687825B (en) * 2021-08-25 2023-12-12 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for constructing software module
CN116149831B (en) * 2023-04-20 2023-08-11 山东海量信息技术研究院 Task scheduling method, system, electronic device, quantum cloud system and storage medium
CN116149831A (en) * 2023-04-20 2023-05-23 山东海量信息技术研究院 Task scheduling method, system, electronic device, quantum cloud system and storage medium

Also Published As

Publication number Publication date
CN107203529B (en) 2020-02-21

Similar Documents

Publication Publication Date Title
CN107203529A (en) Multi-service correlation analysis method and device based on metadata graph structural similarity
US10019442B2 (en) Method and system for peer detection
CN104239501B (en) Mass video semantic annotation method based on Spark
AU2015369723B2 (en) Identifying join relationships based on transactional access patterns
US9659056B1 (en) Providing an explanation of a missing fact estimate
CN103646070A (en) Data processing method and device for search engine
Nikhil et al. A survey on text mining and sentiment analysis for unstructured web data
CN105159971A (en) Cloud platform data retrieval method
Verma et al. Predictive modeling to predict the residency of teachers using machine learning for the real-time
Rabl et al. The vision of BigBench 2.0
Pellegrino et al. A configurable evaluation framework for node embedding techniques
US7899776B2 (en) Explaining changes in measures thru data mining
Yang et al. A mixture record linkage approach for US patent inventor disambiguation
US11989182B2 (en) Systems and method for dynamically updating materiality distributions and classifications
Dave et al. Identifying big data dimensions and structure
Patel et al. Data Warehouse Modernization Using Document-Oriented ETL Framework for Real Time Analytics
CN108595693A (en) A kind of matrimony vine data-reduction system
Nambiar et al. Reinventing the TPC: from traditional to big data to internet of things
Korth et al. On the influence of grid cell size on taxi demand prediction
CN108664590A (en) A kind of matrimony vine data identification method
US20230409618A1 (en) Systems and method for dynamically updating materiality distributions and classifications in multiple dimensions
Noaeen et al. Transportation engineering on social question and answer websites: an empirical case study
Parish Venkata Kumar et al. Concept Summarization of Uncertain Categorical Data Streams Based on Cluster Ensemble Approach
Syed et al. Dynamic topography information landscapes–an incremental approach to visual knowledge discovery
Zhang et al. Research on text classification method based on word2vec and improved TF-IDF

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant