CN109359289B - Web service function similarity measurement method based on ontology - Google Patents
Web service function similarity measurement method based on ontology Download PDFInfo
- Publication number
- CN109359289B CN109359289B CN201810939188.4A CN201810939188A CN109359289B CN 109359289 B CN109359289 B CN 109359289B CN 201810939188 A CN201810939188 A CN 201810939188A CN 109359289 B CN109359289 B CN 109359289B
- Authority
- CN
- China
- Prior art keywords
- prop
- concept
- similarity
- service
- sim
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A Web service function similarity measurement method based on an ontology comprises the following steps: firstly, calculating semantic similarity between two concepts A and B in a domain ontology; a second step of providing a service S by combining the concept similarity calculation method of the first step 1 And service S 2 Input similarity Sim input The calculation method of (1); thirdly, combining the conceptual similarity calculation method of the first step to give a service S 1 And service S 2 Output similarity Sim output The calculation method of (1); the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity with service output Sim Output Computing service S 1 And service S 2 Functional similarity functional sim (S) of 1 ,S 2 ). The invention can reasonably measure the functional similarity among the services and optimize the service clustering effect.
Description
Technical Field
The invention relates to the field of Web service evolution, in particular to a Web service function similarity measurement method based on an ontology.
Background
A Web service is a software system intended to support cross-network interactions between machines. There are currently mainly two types of Web services: one is based on SOAP and the other is based on REST. The difference between the two is that different interfaces are used, SOAP-based Web services use the SOAP interface to pass messages and use the Web Services Description Language (WSDL) to describe Web services, which specifies a protocol and coding independent mechanism for Web service providers, which is an XML vocabulary describing services accessible on the network and mapping them into a collection of communication endpoints with messaging capabilities. Web services using REST interfaces use generic HTTP methods (GET, DELETE, POST, and PUT) to describe, publish, and use relevant resources.
Current research efforts are directed to providing semantic descriptions of Web services through the use of conceptualized knowledge called ontologies. An ontology is a vocabulary that describes a set of concepts within a domain (a domain may be defined as a particular subject domain or knowledge domain) and the relationships that exist between those concepts. It is applied to attribute reasoning within a domain, or to the definition of the domain itself. In the context of Web services, ontologies play an important role as a way to provide semantic descriptions of Web services. The enhancement of Web services descriptions has prompted the development of semantic Web services, which are described in a machine-understandable manner, which will have a significant impact on areas such as e-commerce and application integration, since it can enable dynamic, extensible and efficient collaboration between different systems and organizations.
With the continuous development of Web services, in order to adapt to changes in environment and changes in user requirements, the Web services in the internet need to evolve continuously. Therefore, web services have evolved as one of the important research points in the field of service computing. Meanwhile, the Web service is an important technology for constructing the software service, and important research significance and application value are provided for enabling the software system to operate in a self-adaptive mode and supporting dynamic evolution of the service.
The evolution of Web services generally refers to a process of changing a service after the service is released and operated in order to adapt to environmental changes and continuously meet user requirements. Due to the characteristics of dynamic, heterogeneous and autonomous Web services and the distributed characteristics of the services, most of the services integrated by the system are from different organizations, so that the evolution of the Web services faces more challenges compared with the traditional software evolution.
In the field of service evolution research, many researchers have proposed different optimization schemes for inter-service semantic similarity measurement methods. The Ana GMagitman of computer department of the information institute of Indiana university, USA introduces graph theory into the calculation of semantic similarity, and provides a calculation method. Dekang Lin of mannich topba university, canada, proposes to consider not only the amount of information shared between concepts, but also the amount of information that differs between concepts in calculating semantic similarity. Yuhua Li et al, manchester university of City, also proposed a method for measuring semantic similarity, which takes into account the influence of concept density factors on semantic similarity.
Disclosure of Invention
The evolution of Web services generally narrows the search space for service samples by service clustering operations, allowing the service matching process to be performed in a particular cluster, rather than a large pool of services with many unrelated services. In general, calculating the similarity between objects is an important step of a clustering algorithm. According to the invention, the similarity between the attribute names is calculated by calculating the similarity between the concept attributes of the ontology and considering the condition that the same attribute names are different, the similarity between the attribute names is calculated by combining a word semantic similarity measurement method of a naive Bayes model, and the influence of the quantity difference on the service input similarity is enhanced by the ratio of the parameter quantity difference to the parameter quantity, so that the functional similarity between services can be reasonably measured, and the service clustering effect is optimized.
In order to solve the technical problems, the invention adopts the technical scheme that:
a Web service function similarity measurement method based on an ontology comprises the following steps:
the method comprises the following steps of firstly, calculating semantic similarity between two concepts A and B in a domain ontology, wherein the process is as follows:
step (1.1) if the concepts A and B are identical or they are declared equivalent classes, the similarity Sim of the concepts A, B concept 1, otherwise, performing the step (1.2);
step (1.2) if the concept A is directly or indirectly a subclass of the concept B, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
wherein prop (A) and prop (B) respectively represent the attribute sets of concept A and concept B, and Size (prop (B)) and Size (prop (A)) respectively represent the attribute numbers of concept B and concept A, otherwise, the step (1.3) is carried out
Step (1.3) if the concept B is directly or indirectly a subclass of the concept A, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
step (1.4) if the concept A and the concept B have no parent-child relationship, but the two concepts directly or indirectly have a common parent class concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, all attributes of the concept A and the concept B are respectively traversed, the attribute names of the concept A and the concept B are subjected to feature extraction through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes word According to the similarity Sim between the concept attributes word Comparing with similarity judgment factor eta, judging whether the two attributes are the same attribute, counting, and finally calculating the similarity Sim of concepts A and B concept ;
Step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B concept Set to 0;
second step, combined with first stepA conceptual similarity calculation method of providing a service S 1 And service S 2 Input similarity Sim input The process of the calculation method is as follows:
step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and carrying out step (2.2);
step (2.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);
step (2.3) if d is less than or equal to 0, then service S 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long Carrying out the step (2.4);
step (2.4) traverse S long If the traversal is completed, go to step (2.8), otherwise go from S long Take out the next input parameter long i Carrying out the step (2.5);
step (2.5) traverse S short If the traversal is completed, returning to the step (2.4), otherwise, returning to the step S short Get the next input parameter short j Carrying out the step (2.6);
step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of ij Carrying out the step (2.7);
step (2.7) with Sim ij And InSim [ i ]]For comparison, if Sim ij Greater than InSim [ i]Then, will InSim [ i ]]Is set to Sim ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;
step (2.8) calculation service S 1 And service S 2 Input similarity Sim input The calculation formula is as follows:
wherein Size (S) long Input) andSize(S short input) represents the service S, respectively long Number of input parameters and service S short The number of input parameters, | d | represents the difference value of the number of the input parameters of the two services, and InSim is the maximum matching array of the similarity of the input parameters;
step three, combining the concept similarity calculation method of the step one, and providing a service S 1 And service S 2 Output similarity Sim output The process of the calculation method is as follows:
step (3.1), a service output parameter similarity maximum matching array OutSim is established and initialized, and step (3.2) is carried out;
step (3.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d by the quantity of the output parameters, and performing the step (3.3);
step (3.3) if d is less than or equal to 0, service S is carried out 1 Is set to S short Service S 2 Is set to S long Otherwise, will serve S 2 Is set to S short Service S 1 Is set to S long And (4) performing the step (3.4);
step (3.4) traverse S long If the traversal is completed, go to step (3.8), otherwise go from S long In the next output parameter long i And (5) carrying out the step (3.5);
step (3.5) traverse S short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is ended short In the next output parameter short j And (4) carrying out the step (3.6);
step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of ij Carrying out the step (3.7);
step (3.7) with Sim ij And OutSim [ i ]]For comparison, if Sim ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;
step (3.8) calculation service S 1 And service S 2 Output phaseSimilarity Sim Output The calculation formula is as follows:
wherein Size (S) long Output) and Size (S) short. Output) respectively represent services S long Number of output parameters and service S short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;
the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity to service output Sim Output Computing service S 1 And service S 2 Functional similarity functional sim (S) of 1 ,S 2 ) The calculation formula is as follows:
FunctionalSim(S 1 ,S 2 )=w 1 ×Sim input +w 2 ×Sim Output wherein the weight w 1 And w 2 Are real numbers between 0 and 1 and sum to 1, which represent the importance of the service consumer to input similarity and output similarity determinations.
Further, the step (1.4) comprises the following steps:
step (1.4.1) setting variable i to represent the same number of attributes in concept A and concept B and setting the initial value to be 0, and performing step (1.4.2);
step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, then step (1.4.7) is performed, otherwise, the next prop (A) is taken out from the prop (A) j And removing it from prop (A) and carrying out step (1.4.3);
step (1.4.3) if the traversal of the attribute set prop (B) of the concept B is completed, returning to step (1.4.2), otherwise, taking out the next prop (B) from the prop (B) k And removing it from prop (B) and carrying out step (1.4.4);
step (1.4.4) is based on naive Bayes model, combines WorkNet English dictionary, and pairs of prop (A) through ComputeFeature function j And prop (B) k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) The process comprises the following steps:
calculating the word semantics of each attribute name, wherein each word corresponds to one or more semantics, so that each word pair corresponds to one or more semantic pairs, and the semantic node distance with the shortest distance in all the semantic pairs corresponding to the word pair is defined as the word pair distance L (prop (A) j ,prop(B) k ) And defining the semantic pair depth with the shortest distance between semantic nodes as word pair depth D (prop (A) j ,prop(B) k ) Known attribute name prop (A) j Presence in semantic node v j1 ,v j2 ,…,v jn In the synonymous phrase of (1), attribute name prop (B) k Presence in semantic node v k1 ,v k2 ,…v km In a synonymous phrase of (1), then prop (A) j And prop (B) k The distance calculation formula and the depth calculation formula are as follows:
wherein L (v) ja ,v kb ) Representing semantic nodes v ja And semantic node v kb Distance of D (v) ja ,v kb ) Representing semantic pairs (v) ja ,v kb ) The depth of (d);
further, mean functions LW (i) and DW (o) are generated according to a training set of the naive Bayes model, and then a conditional probability distribution column P (L (prop (A)) is calculated by using the mean functions LW (i) and DW (o) j ,prop(B) k ) I C) and P (D (prop (A) j ,prop(B) k ) I C), wherein C is the word class classification, and the value range thereof is { U, N }, wherein U represents "consistent" and N represents "inconsistent", and finally, the adjustment factors α and β are calculated as follows:
step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V 1 =P(L(prop(A) j ,prop(B) k )=i|C=U), V 2 =P(D(prop(A) j ,prop(B) k )=o|C=U), V 3 =P(L(prop(A) j ,prop(B) k ) = i | C = N) and V 4 =P(D(prop(A) j ,prop(B) k ) = o | C = N), and finally calculating prop (a) by combining the adjustment factors α and β in step (1.4.4) j With prop (B) k Similarity between them Sim word The calculation formula is as follows:
Sim word (prop(A) j ,prop(B) k )=(αV 1 ×V 2 )/(αV 1 ×V 2 +βV 3 ×V 4 ) Carrying out the step (1.4.6);
step (1.4.6) if Sim word Greater than or equal to the similarity criterion factor eta, prop (A) j And prop (B) k Adding 1 to the variable i for the same attribute, returning to the step (1.4.2), and otherwise, returning to the step (1.4.3);
step (1.4.7) calculating similarity Sim of concepts A and B concept The calculation formula is as follows:
wherein i represents the same number of attributes in concept A and concept B, prop (A) and prop (B) represent attribute sets of concept A and concept B, respectively, and Size (prop (B)) and Size (prop (A)) represent attributes of concept B and concept A, respectivelyAnd (4) carrying out step (1.5).
The method has the advantages that when the similarity between the concept attributes of the ontology is calculated, the condition that the same attribute names are different is considered, and the similarity between the attribute names is calculated by combining a word semantic similarity measurement method of a naive Bayes model, so that errors caused by directly judging whether the attribute names are equal or not are avoided to a certain extent. In addition, when calculating the input (output) similarity, the algorithm considers the parameter quantity difference d, and enhances the influence of the quantity difference on the service input similarity through the ratio of the parameter quantity difference to the parameter quantity, so that the functional similarity between services can be reasonably measured.
Detailed Description
The present invention is further explained below.
A Web service function similarity measurement method based on an ontology considers the domain ontology concepts of input and output among services, and the matching between the input (output) mainly refers to the matching of concepts related to the input (output). To calculate the similarity of two concepts a and B, the relationship between the two concepts in the domain ontology needs to be considered.
A domain ontology is a specialized ontology describing knowledge of a given domain, where a "domain" is established according to the needs of an ontology builder, and may be a subject domain, a combination of several domains, or a small area within a domain. If two concepts in a domain ontology have different names but the same set of individuals, they are called equivalence classes (equivalent classes).
The measuring method comprises the following steps:
the method comprises the following steps of firstly, calculating semantic similarity between two concepts A and B in a domain ontology, wherein the process is as follows:
step (1.1) if the concepts A and B are identical or they are declared equivalent classes, the similarity Sim of the concepts A, B concept 1, otherwise, performing the step (1.2);
step (1.2) if concept A is directly or indirectly a subclass of concept BThen the similarity Sim of the concepts A and B concept The calculation formula is as follows:
wherein prop (A) and prop (B) respectively represent the attribute sets of concept A and concept B, and Size (prop (B)) and Size (prop (A)) respectively represent the attribute numbers of concept B and concept A, otherwise, the step (1.3) is carried out
Step (1.3) if the concept B is directly or indirectly a subclass of the concept A, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
step (1.4) if the concept A and the concept B have no parent-child relationship, but the two concepts directly or indirectly have a common parent class concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, all attributes of the concept A and the concept B are respectively traversed, the attribute names of the concept A and the concept B are subjected to feature extraction through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes word According to the similarity Sim between the concept attributes word Comparing with similarity judgment factor eta, judging whether two attributes are the same attribute, counting, and finally calculating similarity Sim of concepts A and B concept The method comprises the following steps:
step (1.4.1) setting variable i to represent the same number of attributes in concept A and concept B and setting the initial value to be 0, and performing step (1.4.2);
step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, then step (1.4.7) is performed, otherwise, the next prop (A) is taken out from the prop (A) j And removing it from prop (A) and carrying out step (1.4.3);
step (1.4.3) if the traversal of the attribute set prop (B) of the concept B is completed, returning to step (1.4.2), otherwise, taking out the next prop (B) from the prop (B) k And removing it from prop (B) and carrying out step (1.4.4);
step (1.4.4) is based on naive Bayes model, combines WorkNet English dictionary, and pairs of prop (A) through ComputeFeature function j And prop (B) k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) j ,prop(B) k ) And D (prop (A) j ,prop(B) k ). The naive Bayes model is one of the two most extensive classification models, and WorkNet is an English dictionary established and maintained by the university of Princeton university cognition science laboratory under the guidance of Miller in psychological teaching. The specific calculation process is as follows:
the word semantics for each attribute name are calculated, and since each word corresponds to one or more semantics, each word pair corresponds to one or more semantic pairs. Defining the semantic node distance with the shortest distance in all semantic pairs corresponding to the word pair as the word pair distance L (prop (A) j ,prop(B) k ) And defining the semantic pair depth with the shortest distance between semantic nodes as word pair depth D (prop (A) j ,prop(B) k ) Known attribute name prop (A) j Presence in semantic node v j1 ,v j2 ,…,v jn In the synonymous phrase of (1), attribute name prop (B) k Presence in semantic node v k1 ,v k2 ,…v km In a synonymous phrase, then prop (A) j With prop (B) k The distance calculation formula and the depth calculation formula are as follows:
wherein L (v) ja ,v kb ) Representing semantic nodes v ja And semantic node v kb Distance of D (v) ja ,v kb ) Representing semantic pairs (v) ja ,v kb ) Of the depth of (c).
Further, mean functions LW (i) and DW (o) are generated according to a training set of the naive Bayes model, and then a conditional probability distribution column P (L (prop (A)) is calculated by using the mean functions LW (i) and DW (o) j ,prop(B) k ) I C) and P (D (prop (A) j ,prop(B) k ) I C), where C is the word class classification with a range of { U, N }, where U stands for "consistent" and N stands for "inconsistent". Finally, adjusting factors alpha and beta are calculated, and the calculation formula is as follows:
step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V 1 =P(L(prop(A) j ,prop(B) k )=i|C=U),V 2 =P(D(prop(A) j ,prop(B) k )=o|C=U), V 3 =P(L(prop(A) j ,prop(B) k ) = i | C = N) and V 4 =P(D(prop(A) j ,prop(B) k ) = o | C = N). Finally, the pro (A) is calculated by combining the regulatory factors alpha and beta in step (1.4.4) j And prop (B) k Similarity between them Sim word The calculation formula is as follows:
Sim word (prop(A) j ,prop(B) k )=(αV 1 ×V 2 )/(αV 1 ×V 2 +βV 3 ×V 4 ) Carrying out the step (1.4.6);
step (1.4.6) if Sim word Greater than or equal to the similarity criterion factor eta, prop (A) j With prop (B) k Adding 1 to the variable i for the same attribute, returning to the step (1.4.2), and otherwise, returning to the step (1.4.3);
step (1.4.7) calculating similarity Sim of concepts A and B concept The calculation formula is as follows:
wherein i represents the number of the attributes in concept A and concept B, prop (A) and prop (B) represent the attribute sets of concept A and concept B respectively, and Size (prop (B)) and Size (prop (A)) represent the number of the attributes in concept B and concept A respectively, and the step (1.5) is carried out;
step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B concept Set to 0;
a second step of providing a service S by combining the concept similarity calculation method of the first step 1 And service S 2 Input similarity Sim input The process of the calculation method is as follows:
step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and carrying out step (2.2);
step (2.2) service S 1 Is subtracted by the service S 2 Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);
step (2.3) if d is less than or equal to 0, then service S 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long Carrying out the step (2.4);
step (2.4) traverse S long If the traversal is completed, go to step (2.8), otherwise go from S long In the next input parameter long is taken out i Carrying out the step (2.5);
step (2.5) traverse S short If the traversal is completed, returning to the step (2.4), otherwise, returning to the step S short Get the next input parameter short j Carrying out the step (2.6);
step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step i Root of Henan ginsengNumber short j Similarity Sim of ij Carrying out the step (2.7);
step (2.7) reacting Sim ij And InSim [ i ]]For comparison, if Sim ij Greater than InSim [ i]Then, will InSim [ i ]]Is set as Sim ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;
step (2.8) calculation service S 1 And service S 2 Input similarity Sim input The calculation formula is as follows:
wherein Size (S) long Input) and Size (S) short Input) respectively represent services S long Number of input parameters and service S short The number of input parameters, | d | represents the difference value of the number of the input parameters of the two services, and InSim is the maximum matching array of the similarity of the input parameters;
thirdly, combining the conceptual similarity calculation method of the first step to give a service S 1 And service S 2 Output similarity Sim output The calculation method comprises the following steps:
step (3.1) creating a service output parameter similarity maximum matching array OutSim and initializing, and performing step (3.2);
step (3.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d according to the output parameter quantity, and performing the step (3.3);
step (3.3) if d is less than or equal to 0, service S is carried out 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long And (4) performing the step (3.4);
step (3.4) traverse S long If the traversal is completed, go to step (3.8), otherwise go from S long In the next output parameter long i And (5) carrying out the step (3.5);
and (3).5) Traverse S short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is ended short In the next output parameter short j And (4) carrying out the step (3.6);
step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step i And the parameter short j Similarity Sim of ij Carrying out the step (3.7);
step (3.7) with Sim ij And OutSim [ i ]]For comparison, if Sim ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;
step (3.8) calculation service S 1 And service S 2 Output similarity Sim Output The calculation formula is as follows:
wherein Size (S) long Output) and Size (S) short Output) respectively represent the services S long Number of output parameters and service S short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;
the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity to service output Sim Output Computing service S 1 And service S 2 Functional similarity functional sim (S) of 1 ,S 2 ) The calculation formula is as follows:
FunctionalSim(S 1 ,S 2 )=w 1 ×Sim input +w 2 ×Sim Output wherein the weight w 1 And w 2 Is a real number between 0 and 1 and sums to 1. They represent the importance of the service consumer to input similarity and output similarity determination. By default, w 1 And w 2 Are all set to 0.5.
Claims (2)
1. A Web service function similarity measurement method based on an ontology is characterized by comprising the following steps:
the method comprises the following steps of firstly, calculating semantic similarity between two concepts A and B in a domain ontology, wherein the process is as follows:
step (1.1) if the concepts A and B are identical or they are declared equivalent classes, the similarity Sim of the concepts A, B concept 1, otherwise, performing the step (1.2);
step (1.2) if the concept A is directly or indirectly a subclass of the concept B, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
wherein prop (A) and prop (B) respectively represent the attribute sets of concept A and concept B, and Size (prop (B)) and Size (prop (A)) respectively represent the attribute numbers of concept B and concept A, otherwise, the step (1.3) is carried out
Step (1.3) if the concept B is directly or indirectly a subclass of the concept A, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
step (1.4) if the concept A and the concept B have no parent-child relationship but the two concepts directly or indirectly have a common parent concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, attributes of the concept A and the concept B are respectively traversed, feature extraction is carried out on the attribute names of the concept A and the concept B through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes word According to the similarity Sim between the concept attributes word Comparing with similarity judgment factor eta, judging whether two attributes are the same attribute, performing statistics, and finally calculating the similarity of concepts A and BDegree Sim concept ;
Step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B concept Is set to 0;
a second step of providing a service S by combining the concept similarity calculation method of the first step 1 And service S 2 Input similarity Sim input The calculation method comprises the following steps:
step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and performing step (2.2);
step (2.2) service S 1 Is subtracted by the service S 2 Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);
step (2.3) if d is less than or equal to 0, then service S 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long Carrying out the step (2.4);
step (2.4) traverse S long If the traversal is completed, go to step (2.8), otherwise go from S long Take out the next input parameter long i Carrying out the step (2.5);
step (2.5) traverse S short If the traversal is completed, returning to the step (2.4), otherwise, returning to the step S short Get the next input parameter short j Carrying out the step (2.6);
step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of ij Carrying out the step (2.7);
step (2.7) with Sim ij And InSim [ i ]]For comparison, if Sim ij Greater than InSim [ i]Then, will InSim [ i ]]Is set to Sim ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;
step (2.8) calculation service S 1 And service S 2 Input similarity Sim input The calculation formula is as follows:
wherein Size (S) long. Input) and Size (S) short. Input) represents services S, respectively long Number of input parameters and service S short The number of input parameters, | d | represents the difference value of the number of the input parameters of two services, and InSim is the maximum matching array of the similarity of the input parameters;
thirdly, combining the conceptual similarity calculation method of the first step to give a service S 1 And service S 2 Output similarity Sim output The calculation method comprises the following steps:
step (3.1) creating a service output parameter similarity maximum matching array OutSim and initializing, and performing step (3.2);
step (3.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d according to the output parameter quantity, and performing the step (3.3);
step (3.3) if d is less than or equal to 0, service S is carried out 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long And (4) performing the step (3.4);
step (3.4) traverse S long If the traversal is completed, go to step (3.8), otherwise go from S long In the next output parameter long i Carrying out the step (3.5);
step (3.5) traverse S short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is started short In the next output parameter short j Step (3.6) is carried out;
step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of * ij Carrying out the step (3.7);
step (3.7) with Sim * ij And OutSim [ i ]]By comparison, if Sim * ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim * ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;
step (3.8) calculation service S 1 And service S 2 Output similarity Sim Output The calculation formula is as follows:
wherein Size (S) long. Output) and Size (S) short. Output) respectively represent services S long Number of output parameters and service S short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;
the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity with service output Sim Output Computing service S 1 And service S 2 Functional similarity of (S) 1 ,S 2 ) The calculation formula is as follows:
FunctionalSim(S 1 ,S 2 )=w 1 ×Sim input +w 2 ×Sim Output wherein the weight w 1 And w 2 Are real numbers between 0 and 1 and sum to 1, which represent the importance of the service consumer to input similarity and output similarity determination.
2. The method of claim 1, wherein the step (1.4) comprises the following steps:
step (1.4.1) setting variable i to represent the same number of attributes in concept A and concept B and setting the initial value to be 0, and performing step (1.4.2);
step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, step (1.4.7) is carried out, otherwise, the next prop (A) is taken out from the prop (A) j And removing it from prop (A), proceeding to stepStep (1.4.3);
step (1.4.3) if the traversal of the attribute set prop (B) of the concept B is completed, returning to step (1.4.2), otherwise, taking out the next prop (B) from the prop (B) k And removing it from prop (B) and carrying out step (1.4.4);
step (1.4.4) is based on naive Bayes model, combines WordNet English dictionary, and pairs prop (A) through computeFeature function j And prop (B) k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) The process is as follows:
calculating the word semantics of each attribute name, wherein each word corresponds to one or more semantics, each word pair corresponds to one or more semantic pairs, and the semantic node distance with the shortest distance in all the semantic pairs corresponding to the word pair is defined as the word pair distance L (prop (A) j ,prop(B) k ) And defining the semantic pair depth with the shortest distance between the semantic nodes as the word pair depth D (prop (A) j ,prop(B) k ) Known attribute name prop (A) j Presence in semantic node v j1 ,v j2 ,…,v jn In the synonymous phrase of (1), attribute name prop (B) k Presence in semantic node v k1 ,v k2 ,…v km In a synonymous phrase, then prop (A) j And prop (B) k The distance calculation formula and the depth calculation formula are as follows:
wherein L (v) ja ,v kb ) Representing semantic nodes v ja And semantic node v kb Distance of D (v) ja ,v kb ) Representing semantic pairs (v) ja ,v kb ) Depth of (d);
according to the principleGenerating mean functions LW (i) and DW (o) by a training set of the naive Bayes model, and calculating a conditional probability distribution list P (L (prop (A)) by using the mean functions LW (i) and DW (o) j ,prop(B) k ) I C) and P (D (prop (A) j ,prop(B) k ) I C), wherein C is a word class classification, the value range is { U, N }, wherein U represents 'consistent' and N represents 'inconsistent', and finally, the adjustment factors alpha and beta are calculated according to the following calculation formula:
step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V 1 =P(L(prop(A) j ,prop(B) k )=i|C=U),V 2 =P(D(prop(A) j ,prop(B) k )=o|C=U),V 3 =P(L(prop(A) j ,prop(B) k ) = i | C = N) and V 4 =P(D(prop(A) j ,prop(B) k ) = o | C = N), and finally the adjustment factors a and a in step (1.4.4)To calculate prop (A) j And prop (B) k Similarity between them Sim word The calculation formula is as follows:
Sim word (prop(A) j ,prop(B) k )=(αV 1 ×V 2 )/(αV 1 ×V 2 +βV 3 ×V 4 ) Carrying out the step (1.4.6);
step (1.4.6) if Sim word If it is greater than or equal to the similarity determination factor eta, prop (A) j And prop (B) k Is the same asAdding 1 to the variable i, and returning to the step (1.4.2), otherwise, returning to the step (1.4.3);
step (1.4.7) calculating similarity Sim of concepts A and B concept The calculation formula is as follows:
wherein i represents the number of attributes in concept a and concept B that are the same, prop (a) and prop (B) represent the attribute sets of concept a and concept B, respectively, and Size (prop (B)) and Size (prop (a)) represent the number of attributes of concept B and concept a, respectively, and step (1.5) is performed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810939188.4A CN109359289B (en) | 2018-08-17 | 2018-08-17 | Web service function similarity measurement method based on ontology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810939188.4A CN109359289B (en) | 2018-08-17 | 2018-08-17 | Web service function similarity measurement method based on ontology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109359289A CN109359289A (en) | 2019-02-19 |
CN109359289B true CN109359289B (en) | 2023-01-31 |
Family
ID=65350095
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810939188.4A Active CN109359289B (en) | 2018-08-17 | 2018-08-17 | Web service function similarity measurement method based on ontology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109359289B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7457531B2 (en) * | 2020-02-28 | 2024-03-28 | 株式会社Screenホールディングス | Similarity calculation device, similarity calculation program, and similarity calculation method |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105404619A (en) * | 2015-09-08 | 2016-03-16 | 华南理工大学 | Similarity based semantic Web service clustering labeling method |
-
2018
- 2018-08-17 CN CN201810939188.4A patent/CN109359289B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105404619A (en) * | 2015-09-08 | 2016-03-16 | 华南理工大学 | Similarity based semantic Web service clustering labeling method |
Non-Patent Citations (6)
Title |
---|
A Semantic Similarity Measure for Semantic Web Services;Jeffrey Hau等;《WWW2005》;20050514;第1-9页 * |
An Information-Theoretic Definition of Similarity;Dekang Lin;《Proceedings of the Fifteenth International Conference on Machine Learning》;19980731;第296-304页 * |
一种基于语义相似度的Web服务匹配方法;张亮;《情报科学》;20160228;第34卷(第2期);第21-23页 * |
基于本体概念集合相似度的语义Web 服务匹配;杨佳等;《计算机技术与发展》;20120831;第22卷(第2期);第56-59页 * |
基于概念相似度计算的语义Web 服务发现方法;徐德智等;《计算技术与自动化》;20100630;第29卷(第2期);第98-101页 * |
面向全局社交服务网的Web服务聚类方法;陆佳炜;《计算机科学》;20180331;第45卷(第3期);第206-214页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109359289A (en) | 2019-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Su et al. | Building natural language interfaces to web apis | |
CN112148863B (en) | Generation type dialogue abstract method integrated with common knowledge | |
Zouaq et al. | Evaluating the generation of domain ontologies in the knowledge puzzle project | |
CN107135092B (en) | A kind of Web service clustering method towards global social interaction server net | |
CN109255125B (en) | Web service clustering method based on improved DBSCAN algorithm | |
US10963318B2 (en) | Natural language interface to web API | |
CN111639252A (en) | False news identification method based on news-comment relevance analysis | |
CN111625658A (en) | Voice interaction method, device and equipment based on knowledge graph and storage medium | |
CN109284086B (en) | Demand-oriented adaptive dynamic evolution method for Web service | |
WO2022179384A1 (en) | Social group division method and division system, and related apparatuses | |
CN108538294B (en) | Voice interaction method and device | |
CN111213136A (en) | Generation of domain-specific models in networked systems | |
CN112989208B (en) | Information recommendation method and device, electronic equipment and storage medium | |
CN108287848B (en) | Method and system for semantic parsing | |
CN111143574A (en) | Query and visualization system construction method based on minority culture knowledge graph | |
CN110909230A (en) | Network hotspot analysis method and system | |
CN110347401B (en) | API Framework service discovery method based on semantic similarity | |
CN109359289B (en) | Web service function similarity measurement method based on ontology | |
Yu et al. | A structured ontology construction by using data clustering and pattern tree mining | |
CN104317853B (en) | A kind of service cluster construction method based on Semantic Web | |
Pomp et al. | Eskape: Platform for enabling semantics in the continuously evolving internet of things | |
CN112417170A (en) | Relation linking method for incomplete knowledge graph | |
CN115329078B (en) | Text data processing method, device, equipment and storage medium | |
EP4120117A1 (en) | Disfluency removal using machine learning | |
CN115859963A (en) | Similarity judgment method and system for new word and semantic recommendation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |