CN109359289B - Web service function similarity measurement method based on ontology - Google Patents

Web service function similarity measurement method based on ontology Download PDF

Info

Publication number
CN109359289B
CN109359289B CN201810939188.4A CN201810939188A CN109359289B CN 109359289 B CN109359289 B CN 109359289B CN 201810939188 A CN201810939188 A CN 201810939188A CN 109359289 B CN109359289 B CN 109359289B
Authority
CN
China
Prior art keywords
prop
concept
similarity
service
sim
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810939188.4A
Other languages
Chinese (zh)
Other versions
CN109359289A (en
Inventor
陆佳炜
卢成炳
吴涵
周焕
徐俊
肖刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201810939188.4A priority Critical patent/CN109359289B/en
Publication of CN109359289A publication Critical patent/CN109359289A/en
Application granted granted Critical
Publication of CN109359289B publication Critical patent/CN109359289B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A Web service function similarity measurement method based on an ontology comprises the following steps: firstly, calculating semantic similarity between two concepts A and B in a domain ontology; a second step of providing a service S by combining the concept similarity calculation method of the first step 1 And service S 2 Input similarity Sim input The calculation method of (1); thirdly, combining the conceptual similarity calculation method of the first step to give a service S 1 And service S 2 Output similarity Sim output The calculation method of (1); the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity with service output Sim Output Computing service S 1 And service S 2 Functional similarity functional sim (S) of 1 ,S 2 ). The invention can reasonably measure the functional similarity among the services and optimize the service clustering effect.

Description

Web service function similarity measurement method based on ontology
Technical Field
The invention relates to the field of Web service evolution, in particular to a Web service function similarity measurement method based on an ontology.
Background
A Web service is a software system intended to support cross-network interactions between machines. There are currently mainly two types of Web services: one is based on SOAP and the other is based on REST. The difference between the two is that different interfaces are used, SOAP-based Web services use the SOAP interface to pass messages and use the Web Services Description Language (WSDL) to describe Web services, which specifies a protocol and coding independent mechanism for Web service providers, which is an XML vocabulary describing services accessible on the network and mapping them into a collection of communication endpoints with messaging capabilities. Web services using REST interfaces use generic HTTP methods (GET, DELETE, POST, and PUT) to describe, publish, and use relevant resources.
Current research efforts are directed to providing semantic descriptions of Web services through the use of conceptualized knowledge called ontologies. An ontology is a vocabulary that describes a set of concepts within a domain (a domain may be defined as a particular subject domain or knowledge domain) and the relationships that exist between those concepts. It is applied to attribute reasoning within a domain, or to the definition of the domain itself. In the context of Web services, ontologies play an important role as a way to provide semantic descriptions of Web services. The enhancement of Web services descriptions has prompted the development of semantic Web services, which are described in a machine-understandable manner, which will have a significant impact on areas such as e-commerce and application integration, since it can enable dynamic, extensible and efficient collaboration between different systems and organizations.
With the continuous development of Web services, in order to adapt to changes in environment and changes in user requirements, the Web services in the internet need to evolve continuously. Therefore, web services have evolved as one of the important research points in the field of service computing. Meanwhile, the Web service is an important technology for constructing the software service, and important research significance and application value are provided for enabling the software system to operate in a self-adaptive mode and supporting dynamic evolution of the service.
The evolution of Web services generally refers to a process of changing a service after the service is released and operated in order to adapt to environmental changes and continuously meet user requirements. Due to the characteristics of dynamic, heterogeneous and autonomous Web services and the distributed characteristics of the services, most of the services integrated by the system are from different organizations, so that the evolution of the Web services faces more challenges compared with the traditional software evolution.
In the field of service evolution research, many researchers have proposed different optimization schemes for inter-service semantic similarity measurement methods. The Ana GMagitman of computer department of the information institute of Indiana university, USA introduces graph theory into the calculation of semantic similarity, and provides a calculation method. Dekang Lin of mannich topba university, canada, proposes to consider not only the amount of information shared between concepts, but also the amount of information that differs between concepts in calculating semantic similarity. Yuhua Li et al, manchester university of City, also proposed a method for measuring semantic similarity, which takes into account the influence of concept density factors on semantic similarity.
Disclosure of Invention
The evolution of Web services generally narrows the search space for service samples by service clustering operations, allowing the service matching process to be performed in a particular cluster, rather than a large pool of services with many unrelated services. In general, calculating the similarity between objects is an important step of a clustering algorithm. According to the invention, the similarity between the attribute names is calculated by calculating the similarity between the concept attributes of the ontology and considering the condition that the same attribute names are different, the similarity between the attribute names is calculated by combining a word semantic similarity measurement method of a naive Bayes model, and the influence of the quantity difference on the service input similarity is enhanced by the ratio of the parameter quantity difference to the parameter quantity, so that the functional similarity between services can be reasonably measured, and the service clustering effect is optimized.
In order to solve the technical problems, the invention adopts the technical scheme that:
a Web service function similarity measurement method based on an ontology comprises the following steps:
the method comprises the following steps of firstly, calculating semantic similarity between two concepts A and B in a domain ontology, wherein the process is as follows:
step (1.1) if the concepts A and B are identical or they are declared equivalent classes, the similarity Sim of the concepts A, B concept 1, otherwise, performing the step (1.2);
step (1.2) if the concept A is directly or indirectly a subclass of the concept B, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
Figure BDA0001768618510000021
wherein prop (A) and prop (B) respectively represent the attribute sets of concept A and concept B, and Size (prop (B)) and Size (prop (A)) respectively represent the attribute numbers of concept B and concept A, otherwise, the step (1.3) is carried out
Step (1.3) if the concept B is directly or indirectly a subclass of the concept A, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
Figure BDA0001768618510000031
otherwise, performing the step (1.4);
step (1.4) if the concept A and the concept B have no parent-child relationship, but the two concepts directly or indirectly have a common parent class concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, all attributes of the concept A and the concept B are respectively traversed, the attribute names of the concept A and the concept B are subjected to feature extraction through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes word According to the similarity Sim between the concept attributes word Comparing with similarity judgment factor eta, judging whether the two attributes are the same attribute, counting, and finally calculating the similarity Sim of concepts A and B concept
Step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B concept Set to 0;
second step, combined with first stepA conceptual similarity calculation method of providing a service S 1 And service S 2 Input similarity Sim input The process of the calculation method is as follows:
step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and carrying out step (2.2);
step (2.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);
step (2.3) if d is less than or equal to 0, then service S 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long Carrying out the step (2.4);
step (2.4) traverse S long If the traversal is completed, go to step (2.8), otherwise go from S long Take out the next input parameter long i Carrying out the step (2.5);
step (2.5) traverse S short If the traversal is completed, returning to the step (2.4), otherwise, returning to the step S short Get the next input parameter short j Carrying out the step (2.6);
step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of ij Carrying out the step (2.7);
step (2.7) with Sim ij And InSim [ i ]]For comparison, if Sim ij Greater than InSim [ i]Then, will InSim [ i ]]Is set to Sim ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;
step (2.8) calculation service S 1 And service S 2 Input similarity Sim input The calculation formula is as follows:
Figure BDA0001768618510000032
wherein Size (S) long Input) andSize(S short input) represents the service S, respectively long Number of input parameters and service S short The number of input parameters, | d | represents the difference value of the number of the input parameters of the two services, and InSim is the maximum matching array of the similarity of the input parameters;
step three, combining the concept similarity calculation method of the step one, and providing a service S 1 And service S 2 Output similarity Sim output The process of the calculation method is as follows:
step (3.1), a service output parameter similarity maximum matching array OutSim is established and initialized, and step (3.2) is carried out;
step (3.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d by the quantity of the output parameters, and performing the step (3.3);
step (3.3) if d is less than or equal to 0, service S is carried out 1 Is set to S short Service S 2 Is set to S long Otherwise, will serve S 2 Is set to S short Service S 1 Is set to S long And (4) performing the step (3.4);
step (3.4) traverse S long If the traversal is completed, go to step (3.8), otherwise go from S long In the next output parameter long i And (5) carrying out the step (3.5);
step (3.5) traverse S short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is ended short In the next output parameter short j And (4) carrying out the step (3.6);
step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of ij Carrying out the step (3.7);
step (3.7) with Sim ij And OutSim [ i ]]For comparison, if Sim ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;
step (3.8) calculation service S 1 And service S 2 Output phaseSimilarity Sim Output The calculation formula is as follows:
Figure BDA0001768618510000041
wherein Size (S) long Output) and Size (S) short. Output) respectively represent services S long Number of output parameters and service S short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;
the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity to service output Sim Output Computing service S 1 And service S 2 Functional similarity functional sim (S) of 1 ,S 2 ) The calculation formula is as follows:
FunctionalSim(S 1 ,S 2 )=w 1 ×Sim input +w 2 ×Sim Output wherein the weight w 1 And w 2 Are real numbers between 0 and 1 and sum to 1, which represent the importance of the service consumer to input similarity and output similarity determinations.
Further, the step (1.4) comprises the following steps:
step (1.4.1) setting variable i to represent the same number of attributes in concept A and concept B and setting the initial value to be 0, and performing step (1.4.2);
step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, then step (1.4.7) is performed, otherwise, the next prop (A) is taken out from the prop (A) j And removing it from prop (A) and carrying out step (1.4.3);
step (1.4.3) if the traversal of the attribute set prop (B) of the concept B is completed, returning to step (1.4.2), otherwise, taking out the next prop (B) from the prop (B) k And removing it from prop (B) and carrying out step (1.4.4);
step (1.4.4) is based on naive Bayes model, combines WorkNet English dictionary, and pairs of prop (A) through ComputeFeature function j And prop (B) k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) The process comprises the following steps:
calculating the word semantics of each attribute name, wherein each word corresponds to one or more semantics, so that each word pair corresponds to one or more semantic pairs, and the semantic node distance with the shortest distance in all the semantic pairs corresponding to the word pair is defined as the word pair distance L (prop (A) j ,prop(B) k ) And defining the semantic pair depth with the shortest distance between semantic nodes as word pair depth D (prop (A) j ,prop(B) k ) Known attribute name prop (A) j Presence in semantic node v j1 ,v j2 ,…,v jn In the synonymous phrase of (1), attribute name prop (B) k Presence in semantic node v k1 ,v k2 ,…v km In a synonymous phrase of (1), then prop (A) j And prop (B) k The distance calculation formula and the depth calculation formula are as follows:
Figure BDA0001768618510000051
Figure BDA0001768618510000052
wherein L (v) ja ,v kb ) Representing semantic nodes v ja And semantic node v kb Distance of D (v) ja ,v kb ) Representing semantic pairs (v) ja ,v kb ) The depth of (d);
further, mean functions LW (i) and DW (o) are generated according to a training set of the naive Bayes model, and then a conditional probability distribution column P (L (prop (A)) is calculated by using the mean functions LW (i) and DW (o) j ,prop(B) k ) I C) and P (D (prop (A) j ,prop(B) k ) I C), wherein C is the word class classification, and the value range thereof is { U, N }, wherein U represents "consistent" and N represents "inconsistent", and finally, the adjustment factors α and β are calculated as follows:
Figure BDA0001768618510000053
Figure RE-GDA0001836032230000061
then, the step (1.4.5) is carried out;
step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V 1 =P(L(prop(A) j ,prop(B) k )=i|C=U), V 2 =P(D(prop(A) j ,prop(B) k )=o|C=U), V 3 =P(L(prop(A) j ,prop(B) k ) = i | C = N) and V 4 =P(D(prop(A) j ,prop(B) k ) = o | C = N), and finally calculating prop (a) by combining the adjustment factors α and β in step (1.4.4) j With prop (B) k Similarity between them Sim word The calculation formula is as follows:
Sim word (prop(A) j ,prop(B) k )=(αV 1 ×V 2 )/(αV 1 ×V 2 +βV 3 ×V 4 ) Carrying out the step (1.4.6);
step (1.4.6) if Sim word Greater than or equal to the similarity criterion factor eta, prop (A) j And prop (B) k Adding 1 to the variable i for the same attribute, returning to the step (1.4.2), and otherwise, returning to the step (1.4.3);
step (1.4.7) calculating similarity Sim of concepts A and B concept The calculation formula is as follows:
Figure BDA0001768618510000062
wherein i represents the same number of attributes in concept A and concept B, prop (A) and prop (B) represent attribute sets of concept A and concept B, respectively, and Size (prop (B)) and Size (prop (A)) represent attributes of concept B and concept A, respectivelyAnd (4) carrying out step (1.5).
The method has the advantages that when the similarity between the concept attributes of the ontology is calculated, the condition that the same attribute names are different is considered, and the similarity between the attribute names is calculated by combining a word semantic similarity measurement method of a naive Bayes model, so that errors caused by directly judging whether the attribute names are equal or not are avoided to a certain extent. In addition, when calculating the input (output) similarity, the algorithm considers the parameter quantity difference d, and enhances the influence of the quantity difference on the service input similarity through the ratio of the parameter quantity difference to the parameter quantity, so that the functional similarity between services can be reasonably measured.
Detailed Description
The present invention is further explained below.
A Web service function similarity measurement method based on an ontology considers the domain ontology concepts of input and output among services, and the matching between the input (output) mainly refers to the matching of concepts related to the input (output). To calculate the similarity of two concepts a and B, the relationship between the two concepts in the domain ontology needs to be considered.
A domain ontology is a specialized ontology describing knowledge of a given domain, where a "domain" is established according to the needs of an ontology builder, and may be a subject domain, a combination of several domains, or a small area within a domain. If two concepts in a domain ontology have different names but the same set of individuals, they are called equivalence classes (equivalent classes).
The measuring method comprises the following steps:
the method comprises the following steps of firstly, calculating semantic similarity between two concepts A and B in a domain ontology, wherein the process is as follows:
step (1.1) if the concepts A and B are identical or they are declared equivalent classes, the similarity Sim of the concepts A, B concept 1, otherwise, performing the step (1.2);
step (1.2) if concept A is directly or indirectly a subclass of concept BThen the similarity Sim of the concepts A and B concept The calculation formula is as follows:
Figure BDA0001768618510000071
wherein prop (A) and prop (B) respectively represent the attribute sets of concept A and concept B, and Size (prop (B)) and Size (prop (A)) respectively represent the attribute numbers of concept B and concept A, otherwise, the step (1.3) is carried out
Step (1.3) if the concept B is directly or indirectly a subclass of the concept A, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
Figure BDA0001768618510000072
otherwise, performing the step (1.4);
step (1.4) if the concept A and the concept B have no parent-child relationship, but the two concepts directly or indirectly have a common parent class concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, all attributes of the concept A and the concept B are respectively traversed, the attribute names of the concept A and the concept B are subjected to feature extraction through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes word According to the similarity Sim between the concept attributes word Comparing with similarity judgment factor eta, judging whether two attributes are the same attribute, counting, and finally calculating similarity Sim of concepts A and B concept The method comprises the following steps:
step (1.4.1) setting variable i to represent the same number of attributes in concept A and concept B and setting the initial value to be 0, and performing step (1.4.2);
step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, then step (1.4.7) is performed, otherwise, the next prop (A) is taken out from the prop (A) j And removing it from prop (A) and carrying out step (1.4.3);
step (1.4.3) if the traversal of the attribute set prop (B) of the concept B is completed, returning to step (1.4.2), otherwise, taking out the next prop (B) from the prop (B) k And removing it from prop (B) and carrying out step (1.4.4);
step (1.4.4) is based on naive Bayes model, combines WorkNet English dictionary, and pairs of prop (A) through ComputeFeature function j And prop (B) k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) j ,prop(B) k ) And D (prop (A) j ,prop(B) k ). The naive Bayes model is one of the two most extensive classification models, and WorkNet is an English dictionary established and maintained by the university of Princeton university cognition science laboratory under the guidance of Miller in psychological teaching. The specific calculation process is as follows:
the word semantics for each attribute name are calculated, and since each word corresponds to one or more semantics, each word pair corresponds to one or more semantic pairs. Defining the semantic node distance with the shortest distance in all semantic pairs corresponding to the word pair as the word pair distance L (prop (A) j ,prop(B) k ) And defining the semantic pair depth with the shortest distance between semantic nodes as word pair depth D (prop (A) j ,prop(B) k ) Known attribute name prop (A) j Presence in semantic node v j1 ,v j2 ,…,v jn In the synonymous phrase of (1), attribute name prop (B) k Presence in semantic node v k1 ,v k2 ,…v km In a synonymous phrase, then prop (A) j With prop (B) k The distance calculation formula and the depth calculation formula are as follows:
Figure BDA0001768618510000081
Figure BDA0001768618510000082
wherein L (v) ja ,v kb ) Representing semantic nodes v ja And semantic node v kb Distance of D (v) ja ,v kb ) Representing semantic pairs (v) ja ,v kb ) Of the depth of (c).
Further, mean functions LW (i) and DW (o) are generated according to a training set of the naive Bayes model, and then a conditional probability distribution column P (L (prop (A)) is calculated by using the mean functions LW (i) and DW (o) j ,prop(B) k ) I C) and P (D (prop (A) j ,prop(B) k ) I C), where C is the word class classification with a range of { U, N }, where U stands for "consistent" and N stands for "inconsistent". Finally, adjusting factors alpha and beta are calculated, and the calculation formula is as follows:
Figure BDA0001768618510000083
Figure RE-GDA0001836032230000084
then, the step (1.4.5) is carried out;
step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V 1 =P(L(prop(A) j ,prop(B) k )=i|C=U),V 2 =P(D(prop(A) j ,prop(B) k )=o|C=U), V 3 =P(L(prop(A) j ,prop(B) k ) = i | C = N) and V 4 =P(D(prop(A) j ,prop(B) k ) = o | C = N). Finally, the pro (A) is calculated by combining the regulatory factors alpha and beta in step (1.4.4) j And prop (B) k Similarity between them Sim word The calculation formula is as follows:
Sim word (prop(A) j ,prop(B) k )=(αV 1 ×V 2 )/(αV 1 ×V 2 +βV 3 ×V 4 ) Carrying out the step (1.4.6);
step (1.4.6) if Sim word Greater than or equal to the similarity criterion factor eta, prop (A) j With prop (B) k Adding 1 to the variable i for the same attribute, returning to the step (1.4.2), and otherwise, returning to the step (1.4.3);
step (1.4.7) calculating similarity Sim of concepts A and B concept The calculation formula is as follows:
Figure BDA0001768618510000091
wherein i represents the number of the attributes in concept A and concept B, prop (A) and prop (B) represent the attribute sets of concept A and concept B respectively, and Size (prop (B)) and Size (prop (A)) represent the number of the attributes in concept B and concept A respectively, and the step (1.5) is carried out;
step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B concept Set to 0;
a second step of providing a service S by combining the concept similarity calculation method of the first step 1 And service S 2 Input similarity Sim input The process of the calculation method is as follows:
step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and carrying out step (2.2);
step (2.2) service S 1 Is subtracted by the service S 2 Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);
step (2.3) if d is less than or equal to 0, then service S 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long Carrying out the step (2.4);
step (2.4) traverse S long If the traversal is completed, go to step (2.8), otherwise go from S long In the next input parameter long is taken out i Carrying out the step (2.5);
step (2.5) traverse S short If the traversal is completed, returning to the step (2.4), otherwise, returning to the step S short Get the next input parameter short j Carrying out the step (2.6);
step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step i Root of Henan ginsengNumber short j Similarity Sim of ij Carrying out the step (2.7);
step (2.7) reacting Sim ij And InSim [ i ]]For comparison, if Sim ij Greater than InSim [ i]Then, will InSim [ i ]]Is set as Sim ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;
step (2.8) calculation service S 1 And service S 2 Input similarity Sim input The calculation formula is as follows:
Figure BDA0001768618510000101
wherein Size (S) long Input) and Size (S) short Input) respectively represent services S long Number of input parameters and service S short The number of input parameters, | d | represents the difference value of the number of the input parameters of the two services, and InSim is the maximum matching array of the similarity of the input parameters;
thirdly, combining the conceptual similarity calculation method of the first step to give a service S 1 And service S 2 Output similarity Sim output The calculation method comprises the following steps:
step (3.1) creating a service output parameter similarity maximum matching array OutSim and initializing, and performing step (3.2);
step (3.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d according to the output parameter quantity, and performing the step (3.3);
step (3.3) if d is less than or equal to 0, service S is carried out 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long And (4) performing the step (3.4);
step (3.4) traverse S long If the traversal is completed, go to step (3.8), otherwise go from S long In the next output parameter long i And (5) carrying out the step (3.5);
and (3).5) Traverse S short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is ended short In the next output parameter short j And (4) carrying out the step (3.6);
step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step i And the parameter short j Similarity Sim of ij Carrying out the step (3.7);
step (3.7) with Sim ij And OutSim [ i ]]For comparison, if Sim ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;
step (3.8) calculation service S 1 And service S 2 Output similarity Sim Output The calculation formula is as follows:
Figure BDA0001768618510000102
wherein Size (S) long Output) and Size (S) short Output) respectively represent the services S long Number of output parameters and service S short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;
the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity to service output Sim Output Computing service S 1 And service S 2 Functional similarity functional sim (S) of 1 ,S 2 ) The calculation formula is as follows:
FunctionalSim(S 1 ,S 2 )=w 1 ×Sim input +w 2 ×Sim Output wherein the weight w 1 And w 2 Is a real number between 0 and 1 and sums to 1. They represent the importance of the service consumer to input similarity and output similarity determination. By default, w 1 And w 2 Are all set to 0.5.

Claims (2)

1. A Web service function similarity measurement method based on an ontology is characterized by comprising the following steps:
the method comprises the following steps of firstly, calculating semantic similarity between two concepts A and B in a domain ontology, wherein the process is as follows:
step (1.1) if the concepts A and B are identical or they are declared equivalent classes, the similarity Sim of the concepts A, B concept 1, otherwise, performing the step (1.2);
step (1.2) if the concept A is directly or indirectly a subclass of the concept B, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
Figure FDA0003874115050000011
wherein prop (A) and prop (B) respectively represent the attribute sets of concept A and concept B, and Size (prop (B)) and Size (prop (A)) respectively represent the attribute numbers of concept B and concept A, otherwise, the step (1.3) is carried out
Step (1.3) if the concept B is directly or indirectly a subclass of the concept A, the similarity Sim of the concepts A and B concept The calculation formula is as follows:
Figure FDA0003874115050000012
otherwise, performing the step (1.4);
step (1.4) if the concept A and the concept B have no parent-child relationship but the two concepts directly or indirectly have a common parent concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, attributes of the concept A and the concept B are respectively traversed, feature extraction is carried out on the attribute names of the concept A and the concept B through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes word According to the similarity Sim between the concept attributes word Comparing with similarity judgment factor eta, judging whether two attributes are the same attribute, performing statistics, and finally calculating the similarity of concepts A and BDegree Sim concept
Step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B concept Is set to 0;
a second step of providing a service S by combining the concept similarity calculation method of the first step 1 And service S 2 Input similarity Sim input The calculation method comprises the following steps:
step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and performing step (2.2);
step (2.2) service S 1 Is subtracted by the service S 2 Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);
step (2.3) if d is less than or equal to 0, then service S 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long Carrying out the step (2.4);
step (2.4) traverse S long If the traversal is completed, go to step (2.8), otherwise go from S long Take out the next input parameter long i Carrying out the step (2.5);
step (2.5) traverse S short If the traversal is completed, returning to the step (2.4), otherwise, returning to the step S short Get the next input parameter short j Carrying out the step (2.6);
step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of ij Carrying out the step (2.7);
step (2.7) with Sim ij And InSim [ i ]]For comparison, if Sim ij Greater than InSim [ i]Then, will InSim [ i ]]Is set to Sim ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;
step (2.8) calculation service S 1 And service S 2 Input similarity Sim input The calculation formula is as follows:
Figure FDA0003874115050000021
wherein Size (S) long. Input) and Size (S) short. Input) represents services S, respectively long Number of input parameters and service S short The number of input parameters, | d | represents the difference value of the number of the input parameters of two services, and InSim is the maximum matching array of the similarity of the input parameters;
thirdly, combining the conceptual similarity calculation method of the first step to give a service S 1 And service S 2 Output similarity Sim output The calculation method comprises the following steps:
step (3.1) creating a service output parameter similarity maximum matching array OutSim and initializing, and performing step (3.2);
step (3.2) service S 1 Minus the service S 2 Obtaining a parameter quantity difference d according to the output parameter quantity, and performing the step (3.3);
step (3.3) if d is less than or equal to 0, service S is carried out 1 Is set to S short Service S 2 Is set to S long Otherwise, it will serve S 2 Is set to S short Service S 1 Is set to S long And (4) performing the step (3.4);
step (3.4) traverse S long If the traversal is completed, go to step (3.8), otherwise go from S long In the next output parameter long i Carrying out the step (3.5);
step (3.5) traverse S short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is started short In the next output parameter short j Step (3.6) is carried out;
step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step i And parameter short j Similarity Sim of * ij Carrying out the step (3.7);
step (3.7) with Sim * ij And OutSim [ i ]]By comparison, if Sim * ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim * ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;
step (3.8) calculation service S 1 And service S 2 Output similarity Sim Output The calculation formula is as follows:
Figure FDA0003874115050000022
wherein Size (S) long. Output) and Size (S) short. Output) respectively represent services S long Number of output parameters and service S short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;
the fourth step combines the service input similarity Sim obtained in the second step and the third step input Similarity with service output Sim Output Computing service S 1 And service S 2 Functional similarity of (S) 1 ,S 2 ) The calculation formula is as follows:
FunctionalSim(S 1 ,S 2 )=w 1 ×Sim input +w 2 ×Sim Output wherein the weight w 1 And w 2 Are real numbers between 0 and 1 and sum to 1, which represent the importance of the service consumer to input similarity and output similarity determination.
2. The method of claim 1, wherein the step (1.4) comprises the following steps:
step (1.4.1) setting variable i to represent the same number of attributes in concept A and concept B and setting the initial value to be 0, and performing step (1.4.2);
step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, step (1.4.7) is carried out, otherwise, the next prop (A) is taken out from the prop (A) j And removing it from prop (A), proceeding to stepStep (1.4.3);
step (1.4.3) if the traversal of the attribute set prop (B) of the concept B is completed, returning to step (1.4.2), otherwise, taking out the next prop (B) from the prop (B) k And removing it from prop (B) and carrying out step (1.4.4);
step (1.4.4) is based on naive Bayes model, combines WordNet English dictionary, and pairs prop (A) through computeFeature function j And prop (B) k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) The process is as follows:
calculating the word semantics of each attribute name, wherein each word corresponds to one or more semantics, each word pair corresponds to one or more semantic pairs, and the semantic node distance with the shortest distance in all the semantic pairs corresponding to the word pair is defined as the word pair distance L (prop (A) j ,prop(B) k ) And defining the semantic pair depth with the shortest distance between the semantic nodes as the word pair depth D (prop (A) j ,prop(B) k ) Known attribute name prop (A) j Presence in semantic node v j1 ,v j2 ,…,v jn In the synonymous phrase of (1), attribute name prop (B) k Presence in semantic node v k1 ,v k2 ,…v km In a synonymous phrase, then prop (A) j And prop (B) k The distance calculation formula and the depth calculation formula are as follows:
Figure FDA0003874115050000031
Figure FDA0003874115050000032
wherein L (v) ja ,v kb ) Representing semantic nodes v ja And semantic node v kb Distance of D (v) ja ,v kb ) Representing semantic pairs (v) ja ,v kb ) Depth of (d);
according to the principleGenerating mean functions LW (i) and DW (o) by a training set of the naive Bayes model, and calculating a conditional probability distribution list P (L (prop (A)) by using the mean functions LW (i) and DW (o) j ,prop(B) k ) I C) and P (D (prop (A) j ,prop(B) k ) I C), wherein C is a word class classification, the value range is { U, N }, wherein U represents 'consistent' and N represents 'inconsistent', and finally, the adjustment factors alpha and beta are calculated according to the following calculation formula:
Figure FDA0003874115050000033
Figure FDA0003874115050000041
step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed j ,prop(B) k ) And D (prop (A) j ,prop(B) k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V 1 =P(L(prop(A) j ,prop(B) k )=i|C=U),V 2 =P(D(prop(A) j ,prop(B) k )=o|C=U),V 3 =P(L(prop(A) j ,prop(B) k ) = i | C = N) and V 4 =P(D(prop(A) j ,prop(B) k ) = o | C = N), and finally the adjustment factors a and a in step (1.4.4)
Figure FDA0003874115050000043
To calculate prop (A) j And prop (B) k Similarity between them Sim word The calculation formula is as follows:
Sim word (prop(A) j ,prop(B) k )=(αV 1 ×V 2 )/(αV 1 ×V 2 +βV 3 ×V 4 ) Carrying out the step (1.4.6);
step (1.4.6) if Sim word If it is greater than or equal to the similarity determination factor eta, prop (A) j And prop (B) k Is the same asAdding 1 to the variable i, and returning to the step (1.4.2), otherwise, returning to the step (1.4.3);
step (1.4.7) calculating similarity Sim of concepts A and B concept The calculation formula is as follows:
Figure FDA0003874115050000042
wherein i represents the number of attributes in concept a and concept B that are the same, prop (a) and prop (B) represent the attribute sets of concept a and concept B, respectively, and Size (prop (B)) and Size (prop (a)) represent the number of attributes of concept B and concept a, respectively, and step (1.5) is performed.
CN201810939188.4A 2018-08-17 2018-08-17 Web service function similarity measurement method based on ontology Active CN109359289B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810939188.4A CN109359289B (en) 2018-08-17 2018-08-17 Web service function similarity measurement method based on ontology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810939188.4A CN109359289B (en) 2018-08-17 2018-08-17 Web service function similarity measurement method based on ontology

Publications (2)

Publication Number Publication Date
CN109359289A CN109359289A (en) 2019-02-19
CN109359289B true CN109359289B (en) 2023-01-31

Family

ID=65350095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810939188.4A Active CN109359289B (en) 2018-08-17 2018-08-17 Web service function similarity measurement method based on ontology

Country Status (1)

Country Link
CN (1) CN109359289B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7457531B2 (en) * 2020-02-28 2024-03-28 株式会社Screenホールディングス Similarity calculation device, similarity calculation program, and similarity calculation method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404619A (en) * 2015-09-08 2016-03-16 华南理工大学 Similarity based semantic Web service clustering labeling method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404619A (en) * 2015-09-08 2016-03-16 华南理工大学 Similarity based semantic Web service clustering labeling method

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
A Semantic Similarity Measure for Semantic Web Services;Jeffrey Hau等;《WWW2005》;20050514;第1-9页 *
An Information-Theoretic Definition of Similarity;Dekang Lin;《Proceedings of the Fifteenth International Conference on Machine Learning》;19980731;第296-304页 *
一种基于语义相似度的Web服务匹配方法;张亮;《情报科学》;20160228;第34卷(第2期);第21-23页 *
基于本体概念集合相似度的语义Web 服务匹配;杨佳等;《计算机技术与发展》;20120831;第22卷(第2期);第56-59页 *
基于概念相似度计算的语义Web 服务发现方法;徐德智等;《计算技术与自动化》;20100630;第29卷(第2期);第98-101页 *
面向全局社交服务网的Web服务聚类方法;陆佳炜;《计算机科学》;20180331;第45卷(第3期);第206-214页 *

Also Published As

Publication number Publication date
CN109359289A (en) 2019-02-19

Similar Documents

Publication Publication Date Title
Su et al. Building natural language interfaces to web apis
CN112148863B (en) Generation type dialogue abstract method integrated with common knowledge
Zouaq et al. Evaluating the generation of domain ontologies in the knowledge puzzle project
CN107135092B (en) A kind of Web service clustering method towards global social interaction server net
CN109255125B (en) Web service clustering method based on improved DBSCAN algorithm
US10963318B2 (en) Natural language interface to web API
CN111639252A (en) False news identification method based on news-comment relevance analysis
CN111625658A (en) Voice interaction method, device and equipment based on knowledge graph and storage medium
CN109284086B (en) Demand-oriented adaptive dynamic evolution method for Web service
WO2022179384A1 (en) Social group division method and division system, and related apparatuses
CN108538294B (en) Voice interaction method and device
CN111213136A (en) Generation of domain-specific models in networked systems
CN112989208B (en) Information recommendation method and device, electronic equipment and storage medium
CN108287848B (en) Method and system for semantic parsing
CN111143574A (en) Query and visualization system construction method based on minority culture knowledge graph
CN110909230A (en) Network hotspot analysis method and system
CN110347401B (en) API Framework service discovery method based on semantic similarity
CN109359289B (en) Web service function similarity measurement method based on ontology
Yu et al. A structured ontology construction by using data clustering and pattern tree mining
CN104317853B (en) A kind of service cluster construction method based on Semantic Web
Pomp et al. Eskape: Platform for enabling semantics in the continuously evolving internet of things
CN112417170A (en) Relation linking method for incomplete knowledge graph
CN115329078B (en) Text data processing method, device, equipment and storage medium
EP4120117A1 (en) Disfluency removal using machine learning
CN115859963A (en) Similarity judgment method and system for new word and semantic recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant