CN109359289B

CN109359289B - Web service function similarity measurement method based on ontology

Info

Publication number: CN109359289B
Application number: CN201810939188.4A
Authority: CN
Inventors: 陆佳炜; 卢成炳; 吴涵; 周焕; 徐俊; 肖刚
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2018-08-17
Filing date: 2018-08-17
Publication date: 2023-01-31
Anticipated expiration: 2038-08-17
Also published as: CN109359289A

Abstract

A Web service function similarity measurement method based on an ontology comprises the following steps: firstly, calculating semantic similarity between two concepts A and B in a domain ontology; a second step of providing a service S by combining the concept similarity calculation method of the first step ₁ And service S ₂ Input similarity Sim _input The calculation method of (1); thirdly, combining the conceptual similarity calculation method of the first step to give a service S ₁ And service S ₂ Output similarity Sim _output The calculation method of (1); the fourth step combines the service input similarity Sim obtained in the second step and the third step _input Similarity with service output Sim _Output Computing service S ₁ And service S ₂ Functional similarity functional sim (S) of ₁ ,S ₂ ). The invention can reasonably measure the functional similarity among the services and optimize the service clustering effect.

Description

Web service function similarity measurement method based on ontology

Technical Field

The invention relates to the field of Web service evolution, in particular to a Web service function similarity measurement method based on an ontology.

Background

A Web service is a software system intended to support cross-network interactions between machines. There are currently mainly two types of Web services: one is based on SOAP and the other is based on REST. The difference between the two is that different interfaces are used, SOAP-based Web services use the SOAP interface to pass messages and use the Web Services Description Language (WSDL) to describe Web services, which specifies a protocol and coding independent mechanism for Web service providers, which is an XML vocabulary describing services accessible on the network and mapping them into a collection of communication endpoints with messaging capabilities. Web services using REST interfaces use generic HTTP methods (GET, DELETE, POST, and PUT) to describe, publish, and use relevant resources.

Current research efforts are directed to providing semantic descriptions of Web services through the use of conceptualized knowledge called ontologies. An ontology is a vocabulary that describes a set of concepts within a domain (a domain may be defined as a particular subject domain or knowledge domain) and the relationships that exist between those concepts. It is applied to attribute reasoning within a domain, or to the definition of the domain itself. In the context of Web services, ontologies play an important role as a way to provide semantic descriptions of Web services. The enhancement of Web services descriptions has prompted the development of semantic Web services, which are described in a machine-understandable manner, which will have a significant impact on areas such as e-commerce and application integration, since it can enable dynamic, extensible and efficient collaboration between different systems and organizations.

With the continuous development of Web services, in order to adapt to changes in environment and changes in user requirements, the Web services in the internet need to evolve continuously. Therefore, web services have evolved as one of the important research points in the field of service computing. Meanwhile, the Web service is an important technology for constructing the software service, and important research significance and application value are provided for enabling the software system to operate in a self-adaptive mode and supporting dynamic evolution of the service.

The evolution of Web services generally refers to a process of changing a service after the service is released and operated in order to adapt to environmental changes and continuously meet user requirements. Due to the characteristics of dynamic, heterogeneous and autonomous Web services and the distributed characteristics of the services, most of the services integrated by the system are from different organizations, so that the evolution of the Web services faces more challenges compared with the traditional software evolution.

In the field of service evolution research, many researchers have proposed different optimization schemes for inter-service semantic similarity measurement methods. The Ana GMagitman of computer department of the information institute of Indiana university, USA introduces graph theory into the calculation of semantic similarity, and provides a calculation method. Dekang Lin of mannich topba university, canada, proposes to consider not only the amount of information shared between concepts, but also the amount of information that differs between concepts in calculating semantic similarity. Yuhua Li et al, manchester university of City, also proposed a method for measuring semantic similarity, which takes into account the influence of concept density factors on semantic similarity.

Disclosure of Invention

The evolution of Web services generally narrows the search space for service samples by service clustering operations, allowing the service matching process to be performed in a particular cluster, rather than a large pool of services with many unrelated services. In general, calculating the similarity between objects is an important step of a clustering algorithm. According to the invention, the similarity between the attribute names is calculated by calculating the similarity between the concept attributes of the ontology and considering the condition that the same attribute names are different, the similarity between the attribute names is calculated by combining a word semantic similarity measurement method of a naive Bayes model, and the influence of the quantity difference on the service input similarity is enhanced by the ratio of the parameter quantity difference to the parameter quantity, so that the functional similarity between services can be reasonably measured, and the service clustering effect is optimized.

In order to solve the technical problems, the invention adopts the technical scheme that:

a Web service function similarity measurement method based on an ontology comprises the following steps:

the method comprises the following steps of firstly, calculating semantic similarity between two concepts A and B in a domain ontology, wherein the process is as follows:

step (1.1) if the concepts A and B are identical or they are declared equivalent classes, the similarity Sim of the concepts A, B _concept 1, otherwise, performing the step (1.2);

step (1.2) if the concept A is directly or indirectly a subclass of the concept B, the similarity Sim of the concepts A and B _concept The calculation formula is as follows:

wherein prop (A) and prop (B) respectively represent the attribute sets of concept A and concept B, and Size (prop (B)) and Size (prop (A)) respectively represent the attribute numbers of concept B and concept A, otherwise, the step (1.3) is carried out

Step (1.3) if the concept B is directly or indirectly a subclass of the concept A, the similarity Sim of the concepts A and B _concept The calculation formula is as follows:

otherwise, performing the step (1.4);

step (1.4) if the concept A and the concept B have no parent-child relationship, but the two concepts directly or indirectly have a common parent class concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, all attributes of the concept A and the concept B are respectively traversed, the attribute names of the concept A and the concept B are subjected to feature extraction through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes _word According to the similarity Sim between the concept attributes _word Comparing with similarity judgment factor eta, judging whether the two attributes are the same attribute, counting, and finally calculating the similarity Sim of concepts A and B _concept ；

Step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B _concept Set to 0;

second step, combined with first stepA conceptual similarity calculation method of providing a service S ₁ And service S ₂ Input similarity Sim _input The process of the calculation method is as follows:

step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and carrying out step (2.2);

step (2.2) service S ₁ Minus the service S ₂ Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);

step (2.3) if d is less than or equal to 0, then service S ₁ Is set to S _short Service S ₂ Is set to S _long Otherwise, it will serve S ₂ Is set to S _short Service S ₁ Is set to S _long Carrying out the step (2.4);

step (2.4) traverse S _long If the traversal is completed, go to step (2.8), otherwise go from S _long Take out the next input parameter long _i Carrying out the step (2.5);

step (2.5) traverse S _short If the traversal is completed, returning to the step (2.4), otherwise, returning to the step S _short Get the next input parameter short _j Carrying out the step (2.6);

step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step _i And parameter short _j Similarity Sim of _ij Carrying out the step (2.7);

step (2.7) with Sim _ij And InSim [ i ]]For comparison, if Sim _ij Greater than InSim [ i]Then, will InSim [ i ]]Is set to Sim _ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;

step (2.8) calculation service S ₁ And service S ₂ Input similarity Sim _input The calculation formula is as follows:

wherein Size (S) _long Input) andSize(S _short input) represents the service S, respectively _long Number of input parameters and service S _short The number of input parameters, | d | represents the difference value of the number of the input parameters of the two services, and InSim is the maximum matching array of the similarity of the input parameters;

step three, combining the concept similarity calculation method of the step one, and providing a service S ₁ And service S ₂ Output similarity Sim _output The process of the calculation method is as follows:

step (3.1), a service output parameter similarity maximum matching array OutSim is established and initialized, and step (3.2) is carried out;

step (3.2) service S ₁ Minus the service S ₂ Obtaining a parameter quantity difference d by the quantity of the output parameters, and performing the step (3.3);

step (3.3) if d is less than or equal to 0, service S is carried out ₁ Is set to S _short Service S ₂ Is set to S _long Otherwise, will serve S ₂ Is set to S _short Service S ₁ Is set to S _long And (4) performing the step (3.4);

step (3.4) traverse S _long If the traversal is completed, go to step (3.8), otherwise go from S _long In the next output parameter long _i And (5) carrying out the step (3.5);

step (3.5) traverse S _short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is ended _short In the next output parameter short _j And (4) carrying out the step (3.6);

step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step _i And parameter short _j Similarity Sim of _ij Carrying out the step (3.7);

step (3.7) with Sim _ij And OutSim [ i ]]For comparison, if Sim _ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim _ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;

step (3.8) calculation service S ₁ And service S ₂ Output phaseSimilarity Sim _Output The calculation formula is as follows:

wherein Size (S) _long Output) and Size (S) _short. Output) respectively represent services S _long Number of output parameters and service S _short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;

the fourth step combines the service input similarity Sim obtained in the second step and the third step _input Similarity to service output Sim _Output Computing service S ₁ And service S ₂ Functional similarity functional sim (S) of ₁ ,S ₂ ) The calculation formula is as follows:

FunctionalSim(S ₁ ,S ₂ )＝w ₁ ×Sim _input +w ₂ ×Sim _Output wherein the weight w ₁ And w ₂ Are real numbers between 0 and 1 and sum to 1, which represent the importance of the service consumer to input similarity and output similarity determinations.

Further, the step (1.4) comprises the following steps:

step (1.4.1) setting variable i to represent the same number of attributes in concept A and concept B and setting the initial value to be 0, and performing step (1.4.2);

step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, then step (1.4.7) is performed, otherwise, the next prop (A) is taken out from the prop (A) _j And removing it from prop (A) and carrying out step (1.4.3);

step (1.4.3) if the traversal of the attribute set prop (B) of the concept B is completed, returning to step (1.4.2), otherwise, taking out the next prop (B) from the prop (B) _k And removing it from prop (B) and carrying out step (1.4.4);

step (1.4.4) is based on naive Bayes model, combines WorkNet English dictionary, and pairs of prop (A) through ComputeFeature function _j And prop (B) _k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) _j ,prop(B) _k ) And D (prop (A) _j ,prop(B) _k ) The process comprises the following steps:

calculating the word semantics of each attribute name, wherein each word corresponds to one or more semantics, so that each word pair corresponds to one or more semantic pairs, and the semantic node distance with the shortest distance in all the semantic pairs corresponding to the word pair is defined as the word pair distance L (prop (A) _j ,prop(B) _k ) And defining the semantic pair depth with the shortest distance between semantic nodes as word pair depth D (prop (A) _j ,prop(B) _k ) Known attribute name prop (A) _j Presence in semantic node v _j1 ,v _j2 ,…，v _jn In the synonymous phrase of (1), attribute name prop (B) _k Presence in semantic node v _k1 ,v _k2 ，…v _km In a synonymous phrase of (1), then prop (A) _j And prop (B) _k The distance calculation formula and the depth calculation formula are as follows:

wherein L (v) _ja ,v _kb ) Representing semantic nodes v _ja And semantic node v _kb Distance of D (v) _ja ,v _kb ) Representing semantic pairs (v) _ja ,v _kb ) The depth of (d);

further, mean functions LW (i) and DW (o) are generated according to a training set of the naive Bayes model, and then a conditional probability distribution column P (L (prop (A)) is calculated by using the mean functions LW (i) and DW (o) _j ,prop(B) _k ) I C) and P (D (prop (A) _j ,prop(B) _k ) I C), wherein C is the word class classification, and the value range thereof is { U, N }, wherein U represents "consistent" and N represents "inconsistent", and finally, the adjustment factors α and β are calculated as follows:

then, the step (1.4.5) is carried out;

step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed _j ,prop(B) _k ) And D (prop (A) _j ,prop(B) _k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V ₁ ＝P(L(prop(A) _j ,prop(B) _k )＝i|C＝U)， V ₂ ＝P(D(prop(A) _j ,prop(B) _k )＝o|C＝U), V ₃ ＝P(L(prop(A) _j ,prop(B) _k ) = i | C = N) and V ₄ ＝P(D(prop(A) _j ,prop(B) _k ) = o | C = N), and finally calculating prop (a) by combining the adjustment factors α and β in step (1.4.4) _j With prop (B) _k Similarity between them Sim _word The calculation formula is as follows:

Sim _word (prop(A) _j ,prop(B) _k )＝(αV ₁ ×V ₂ )/(αV ₁ ×V ₂ +βV ₃ ×V ₄ ) Carrying out the step (1.4.6);

step (1.4.6) if Sim _word Greater than or equal to the similarity criterion factor eta, prop (A) _j And prop (B) _k Adding 1 to the variable i for the same attribute, returning to the step (1.4.2), and otherwise, returning to the step (1.4.3);

step (1.4.7) calculating similarity Sim of concepts A and B _concept The calculation formula is as follows:

wherein i represents the same number of attributes in concept A and concept B, prop (A) and prop (B) represent attribute sets of concept A and concept B, respectively, and Size (prop (B)) and Size (prop (A)) represent attributes of concept B and concept A, respectivelyAnd (4) carrying out step (1.5).

The method has the advantages that when the similarity between the concept attributes of the ontology is calculated, the condition that the same attribute names are different is considered, and the similarity between the attribute names is calculated by combining a word semantic similarity measurement method of a naive Bayes model, so that errors caused by directly judging whether the attribute names are equal or not are avoided to a certain extent. In addition, when calculating the input (output) similarity, the algorithm considers the parameter quantity difference d, and enhances the influence of the quantity difference on the service input similarity through the ratio of the parameter quantity difference to the parameter quantity, so that the functional similarity between services can be reasonably measured.

Detailed Description

The present invention is further explained below.

A Web service function similarity measurement method based on an ontology considers the domain ontology concepts of input and output among services, and the matching between the input (output) mainly refers to the matching of concepts related to the input (output). To calculate the similarity of two concepts a and B, the relationship between the two concepts in the domain ontology needs to be considered.

A domain ontology is a specialized ontology describing knowledge of a given domain, where a "domain" is established according to the needs of an ontology builder, and may be a subject domain, a combination of several domains, or a small area within a domain. If two concepts in a domain ontology have different names but the same set of individuals, they are called equivalence classes (equivalent classes).

The measuring method comprises the following steps:

step (1.2) if concept A is directly or indirectly a subclass of concept BThen the similarity Sim of the concepts A and B _concept The calculation formula is as follows:

otherwise, performing the step (1.4);

step (1.4) if the concept A and the concept B have no parent-child relationship, but the two concepts directly or indirectly have a common parent class concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, all attributes of the concept A and the concept B are respectively traversed, the attribute names of the concept A and the concept B are subjected to feature extraction through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes _word According to the similarity Sim between the concept attributes _word Comparing with similarity judgment factor eta, judging whether two attributes are the same attribute, counting, and finally calculating similarity Sim of concepts A and B _concept The method comprises the following steps:

step (1.4.4) is based on naive Bayes model, combines WorkNet English dictionary, and pairs of prop (A) through ComputeFeature function _j And prop (B) _k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) _j ,prop(B) _k ) And D (prop (A) _j ,prop(B) _k ). The naive Bayes model is one of the two most extensive classification models, and WorkNet is an English dictionary established and maintained by the university of Princeton university cognition science laboratory under the guidance of Miller in psychological teaching. The specific calculation process is as follows:

the word semantics for each attribute name are calculated, and since each word corresponds to one or more semantics, each word pair corresponds to one or more semantic pairs. Defining the semantic node distance with the shortest distance in all semantic pairs corresponding to the word pair as the word pair distance L (prop (A) _j ,prop(B) _k ) And defining the semantic pair depth with the shortest distance between semantic nodes as word pair depth D (prop (A) _j ,prop(B) _k ) Known attribute name prop (A) _j Presence in semantic node v _j1 ,v _j2 ,…，v _jn In the synonymous phrase of (1), attribute name prop (B) _k Presence in semantic node v _k1 ,v _k2 ，…v _km In a synonymous phrase, then prop (A) _j With prop (B) _k The distance calculation formula and the depth calculation formula are as follows:

wherein L (v) _ja ,v _kb ) Representing semantic nodes v _ja And semantic node v _kb Distance of D (v) _ja ,v _kb ) Representing semantic pairs (v) _ja ,v _kb ) Of the depth of (c).

Further, mean functions LW (i) and DW (o) are generated according to a training set of the naive Bayes model, and then a conditional probability distribution column P (L (prop (A)) is calculated by using the mean functions LW (i) and DW (o) _j ,prop(B) _k ) I C) and P (D (prop (A) _j ,prop(B) _k ) I C), where C is the word class classification with a range of { U, N }, where U stands for "consistent" and N stands for "inconsistent". Finally, adjusting factors alpha and beta are calculated, and the calculation formula is as follows:

then, the step (1.4.5) is carried out;

step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed _j ,prop(B) _k ) And D (prop (A) _j ,prop(B) _k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V ₁ ＝P(L(prop(A) _j ,prop(B) _k )＝i|C＝U)，V ₂ ＝P(D(prop(A) _j ,prop(B) _k )＝o|C＝U), V ₃ ＝P(L(prop(A) _j ,prop(B) _k ) = i | C = N) and V ₄ ＝P(D(prop(A) _j ,prop(B) _k ) = o | C = N). Finally, the pro (A) is calculated by combining the regulatory factors alpha and beta in step (1.4.4) _j And prop (B) _k Similarity between them Sim _word The calculation formula is as follows:

step (1.4.6) if Sim _word Greater than or equal to the similarity criterion factor eta, prop (A) _j With prop (B) _k Adding 1 to the variable i for the same attribute, returning to the step (1.4.2), and otherwise, returning to the step (1.4.3);

wherein i represents the number of the attributes in concept A and concept B, prop (A) and prop (B) represent the attribute sets of concept A and concept B respectively, and Size (prop (B)) and Size (prop (A)) represent the number of the attributes in concept B and concept A respectively, and the step (1.5) is carried out;

a second step of providing a service S by combining the concept similarity calculation method of the first step ₁ And service S ₂ Input similarity Sim _input The process of the calculation method is as follows:

step (2.2) service S ₁ Is subtracted by the service S ₂ Obtaining a parameter quantity difference d by inputting the parameter quantity, and performing the step (2.3);

step (2.4) traverse S _long If the traversal is completed, go to step (2.8), otherwise go from S _long In the next input parameter long is taken out _i Carrying out the step (2.5);

step (2.6) calculating parameter Long according to the concept similarity calculation method of the first step _i Root of Henan ginsengNumber short _j Similarity Sim of _ij Carrying out the step (2.7);

step (2.7) reacting Sim _ij And InSim [ i ]]For comparison, if Sim _ij Greater than InSim [ i]Then, will InSim [ i ]]Is set as Sim _ij Otherwise InSim [ i ]]Returning to the step (2.5) when the value is the original value;

wherein Size (S) _long Input) and Size (S) _short Input) respectively represent services S _long Number of input parameters and service S _short The number of input parameters, | d | represents the difference value of the number of the input parameters of the two services, and InSim is the maximum matching array of the similarity of the input parameters;

thirdly, combining the conceptual similarity calculation method of the first step to give a service S ₁ And service S ₂ Output similarity Sim _output The calculation method comprises the following steps:

step (3.1) creating a service output parameter similarity maximum matching array OutSim and initializing, and performing step (3.2);

step (3.2) service S ₁ Minus the service S ₂ Obtaining a parameter quantity difference d according to the output parameter quantity, and performing the step (3.3);

step (3.3) if d is less than or equal to 0, service S is carried out ₁ Is set to S _short Service S ₂ Is set to S _long Otherwise, it will serve S ₂ Is set to S _short Service S ₁ Is set to S _long And (4) performing the step (3.4);

and (3).5) Traverse S _short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is ended _short In the next output parameter short _j And (4) carrying out the step (3.6);

step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step _i And the parameter short _j Similarity Sim of _ij Carrying out the step (3.7);

step (3.8) calculation service S ₁ And service S ₂ Output similarity Sim _Output The calculation formula is as follows:

wherein Size (S) _long Output) and Size (S) _short Output) respectively represent the services S _long Number of output parameters and service S _short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;

FunctionalSim(S ₁ ,S ₂ )＝w ₁ ×Sim _input +w ₂ ×Sim _Output wherein the weight w ₁ And w ₂ Is a real number between 0 and 1 and sums to 1. They represent the importance of the service consumer to input similarity and output similarity determination. By default, w ₁ And w ₂ Are all set to 0.5.

Claims

1. A Web service function similarity measurement method based on an ontology is characterized by comprising the following steps:

otherwise, performing the step (1.4);

step (1.4) if the concept A and the concept B have no parent-child relationship but the two concepts directly or indirectly have a common parent concept C, a word semantic similarity measurement method based on a naive Bayes model is adopted, firstly, attributes of the concept A and the concept B are respectively traversed, feature extraction is carried out on the attribute names of the concept A and the concept B through a computeFeature function, and then, a conditional probability distribution list and an adjustment factor after sample training are adopted to calculate the similarity Sim between the concept attributes _word According to the similarity Sim between the concept attributes _word Comparing with similarity judgment factor eta, judging whether two attributes are the same attribute, performing statistics, and finally calculating the similarity of concepts A and BDegree Sim _concept ；

Step (1.5) if the relationship between the concept A and the concept B does not meet the above condition, the similarity Sim of the concepts A and B _concept Is set to 0;

a second step of providing a service S by combining the concept similarity calculation method of the first step ₁ And service S ₂ Input similarity Sim _input The calculation method comprises the following steps:

step (2.1) establishing a service input parameter similarity maximum matching array InSim and initializing, and performing step (2.2);

wherein Size (S) _long. Input) and Size (S) _short. Input) represents services S, respectively _long Number of input parameters and service S _short The number of input parameters, | d | represents the difference value of the number of the input parameters of two services, and InSim is the maximum matching array of the similarity of the input parameters;

step (3.4) traverse S _long If the traversal is completed, go to step (3.8), otherwise go from S _long In the next output parameter long _i Carrying out the step (3.5);

step (3.5) traverse S _short If the traversal is completed, the step (3.4) is returned, otherwise, the step S is started _short In the next output parameter short _j Step (3.6) is carried out;

step (3.6) calculating the parameter Long according to the concept similarity calculation method of the first step _i And parameter short _j Similarity Sim of ^* _ij Carrying out the step (3.7);

step (3.7) with Sim ^* _ij And OutSim [ i ]]By comparison, if Sim ^* _ij Greater than OutSim [ i ]]Then OutSim [ i ] is set]Is set to Sim ^* _ij Else OutSim [ i ]]Returning to the step (3.5) when the value is the original value;

wherein Size (S) _long. Output) and Size (S) _short. Output) respectively represent services S _long Number of output parameters and service S _short The number of output parameters, | d | represents the difference value of the number of the output parameters of the two services, and OutSim is a maximum matching array of the similarity of the output parameters;

the fourth step combines the service input similarity Sim obtained in the second step and the third step _input Similarity with service output Sim _Output Computing service S ₁ And service S ₂ Functional similarity of (S) ₁ ,S ₂ ) The calculation formula is as follows:

FunctionalSim(S ₁ ,S ₂ )＝w ₁ ×Sim _input +w ₂ ×Sim _Output wherein the weight w ₁ And w ₂ Are real numbers between 0 and 1 and sum to 1, which represent the importance of the service consumer to input similarity and output similarity determination.

2. The method of claim 1, wherein the step (1.4) comprises the following steps:

step (1.4.2) if the traversal of the attribute set prop (A) of the concept A is completed, step (1.4.7) is carried out, otherwise, the next prop (A) is taken out from the prop (A) _j And removing it from prop (A), proceeding to stepStep (1.4.3);

step (1.4.4) is based on naive Bayes model, combines WordNet English dictionary, and pairs prop (A) through computeFeature function _j And prop (B) _k The attribute name of (2) is subjected to feature extraction to obtain L (prop (A) _j ,prop(B) _k ) And D (prop (A) _j ,prop(B) _k ) The process is as follows:

calculating the word semantics of each attribute name, wherein each word corresponds to one or more semantics, each word pair corresponds to one or more semantic pairs, and the semantic node distance with the shortest distance in all the semantic pairs corresponding to the word pair is defined as the word pair distance L (prop (A) _j ,prop(B) _k ) And defining the semantic pair depth with the shortest distance between the semantic nodes as the word pair depth D (prop (A) _j ,prop(B) _k ) Known attribute name prop (A) _j Presence in semantic node v _j1 ,v _j2 ,…，v _jn In the synonymous phrase of (1), attribute name prop (B) _k Presence in semantic node v _k1 ,v _k2 ，…v _km In a synonymous phrase, then prop (A) _j And prop (B) _k The distance calculation formula and the depth calculation formula are as follows:

wherein L (v) _ja ,v _kb ) Representing semantic nodes v _ja And semantic node v _kb Distance of D (v) _ja ,v _kb ) Representing semantic pairs (v) _ja ,v _kb ) Depth of (d);

according to the principleGenerating mean functions LW (i) and DW (o) by a training set of the naive Bayes model, and calculating a conditional probability distribution list P (L (prop (A)) by using the mean functions LW (i) and DW (o) _j ,prop(B) _k ) I C) and P (D (prop (A) _j ,prop(B) _k ) I C), wherein C is a word class classification, the value range is { U, N }, wherein U represents 'consistent' and N represents 'inconsistent', and finally, the adjustment factors alpha and beta are calculated according to the following calculation formula:

step (1.4.5) based on naive Bayes model, the characteristic L (prop (A) of ontology concept attribute is processed _j ,prop(B) _k ) And D (prop (A) _j ,prop(B) _k ) Injecting the mixture into the conditional probability distribution list obtained in the step (1.4.4), and sequentially extracting the conditional probability V ₁ ＝P(L(prop(A) _j ,prop(B) _k )＝i|C＝U)，V ₂ ＝P(D(prop(A) _j ,prop(B) _k )＝o|C＝U),V ₃ ＝P(L(prop(A) _j ,prop(B) _k ) = i | C = N) and V ₄ ＝P(D(prop(A) _j ,prop(B) _k ) = o | C = N), and finally the adjustment factors a and a in step (1.4.4)

To calculate prop (A) _j And prop (B) _k Similarity between them Sim _word The calculation formula is as follows:

step (1.4.6) if Sim _word If it is greater than or equal to the similarity determination factor eta, prop (A) _j And prop (B) _k Is the same asAdding 1 to the variable i, and returning to the step (1.4.2), otherwise, returning to the step (1.4.3);

wherein i represents the number of attributes in concept a and concept B that are the same, prop (a) and prop (B) represent the attribute sets of concept a and concept B, respectively, and Size (prop (B)) and Size (prop (a)) represent the number of attributes of concept B and concept a, respectively, and step (1.5) is performed.