CN111797196A

CN111797196A - Service discovery method combining attention mechanism LSTM and neural topic model

Info

Publication number: CN111797196A
Application number: CN202010483308.1A
Authority: CN
Inventors: 李兵; 姚力; 王健
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2020-06-01
Filing date: 2020-06-01
Publication date: 2020-10-20
Anticipated expiration: 2040-06-01
Also published as: CN111797196B

Abstract

The invention discloses a service discovery method combining an attention mechanism LSTM and a neural topic model, which comprises the following steps of: 1: extracting natural language words from the labels of the Web service description language and preprocessing the natural language words; 2: performing semantic enhancement on the keywords extracted in the step 1, and processing the keywords through a neural topic model to obtain described topic information; 3: embedding the key vocabulary in the step 1 by utilizing the pre-trained word vector; 4: on the basis of the steps 2 and 3, performing feature extraction by combining the bidirectional LSTM of the attention mechanism; 5: and on the basis of the step 4, calculating similarity of the feature vectors of the query request and the service description, and finding k services with highest similarity from the registered service library. The invention has the beneficial effects that: by processing semantic information existing in the description of the Web service and using external information to enhance the semantic information, services meeting the functional requirements of users are found in a large number of registration services based on the semantics of user queries.

Description

Service discovery method combining attention mechanism LSTM and neural topic model

Technical Field

The invention relates to the technical field of computers, in particular to a service discovery method based on the combination of an attention mechanism LSTM and a neural topic model.

Background

Service-oriented architectures have spawned a new paradigm of software development and integration in which system functions are packaged as loosely-coupled and interoperable services. Therefore, to meet the high interoperability and flexibility requirements in modern software application development, more and more Web services and cloud services are being developed. The proliferation of the number of Web services brings convenience to developers, but also brings difficulty to quickly select appropriate services meeting the needs of users from a large-scale service registry.

In existing service registries, Web services are mostly described in the (WSDL) Web services description language. The number of keywords extracted from the description is very limited and semantically sparse, and these keywords are difficult to compose into reasonable natural sentences. There are two types of improved approaches to the problem that the keyword matching approach employed in most service search engines may retrieve irrelevant services or lose relevant services. The first category of methods annotates services and queries with domain ontologies and performs service matching using ontology reasoning. However, constructing such ontologies and semantically annotating Web services is a time-consuming and difficult task to apply in practice. Another class of methods uses machine learning techniques for service matching. The method comprises the steps of obtaining the description of the service and the topic distribution of the query text of a user through an LDA model, and combining a word vector and a topic model to relieve the semantic sparseness problem of the service description.

The inventor of the present application finds that the method of the prior art has at least the following technical problems in the process of implementing the present invention:

some machine learning and deep learning methods in the natural language processing field have made some progress in service clustering, service recommendation and the like, but since complex deep learning models and methods require a large number of sentences containing context information as training corpora, these methods are still difficult to be directly applied to the service discovery field.

Therefore, the method in the prior art has the technical problem of poor service discovery effect.

Disclosure of Invention

The invention provides a service discovery method based on attention mechanism LSTM and neural topic model combination, which is used for solving or at least partially solving the technical problem of poor service discovery effect of the method in the prior art.

In order to solve the technical problem, the invention provides a service discovery method based on the combination of attention-based LSTM and neural topic model, which comprises the following steps:

s1: respectively extracting keywords from the Web service description and the query request, and preprocessing the extracted service description keywords and query request keywords;

s2: performing semantic enhancement on the preprocessed service description keywords and processing the preprocessed service description keywords by a neural topic model to obtain topic information of service description, and performing semantic enhancement on the preprocessed query request keywords and processing the preprocessed query request keywords by the neural topic model to obtain topic information of the query request;

s3: converting the preprocessed service description keywords and query request keywords into vectorization forms by a word embedding technology, and obtaining a word vector matrix of the service description and a word vector matrix of the query request;

s4: performing feature extraction on service description by combining a bidirectional LSTM of an attention mechanism based on the topic information of the service description and a word vector matrix of the service description to obtain a semantic feature vector of the service description, and performing feature extraction on a query request by combining the bidirectional LSTM of the attention mechanism based on the topic information of the query request and the word vector matrix of the query request to obtain the semantic feature vector of the query request;

s5: calculating similarity of the semantic feature vector of the query request and the semantic feature vector of the service description, and finding k services with highest similarity to the query request from the registration service library, wherein k is a positive integer greater than 0.

In one embodiment, S1 specifically includes:

s1.1: extracting keywords from the Web service description and the query request respectively, and extracting natural language words as service description keywords and query request keywords;

s1.2: and performing word segmentation, stop word removal and word shape reduction on the extracted natural language words.

In one embodiment, S2 specifically includes:

s2.1: querying terms corresponding to the extracted service description keywords and query request keywords from a preset encyclopedia website, and selecting a first segment for paraphrasing the terms as content for enhancing semantics from the terms to be added into the extracted keywords;

s2.2: performing bag-of-words processing on the service description and query request description information subjected to semantic enhancement to obtain bag-of-words vectors of the service description and bag-of-words vectors of the query request;

s2.3: and taking the bag-of-words vector of the service description and the bag-of-words vector of the query request as the input of the neural topic model, obtaining a parameterization parameter through the processing of a multilayer perceptron, and normalizing the parameterization result through softmax to obtain the topic information of the service description and the topic information of the query request.

In one embodiment, S3 specifically includes:

searching word vectors of corresponding words for the service description keywords preprocessed in the step S1 through a pre-trained word vector model, and splicing the vectors of all words in the service description into a word vector matrix corresponding to the service description;

and finding the word vectors of the corresponding words through the pre-trained word vector model for the query request keywords preprocessed in the step S1, and splicing the vectors of all the words in the query request into a word vector matrix corresponding to the query request.

In one embodiment, S4 specifically includes:

s4.1: using bidirectional LSTM to extract sequence characteristics of the word vector matrix of the service description and the word vector matrix of the query request obtained in S3, and obtaining context vectors corresponding to each word in the word vector matrix of the service description and the word vector matrix of the query request;

s4.2: obtaining a correlation coefficient of each vocabulary through a full connection layer, an activation function and normalization processing by an attention mechanism based on the extracted context vector, the topic information of the service description and the topic information of the query request, wherein each word corresponds to a weight for representing the current vocabulary and the correlation coefficient for describing the overall topic distribution;

s4.3: and based on the correlation coefficient of each vocabulary obtained in the S4.2, performing weighted summation on the context vector matrix obtained in the S4.1, and taking the weighted summation result as the semantic feature vector of the service description and the semantic feature vector of the query request.

In one embodiment, S5 specifically includes:

calculating cosine similarity of the semantic feature vectors of the query request and the semantic feature vectors of the service description, sequencing, and returning the service description with the similarity meeting the preset condition as the service meeting the functional requirements in the user query.

In one embodiment, in S2.3, the topic information of the service description is obtained by inputting the bag vector of the service description, and the input is processed by using the multi-layer perceptron, using the following formula:

π＝relu(W^π·X_BoW+b_π),

μ＝relu(W^μ·π+b_μ),

logσ＝relu(W^σ·π+b_σ),

wherein, the bag-of-words form of the service description after semantic enhancement is expressed as X_BoWPi is an intermediate variable used for calculating subsequent mu and sigma, the mu and the sigma are respectively a mean value and a standard deviation in the subsequent heavy parameter process, W represents a parameter matrix of the multilayer perceptron, b_＊Representing the bias term of the multi-layer perceptron, the formula of the relu activation function is as follows:

relu(x)＝max(0,x)

the formula for the reparameterisation procedure used in S2.3 is as follows:

u＝Drawu～N(0,1)

z＝u*σ+μ

the multivariate random variable u which is subjected to the standard normal distribution is obtained by sampling from the standard normal distribution, the mean value mu and the standard deviation sigma which are obtained by calculation are used for carrying out scaling and translation to obtain a vector z, and a softmax function is used for carrying out normalization output to be the theme distribution of the service description, wherein the formula is described as follows:

θ＝softmax(relu(W^θ·z+b_θ))

wherein W^θ、b_θThe multi-layer perceptron parameter matrix and the bias item are respectively, and theta is the theme distribution of the current input text which is not normalized.

In one embodiment, S4.1 performs feature extraction on the word vector matrix in S3 using bi-directional LSTM, using the following formula:

f_t＝sig(W_f·[h_t-1,x_t]+b_f)

i_t＝sig(W_i·[h_t-1,x_t]+b_i)

o_t＝sig(W_o[h_t-1,x_t]+b_o)

h_t＝o_t⊙tanh(C_t)

for time step t, its input is the embedded expression x of the corresponding vocabulary_t，W^*Parameter matrix representing a multi-layer perceptron, b_*Representing a bias term of the multi-layer perceptron, the hidden state of the current time step is h_tThe cell state is C_tThe hidden state of the previous time step is h_t-1The cell state is C_t-1，f_tRepresents the above time state C_t-1The forgetting factor of (a) is,

input representing the current time step for a long-term state, i_tIndicating the current input

Coefficient of (a), o_tIs the current cell state C_tOutput as a hidden layer h_tThe coefficient of (a); sig denotes a sigmoid function,. indicates a bitwise multiplication operation, and the calculation formulas of the sigmoid function and the tanh function are as follows:

in bi-directional LSTM, each time step outputs two hidden vectors containing forward and backward directions, which are spliced into a context vector corresponding to the current word:

a hidden vector representing the forward direction is shown,

representing latent vectors of opposite sense, H_sRepresenting the context vector corresponding to the current word.

In one embodiment, the attention mechanism employed at S4.2 is expressed as:

a_s＝W^a·tanh(W^θ·θ_s+W^h·H_s)

A_s＝softmax(a_s)

wherein, W^*Is a parameter matrix, H_sCorresponding the current text after bidirectional LSTM processingThe output matrix of (a), i.e. the context vector, a_sFor distribution of theta by the current text topic_sVector composed of weights of each word calculated, A_sIs a normalized weight vector and is used as a weight vector,

and S4.3, weighting and summing the weight and the context vector to obtain a semantic feature vector of the current service description document:

wherein, O_sThe semantic feature vector representing the current service description document may be a semantic feature vector of a service description or a semantic feature vector of a query request.

In one embodiment, the formula for calculating the similarity of the service description and the user query is:

wherein O is_s,O_qRespectively representing the extracted semantic feature vector of the service description and the semantic feature vector of the query request, cosine being a function for calculating cosine similarity of the two vectors

One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:

the invention provides a service discovery method combining an LSTM and a neural topic model based on an attention mechanism, which comprises the steps of firstly, respectively extracting keywords of Web service description and a query request, and preprocessing the extracted service description keywords and query request keywords; then, performing semantic enhancement on the preprocessed keywords and processing the keywords by a neural topic model to obtain topic information of service description and topic information of a query request; secondly, converting the preprocessed keywords into a vectorization form by a word embedding technology, and obtaining a word vector matrix; then, based on the topic information and the word vector matrix, performing feature extraction on the current description by combining a bidirectional LSTM of an attention mechanism to obtain a semantic feature vector of the service description and a semantic feature vector of the query request; and finally, calculating the similarity of the semantic feature vector of the query request and the semantic feature vector of the service description, and finding out k services with the highest similarity to the query request from the registration service library.

Compared with the prior art, the method can enhance the semantics of the original service description with sparse semantics and the user query request, further judge the matching degree of the service and the user query based on the semantic similarity, find the service meeting the functional requirements of the user in a large number of registration service libraries, improve the service discovery effect and improve the accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a general framework diagram of the service discovery method of the present invention incorporating attention-driven LSTM and semantically enhanced neural topic models;

FIG. 2 is a schematic diagram of a neural topic model of the present invention;

FIG. 3 is a schematic diagram of the structure of an LSTM incorporating the attention mechanism of the present invention;

FIG. 4 is a graph of model performance for different numbers of topics for the model in an embodiment of the present invention;

FIG. 5 is a graph comparing the performance of other mainstream model methods in an embodiment of the invention.

Detailed Description

The invention provides a service discovery method combining attention mechanism LSTM and a neural topic model. The service description with sparse semantics and the user query can be subjected to semantic enhancement, the matching degree of the service and the user query is judged according to semantic similarity, and the service capable of meeting the functional requirements of the user is found in a large number of registration service libraries.

The technical scheme of the invention is as follows:

1: extracting natural language words from the labels of the Web service description language and preprocessing the natural language words; 2: performing semantic enhancement on the keywords extracted in the step 1, and processing the keywords through a neural topic model to obtain described topic information; 3: embedding the key vocabulary in the 1 by using the word vector trained on the large-scale data set; 4: on the basis of the completion of the

steps

2 and 3, performing feature extraction by combining the bidirectional LSTM of the attention mechanism (the same method in the steps 1-4 is also adopted for processing the query request); 5: and on the basis of the completion of the step 4, calculating the similarity of the feature vectors of the query request and the service description, and finding the k services with the highest similarity from the registered service library.

The invention has the beneficial effects that: by processing semantic information existing in the description of the Web service and using external information to enhance the semantic information, services meeting the functional requirements of users are found in a large number of registration services based on the semantics of user queries.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment provides a service discovery method based on the combination of attention mechanism LSTM and neural topic model, which comprises the following steps:

Specifically, for Web service description and a query request of a user, the method of steps 1-4 is adopted for processing to obtain a corresponding semantic feature vector.

S2 first performs semantic enhancement on the extracted keywords, for example, the interpretations of related keywords can be searched from encyclopedic websites as semantic enhancement information, and then obtains the topic information of the service description and query request through a pre-constructed neural topic model.

S4 is to extract the feature of the word vector matrix through a bidirectional LSTM, then to sum the outputs of the bidirectional LSTM by weighting according to the described topic distribution and attention mechanism obtained in S2, and finally to obtain the corresponding semantic feature vector.

In S5, the service closest to the query request is selected according to the similarity between the semantic feature vectors.

In one embodiment, S1 specifically includes:

In one embodiment, S2 specifically includes:

Specifically, for the Web service description, according to the extracted keywords, the corresponding vocabulary entry is inquired from the encyclopedic website, and semantic enhancement is performed. The processing for the query request is similar.

In one embodiment, S3 specifically includes:

Specifically, the pre-trained word vector model refers to a word vector model trained on large-scale natural language data, and the word vector model can output corresponding word vectors by inputting words and phrases.

In one embodiment, S4 specifically includes:

Specifically, after bi-directional LSTM processing, each vocabulary (keyword) gets a corresponding context vector, e.g., a keyword extracted from a service description has a corresponding context vector, and similarly, a keyword in a query request also has a context vector.

Taking the service description as an example, the context vector and the subject information corresponding to the extracted service description keyword are used as input, and the attention mechanism is processed through a full connection layer, an activation function and normalization, so that the correlation coefficient, namely the correlation between the current vocabulary (keyword) and the overall subject distribution of the service description can be obtained. And then, obtaining a corresponding semantic feature vector through the weighted summation of the correlation coefficient and the context vector. Similar steps are also taken for the processing of the query request, which is not described in detail herein.

In one embodiment, S5 specifically includes:

The service with high similarity is returned as the service which can more semantically meet the functional requirements in the user query.

π＝relu(W^π·X_BoW+bπ),

μ＝relu(W^μ·π+b_μ),

logσ＝relu(W^σ·π+b_σ),

wherein, the bag-of-words form of the service description after semantic enhancement is expressed as X_BoWPi is an intermediate variable used for calculating subsequent mu and sigma, the mu and the sigma are respectively a mean value and a standard deviation in the subsequent heavy parameter process, W represents a parameter matrix of the multilayer perceptron, b_*Representing the bias term of the multi-layer perceptron, the formula of the relu activation function is as follows:

relu(x)＝max(0,x)

the formula for the reparameterisation procedure used in S2.3 is as follows:

u＝Drawu～N(0,1)

z＝u*σ+μ

θ＝softmax(relu(W^θ·z+b_θ))

f_t＝sig(W_f·[h_t-1,x_t]+b_f)

i_t＝sig(W_i·[h_t-1,x_t]+b_i)

o_t＝sig(W_o[h_t-1,x_t]+b_o)

h_t＝o_t⊙tanh(C_t)

for time step t, its input is the embedded expression x of the corresponding vocabulary_t，W^*Parameter matrix representing a multi-layer perceptron, b_＊Representing a bias term of the multi-layer perceptron, the hidden state of the current time step is h_tThe cell state is C_tThe hidden state of the previous time step is h_t-1The cell state is C_t-1，f_tRepresents the above time state C_t-1The forgetting factor of (a) is,

a hidden vector representing the forward direction is shown,

In one embodiment, the attention mechanism employed at S4.2 is expressed as:

a_s＝W^a·tanh(W^θ·θ_s+W^h·H_s)

A_s＝softmax(a_s)

wherein, W^*Is a parameter matrix, H_sFor the output matrix corresponding to the current text after bidirectional LSTM processing, i.e. context vector, a_sFor distribution of theta by the current text topic_sVector composed of weights of each word calculated, A_sIs a normalized weight vector and is used as a weight vector,

wherein O is_s,O_qAnd respectively representing the extracted semantic feature vector of the service description and the semantic feature vector of the query request, wherein the cosine is a function for calculating the cosine similarity of the two vectors. The following is a specific embodiment of performing service discovery by applying the method of the present invention, taking service discovery on a disclosed service data set SAWSDL-TC described by WSDL as an example, and describing the implementation process of the present invention in detail with reference to the accompanying drawings.

The service data set contains 1080 services and 42 user requests, for each of which three levels of related services are manually labeled, where 3 indicates that the service is most relevant to the request and 0 indicates that the service is less relevant to the request.

FIG. 1 depicts a process of calculating semantic similarity between a user request and a service description, in the method of the present invention, the same parameters are used for a model for calculating the user request and the service description, and the same processing flow and model are used for the service description and the user request to extract feature vectors.

Firstly, step 1 is executed, key words for describing services are extracted from the name attributes of different labels of the WSDL file, and preprocessing work including word segmentation, stop word removal and word shape restoration is carried out.

Then step 2 is executed, for each vocabulary in the service, the vocabulary is searched from Wikipedia, a first segment is selected from a corresponding page to be added into the description of the service, and the vector in the form of a bag of words is processed. Fig. 2 contains the structure of the neural topic model used in the present invention, and each digit of the bag-of-words vector represents a vocabulary, and the value of the position is the number of times the vocabulary appears. The neural topic model is first trained on a dataset of all 1080 semantically enhanced service descriptions, the process is unsupervised, and the neural topic model is required to generate a topic distribution theta from the input bag of words and regenerate a bag of words vector according to theta.

Then, using the trained neural topic model, obtaining the topic distribution theta of the service description or the user query from the input bag-of-words vector_s

And then, executing a step 3, embedding the vocabulary extracted in the step 1 by using the 300-dimensional word vector pre-trained on the GoogleNews data set, and obtaining a matrix representation of the service description or the user query.

Then, step 4 is performed, as shown in fig. 3, with LSTM as the encoder, receiving the matrix representation in step 3 as input, and outputting the vector matrix H containing context information for each vocabulary_s(ii) a On the other hand, the attention mechanism distributes the theme θ obtained in step 2_sAnd context information matrix H in step 4_sSplicing, obtaining a correlation coefficient of each vocabulary and the overall described theme distribution after a full connection layer and a tanh activation function are passed, and obtaining A after normalization_sWith A_sIs a weight pair H_sThe words in the Chinese character are weighted and summed to obtain a characteristic vector O of service description or user query (query request)_s。

Finally, a feature vector O of the user request is calculated_qAnd the similarity of 1080 service descriptions, selecting the service with the highest similarity from 1080 values, and returning the service to the user.

Figure 4 compares the final performance results of the model for different number of topics (topic number), and by observing that when the number of topics of the neural topic model is chosen to be 130, the model as a whole achieves the best results, so we choose 130 as the number of topics of the neural topic model on this data set. Table 1 is the prediction result evaluation of the model on the SAWSDL-TC dataset, returning the k highest similarity services.

Comparing AENTM with mainstream model methods such as LDA (late Dirichlet allocation), LSTM (Long Short-Term Memory), Lucene, CNN (volumetric Neural networks), WMD (WordMover's Distance) and Doc2Vec on SAWSDL-TC data set, and comparing FIG. 5 with four indexes of Precision, Recall, NDCG and F1 of each model, when k most similar services are returned, the method provided by the invention is superior to the comparison method in the four indexes. The formulas of the four indexes are respectively as follows:

where TP is the number of positive samples appearing in the returned k most similar services, FP is the number of non-positive samples in the k services, and FN is the number of positive samples not appearing in the k services. In NDCG calculation, rel_iA real tag representing the ith service.

Table 2 shows the comparison between AENTM of the present invention method and the above-mentioned mainstream model method in response time, where the response time is the time for returning k most similar services from the model to each user query, and the index effect of the present invention model is superior to that of the mainstream method of comparison without significantly increasing the response time.

TABLE 1 evaluation of model prediction on SAWSDL-TC

TABLE 2 comparison of mean response times for various models on SAWSDL-TC

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A service discovery method based on attention mechanism LSTM and neural topic model combination is characterized by comprising the following steps:

2. The service discovery method of claim 1, wherein S1 specifically comprises:

3. The method of claim 1, wherein S2 specifically comprises:

4. The service discovery method of claim 1, wherein S3 specifically comprises:

5. The service discovery method of claim 1, wherein S4 specifically comprises:

6. The service discovery method of claim 1, wherein S5 specifically comprises:

7. A service discovery method according to claim 3, characterized in that in S2.3 the subject information of the service description is obtained by inputting the bag of service description words vector, and the input is processed by the multi-layer perceptron, using the following formula:

π＝relu(W^π·X_BoW+b_π),

μ＝relu(W^μ·π+b_μ),

logσ＝relu(W^σ·π+b_σ),

wherein, the bag-of-words form of the service description after semantic enhancement is expressed as X_BoWPi is an intermediate variable used for calculating subsequent mu and sigma, and the mu and the sigma are respectively a mean value and a standard deviation in the subsequent weight parameter process, W^*Parameter matrix representing a multi-layer perceptron, b_＊Representing the bias term of the multi-layer perceptron, the formula of the relu activation function is as follows:

relu(x)＝max(0,x)

the formula for the reparameterisation procedure used in S2.3 is as follows:

u＝Drawu～N(0,1)

z＝u*σ+μ

θ＝softmax(relu(W^θ·z+b_θ))

8. The service discovery method of claim 5 wherein S4.1 performs feature extraction on the word vector matrix in S3 using bi-directional LSTM, using the following formula:

f_t＝sig(W_f·[h_t-1,x_t]+b_f)

i_t＝sig(W_i·[h_t-1,x_t]+b_i)

o_t＝sig(W_o[h_t-1,x_t]+b_o)

h_t＝o_t⊙tanh(C_t)

a hidden vector representing the forward direction is shown,

9. The service discovery method of claim 5 wherein the attention mechanism employed at S4.2 is expressed as:

a_s＝W^a·tanh(W^θ·θ_s+W^h·H_s)

A_s＝softmax(a_s)

10. The service discovery method of claim 6, wherein the formula for calculating the similarity between the service description and the user query is:

wherein O is_s,O_qRespectively represent the extracted service descriptionsThe cosine similarity of the two vectors is calculated according to the cosine similarity of the semantic feature vector of the query request and the semantic feature vector of the query request.