CN116150669A

CN116150669A - Mashup service multi-label classification method based on double-flow regularized width learning

Info

Publication number: CN116150669A
Application number: CN202211542741.3A
Authority: CN
Inventors: 曹志英; 陈思源; 王凯月; 张秀国; 张德珍
Original assignee: Dalian Maritime University
Current assignee: Dalian Maritime University
Priority date: 2022-12-02
Filing date: 2022-12-02
Publication date: 2023-05-23

Abstract

The invention provides a Mashup service multi-label classification method based on double-flow regularized width learning, which mainly comprises the following steps: extracting features of the preprocessed Mashup description document by using a hidden dirichlet distribution theme model; respectively linearly mapping the Mashup description document theme feature matrix into n groups of feature nodes; processing the feature nodes through an activation function so as to generate enhanced nodes; splicing the feature nodes and the enhancement nodes to generate enhancement feature nodes as the input of the model; constructing an objective function of a Mashup service multi-label classification model based on double-flow regularization width learning; solving an objective function by using a least square method to obtain a weight matrix of the double manifold regularized width learning network; and acquiring a description document of the Mashup service for testing, and sending the description document into a trained model to predict a multi-label classification result. The invention utilizes double manifold regularization to improve the width learning model, and adopts the improved BLS model to realize Mashup service multi-label classification function.

Description

Mashup service multi-label classification method based on double-flow regularized width learning

Technical Field

The invention relates to the technical field of multi-label classification of Mashup service, in particular to a Mashup service multi-label classification method based on double-flow regularized width learning.

Background

As a research object of service computing, the number of new services such as API (Application Programming Interface) service and Mashup service, which are expanded in number and variety, is rapidly increasing in the age of information explosion. The Mashup service is used as an aggregation model, allows software developers to integrate a plurality of publicly issued Web APIs into data, generates reusable and lightweight application programs with users as centers, and constructs multifunctional service application, thereby avoiding the limitation of single function of the APIs. However, mashup developers are very difficult and time consuming to choose from a vast number of Web services, if they want to choose a number of services that fit their complex functional descriptions. For example, a Mashup developer wants to create a Mashup service that "can locate restaurants on a map and see what other people evaluate the restaurants, and from this functional description, it can be inferred that sub-services that meet the demand can be found with a high probability if service filtering can be performed in the service categories of" map category "and" social category ". Therefore, if the function description document to be Mashup is analyzed, the service category related to the function is found out, the search space of Web service is greatly reduced, and the service in each service class cluster can be more conveniently used, so that the follow-up tasks such as service discovery, service combination and service recommendation become more reasonable and efficient.

The multi-label classification task of the text is a classical problem in the field of natural language processing, and a text segment can correspond to a plurality of labels, for example, for the same text segment describing intelligent medical treatment, the text segment belongs to the medical treatment class and the science and technology class. The Mashup function description document is described by natural language, and the composite function corresponds to a plurality of different service type labels. For service development registrants, a Mashup service multi-label classification algorithm is adopted to rapidly label service types for Mashup, so that the problems of complexity of manual labeling and ambiguity caused by language gaps are avoided; in addition, as the service system faces the problems of Mashup sub-service failure, migration and the like, through the service type mark, a developer can replace the service in the determined service type, so that the service searching space is reduced, and the service searching and recommending efficiency is improved. For service invoker, the main function of each Mashup can be intuitively known through the label, so that the Mashup service meeting the own requirements can be screened quickly by conveniently filtering the service.

Currently, the multi-label classification method mainly comprises two major categories, namely an algorithm self-adaptation method (Algorithm Adaptation) and a problem transformation method (Problem Transformation).

The problem conversion method converts the multi-label classification problem into a plurality of single-label classification problems, and the data can be directly processed by using the existing traditional single-label classification algorithm by reconstructing the multi-label data set, so that the data is adapted to the algorithm. Common transformation methods are binary correlations BR (Binary Relevance) and classifier chains CC (Classifier Chain). The BR algorithm distributes a binary classifier for each label, so that the multi-label classification problem is decomposed into a plurality of independent binary problems, each classifier can be independently trained, and finally a prediction label set of an unknown sample is obtained. The method is visual and efficient, but ignores the correlation between labels, resulting in reduced classification accuracy. The CC algorithm solves the problem of lack of dependency relationship among labels due to independent assumption of labels in the BR method, links a plurality of binary classifiers one by one, and training of each classifier is built on all classifiers before input data and on a chain. The classifiers in the chain are randomly arranged so that the relationship between the labels is calculated randomly; in addition, if the previous classifier predicts poorly, errors will continue to spread as the chain extends.

The algorithm self-adaption method expands the traditional single-tag classification algorithm, so that the improved algorithm can directly solve the problem of multi-tag classification, and the algorithm is designed according to the data characteristics, so that the multi-tag data set is directly processed. Common algorithms are ML-KNN (Multi-Label K-Nearest Neighbor), rank-SVM (Ranking Support Vector Machine), and ML-ELM (Multi-Label Extreme Learning Machine). The ML-KNN adopts a K neighbor algorithm to obtain the label distribution condition of neighbor data, then the number of neighbors belonging to a certain label is counted according to the labels of neighbor samples, and finally the labels of unknown samples are inferred and predicted by maximizing posterior probability; based on an original SVM, the Rank-SVM adds a ranking loss function with a multi-label model and takes a corresponding margin as a constraint condition to process the multi-label classification problem. The two adaptive algorithms are simple to realize and are not influenced by class imbalance, but the algorithm has high computational complexity due to the characteristics of the algorithm. In order to solve the problem of large time consumption, a multi-label classification algorithm ML-ELM based on an extreme learning machine is proposed and widely applied, the method designs a network into a single hidden layer feedforward neural network, utilizes Moore-Penrose generalized inverse solution of a weight coefficient between a hidden layer and an output layer, carries out two classification on each label at the output layer through a sigmoid function, has simple structural design and globally unique parameter solution, but only utilizes linear or nonlinear single characteristics of data in the hidden layer, so that generalization performance is not high.

With the development of deep learning technology, the problem of multi-label classification is rapidly developed by utilizing deep neural networks such as deep belief network DBN (Deep Belief Network), recurrent neural network RNN (Recurrent Neural Network), text convolutional neural network TextCNN (Text Convolutional Netural Network) and the like. In order to extract the features of deeper data, the neural networks generally realize the extraction of deep features by deepening the network layers so as to achieve a good approximation result, but the increase of the network layers tends to increase the parameters of each layer of neurons, and a large amount of manual parameter adjustment operations are required. In addition, since these networks employ back propagation BP (Back Propagation) to update the weight parameters to minimize the error of the actual and target outputs, the update of the multi-layer parameters by a layer-by-layer gradient descent method can lead to problems of gradient explosion or disappearance and the model being prone to local optimal solutions. The width learning (Broad Learning System, BLS) aims at realizing fitting of results by expanding the network width instead of deepening the network layer number, the overall network structure is similar to ELM, three layers of networks including an input layer, a hidden layer (a mapping layer) and an output layer are provided, the full connection of neurons is carried out between each layer, the problem that the local optimal solution is involved by solving the iteration solution parameters layer by layer in a pseudo-inverse matrix mode is avoided by the output weight coefficient, and gradient explosion or disappearance is avoided. Therefore, BLS has been widely used in the fields of image classification, pattern recognition, anomaly detection, etc. due to its fast and convenient parameter solving method and simple network structure. After the characteristic nodes are generated by random linear mapping of the original input data, the characteristic nodes are generated into the enhancement nodes by nonlinear mapping of the characteristic nodes, the two nodes are used as the input of a final network together and are directly and fully connected with an output layer, and finally, the parameters of the model are learned by solving a pseudo-inverse matrix between the input nodes and the output labels. However, this method has the following disadvantages: the adoption of random mapping tends to cause the data to exhibit a non-linear distribution in a new space that is difficult to predict; meanwhile, the target parameters are solved only by minimizing training errors, and the nonlinear geometric structure after data mapping cannot be mined, so that the precision of final classification can be influenced.

Disclosure of Invention

In view of the defects existing in the prior art, the invention provides a Mashup service multi-label classification method based on double-manifold regularization width learning, which utilizes double-manifold regularization to improve a width learning model and adopts an improved BLS model to realize a Mashup service multi-label classification function of topic-service type label matching.

The invention adopts the following technical means:

a Mashup service multi-label classification method based on double-flow regularized width learning comprises the following steps:

acquiring a Mashup service data set, wherein the Mashup service data set comprises a Mashup service name, a description document, an API service called by each Mashup service, an API service name and a class of the API service name, and preprocessing the description document and a binary output label of the Mashup service;

extracting features of the preprocessed Mashup description document by using a hidden dirichlet distribution theme model, and further generating a Mashup description document theme distribution vector;

taking the theme distribution vector of the Mashup description document as original input data of width learning, and respectively and linearly mapping the theme feature matrix of the Mashup description document into n groups of feature nodes by using a linear transformation function and n groups of randomly mapped parameter matrices;

processing the characteristic nodes through a nonlinear activation function so as to generate enhanced nodes;

performing splicing processing on the feature nodes and the enhancement nodes to generate enhancement feature nodes, and taking the enhancement feature nodes as input of a Mashup service multi-label classification model based on double-flow regularization width learning;

constructing an objective function of a Mashup service multi-label classification model based on double-flow regularization width learning by utilizing a manifold regularization idea;

solving the objective function by using a least square method, obtaining a weight matrix of a double manifold regularized width learning network, and completing training of a Mashup service multi-label classification model based on double manifold regularized width learning;

the method comprises the steps of obtaining a description document of a Mashup service for testing, obtaining a theme distribution vector matrix of the Mashup service for testing through text preprocessing and theme feature extraction, mapping the theme distribution vector matrix to generate enhanced feature nodes, obtaining final input of a test sample, sending the final input to a trained Mashup service multi-label classification model based on double-flow regularization width learning, and predicting classification results of the Mashup service.

Further, the step of preprocessing the binary output tags describing the document and Mashup service includes:

converting all letters in the descriptive document into lower case letters;

removing punctuation and nonsensical special symbols in the descriptive document;

removing stop words in the description document by using a corpus in an nltk package;

converting the complex morphology of the word into a base form based on a dictionary using a wordnetremmatatzer tool in the ntk package;

performing 0-1 digital conversion on the crawled Mashup call record to generate Mashup-service class call matrix Y= (Y) _ij ) ^MN×CN Wherein MN is the number of Mashup services, CN is the number of service types, y _ij =1 represents Mashup service M _i Invoking a certain service in category j, y _ij =0 means that no service in class j is invoked.

Further, the feature extraction is performed on the pre-processed Mashup description document by adopting a hidden dirichlet allocation topic model, so as to generate a Mashup description document topic feature matrix, which comprises the following steps:

taking the Mashup function description document set as a corpus of a cryptodirichlet distribution topic model;

training the hidden dirichlet distribution topic model to obtain topic set { K } ₁ ,K ₂ ,…,K _T -wherein T represents the number of topics set in the cryptodirichlet distribution topic model, each Mashup description document corresponding to a topic set { K } ₁ ,K ₂ ,…,K _T A probability distribution over the i-th Mashup description document, the subject features of which are represented as a 1 x T vector

Thereby obtaining the topic distribution vector of the whole corpus +.>

Further, the feature node is obtained according to the following formula:

Z ⁿ ≡[Z ₁ ,…,Z _n ](1)

wherein ,Z_i Representing an i-th set of feature nodes, comprising k feature nodes,

n x k is the total number of feature nodes,

wherein phi (·) is a linear mapping function,

and />

The weight coefficient matrix and the bias matrix which are the i-th group of characteristic nodes are generated through random mapping and accord with normal distribution.

Further, the enhancement node is obtained according to the following formula:

H ^m ＝ξ(Z ⁿ ·W _h +β _h ) (3)

wherein m is the number of enhancement nodes, generated at one time by the feature nodes,

and />

The weight coefficient matrix and the bias matrix which are respectively used as the enhancement nodes are generated through random mapping and accord with normal distribution, and then orthogonalization is carried out on the weight coefficient matrix and the bias matrix, and ζ (°) is a nonlinear activation function.

Further, the enhanced feature node is obtained according to the following formula:

X＝[Z ⁿ |H ^m ] (4)

wherein ,

to enhance characteristic node momentAn array, the rows of the array representing the number of samples and the columns representing the total number of network nodes;

thus, the input and output of Mashup service multi-label classification model based on double-flow regularization width learning can be related by the following formula:

Y＝Xβ (5)

wherein Y is a service type binary multi-label vector output by a Mashup service multi-label classification model based on double-flow regularized width learning, and beta is a weight matrix required to be learned by a width learning network.

Further, fusing double manifold regularization to construct an objective function of a Mashup service multi-label classification model based on double manifold regularization width learning, including:

constructing a data manifold regularization term of the enhancement feature node X, wherein the data manifold regularization constraint of the enhancement feature node X is as follows:

wherein ,x_i,： and x_j,： The ith and jth rows, representing training samples, y, of the enhanced feature node X _i and y_j Is sample x _i,： and x_j,： Is embedded in a low-dimensional tag space corresponding to a sample of a high-dimensional space, W _ij Representing sample x _i,： and x_j,： The similarity between the two samples is determined by a K neighbor algorithm, p neighbor samples closest to the sample are determined, and the calculation and the acquisition of a Gaussian kernel function are utilized:

W _ij calculated according to the following formula:

wherein N_p (xi,:) represents sample xi,: the nearest p neighbors, t is the bandwidth parameter of the gaussian kernel function, used to control the local scope of action of the function;

constructing a feature manifold regularization term of the enhanced feature node X, which specifically comprises the steps of constructing a similarity matrix S among features:

wherein ,x_i,： and x_j,： Is the ith and jth columns of the enhanced feature node X, representing sample features;

while enhancing feature manifold regularization constraint of feature node X to minimize mutually approximate data features X _:,i and x_:,j Corresponding weight beta _i and β_j Can be expressed as:

the objective function for constructing the Mashup service multi-label classification model based on double-flow regularization width learning is as follows:

wherein ,

for the quadratic norm, λ is the L2 regularization factor of the width learning, used to balance the empirical and structural risks, C ₁ Penalty factor for regularization of data manifold, C ₂ Penalty factors for feature manifold regularization;

simplifying the objective function of the Mashup service multi-label classification model based on double-flow regularization width learning, wherein the simplified objective function is as follows:

wherein trace (·) represents the trace operation of the matrix, L ₁ and L₂ Is a laplace matrix of data and features, and can be respectively obtained from a data similarity matrix W and a feature similarity matrix S.

Further, solving the objective function by using a least square method, obtaining a weight matrix of a double-manifold regularized width learning network, and completing training of a Mashup service multi-label classification model based on double-manifold regularized width learning, wherein the training comprises the following steps:

and (3) deriving beta from the whole objective function and enabling the derivative to be 0, thereby obtaining the following steps:

β＝(X ^T X+λI+C ₁ X ^T L ₁ X+C ₂ L ₂ ) ^-1 X ^T Y (13)

where β is the weight matrix of the dual-flow regularized width learning network.

Further, the classification result of the predictive Mashup service is obtained according to the following formula:

Pred＝X ^* β (14)

wherein Pred is a classification result of predicting Mashup service, beta is a weight matrix of the double-flow regularized width learning network, and X ^* Is the final input of the test sample.

Compared with the prior art, the invention has the following advantages:

the invention aims at the problems existing in the process of classifying Mashup service multi-labels in original width learning and improves the problem. The constraints of the data manifold and the characteristic manifold are added into the original width learning model, the row and column information of the input sample matrix is fully utilized, the local geometric structures of the data and the characteristics are maintained, the solved width learning model parameters are more accurate, and the model classification effect is improved. The fact proves that the Mashup service multi-label classification method for the improved double-flow regularized width learning is superior to the original width learning, and has obvious advantages compared with other commonly used multi-label classification algorithms.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.

FIG. 1 is a flow chart of a Mashup service multi-label classification method based on double manifold regularization width learning.

FIG. 2 is a diagram of an MLMS-DMBLS network model of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

Aiming at the problems that the data existing in the width learning show nonlinear distribution which is difficult to predict in a new space and the final classification precision is low due to the fact that the target parameters are solved by minimizing training errors, the invention improves the width learning model by utilizing the manifold learning idea and performs Mashup service multi-label classification by utilizing the improved width learning model. Manifold learning assumes that some data is embedded on a low-dimensional manifold in high-dimensional space, whose goal is to be able to mine the nonlinear geometry of the data while the high-dimensional data is mapped into the low-dimensional space, preserving the original local relationships of neighboring points. Manifold regularization utilizes manifold learning to enable data to keep original geometric distribution in a new feature space, namely sample points which are very close to each other in an original data space can be very close to each other in a projected label space, and therefore the problem of classification precision loss caused by the fact that geometric distribution of feature space data cannot be restored by adopting random mapping in BLS is solved. Mashup service description data belongs to text data that has been proven to conform to the manifold assumptions described above. According to the invention, the feature attribute of Mashup service description data is utilized, the local manifold structure constraint of the feature is introduced, and the feature vector with the similarity relationship is considered, and the corresponding weight parameters are considered to be similar; meanwhile, double manifold regularization constraints of data and features are integrated into an objective function of width learning, structural features described by Mashup service can be utilized to the maximum extent, local geometric structures of samples and features are mined, and accuracy of Mashup service multi-label classification by using the width learning is improved.

The invention provides a Mashup service Multi-label classification method MLMS-DMBLS (Multi-label Classification of Mashup Services Based on Double Manifold Broad Learning System) based on double-manifold regularization width learning, wherein the flow is shown in figure 1, a model is input as a theme vector of a Mashup description document, the model is output as the matching probability of each service type label, the width learning model is improved by using double-manifold regularization, and the Mashup service Multi-label classification function of the theme-service type label matching is realized by adopting the improved BLS model. The method specifically comprises the following steps:

(1) Training phase

Step one: mashup describes document and tag preprocessing.

The method comprises the steps of firstly crawling a real Mashup service data set from a Programmable Web platform, wherein the real Mashup service data set comprises Mashup service names, description documents, respectively called API services, API service names, description documents and belonging types. Assuming that the number of Mashup services is MN, the number of service categories is CN.

In order to acquire related theme characteristics of Mashup service, text preprocessing is required to be carried out on a description document, and the detailed process is as follows:

1) Converting all letters into lower case letters;

2) Punctuation marks and nonsensical special symbols, such as ',' and ',' are removed; ' s ','s;

3) Removing stop words which frequently occur in English but have no practical meaning, such as 'of', 'by', 'about', 'under' and the like by using a corpus in an nltk package;

4) The wordnetLemmatezer tool in the ntk package is utilized to convert the complex forms of the words into basic forms based on dictionary, such as 'does', 'fid', 'done' are uniformly restored to 'do', 'men' are changed to 'man', and the like, so that subsequent semantic analysis by a computer is facilitated.

In order to obtain the binary output tag of the MLMS-DMBLS, the Mashup call record is required to be subjected to 0-1 digital conversion to generate Mashup-service class call matrix Y= (Y) _ij ) ^MN×CN Wherein MN is the number of Mashup services, CN is the number of service types, y _ij =1 represents Mashup service M _i Invoking a certain service in category j, y _ij =0 means that no service in class j is invoked.

Step two: mashup description document theme feature extraction based on LDA model.

In order to fully understand the semantic features of Mashup service function description documents and map the semantic features into different service type labels, the latent topic vectors of the preprocessed Mashup service description document description are mined by using a hidden dirichlet distribution LDA (LatentDirichletAllocation) topic model. One Mashup service typically corresponds to multiple abstract topics, and if a service topic vector weights over a topic above a threshold, it can be inferred that a portion of the functionality of the service will correspond to a similar class of service to that topic. Taking the Mashup function description document set as a corpus of a topic model, training the corpus by using an LDA model to obtain a topic set { K } ₁ ,K ₂ ,…,K _T Where T represents the number of topics set in the LDA model. Each Mashup description document corresponds to a theme set { K } ₁ ,K ₂ ,…,K _T A probability distribution over the i-th Mashup description document, the subject features of the i-th Mashup description document may be represented as a 1×t vector

Then the entire corpusThe topic distribution vector can be represented as a two-dimensional matrix

Step three: and generating characteristic nodes.

The original data input by the width learning is a two-dimensional matrix theta formed by Mashup description document theme vectors, and the theta is respectively and linearly mapped into n groups of characteristic nodes Z by utilizing a linear transformation function and n groups of randomly mapped parameter matrices ⁿ The method comprises the following steps:

Z ⁿ ≡[Z ₁ ,…,Z _n ] (1)

wherein ,Z_i Representing the ith set of feature nodes, including k feature nodes.

n x k is the total number of feature nodes, subsequently denoted by nk.

Phi (·) is a linear mapping function,

and />

Step four: an enhanced node is generated.

Acquiring nonlinear characteristics of data through an activation function by using characteristic nodes in formula (1) to generate enhanced nodes H ^m The method comprises the following steps:

H ^m ＝ξ(Z ⁿ ·W _h +β _h ) (3)

and />

Step five: and splicing the characteristic nodes and the enhancement nodes.

Node Z of the feature ⁿ And reinforcing node H ^m The resulting enhanced features node X, see equation (4), is combined and input into the MLMS-DMBLS.

X＝[Z ⁿ |H ^m ] (4)

The rows of the matrix represent the number of samples and the columns represent the total number of network nodes. The network output of MLMS-DMBLS is a binary multi-tag vector Y of the service class, which can be expressed as:

Y＝Xβ (5)

beta is a weight matrix that the breadth-learning network needs to learn. The network model of the MLMS-DMBLS is shown in FIG. 2.

Step six: and (5) fusing double-flow regularization to construct an objective function of the MLMS-DMBLS.

The enhanced feature nodes generated by mapping the Mashup description document theme vector matrix through the steps III, IV and V are

Each row represents a training sample, and the total number of the rows is MN; each column represents a node, and for convenience of description, let d=nk+m, d represents the total number of nodes, and represents d-dimensional features of the enhanced feature node X. Let x be _i,： (1.ltoreq.i.ltoreq.MN) represents the ith row, X, of the enhanced feature node X _:,j (1. Ltoreq.j.ltoreq.d) represents the j-th column of the enhancement feature node X.

1) Constructing a data manifold regularization term for an enhanced feature node X

By adding data manifold regularization constraint in the objective function of Mashup service multi-label classification model based on original width learning, the local geometry structure is kept when the sample is embedded into the low-dimensional space from the high-dimensional space, namely, the sample points with very close original distance are subjected to feature mapping and then are also very close to each other in the new low-dimensional projection space. Based on this assumption, the data manifold regularization constraint of the enhanced feature node X is such that the minimized mutual approximation data samples X _i,： and x_j,： Corresponding label y _i and y_j The distance between them can be shown in table 5 as:

wherein ,y_i and y_j Is sample x _i,： and x_j,： Is embedded in a low-dimensional tag space corresponding to a sample of a high-dimensional space, W _ij Representing sample x _i,： and x_j,： The similarity between the samples is determined by a K neighbor algorithm, p neighbor samples closest to the samples are determined, and the similarity between the neighbor samples and the samples is calculated by using a Gaussian kernel function, so that a similarity matrix W between the samples can be obtained.

W _ij The specific calculation method is shown in the formula (8).

wherein N_p (x _i,： ) Representing sample x _i,： The nearest p neighbors, t, are bandwidth parameters of the gaussian kernel function to control the local scope of the function.

2) Feature manifold regularization term for constructing enhanced feature node X

In order to further enhance the generalization performance of the width learning, according to the dual concept, the sample features should also follow manifold assumptions, i.e. feature vectors with similar relations, and their corresponding weight parameters should also be similar, which accords with the general rule. Similar to the idea of data manifold regularization, feature manifold regularization also requires construction of a similarity matrix S between features, namely:

3) Construction of the objective function of MLMS-DMBLS

After data manifold regularization and feature manifold regularization are considered, the formulas (6) and (10) are fused into an objective function of a Mashup service multi-label classification model based on original width learning, and the weight coefficient beta is solved under the constraint of a double-manifold regularization term by minimizing the error of a prediction result and actual output. In addition, in order to prevent the model from being over fitted, the empirical risk and the structural risk of the model are reduced to the minimum by reducing the secondary paradigm of the output coefficient. The objective function of MLMS-DMBLS can thus be described as:

wherein ,

for the quadratic norm, λ is the L2 regularization factor of the width learning, used to balance the empirical and structural risks, C ₁ Penalty factor for regularization of data manifold, C ₂ Penalty factors for feature manifold regularization.

4) Simplifying the objective function of MLMS-DMBLS

In order to facilitate the solution of the subsequent MLMS-DMBLS model parameters, the matrix transformation derivation is required to be carried out on the double-manifold regularization term of the formula (11), and the derivation simplification process of the data manifold regularization non-constant term is as follows:

/>

wherein trace (·) represents the trace operation of the matrix, Y represents the output label matrix of the sample, L ₁ Is a data laplace matrix, which can be calculated from the similarity matrix W according to equation (13).

L ₁ ＝D-W (13)

D is a degree matrix, which can be obtained from the corresponding similarity matrix, and the calculation method is as in formula (14).

Similarly, through matrix transformation similar to equation (12), the feature manifold regularization term obtains a transformed objective function such as equation (15).

trace(β ^T L ₂ β)(15)

wherein ,L₂ Is a characteristic Laplace matrix, which can be obtained by a characteristic similarity matrix S, and the formula is as follows:

L ₂ ＝D-S (16)

finally, the objective function of MLMS-DMBLS is:

step seven: the weighting coefficients of the MLMS-DMBLS are calculated using the least squares method.

For solving the minimum value of beta in the formula (18), the whole needs to be derived and the derivative is 0, so that the beta is sequentially derived by four parts in the formula, and the specific derivation process is as follows:

/>

the synthesis (1) (2) (3) (4) part derives beta and makes it equal to 0, namely:

X ^T Xβ-X ^T Y+λβ+C ₁ X ^T L ₁ Xβ+C ₂ L ₂ β＝0 (19)

the method can be characterized by comprising the following steps:

β＝(X ^T X+λI+C ₁ X ^T L ₁ X+C ₂ L ₂ ) ^-1 X ^T Y (20)

(2) Test phase

The invention takes the description document of partial Mashup service in the data set as a test sample set for experiments. Obtaining a theme feature vector matrix of the test Mashup service through text preprocessing and theme feature extraction

M is the number of test samples, the matrix is mapped to generate enhanced feature nodes, and the final input X of the test samples is obtained ^* Then send it into trained multi-label score based on double-flow regularized width learningIn the class model, according to the formula (21), the classification result of Mashup service is predicted.

Pred＝X ^* β (21)

In order to generate a binary multi-label result vector with only 0 and 1 from the predicted category probability vector, a threshold value setting mode is adopted, and for the output probability in each label two-classifier, if the predicted value is larger than the threshold value, the output probability is marked as 1 and is indicated as Mashup, the service category label is added, otherwise, the output probability is marked as 0 and is indicated as not being added for Mashup. And (3) taking each threshold value in the sets {0.1,0.2 and … 0.8,0.9} respectively, and carrying out multiple experiments to finally determine that the threshold value has the best classifying effect at 0.5. The Mashup service multi-label classification model solving process based on double-flow regularized width learning is shown as an algorithm 1.

/>

In order to verify the effectiveness of the proposed method, a series of experiments were performed on a real-world service data set, and five evaluation criteria suitable for multi-label classification were selected to measure the performance of the experiment. The evaluation index and the specific experiment are described in detail below.

(1) Evaluation index

For the multi-label classification problem, the traditional classification evaluation indexes such as accuracy, recall and the like cannot directly evaluate the results, and in order to verify the beneficial effects of the method provided by the invention, the performance evaluation is performed by using 5 general evaluation indexes for multi-label classification, wherein the general evaluation indexes comprise hamming loss HL (Hamming Loss), 1-error rate OE (One-error), coverage rate CV (Coverage), ranking loss RL (Ranking Loss) and average precision AP (Average Precision). The formalized definition of five evaluation indexes is given below.

Hamming loss is used to evaluate the number of times a sample tag is misclassified, where

An exclusive or operation is represented to calculate the number of real tags that are different from the predicted tags. The smaller HL, the better the model performance.

The 1-error rate reflects the number of times the highest probability label in the predicted label is not correctly classified, mainly focusing on whether the most relevant label is predicted to be correct. Wherein f (·, ·) represents the prediction function, corresponding to the output probability vector p described above ^* . The smaller the OE, the better the model performance.

Coverage reflects how many steps are required to average the predicted tags in descending order to be able to cover all relevant tags. The smaller the CV, the better the model performance.

The ranking penalty is used to evaluate the average score of the reverse ordering tag pairs. Wherein Y is _i And

respectively represent sample x _i The number of tags marked 0 and 1. The smaller the RL, the better the model performance.

Average accuracy is used toEvaluation rank is higher than specific label Y epsilon Y _i Is a mean score of the correct label. Wherein, rank _f (x _i Y) represents sample x _i The middle tag y is arranged in the sorting position of all tags, rank _f (x,y')≤rank _f (x, y) means that label y' is arranged before label y. The larger the AP, the better the model performance.

(2) Data set selection and experimental setup

The data set adopted by the invention comes from data published by a key laboratory of Hunan university of science and technology knowledge processing and networked manufacturing, and is real service data crawled from a Programmable Web.com website, and specifically comprises 6206 Mashup services, 12919 Web APIs, description documents of the services, types of the Web APIs and the like. In order to evaluate the effectiveness of the method provided by the invention, the performance of the algorithm is comprehensively measured through the five common evaluation indexes classified by multiple labels, which are respectively abbreviated as HL ∈r, OE ∈r, CV ∈r, RL ∈r and AP ∈r, wherein ∈r represents that the lower the numerical value of the evaluation index is, the better the numerical value of the evaluation index is, and ∈r represents that the higher the numerical value of the evaluation index is. The invention performs a comparison experiment from two parts, wherein the first part is ablation experiment comparison analysis, and the MLMS-DMBLS provided by the invention is compared with the results of Mashup multi-label classification by utilizing Manifold regularized width learning M-BLS (Manifold-Broad Learning System) and width learning BLS. The second part is conventional experimental comparative analysis, and the MLMS-DMBLS is compared with an extreme learning machine RMLDM based on an original extreme learning machine ELM, RELM introduced with L2,1 regularization term, and introduced with data and characteristic manifold regularization term, and an ML-KNN classical multi-label classification algorithm.

For the MLMS-DMBLS method provided by the invention, the number of the width learning feature nodes and the number of the enhancement nodes are determined by circularly searching according to a fixed step length by a grid searching method; regularization factor lambda, C ₁ 、C ₂ The value range of (2) ^-20 ，2 ^-19 ，…2 ¹⁹ ，2 ²⁰ -a }; the topic number T has the value range of {10, 20, …, 100}, and the detailed parameter settings related to the invention are shown in the table 1 after multiple experiments are carried out.

TABLE 1 model parameter settings

To evaluate the model, the present invention uses 80% of the data of the dataset as a training set for training of the model and 20% of the data as a test set for verifying the performance of the model. In order to reduce the generation of experimental random errors, each test algorithm performs 10 experiments under the condition of the same parameters, takes the average value of each evaluation index as the final experimental result, and marks the most excellent result by using the thickened fonts.

(3) Comparative experiments

The invention uses manifold learning idea to restrain on the objective function of the width learning model, thereby improving the model to realize the improvement of the performance. Thus, in order to verify the effectiveness of the improvement, ablation experiments were performed, and the experimental test results are shown in table 2.

Table 2 ablation test results

As can be seen from Table 2, under multiple experiments, the method provided by the invention reaches the optimal level on each index, which shows that the MLMS-DMBLS method has better classification performance and stability. The method introduces manifold regularization, and utilizes double constraints of data manifold and feature manifold to ensure that the input text features can maintain the geometric distribution features of the original samples after feature mapping, and can ensure that feature vectors with similar relations have similar weight parameters, thereby improving the classification performance of the model.

Meanwhile, in order to verify the effectiveness of the method provided by the invention, the method is compared with the test results of other multi-label classification algorithms for Mashup service multi-label classification, and the results are shown in table 3.

Table 3 test results of various classification algorithms

It can be seen that the present invention has a different percentage performance improvement over each evaluation index. For HL ∈ the invention has little improvement compared with the ML-KNN method of the second rank, only reduces the loss by 0.03%, but has obvious improvement effect on OE ∈ and AP ∈ and reduces the 1-error rate by 19.56% and improves the average precision by 18.69%, so that the ML-KNN can be close to the method of the invention in terms of hamming loss and coverage rate, but is still in the dominant position in other indexes; on the index of OE ∈, MLMS-DMBLS has no obvious difference from RMLDM, but is reduced by 5.71 and 2.25 percent compared with other two methods ELM and RELM based on extreme learning machine respectively; on the AP ≡ index, compared with the second ranking method RMLDM, the accuracy is improved by 2.49%; meanwhile, compared with three methods based on an extreme learning machine, the method has the advantages that the loss reduction on CV ∈ and RL ∈ is particularly remarkable, and the characteristic extraction capability of the method is obviously superior to that of the extreme learning machine under the combined action of characteristic nodes and enhancement nodes in the width learning, and the overall performance is superior to that of other multi-label classification models especially under the width learning model of double-flow regularization constraint. In addition, according to the evaluation index results of ELM, RELM and RMLDM, proper constraint is added to the objective function of the model, so that more excellent parameters can be learned, and the classification performance of the model is improved to a certain extent, which accords with the conclusion obtained when the ablation experimental results of Table 2 are analyzed.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. A Mashup service multi-label classification method based on double-flow regularized width learning is characterized by comprising the following steps:

2. The Mashup service multi-label classification method based on double-flow regularization width learning of claim 1, wherein the step of preprocessing the binary output labels describing the document and Mashup service comprises:

converting all letters in the descriptive document into lower case letters;

3. The Mashup service multi-label classification method based on double-flow regularization width learning of claim 1, wherein the feature extraction is performed on the pre-processed Mashup description document by using a hidden dirichlet distribution topic model, so as to generate a Mashup description document topic feature matrix, and the method comprises the following steps:

training the hidden dirichlet distribution topic model to obtain topic set { K } ₁ ,K ₂ ,…,K _T -wherein T represents the number of topics set in the cryptodirichlet distribution topic model, each Mashup description document corresponding to a topic set { K } ₁ ,K ₂ ,…,K _T Probability distribution on i-th Mashup description document theme feature tableShown as a 1 x T vector

Thereby obtaining the topic distribution vector of the whole corpus +.>

4. The Mashup service multi-label classification method based on double-flow regularized width learning as claimed in claim 1, wherein the feature node is obtained according to the following formula:

Z ⁿ ≡[Z ₁ ,…,Z _n ] (1)

n x k is the total number of feature nodes,

wherein phi (·) is a linear mapping function,

and />

5. The Mashup service multi-label classification method based on double-flow regularized width learning as claimed in claim 1, wherein the enhancement node is obtained according to the following formula:

H ^m ＝ξ(Z ⁿ ·W _h +β _h ) (3)

and />

6. The Mashup service multi-label classification method based on double-flow regularization width learning of claim 1, wherein the enhancement feature node is obtained according to the following formula:

X＝[Z ⁿ |H ^m ] (4)

wherein ,

to enhance the feature node matrix, the rows of the matrix represent the number of samples and the columns represent the total number of network nodes;

Y＝Xβ (5)

7. The Mashup service multi-label classification method based on double-manifold regularization width learning of claim 1, wherein the fusion double-manifold regularization builds an objective function of a Mashup service multi-label classification model based on double-manifold regularization width learning, and the method comprises the following steps:

W _ij calculated according to the following formula:

wherein N_p (x _i,： ) Representing sample x _i,： The nearest p neighbors, t is the bandwidth parameter of the gaussian kernel function, used to control the local scope of action of the function;

simultaneous enhancement of featuresThe feature manifold regularization constraint of node X is to minimize the mutual approximation data feature X _:,i and x_:,j Corresponding weight beta _i and β_j Can be expressed as:

wherein ,

8. The Mashup service multi-label classification method based on double-manifold regularization width learning of claim 7, wherein solving the objective function by using a least square method to obtain a weight matrix of the double-manifold regularization width learning network, and completing training of the Mashup service multi-label classification model based on double-manifold regularization width learning, comprises:

β＝(X ^T X+λI+C ₁ X ^T L ₁ X+C ₂ L ₂ ) ^-1 X ^T Y (13)

9. The Mashup service multi-label classification method based on double-flow regularization width learning of claim 8, wherein the classification result of the predictive Mashup service is obtained according to the following formula:

Pred＝X ^* β (14)