CN116094977A

CN116094977A - Deep learning method of service Qos prediction based on time perception feature-oriented optimization

Info

Publication number: CN116094977A
Application number: CN202211437946.5A
Authority: CN
Inventors: 张佩云; 潘朝君; 陈禹同; 黄文君; 陶帅; 陈健; 谢荣见
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2023-05-09

Abstract

The invention discloses a deep learning method for service QoS prediction based on time perception feature oriented optimization, which provides a deep learning service QoS prediction model based on time perception feature oriented optimization, further improves the prediction accuracy, in the deep learning service QoS prediction model based on time perception feature oriented optimization, firstly, DGRN is designed to perform time perception to extract user/service features based on time perception, secondly GRGAN is designed to perform feature optimization, a discriminator in GRGAN is utilized to evaluate the optimized features so as to train a generator in GRGAN, finally, the optimized features are utilized to predict QoS values, and the prediction accuracy is improved.

Description

Deep learning method for service QoS prediction based on time perception feature oriented optimization

Technical Field

The invention relates to the field of service calculation, in particular to a deep learning method for service QoS prediction based on time perception feature oriented optimization.

Background

Service computation typically uses QoS to describe non-functional attributes of the service, such as throughput, response time, cost, etc. Since the network status, geographical location, personal preferences of different users are different from each other, qoS invoked by the same service by different users will also be different. Because the number of services is too large, it is not possible for a user to invoke all Web services, and therefore QoS for many services is unknown. The user's desired service may be recommended to the user by predicting the QoS value of the user for the non-invoked service. How to accurately predict service QoS values plays a critical role in service recommendation.

At present, the traditional collaborative filtering method is difficult to capture the time-varying characteristic of the QoS of the service, and the traditional time-perception-based deep learning service prediction method has the problems of large network parameter quantity, long training time and over-fitting risk in the feature extraction process when acquiring the user/service feature based on time perception. In order to reduce the number of network parameters, researchers propose a GRU-based method, but the network structure of the method can be further optimized, and the existing service QoS prediction method performs QoS prediction on a sparse QoS matrix, and the prediction capability of a model is limited to a certain extent because the sparse QoS matrix provides less information.

Disclosure of Invention

To solve the above-mentioned deficiencies in the background art, it is an object of the present invention to provide a deep learning method for service QoS prediction based on time-aware feature-oriented optimization,

the aim of the invention can be achieved by the following technical scheme: a deep learning method of service QoS prediction based on time perception feature oriented optimization includes the following steps:

decomposing the user-service original QoS matrix on the time slice by using a probability matrix decomposition method to obtain the user and service latent feature matrix on the time slice, thereby obtaining the latent feature vectors of the user and service on the time slice;

the obtained potential feature vectors of the user form a time feature set of the user, and the obtained potential feature vectors of the service form a time feature set of the service;

the method comprises the steps of taking a depth gating circulating network DGRN as a generator, respectively inputting time feature sets of a user and a service into the DGRN, and respectively outputting user features based on time perception of the user and service features based on time perception of the service;

performing probability matrix decomposition on the QoS matrix training set to obtain reference user characteristics of users and services;

using a plurality of full connection layers as a discriminator, inputting the obtained service characteristics based on time perception into the discriminator to obtain a first service discrimination result, calculating generator loss by using the first service discrimination result, and training generator network parameters by using the generator loss;

respectively inputting the obtained service characteristics based on time perception and the reference user characteristics of the service into a discriminator to obtain a second service discriminating result and a third service discriminating result, calculating discriminator losses by using the second service discriminating result and the third service discriminating result, and training network parameters of the discriminator by using the discriminator losses;

the process of inputting the obtained user characteristics based on time perception into the discriminator is the same as the process of inputting the service characteristics based on time perception into the discriminator;

and obtaining the time-based features of the end user and the service by using the trained generator, and performing dot multiplication on the time-based features of the end user and the time-based features of the end service to obtain a final QoS predicted value.

Preferably, the process of obtaining potential feature vectors for users and services on a time slice comprises the steps of:

setting the time slice as [ tau-M, tau-1 ]]Is provided with

For the user-service primitive QoS matrix on the t-th time slice, where m and n are the number of users and services, respectively, for Q ^t Performing probability matrix decomposition to obtain potential feature matrices of users and services on the t-th time slice respectively>

And->

Wherein U is ^t Is>

Representing the potential feature vector of user i on the t-th time slice, S ^t Each column +.>

Representing a potential feature vector of the service j on the t-th time slice, and d is the dimension of the user/service potential feature vector;

setting the QoS value of user i to service j on the t-th time slice

Is composed of->

And->

Is determined by the inner product of (a), namely:

in the formula (1), the components are as follows,

for user i to service j's QoS value on the t-th time slice, N () represents normal distribution, U ^t And S is ^t For potential feature matrices for users and services on the T-th time slice, T is the transposed symbol, σ ² For the variance of the original QoS matrix, the conditional probability of the user-serving original QoS matrix is:

in the formula (2), I _ij Is an indication function, Q ^t Original QoS matrix for user-service on the t-th time slice if

If the function value of the indication function is known to be 1, otherwise the function value of the indication function is known to be 0, then the value of +.>

And->

Also obeys normal distribution, namely:

in the formula (3) and the formula (4), sigma _U Sum sigma _S Standard deviation of the user potential feature vector and the service potential feature vector.

Since the posterior probability is equal to the prior probability multiplied by the likelihood probability, U can be obtained ^t ，S ^t Posterior probability of (2)As shown in formula (5):

taking the logarithm of the two sides of the formula (5) to obtain a formula (6):

the maximum posterior probability equivalent maximization objective function is shown in formula (7):

in the formula (7), E is the maximum posterior probability equivalent maximization objective function,

updating +.>

And->

As shown in the formula (8) and the formula (9):

and eta in the formula (8) and the formula (9) is the learning rate for controlling the gradient descent speed in the iterative process.

Preferably, the time slices [ τ -M, τ -1] are obtained based on decomposition using a probability matrix]After the user/service potential feature vectors on the M time slices, the potential feature vectors of user i on the M time slices are formed into user iA set of temporal features, noted as

I.e.

The potential feature vectors of service j constitute the temporal feature set of service j, noted as

I.e. < ->

Preferably, the construction process of the depth gating loop network DGRN is as follows:

given user i is on time slice [ τ -M, τ -1]]Calling service j, fully utilizing the time feature sets of user i and service j, and performing time perception on the time feature sets of user i and service j to ensure that service j is in time slice [ tau-M, tau-1 ]]Service time feature set on

As input, the DGRN structure is as follows:

wherein the DGRN structure consists of M depth gating circulating units DRGU, S _j ^τ-M+k-1 For the k-th DGRU input, denoted as DGRU _k The potential eigenvectors of the service j in the tau+k-1 time slice are obtained; h is a _k-1 The hidden state transferred for the last node contains the related information of the last node, combined with S _j ^τ-M+k-1 And h _k-1 The DGRU obtains the hidden state h transferred to the next node _k Hidden state h output by Mth node _M As a final result, and the initial value h of the hidden state ₀ Set to 0.

Preferably, by using the DRGN obtained by construction, the process of inputting the time feature sets of the user and the service into the DGRN respectively and outputting the time-aware-based user feature of the user and the time-aware-based service feature of the service respectively includes the following steps:

hidden state h transferred through the (k-1) th node _k-1 And input of node k, i.e. S _j ^τ-M+k-1 To obtain two gating states: the reset gating is shown as formula (10):

r _k ＝sigmoid(r′ _k ×γ _r +α _r ) (10)

in the formula (10), r' _k ＝ReLU(h _k-1 ×γ _hr +S _j ^τ-M+k-1 ×γ _xr +α _rr )，γ _hr 、γ _xr 、γ _r As a weight matrix, alpha _rr And alpha _r Is a bias matrix;

let z _k Update gating for node k, as shown in equation (11):

z _k ＝sigmoid(z′ _k ×γ _z +α _z ) (11)

in the formula (11), the amino acid sequence of the compound,

γ _hz 、γ _xz 、γ _z as a weight matrix, alpha _zz And alpha _z Is a bias matrix;

obtaining r _k And z _k After two gating signals, h is obtained by using reset gating _k-1 Data h after reset _k-1 r _k Then with x _k Together obtain information containing current node

As shown in formula (12):

in the formula (12), the amino acid sequence of the compound,

γ _xh 、γ _hh 、γ _h as a weight matrix, alpha _hh And alpha _h Is a bias matrix;

updating by using updating gating to obtain hidden state h of node k _k And h is set _k Feeding into the (k+1) th node as shown in a formula (13);

outputting the result as a time-aware based service feature of service j through M nodes

User i's time-aware based service feature +.>

The acquisition of (a) is the same as the acquisition procedure of the time-aware based service feature of service j.

Preferably, the process of calculating generator loss and arbiter loss includes:

generator loss:

generating, with a generator, time-aware based service features;

inputting the service characteristics based on time perception into a discriminator to obtain a first service discrimination result epsilon ¹ ；

Using the first service discrimination result epsilon ¹ Calculate generator loss L _G ＝MSE(ε ¹ )；

Loss L by generator _G Training generator network parameters;

loss of the discriminator:

inputting the reference service characteristic into the discriminator to obtain a second service discrimination result epsilon ² ；

Using the second service discrimination result epsilon ² Calculate loss L _D1 ＝MSE(ε ² )；

Inputting time-aware based service features in a arbiterObtaining a third service discrimination result epsilon ³ ；

Using third service discrimination result epsilon ³ Calculate loss L _D2 ＝MSE(ε ³ )；

Calculating the discriminator loss L _D ＝L _D1 +L _D2 ；

Loss L by means of a discriminator _D Training the parameters of the discriminator network.

Preferably, the mean square error MSE is used as a loss function in the training process of the network parameters, and the MSE calculation process is as follows:

in the formula (14), n is the number of services input in the training process,

is the result of the discrimination of service j by the discriminator,

for judging the result label during training, the user is->

And k ε {1,2,3};

and training is performed by a gradient descent method, as shown in the formulas (15) to (18). First, parameters in the network are trained with generator loss based on equations (17) and (18):

in the formulas (15) and (16), lambda is the learning rate for controlling the gradient descent speed in the iterative process, and gamma _G To generateAggregation of all weight matrices in a network of devices, alpha _G The set of all bias matrices in the generator network is then trained on parameters in the network with the discriminant loss by equation (17), equation (18):

in the formula (17), gamma _D A set of all weight matrices in the arbiter network;

in the formula (18), alpha _D A set of all bias matrices in the generator network.

Preferably, the process of obtaining the final QoS predicted value includes the steps of:

user i and service j are in time slice [ tau-M, tau-1 ]]The time feature vector sets are respectively input into DGRN, feature optimization is carried out through time perception, and then user features of the end user i based on time perception are respectively output

And time-aware based service feature of final service j->

And->

The product of (a) is the QoS predictive value of user i for service j on time slice tau>

The calculation process is shown as a formula (19):

preferably, an apparatus comprises:

one or more processors;

a memory for storing one or more programs;

when one or more of the programs are executed by one or more of the processors, the one or more of the processors implement the deep learning method of service QoS prediction based on time-aware feature-oriented optimization as described above.

Preferably, a storage medium containing computer executable instructions, which when executed by a computer processor, are for performing a deep learning method of service QoS prediction based on time-aware feature-oriented optimization as described above.

The invention has the beneficial effects that:

the deep learning service QoS prediction model based on time perception feature-oriented optimization is provided, and the prediction accuracy is further improved. In the model, firstly, DGRN is designed to perform time perception to extract user/service characteristics based on time perception, secondly, GRGAN is designed to perform characteristic optimization, a discriminator in the GRGAN is utilized to evaluate the optimized characteristics so as to train a generator in the GRGAN, and finally, the optimized characteristics are utilized to predict QoS values, so that prediction accuracy is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort;

FIG. 1 is a schematic diagram of the structure of the present invention;

FIG. 2 is a DGRU of the present invention _k A structural schematic;

FIG. 3 is a schematic diagram of a time-aware based user/service feature extraction process of the present invention;

FIG. 4 is a schematic diagram of a GRGAN network architecture of the present invention with service j as an example;

FIG. 5 is a schematic diagram of the effect of the original feature vector dimensions of the user/service of the present invention on prediction accuracy;

FIG. 6 is a schematic diagram of the effect of the number of discriminator layers on the prediction accuracy of the present invention;

FIG. 7 is a schematic diagram of the effect of the time slice length on prediction accuracy of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, the deep learning method of service QoS prediction based on time-aware feature-oriented optimization includes the following steps:

1) Probability matrix decomposition

To achieve temporal feature extraction, user/service latent feature vectors on time slices [ τ -M, τ -1] need to be initialized by probability matrix decomposition (Probabilistic Matrix Factorization, PMF), thus obtaining a set of user/service temporal features, which are then input into a deep-gated loop network for user/service feature optimization, outputting user/service features based on temporal perception.

The probability matrix decomposition is adopted to obtain the potential feature vectors of the user and the service on the time slice [ tau-M, tau-1 ], two matrices are decomposed from the original QoS matrix of the user-service, and the potential features of the user and the service are respectively represented, so that the decomposed matrices can be multiplied to obtain the original matrix. In practice, however, the QoS matrix is very sparse, i.e. there are only a few users' QoS values for a few services in the matrix.

Is provided with

The original QoS matrix is for the user-service on the t-th time slice, where m and n are the number of users and services, respectively. Performing probability matrix decomposition on Qt to obtain users and services on the t-th time slice respectivelyIs>

And->

Wherein U is ^t Is>

Represents the potential feature vector of service j on the t-th time slice, d is the dimension of the user/service potential feature vector. The PMF assumes that user i is on the t-th time slice for QoS value of service j +.>

Is composed of->

And->

Is determined by the inner product of (a), namely:

in the formula (1), N () represents a normal distribution. The conditional probability of the user-service original QoS matrix is:

in the formula (2), I _ij Is an indication function if

If the function value is known to be 1, noThen 0. Let->

And->

Also obeys normal distribution, namely:

/>

since the posterior probability is equal to the prior probability multiplied by the likelihood probability, U can be obtained ^t ，S ^t The posterior probability of (2) is represented by formula (5):

in the formula (7), the amino acid sequence of the compound,

updating +.>

And->

As shown in the formula (8) and the formula (9):

In the process of obtaining time slices [ tau-M, tau-1 ] by utilizing probability matrix decomposition]After the user/service potential feature vectors, the M time slice potential feature vectors of the user i form a time feature set of the user i, which is recorded as

I.e.

I.e. +.>

2) Deep gating loop network design

A deep gating cyclic network structure is designed, and a given user i is in a time slice [ tau-M, tau-1 ]]And calling the service j, fully utilizing the time feature sets of the user i and the service j, and performing time perception on the user i and the service j. With service j in time slice [ tau-M, tau-1 ]]Service time feature set on

Input is exemplified by the structure of DGRN as follows. />

The structure consists of M depth-gated loop units (Deep Gate Recurrent Unit, DGRU). S is S _j ^τ-M+k-1 Is the input to the kth DGRU (denoted as DGRUk), i.e., the potential eigenvectors of service j in the τ+k-1 time slice. h is a _k-1 The hidden state is transferred for the last node and contains the relevant information of the last node. Binding S _j ^τ-M+k-1 And h _k-1 The DGRU obtains the hidden state h transferred to the next node _k . Hidden state h output by Mth node _M As a final result. Initial value h of hidden state ₀ Set to 0. With service j in time slice [ tau-M, tau-1 ]]The potential feature vector input is exemplified, and the DGRN structure for feature extraction is similar to that described above. The specific structure of DGRUk is shown in FIG. 2.

In the view of figure 2,

for addition, the product of Hadamard of the matrix, i.e., the multiplication of the corresponding elements in the two homotype matrices, is also indicated by the term. ReLU is a linear rectifying unit (Rectified Linear Unit, reLU), i.e., reLU (x) =max (0, x). Sigma is a sigmoid function, i.e. sigmoid (x) =1/(1+e) ^-x ). tanh is a hyperbolic tangent function, i.e., tanh (x) = (e) ^x -e ^-x )/(e ^x +e ^-x ). FC is a full connection operation. In fig. 2, the DGRU-based forward propagation procedure is as follows:

a) Hidden state h transferred by the last node (k-1) node _k-1 And the input of node k (i.e. S _j ^τ-M+k-1 ) To obtain two gating states: the reset gating is shown as formula (10):

r _k ＝sigmoid(r′ _k ×γ _r +α _r ) (10)

in the formula (10), r' _k ＝ReLU(h _k-1 ×γ _hr +S _j ^τ-M+k-1 ×γ _xr +α _rr )，γ _hr 、γ _xr 、γ _r Is the weightMatrix, alpha _rr And alpha _r Is a bias matrix.

Let z _k Update gating for node k, as shown in equation (11):

z _k ＝sigmoid(z′ _k ×γ _z +α _z ) (11)

in the formula (11), the amino acid sequence of the compound,

γ _hz 、γ _xz 、γ _z as a weight matrix, alpha _zz And alpha _z Is a bias matrix.

b) As shown in FIG. 2, r is obtained _k And z _k After two gating signals, h is obtained by using reset gating _k-1 Data h after reset _k-1 r _k Then with x _k Together obtain information containing current node

As shown in formula (12):

in the formula (12), the amino acid sequence of the compound,

γ _xh 、γ _hh 、γ _h as a weight matrix, alpha _hh And alpha _h Is a bias matrix.

c) Updating by using updating gating to obtain hidden state h of node k _k And feeds it into the (k+1) th node as shown in formula (13).

d) Outputting the final result as the time-aware service feature of service j through M nodes

User i's time-aware based service feature +.>

Is similar to a service.

3) User/service feature extraction based on time perception

The time-aware based user/service feature extraction process is shown in fig. 3. User i and service j are in time slice [ tau-M, tau-1 ]]User time feature set on

And a service time feature set

Respectively inputting the user characteristics into two DGRNs for feature optimization, and then respectively outputting the user characteristics of the user i based on time perception +.>

And time-aware based service feature of service j ∈>

A. Deep learning service QoS prediction model design, training and prediction based on time perception feature oriented optimization

1) Deep learning service QoS prediction model design and training based on time perception feature-oriented optimization

The GRGAN mainly comprises two parts, namely a generator and a discriminator. Since the network structure used by the user and the service is the same, a network structure diagram of the GRGAN is given as an example of the service j, as shown in fig. 4.

In fig. 4, the DGRN is first used as a generator of a generation type countermeasure network, and the time feature set of the service j is used

Time-aware based service features entered into a generator to obtain service j/>

Next, the reference service feature of service j is +.>

And

respectively sending the user characteristics to a discriminator to respectively obtain discrimination results to judge the optimization condition of the user characteristics.

The specific calculation process of the GRGAN loss is as follows:

a) Generator loss

(a) A time-aware based service feature is generated with a generator.

(b) Inputting the service characteristics based on time perception into a discriminator to obtain a discrimination result epsilon ¹ 。

(c) Using the discrimination result epsilon ¹ Calculate generator loss L _G ＝MSE(ε ¹ )。

(d) Loss L by generator _G The generator network parameters are trained.

b) Loss of discriminator

(a) Inputting the reference service characteristic into the discriminator to obtain discrimination result epsilon ² 。

(b) Using the discrimination result epsilon ² Calculate loss L _D1 ＝MSE(ε ² )。

(c) Inputting service characteristics based on time perception into a discriminator to obtain a discrimination result epsilon ³ 。

(d) Using the discrimination result epsilon ³ Calculate loss L _D2 ＝MSE(ε ³ )。

(e) Calculating the discriminator loss L _D ＝L _D1 +L _D2 。

(f) Loss L by means of a discriminator _D Training the parameters of the discriminator network.

In the parameter training process of the model, mean square error (Mean Square Error, MSE) is adopted as a loss function, taking an input service training model as an example, and the calculation process of the MSE is given in formula (14):

in the formula (14), n is the number of services input in the training process,

is the result of the discrimination of service j by the discriminator,

for judging the result label during training, the user is->

And k.epsilon.1, 2, 3.

GRGAN utilizes generators and discriminators for overall network optimization. In the process of DGRN learning feature optimization, the purpose of the generator is to generate as high quality features as possible to fool the discriminators, while the purpose of the discriminators is to separate the optimized features from the reference features as far as possible. In this process, the parameters of the GRGAN are updated for network parameters by minimizing losses.

To optimize parameters in the network, the GRGAN is trained using a random gradient descent method to minimize two loss functions, as shown in equations (15) through (18). First, parameters in the network are trained with generator loss based on equations (17) and (18):

in the formulas (15) and (16), lambda is the learning rate for controlling the gradient descent speed in the iterative process, and gamma _G Ownership matrix in generator networkAggregation, alpha _G A set of all bias matrices in the generator network. Parameters in the network are then trained with the discriminant loss by equations (17), (18):

in the formula (17), gamma _D Is the set of all weight matrices in the arbiter network.

2) Deep learning service QoS prediction model prediction based on time-aware feature-oriented optimization

The prediction structure of this model is shown in fig. 1. After the DGRN learns the feature optimization capability, user i and service j are in time slices [ tau-M, tau-1 ]]The time feature vector sets are respectively input into two DGRNs, feature optimization is carried out through time perception, and then user features of the user i based on time perception are respectively output

And time-aware based service feature of service j ∈>

The product of these is the QoS forecast value of user i for service j on time slice tau>

The calculation process is shown in formula (19).

A. It should be further noted that, in the specific implementation, parameter Settings and the Compared Methods

The experimental parameter settings are shown in table 1.

TABLE 1 parameter settings

/>

In IV-D, the parameters that optimize the model performance will be determined experimentally in the parameter ranges set above, with the initial values of all parameters in the network being determined using random numbers. Model performance was evaluated in IV-F using the optimal parameters. To evaluate the model herein, a comparison was made with the following typical method:

1) UPCC (User-based CF using Pearson correlation coefficient) [22]: the method is a collaborative filtering algorithm based on users, calculates the similarity of users by using PCC, and predicts the missing QoS value by using the data about the similarity of other users.

2) IPCC (Item-based CF using Pearson correlation coefficient) [23]: the method is a collaborative filtering algorithm based on services, calculates service similarity by using PCC, and predicts missing QoS values by using data about the similarity of other services.

3) WSRec (Web service recommender system) [24]: the method is a hybrid cooperation algorithm, which combines UPCC and IPCC to predict missing QoS values.

4) K-SLOPE [25] this approach uses K-means to exclude less similar users while employing the SLOPE One algorithm to predict QoS loss over time.

5) NTF (Non-negative tensor factorization) [26]: the method is a generalized tensor decomposition model, considers the difference of QoS values of different times, and replaces a user-service matrix in matrix decomposition with a user-service-time interaction relationship.

6) RTF (Recurrent tensor factorization) [28]: the method is a time-aware service prediction framework based on deep learning, and combines tensor decomposition and deep learning for time-aware service recommendation.

7) TF-KMP (Temporal QoS forecasting with collaborative filtering QoS predictionbased on K-means classification) [29]: the method combines TF and KMP to predict missing QoS values, wherein TF (Temporal QoS forecasting based onhistorical values) takes the average value of historical QoS values as a predicted QoS value, and KMP (Collaborative filtering QoS predictionbased on K-means classification) is a collaborative filtering QoS prediction method based on K-means clustering.

8) TUIPCC (Time-aware user-service item-based CF using Pearson correlation coefficient) [7]: the method is based on collaborative filtering, qoS is divided into a plurality of time slices, and then a time-aware similarity calculation mechanism is utilized to select similar users (or services) to predict missing QoS.

9) RNCF (Recurrent neural network based collaborative filtering) [18]: the method adds a multi-layer GRU structure in a neural collaborative filtering framework, and shares calling records on different time slices so as to predict QoS, thereby being a prediction method based on collaborative filtering.

B. Experiment parameter set-up contrast experiment

To analyze the effect of different parameter settings on the accuracy of the model prediction herein, the following experiments were performed with matrix densities of 5%, 10%, 15%, 20%, respectively, to determine the optimal values of the parameters of the model.

1) Original user/service feature vector dimension/: the original user/service feature vector dimension represents the number of individual features of the user and service that are resolved by the user-service original matrix Q, determining how many individual features are used to make QoS predictions. To study the effect of L on the accuracy of model predictions herein, L was set to 3,l to 50,60,70,80,90,100.

2) The number of discriminators L to study the effect of L on prediction accuracy, the matrix density was set to 15%, L was set to 50, M was set to 3, and L was set to 2,3,4,5,6.

3) Time slice length M to study the effect of time slice length on QoS prediction accuracy, l is set to 60, L is set to 3, M is set to 4,5,6,7,8,9,10.

1) Original feature vector dimension of user/service i the original feature vector dimension of user/service represents the number of individual features of user and service decomposed by the original matrix Q of user-service, determining how many individual features are used for QoS prediction. Study l the effect on the accuracy of the model predictions herein, the experimental results are shown in fig. 5.

Fig.9. Influence of original feature vector dimensions of user/service on prediction accuracy

As shown in fig. 5, as l increases, the prediction accuracy of QoS increases, and when l exceeds 60, the prediction accuracy of QoS starts to decrease. This is because the higher the dimension, the more features can be mined, so that the model is more effective for feature learning, and higher prediction accuracy is realized, while as the dimension is increased, the problem of over-fitting is easily generated, and the accuracy is reduced. As the matrix density increases from 5% to 20%, the prediction accuracy increases significantly. And as the density of the user-service original QoS matrix increases, the probability matrix decomposition can obtain the user/service potential feature vector more accurately, thereby improving the prediction accuracy of the model. Therefore, in the following comparative experiment, l was taken as 60.

2) Number of discriminator layers L: the influence of the number of layers L of the discriminator on the prediction accuracy of the model is studied, and the experimental result is shown in fig. 6.

As shown in fig. 6, when L <3, as L increases, the prediction accuracy increases. This is because deeper networks can better learn the inherent links in the sample. However, when L >3, the prediction accuracy decreases with increasing L, since the risk of overfitting increases as the network deepens. L was therefore set to 3 in the following comparative experiments.

3) Time slice length M: the effect of the time slice length on QoS prediction accuracy was studied, and the experimental results are shown in fig. 7.

As shown in fig. 7, when M <8, as L increases, the prediction accuracy increases. This is because the temporal information may provide more features to optimize the prediction results. However, when M >8, the prediction accuracy decreases with increasing M, because the risk of overfitting increases with the time slice length. M was therefore given a value of 8 in the following comparative experiments.

C.Performance Comparison

Eight methods of MAE and RMSE were compared at four different matrix densities (5%, 10%, 15%, 20%). Table 2 is the MAE and RMSE of response times for different prediction methods, and table 3 is the MAE and RMSE of throughput for different prediction methods.

Table 2 response time comparison

Table 3 throughput comparison

Table 2 gives the MAE and RMSE of the response times for the 10 methods. At four matrix densities, the proposed model outperforms the other 9 methods by 0.69% to 11.97%.

Table 3 gives the MAE and RMSE for the throughput of 9 methods. The proposed model throughput is 0.20% to 10.75% higher than other methods. As can be seen from tables 2 and 3, DGRN achieves the highest prediction accuracy in the comparison method for different QoS attributes (response time and throughput). The best results are achieved at different matrix densities. As matrix density increases, prediction accuracy increases because more data can be used to train a more accurate model, thereby increasing service QoS prediction accuracy.

Based on the same inventive concept, the present invention also provides a computer apparatus comprising: one or more processors, and memory for storing one or more computer programs; the program includes program instructions and the processor is configured to execute the program instructions stored in the memory. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application SpecificIntegrated Circuit, ASIC), field-Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computational core and control core of the terminal for implementing one or more instructions, in particular for loading and executing one or more instructions within a computer storage medium to implement the methods described above.

It should be further noted that, based on the same inventive concept, the present invention also provides a computer storage medium having a computer program stored thereon, which when executed by a processor performs the above method. The storage media may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electrical, magnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The foregoing has shown and described the basic principles, principal features, and advantages of the present disclosure. It will be understood by those skilled in the art that the present disclosure is not limited to the embodiments described above, which have been described in the foregoing and description merely illustrates the principles of the disclosure, and that various changes and modifications may be made therein without departing from the spirit and scope of the disclosure, which is defined in the appended claims.

Claims

1. The deep learning method of service QoS prediction based on time perception feature oriented optimization is characterized by comprising the following steps:

2. The deep learning method of service QoS prediction based on time-aware feature-oriented optimization of claim 1, wherein the process of obtaining potential feature vectors for users and services on a time-slice comprises the steps of:

setting the time slice as [ tau-M, tau-1 ]]Is provided with

And->

Wherein U is ^t Is>

Representing service jThe potential feature vector on the t-th time slice, d is the dimension of the user/service potential feature vector;

setting the QoS value of user i to service j on the t-th time slice

Is composed of->

And->

Is determined by the inner product of (a), namely:

in the formula (1), the components are as follows,

And->

Also obeys normal distribution, namely: />

In the formula (3) and the formula (4), sigma _U Sum sigma _S Standard deviation of the user potential feature vector and the service potential feature vector;

updating +.>

And->

As shown in the formula (8) and the formula (9):

3. Deep learning method of service QoS prediction based on time-aware feature-oriented optimization according to claim 2, characterized in that time slices [ τ -M, τ -1] are obtained based on decomposition with probability matrix]After the user/service potential feature vectors, the M time slice potential feature vectors of the user i form a time feature set of the user i, which is recorded as

I.e.

The potential feature vectors of service j constitute the temporal feature set of service j, denoted +.>

I.e. < ->

4. The deep learning method of service QoS prediction based on time-aware feature-oriented optimization according to claim 1, wherein the construction process of the deep gating loop network DGRN is as follows:

As input, the DGRN structure is as follows:

5. The deep learning method of service QoS prediction based on time-aware feature-oriented optimization according to claim 4, wherein the process of respectively inputting the time feature sets of the user and the service into the DGRN and respectively outputting the time-aware user feature of the user and the time-aware service feature of the service by using the DRGN obtained by the construction comprises the steps of:

r _k ＝sigmoid(r′ _k ×γ _r +α _r ) (10)

let z _k Update gating for node k, as shown in equation (11):

z _k ＝sigmoid(z′ _k ×γ _z +α _z ) (11)

in the formula (11), the amino acid sequence of the compound,

As shown in formula (12):

in the formula (12), the amino acid sequence of the compound,

User i's time-aware based service feature +.>

6. The deep learning method of service QoS prediction based on time-aware feature-oriented optimization of claim 1, wherein the process of calculating generator loss and arbiter loss comprises:

generator loss:

generating, with a generator, time-aware based service features;

Loss L by generator _G Training generator network parameters;

loss of the discriminator:

Inputting service characteristics based on time perception into a discriminator to obtain a third service discrimination result epsilon ³ ；

Calculating the discriminator loss L _D ＝L _D1 +L _D2 ；

7. The deep learning method of service QoS prediction based on time-aware feature-oriented optimization according to claim 6, wherein the training process of the network parameters adopts a mean square error MSE as a loss function, and the MSE calculation process is as follows:

in the formula (14), n is the number of services input in the training process,

is the result of the discrimination of service j by the discriminator, < >>

For judging the result label during training, the user is->

And k ε {1,2,3};

and training by using a gradient descent method, as shown in the formulas (15) to (18), firstly training parameters in a network by using generator loss based on the formulas (17) and (18):

in the formulas (15) and (16), lambda is iterationLearning rate of gradient decline speed is controlled in the process, gamma _G′ Set of ownership matrices, alpha, in generator network _G The set of all bias matrices in the generator network is then trained on parameters in the network with the discriminant loss by equation (17), equation (18):

8. The deep learning method of service QoS prediction based on time-aware feature-oriented optimization of claim 1, wherein the process of obtaining the final QoS predicted value comprises the steps of:

And time-aware based service feature of final service j->

And->

The product of (a) is the QoS predicted value of user i to service j on time slice tau

The calculation process is shown as a formula (19):

9. an apparatus, comprising:

one or more processors;

a memory for storing one or more programs;

when one or more of the programs are executed by one or more of the processors, the one or more of the processors implement the deep learning method of service QoS prediction based on time-aware feature-oriented optimization as claimed in any one of claims 1-8.

10. A storage medium containing computer executable instructions which, when executed by a computer processor, are for performing a deep learning method of service QoS prediction based on time-aware feature-oriented optimization as claimed in any one of claims 1-8.