CN114757391B

CN114757391B - Network data space design and application method oriented to service quality prediction

Info

Publication number: CN114757391B
Application number: CN202210264626.8A
Authority: CN
Inventors: 鄢萌; 王子梁; 张小洪; 吴云松; 杨丹; 付春雷
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2022-03-17
Filing date: 2022-03-17
Publication date: 2024-05-03
Anticipated expiration: 2042-03-17
Also published as: CN114757391A

Abstract

The invention relates to a network data space design and application method facing service quality prediction, which consists of two parts of potential network space learning and QoS prediction. The first section is intended to learn three potential webspaces (i.e., a user potential webspace, a service potential webspace, and an interaction potential webspace). The second part firstly samples each feature in three potential network spaces, and then fuses different features through three different convolution networks to realize QoS prediction. The method generates potential network space for each user and service in different positions through the two innovative components LNSL, and achieves high-precision QoS prediction.

Description

Network data space design and application method oriented to service quality prediction

Technical Field

The invention relates to a network space learning method, in particular to a network data space design and application method facing to service quality prediction.

Background

With the advent of the age when everything is a service, how to recommend high quality services to users has become critical. The main idea of service recommendation is to predict QoS (e.g. response time and throughput) values for services that the user does not invoke and then recommend the best quality service to the user. In addition, some micro-service resource optimization methods require a high-precision predictive QoS module. However, due to cost and privacy concerns, network space information that has a significant impact on QoS is often difficult to capture. For example, the network space state (e.g., network speed, delay) of the user and the network provider providing the service can greatly impact the quality of the service.

The main limitation of these predecessor QoS prediction methods is the lack of a way to mine network space information for different users or service misses. In the existing QoS prediction method, collaborative filtering according to location information to find similar users and services is one of the core ideas. Wherein the known information is the location information of the service and the calling subscriber. Previous studies have generally concluded that users and services in relatively close geographic locations are likely to obtain or provide more similar quality of service. An important issue is that users or services that are close to each other do not necessarily have similar network status. As shown in fig. 1, both the user and the service are provided by a home operator to provide network services. Two services or users in proximity to each other are provided by two different service providers. CF-based methods may draw false conclusions. The physical distance of two users or services is very close, but their computers may be located in different networks. They can be very remote in terms of network distance. That is, the neighbors on the physical location do not necessarily belong to the same network.

The existing QoS prediction methods can be divided into two main categories, namely QoS prediction methods based on collaborative filtering and QoS prediction methods based on deep learning. The most common method of predicting QoS is a collaborative filtering model that learns the QoS of a target service through similar users or services. Cfs-based Qos prediction methods can be categorized into two main categories, memory-based and model-based. Memory-based approaches treat QoS (i.e., RT, TP) or attributes (i.e., distance) as a measure of similarity between users or services. The key step is to calculate the similarity of the target object. Such methods include user-based (e.g., UPCC), service-based (e.g., IPCC), and user-to-service combinations (e.g., UIPCC). To further improve the prediction accuracy, a number of model-based methods are applied, such as matrix decomposition (MF). This is an effective recommender technology that has been widely used in QoS prediction in recent years. The CF method often fails to achieve satisfactory prediction accuracy due to lack of sparsity of information and data.

The deep neural network becomes an effective QoS prediction method. It captures the nonlinear relationship by taking into account the characteristics of the object. The location information is relatively easy to obtain. A great deal of research has utilized location information to improve the accuracy of QoS predictions. Also, the time slice information is widely used in QoS prediction. Complex models improve prediction accuracy to some extent, but require many efficient features as prediction conditions. The existing deep neural network method cannot avoid wrong division of users or services of different network spaces into similar groups due to the increased prediction accuracy of the position information.

LNSL are motivated by other areas, such as a model based on a variational self-encoder (VAE) and a prediction method based on potential factors. The main difference between these two methods is that the former learns different potential feature spaces by classifying features to fill in missing network information without pre-training.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention aims to solve the technical problems that: how to provide a method with high QoS value prediction accuracy.

In order to solve the technical problems, the invention adopts the following technical scheme: a network data space design and application method facing service quality prediction includes the following steps:

S1: constructing a training set, wherein each training sample in the training set comprises user characteristics, service characteristics, interaction characteristics and real QoS values, and the training set comprises the following components:

user characteristics U are marked as Wherein/>Representing user ID,/>Representing longitude of user,/>Representing the latitude of the user;

Service feature S is denoted as Wherein/>Representing service ID,/>Representing longitude of service,/>Representing a latitude of the service;

The interaction characteristic C is marked as

S2: constructing a prediction neural network, wherein the prediction neural network comprises a priori network, a characteristic adoption layer, an object perception convolution layer, a known information perception convolution layer, a full information perception convolution layer and two full connection layers which are sequentially arranged;

S21: the detailed network structure of the a priori network is as follows:

Where E _i represents the i-th prior network, f ^[2] represents a one-dimensional convolution, For calculating the average of the tensors along a given number axis, [1,2] representing the first and second dimensions of the tensor,/>For deleting the dimension of the first dimension pointed by the table [1] above, X _i represents the prior probability of the input;

The prior network is used to estimate the prior probability of a given combined feature, the prior probability distribution being modeled as an axial gaussian distribution, where both the expected μ and the variance σ conform to the gaussian distribution;

The parameters mu _i and sigma _i are calculated through the prior network, and the calculation formula is as follows:

μ_i＝f_μ(E_i;W_μ) (2)

σ_i＝f_σ(E_i;W_σ) (3)

Where μ _i represents the expectation of the ith a priori network, σ _i represents the variance of the ith a priori network, f _μ represents the process of calculating the expectation by E _i, W _μ represents the weight of μ, f _σ represents the process of calculating the variance by E _i, and W _σ represents the weight of σ;

inputting the user characteristic U of a training sample into an i priori network, wherein i=1 to obtain a user potential network space priori distribution P1, mu ₁ and sigma ₁ are obtained, inputting the U of the training sample and a QoS value corresponding to the training sample into the i priori network, and i=2 to obtain a user potential network space posterior distribution Pr1, and mu ₂ and sigma ₂ are obtained;

Inputting service features S of a training sample into an i priori network, i=3 to obtain space priori distribution P2 of the service potential network, mu ₃ and sigma ₃ are obtained, inputting S of the training sample and QoS values corresponding to the training sample into the i priori network, i=4 to obtain space posterior distribution Pr2 of the service potential network, and mu ₄ and sigma ₄ are obtained;

Inputting the interactive feature C of the training sample into an i priori network, i=5 to obtain interactive potential network space priori distribution P3, mu ₅ and sigma ₅, inputting the C of the training sample and the QoS value corresponding to the training sample into the i priori network, i=6 to obtain interactive potential network space posterior distribution Pr3, and mu ₆ and sigma ₆;

s22: the feature adopts a layer

The feature Z1 of the training sample is sampled from P1 as in equation (4):

Z1～P₁(·|U)＝Ν(μ₁,diag(σ₁)) (4)

the feature Z2 of the training sample is sampled from P2 as in equation (5):

Z2～P₂(·|U)＝Ν(μ₂,diag(σ₂)) (5)

The feature Z3 of the training sample is sampled from P3 as in equation (6):

Z3～P₃(·|U)＝Ν(μ₃,diag(σ₃)) (6)

the three samples are concatenated as the final characterization of the training sample:

Z＝U⊕Z1⊕S⊕Z2⊕C⊕Z3 (7)

S23: respectively inputting the final characteristic representation Z of the training sample into an object sensing convolution layer, a known information sensing convolution layer and a full information sensing convolution layer, and then sequentially inputting the two full connection layers to obtain a predicted value of QoS of the training sample;

s24: defining a loss function of the predictive neural network, obtaining the trained predictive neural network after training is finished when the value of the loss function is not changed any more, otherwise updating the parameters of the predictive neural network, and inputting a training sample to continue training;

S3: and for one sample to be predicted, obtaining user characteristics, service characteristics and interaction characteristics of the sample to be predicted by adopting the method of S1, setting the real QoS value of the sample to be predicted as a random array, and then inputting the sample to be predicted into a trained prediction neural network to obtain the QoS predicted value corresponding to the sample to be predicted.

Preferably, in the step S1, the acquired longitude of the user and the longitude of the service are increased by 180, and the longitude range is mapped from [ -180,180] to [0,360]; the acquired latitude of the user and the latitude of the service are increased by 90, respectively, and the latitude ranges are mapped from [ -90,90] to [0,180].

Preferably, the specific process of the object-aware convolution layer in S23 is as follows: the convolution kernel is 6*1, step = 8, and the object-aware convolution performs two convolution operations;

X₁＝f(Z;w₁) (8)

Where w ₁ is a one-dimensional convolution kernel of size 6*1 and X ₁ represents the output of the object-aware convolution layer.

Preferably, the specific procedure of the information aware convolution layer known in S23 is as follows: the convolution kernel is 3*1 in size and 11 in step length, and known information sensing convolution performs two convolution operations;

X₂＝f(Z;w₂) (9)

where w ₂ is a one-dimensional convolution kernel of size 3*1, and X ₂ represents the output of a known information-aware convolution layer.

Preferably, the specific process of the full information aware convolution layer in S23 is as follows: the convolution kernel size is 20 x 1, and only one convolution operation is executed;

X₃＝f(Z;w₃) (10)

Where w ₃ is a one-dimensional convolution kernel of size 20X 1, and X ₃ represents the output of the full information aware convolution layer.

Preferably, the specific process of the two fully-connected layers in S23 is as follows:

x＝F(X₁)⊕F(X₂)⊕F(X₃) (11)

Wherein ∈ indicates a merging operation, F indicates a flattening operation of a flattening layer Keras, F ^[16,8,1] indicates a full-communication layer, [16,8,1] indicates the number of neurons in the layer, Representing QoS predicted values.

Preferably, the loss function defined in S24 is as defined in formula (13):

Loss＝H_loss+KL_loss (13)

Wherein,

Where y represents the true QoS value of the training samples,QoS values, delta constant items and beta constant items of predicted values obtained by a training sample through a predicted neural network are represented;

KL_loss＝D_kl(Pr₁||P₁)+D_kl(Pr₂||P₂)+D_kl(Pr₃||P₃) (15)

Wherein, three KL divergences

Compared with the prior art, the invention has at least the following advantages:

1. the method can realize high-precision QoS prediction, and the invention provides the knowledge of the network space of each specific position so as to make up the defects of the prior method.

2. The method of the invention provides a depth model for carrying out position-aware network space mining, so that the depth model is as close to a real network space as possible. The former is realized by a convolutional neural network and a probability model, and the latter is realized by introducing a joint loss function into a feature space.

3. Extensive experiments were performed on two real world datasets to evaluate the effectiveness of the method of the present invention, which was compared to the 10 most advanced baselines. Experimental results show that the method is superior to all base lines in terms of QoS prediction accuracy.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The present invention will be described in further detail below.

The invention provides a network data space design and application method (latent network-SPACE LEARNING method, LNSL) oriented to service quality prediction, which consists of two parts, namely potential network space learning and QoS prediction. The first section is intended to learn three potential webspaces (i.e., a user potential webspace, a service potential webspace, and an interaction potential webspace). The second part firstly samples each feature in three potential network spaces, and then fuses different features through three different convolution networks to realize QoS prediction. The method generates potential network space for each user and service in different positions through the two innovative components LNSL, and achieves high-precision QoS prediction.

How to mine the network space information of users and services on the basis of known information is a major technical challenge. In QoS prediction problems, the only known information is the location information of the user and service and QoS history. Our main observation is that when a user accesses a service in another network state through his own network, the network information of each location is hidden in the true QoS record. Throughout the process, the network condition of the user, the network condition of the service, and the interaction state of the two networks affect the quality of the service. Therefore, we merge the three location features to build a priori distribution of different network spaces. And then fitting three posterior distributions by fusing the actual QoS information and the position characteristics, and minimizing the Kullback-Leible (KL) distance between the corresponding prior distribution and the posterior distribution. Thus, the method is able to mine potential cyber-space information from the real QoS call records.

Referring to fig. 1, a network data space design and application method facing service quality prediction includes the following steps:

The interaction characteristic C is marked as

S21: the detailed network structure of the a priori network is as follows:

Where E _i represents the i-th a priori network, f ^[2] represents a one-dimensional convolution, the convolution kernel is 2 x 1, For calculating the average of the tensors along a given number axis, [1,2] representing the first and second dimensions of the tensor,/>For deleting the dimension of the first dimension pointed by the table [1] above, X _i represents the prior probability of the input;

μ_i＝f_μ(E_i;W_μ) (2)

σ_i＝f_σ(E_i;W_σ) (3)

s22: the feature adopts a layer

The feature Z1 of the training sample is sampled from P1 as in equation (4):

Z1～P₁(·|U)＝Ν(μ₁,diag(σ₁)) (4)

the feature Z2 of the training sample is sampled from P2 as in equation (5):

Z2～P₂(·|U)＝Ν(μ₂,diag(σ₂)) (5)

The feature Z3 of the training sample is sampled from P3 as in equation (6):

Z3～P₃(·|U)＝Ν(μ₃,diag(σ₃)) (6)

Z＝U⊕Z1⊕S⊕Z2⊕C⊕Z3 (7)

s24: defining a loss function of the prediction neural network, and obtaining the training after the value of the loss function is no longer changed

If not, updating parameters of the predicted neural network, and inputting a training sample to continue training;

And (5) feature embedding. In order for the neural network to learn additional data characteristics, the user identifier, user location information, service identifier, service location information, and real-time response time are input into the embedded layer of Keras, which can be viewed as a special unbiased item all-on layer. Specifically, the embedded performs a thermal encoding on the input to generate a zero vector of a specified dimension, and the i-th position of the vector is set to 1. Because the embedded layer accepts only positive integer input data. We use some conversion operations to convert the position information and RT information into positive integers. I.e., increase the longitude in the location information by 180 and map the longitude range from [ -180,180] to [0,360]. At the same time we increase the latitude by 90 and map the latitude range from [ -90,90] to [0,180]. Specifically, the acquired longitude of the user and the longitude of the service are respectively increased by 180, and the longitude range is mapped from [ -180,180] to [0,360]; the acquired latitude of the user and the latitude of the service are increased by 90, respectively, and the latitude ranges are mapped from [ -90,90] to [0,180].

The embedding method uses dense vectors to represent words or documents, similar to natural language processing. By this operation, the classification feature is mapped to a high-dimensional dense embedded vector.

The feature vectors are combined into three different combined features, namely user features, service features and interaction features, according to the physical meaning of the features. The user characteristics consist of a user ID, user longitude and latitude. The service characteristics consist of a service ID, service longitude and latitude. The interaction characteristics include user longitude, latitude, and service longitude, latitude.

A priori distribution of potential network space. The central component (P ₁,P₂,P₃) of the inventive architecture is three one-dimensional a priori distributions. Each position in this space encodes a variant. Estimating probabilities of these variants of a given combined feature by a "prior network" parameterized by weights wThese prior probability distributions are modeled as axial gaussian distributions.

Posterior distribution of network space. In the potential cyberspace learning module, we build three different cyberspace prior distributions through known location features. During the training process we fit the posterior distribution Pr _i with the true QoS values and the known position features. The a priori distribution of known features, P _i, is made closer to the a posteriori distribution Pr _i by minimizing the KL divergence between Pr _i and P _i. The specific network structure is as follows:

And Finally, three posterior distributions Pr _i are obtained through the prior network.

Xr _i denotes the i-th posterior distribution, U _r denotes user information with a real tag, S _r denotes service information with a real tag, C _r denotes interaction information with a real tag,Representing the true label vector.

Specifically, the invention designs three different convolution kernels to realize the classification perception of the features.

1) The object perceives the convolution. Object-Con considers the status of a user or service separately by generating Object features. It can pay attention to the states of the user object and the service object itself to judge the quality of service. The convolution kernel is 6*1, step size=8. The object-aware convolution performs two convolution operations. The first convolution operation merges all user features to obtain user object features. The second convolution operation merges all the service features to obtain the service object features.

X₁＝f(Z;w₁) (8)

Where w ₁ is a one-dimensional convolution kernel of size 6*1, X ₁ represents the output of the object-aware convolution layer.

2) Known information perceives convolution. The convolution kernel size is 3*1, and the step size is 11. Information aware convolution is known to perform two convolution operations. The first convolution operation fuses all the known features of the user to obtain the fused features known to the user. The second convolution operation fuses all known service features to obtain known service fusion features.

X₂＝f(Z;w₂) (9)

Where w ₂ is a one-dimensional convolution kernel of size 3*1, X ₂ represents the output of a known information-aware convolution layer.

3) Full information aware convolution. All-Con performs only one convolution operation. And (3) fusing all the features by convolution operation to obtain fused features.

X₃＝f(Z;w₃) (10)

Where w ₃ is a one-dimensional convolution kernel of size 20X 1 and X ₃ represents the output of the full information aware convolution layer.

A network is predicted. We combine the fusion feature and the sampling feature and pass through a convolution layer. And finally, realizing QoS prediction through a fully connected network. The method comprises the following specific steps:

x＝F(X₁)⊕F(X₂)⊕F(X₃) (11)

Where +.. F represents the flattening operation of the flattened layer of Keras. F ^[16, 8,1 represents a fully connected layer, [16,8,1]

The number of neurons in the layer is represented,Representing the predicted value.

Huber loss is a loss function used in robust regression, is less sensitive to outliers in data analysis,

Therefore, the loss function defined in S24 is as in equation (13):

Loss＝H_loss+KL_loss (13)

Wherein the method comprises the steps of

Where y represents the true QoS value of the training samples,QoS values representing predicted values obtained by predicting neural networks for training samples, delta being a constant term, default being 0.5, beta constant term.

KL_loss＝D_kl(Pr₁||P₁)+D_kl(Pr₂||P₂)+D_kl(Pr₃||P₃) (15)

Wherein, three KL divergences

During training, KL _loss calculates the distances between the three prior distributions P _i and the corresponding posterior distribution Pr _i. We use a random array (e.g., an array of all 1's) during testing, rather than an array of all 1' sSince the network parameters are not updated during the training process, pr _i does not affect the test results.

And (3) experimental verification:

Data set

We have experimented with two reference data sets, which are Web service QoS data collected by the WS-stream system. This is a large-scale real-world Web service data set that contains the QoS values of 1,974675 Web services collected from 339 users over 5825 services, including the location information of the users and services. Herein, the QoS data set exists in the form of a user-service matrix, wherein a row index represents a user identifier, a column index represents a service identifier, and each value in the matrix represents a corresponding Response Time (RT).

In calculating MAE and root mean square error, fromOutliers are eliminated. We have no real labels for outliers. Therefore, we need to detect outliers from the dataset. In the present invention, we use the same detection method and parameter settings as the outlier detection using ifforest (short for isolated forest) method. forest detects outliers based on the isolated concept without using any distance or density metrics, which makes the algorithm very efficient and robust. ifforest will calculate an outlier for each data. The score ranges from 0,1, with a greater value indicating a greater likelihood of outliers. From the outliers, we can flexibly set the number of outliers. Here set to 0.1.

1.2 Evaluation index

In the QoS prediction problem, an important criterion for evaluation is prediction accuracy. In most studies, two indicators are used to measure accuracy. The first metric is the mean absolute error function (MAE):

Where r _i,j is the actual value of, N is the number of QoS records, which is the predicted value of the QoS attribute. The second metric is Root Mean Square Error (RMSE):

the lower the values of these two indices, the more accurate our predictions are.

For ease of description, we compare the method LNSL of the present invention with the following six methods:

IPCC uses the concept of project-based collaborative filtering algorithm to calculate the similarity between two services.

UPCC uses similar user behavior information to predict QoS.

UIPCC it is a hybrid collaborative filtering method, combining the results of UPCC and IPCC.

HMF is also a model-based collaborative filtering method, which uses location information to cluster users and services, and then combines local matrix factorization prediction and global matrix factorization prediction.

LDCF is an advanced deep learning method with location awareness for QoS prediction.

CMF is an abnormal elastic QoS prediction method that uses cauchy loss to measure the difference between an observed QoS value and a predicted value.

For all methods, outliers will be removed when calculating MAE and RMSE. Furthermore, in the experiments, we run 10 times per method and report the average results for fair comparison.

To evaluate the performance of the method of the present invention, we selected six baselines for comparison, including the conventional algorithm and the most advanced method. We performed experiments of different data densities on two data sets considering outliers and no outliers.

As shown in table 1, LNSL was in any case superior to the comparison method. Among all comparison methods, CMF is the latest outlier optimization-based method, LDCF is a location-aware deep learning method. CMF performs better than other comparison methods on data sets without outliers. On the data set with outliers, LDCF is made more accurate than other comparison methods by adding distance information to the deep learning model. On the RT dataset with outliers, LNSL% improvement in MAE index was on average 22.82% and 5.34% improvement in RMSE compared to both baseline methods. On the RT dataset without outliers, LNSL of the MAE index increased by 50.62% on average and LNSL of the RMSE index increased by 22.43% on average.

Table 1 shows a comparison of the predicted results of the method of the present invention and the prior art

Experimental results show that the method has minimum mean square error and mean square error in all cases. This means that the proposed method performs better than the other baselines. The accuracy of this method has significant advantages, especially when there are no outliers to the known information in the dataset.

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims

1. A network data space design and application method facing service quality prediction is characterized by comprising the following steps:

The interaction characteristic C is marked as

S21: the detailed network structure of the a priori network is as follows:

Where E _i represents the i-th prior network, f ^[2] represents a one-dimensional convolution, For calculating the average of the tensors along a given number axis, [1,2] representing the first and second dimensions of the tensor,/>For deleting the dimension of the first dimension indicated by the superscript [1], X _i represents the prior probability of the input;

μ_i＝f_μ(E_i;W_μ) (2)

σ_i＝f_σ(E_i;W_σ) (3)

s22: the feature adopts a layer

The feature Z1 of the training sample is sampled from P1 as in equation (4):

Z1～P₁(·|U)＝Ν(μ₁,diag(σ₁)) (4)

the feature Z2 of the training sample is sampled from P2 as in equation (5):

Z2～P₂(·|U)＝Ν(μ₂,diag(σ₂))(5)

The feature Z3 of the training sample is sampled from P3 as in equation (6):

Z3～P₃(·|U)＝Ν(μ₃,diag(σ₃))(6)

2. The quality of service prediction oriented network data space design and application method of claim 1, wherein: the acquired longitude of the user and the longitude of the service are respectively increased by 180 and the longitude range is mapped from [ -180,180] to [0,360]; the acquired latitude of the user and the latitude of the service are increased by 90, respectively, and the latitude ranges are mapped from [ -90,90] to [0,180].

3. A network data space design and application method for quality of service prediction according to claim 1 or 2, characterized in that: the specific process of the object-aware convolution layer in S23 is as follows: the convolution kernel is 6*1, step = 8, and the object-aware convolution performs two convolution operations;

X₁＝f(Z;w₁) (8)

4. The quality of service prediction oriented network data space design and application method of claim 3, wherein: the specific process of the information aware convolution layer known in S23 is as follows: the convolution kernel is 3*1 in size and 11 in step length, and known information sensing convolution performs two convolution operations;

X₂＝f(Z;w₂) (9)

5. The quality of service prediction oriented network data space design and application method of claim 4, wherein: the specific process of the full information sensing convolution layer in S23 is as follows: the convolution kernel size is 20 x1, and only one convolution operation is executed;

X₃＝f(Z;w₃) (10)

6. The quality of service prediction oriented network data space design and application method of claim 5, wherein: the specific process of the two full connection layers in S23 is as follows:

Wherein, Represents a merging operation, F represents a flattening operation of a flattened layer of Keras, F ^[16,8,1] represents a fully connected layer, [16,8,1] represents the number of neurons of the layer,/>Representing QoS predicted values.

7. The quality of service prediction oriented network data space design and application method of claim 6, wherein: the loss function defined in S24 is as in equation (13):

Loss＝H_loss+KL_loss (13)

Wherein,

KL_loss＝D_kl(Pr₁||P₁)+D_kl(Pr₂||P₂)+D_kl(Pr₃||P₃) (15)

Wherein, three KL divergences