CN107679415A - Secret protection cooperates with the collaborative filtering method based on model of Web service prediction of quality - Google Patents

Secret protection cooperates with the collaborative filtering method based on model of Web service prediction of quality Download PDF

Info

Publication number
CN107679415A
CN107679415A CN201710875789.9A CN201710875789A CN107679415A CN 107679415 A CN107679415 A CN 107679415A CN 201710875789 A CN201710875789 A CN 201710875789A CN 107679415 A CN107679415 A CN 107679415A
Authority
CN
China
Prior art keywords
qos
user
service
value
privacy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710875789.9A
Other languages
Chinese (zh)
Inventor
毛睿
李荣华
陆敏华
王毅
罗秋明
商烁
刘刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201710875789.9A priority Critical patent/CN107679415A/en
Priority to PCT/CN2017/113485 priority patent/WO2019056572A1/en
Publication of CN107679415A publication Critical patent/CN107679415A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6263Protecting personal data, e.g. for financial or medical purposes during internet communication, e.g. revealing personal data from cookies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5009Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of collaborative filtering method based on model of secret protection collaboration Web service prediction of quality, comprise the following steps:The first step, Data Collection:Each user is being collected locally service quality value, i.e. qos value;Second step, data camouflage:Pretend service quality value;3rd step, the service quality value after the collaborative filtering camouflage based on model;4th step, prediction result:According to the service quality value after collaborative filtering come prediction result.Difference privacy is introduced collaborative Web service QoS prediction frameworks by the present invention first, and user can obtain maximum secret protection by ensuring the availability of data.Test result indicates that there is superiority in terms of the potential structure of capture QoS data present approach provides the QoS predictions of safety and accurate cooperation Web service, and based on the collaborative filtering method of model.

Description

Model-based collaborative filtering method for privacy protection collaborative Web service quality prediction
Technical Field
The invention belongs to the field of computers, particularly relates to a collaborative filtering method for privacy protection, and particularly relates to a collaborative filtering method based on a model for predicting the quality of privacy protection collaborative Web services.
Background
Quality of service (QoS) is widely used to describe the non-functional characteristics of web services. Quality of service based selection, composition, and recommendation web services techniques are widely discussed in recent papers. The premise of these methods is that the exact QoS value of the Web service is always available. But obtaining an accurate quality of service value is not an easy task. On the one hand, the QoS values published by service providers or third party communities are inaccurate for service users because they are susceptible to uncertain internet environments. On the other hand, it is impractical for a service user to directly evaluate the QoS of all available services due to time, cost and other resource constraints. To address this issue, a breakthrough is the QoS prediction of personalized collaborative Web services. The basic idea is that users with similar characteristics tend to observe similar QoS values for the same service, and therefore when it is desired to predict the QoS value observed by a particular user for a web service, the value observed by the users with similar characteristics may be substituted.
In this way, different users are also typically given different QoS prediction values for the same service, and the final prediction value is actually dependent on their particular context. Based on these provided QoS values, various techniques have been employed to improve quality, particularly the accuracy of the predictions.
Collaborative Web services QoS prediction has become an important tool for generating accurate personalized QoS. Although much effort has been made in research to improve the accuracy of collaborative QoS predictions, there is insufficient effort to protect user privacy in this process. In fact, the observed QoS values may be sensitive information, and thus users may be reluctant to share them with others. For example, the observed response time fed back by a user is typically dependent on her location, indicating that the user's location can be inferred from the QoS information she provides. Therefore, one problem is whether the recommendation system can make an accurate personalized QoS prediction for the user on the premise of protecting the privacy of the user.
Homomorphic encryption, which allows computation on ciphertext, is a straightforward way to achieve privacy. However, all these operations not only require significant computational costs, but also require continuous communication between the parties, even without considering the difficulties of applying some complex computations into the cryptographic domain. Therefore, it is not feasible to deal with our problem by using homomorphic encryption.
Another technique, random perturbation, proposed by polic et al, claims that accurate recommendations can still be obtained with this technique, while randomness from a specific distribution is added to the original data to prevent information leakage. However, the range of randomness α is empirically chosen and there is no provable privacy guarantee. However, for applications that perturb data to cluster, an adversary can accurately infer the user's private data with up to 70% accuracy.
Thus, while the privacy preserving method of random perturbation is insecure, it motivates us to design a lightweight and provable random perturbation. Specifically, a privacy protection QoS prediction model and a differential privacy model are developed for a user, the model can powerfully protect privacy data and has provable privacy guarantee, and the model is the most advanced privacy data state protection technology at present. Differential privacy has attracted extensive attention because it is intended to provide an effective method to minimize noise added to the original data.
Despite the widespread interest in differential privacy, the application of QoS prediction remains quite limited. Reference 1[ 1], [ F.McSherry and I.Mironov.Dimentally private records systems. Machanavajhala et al [ reference 2] studied privacy protection for personalized social recommendations, which is based entirely on the user's social graph. With differential privacy, sensitive links in a social graph can be effectively protected, meaning that an attacker cannot infer the presence of a single link in the graph by passively observing the recommendation. However, another problem is that quality recommendations can only be achieved with weak privacy parameters, or only for a small fraction of users. Mc sheery and Mironov [ reference 1] apply differential privacy to collaborative filtering r.m. bell and y.koren. Scalable collaborative filtering with joined weighted interference weights. Icdm 2007 ], which is a common solution for recommendation systems. They split the recommendation algorithm into two parts: the method comprises a learning stage and an individual recommendation stage, wherein the learning stage is executed by using differential privacy guarantee, and the individual recommendation stage uses a learning result for individual prediction. Unlike the work done in references 1 and 2, which focuses on privacy assurance of data distribution rather than knowledge learning, the present invention explores other methods than that being studied in reference 1, such as a latent factor model.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a model-based collaborative filtering method for privacy protection collaborative Web service quality prediction. Differential privacy is introduced into a collaborative Web service QoS prediction framework for the first time, and a user can obtain the maximum privacy protection by ensuring the availability of data. Experimental results show that the method provides safe and accurate QoS prediction of the collaborative Web service, and the model-based collaborative filtering method has superiority in capturing the potential structure of QoS data.
In order to solve the technical problem, the invention provides a model-based collaborative filtering method for privacy protection collaborative Web service quality prediction, which comprises the following steps:
the first step, data collection: each user locally collects a quality of service value, i.e. a QoS value;
step two, data disguising: disguising a quality of service value;
thirdly, cooperatively filtering the service quality value after disguising based on the model;
fourthly, predicting the result: and predicting the result according to the service quality value after collaborative filtering.
As a preferable technical solution of the present invention, in the second step, the data camouflage is performed by a method including:
using r ui To represent the quality of service value-QoS value, r, collected by user u for web service i u Represents the entire vector of QoS values evaluated by user u, and similarly, I ui And I u Respectively representing binary elements and vectors, R, indicating whether QoS values are present or not u Representing the disguised data; the epsilon-differential privacy for each user u is achieved by the following equation:
R ui =r ui +Laplace(Δf/ε)
where ε is the privacy parameter for exploiting privacy, Δ f is defined as the maximum difference between QoS values, i.e.:
Δf=max(r ui -r uj )
r ui represents the quality of service value collected by user u for web service i-QoS value, r uj Represents a quality of service value-QoS value collected by user u for web service j;
and the meaning of Laplace () is given by the following formula:
if the probability density function for a random variable x is:
the random variable x has a laplace (μ, b) distribution; μ and b are a position parameter and a scale parameter, respectively; let μ =0, so the distribution is considered to beStandard deviation ofA symmetric exponential distribution of (a); to add noise that obeys the laplacian distribution, let b = Δ f/ε, and the generation of noise is referred to as laplace (Δ f/ε).
As a preferred technical scheme of the invention, in the second step, the data disguise achieves the purpose of disguising data by randomly interfering the original data; randomness should ensure that sensitive information cannot be derived from perturbed data, including quality of service values for each individual user; when the number of users is very large, the aggregated information of the users can still be evaluated with high accuracy.
As a preferred solution of the invention, in a second step, said privacy parameter epsilon is given by each user, by using differential privacy, the random number added in the observed QoS value is the minimum value that maintains a considerable accuracy with respect to the specific privacy.
As a preferred technical solution of the present invention, in the third step, the model-based collaborative filtering method specifically comprises: using factorization of the matrix MF, assuming a sparse matrix Q n*m Represents the observed QoS values of n users and m services, where each element q ij Reflecting QoS values of user i using service j, using input matrix Q n*m MF aims at serving the users to the matrix Q n*m Factorization into two matrices of lower dimension d: user factor matrix U n*d And a service factor matrix V m*d (ii) a Then, Q n*m The null element in (b) may be approximated as the product of U and V, i.e., the unknown QoS value q' ij ByTo estimate;
MF is often converted to an optimization problem and local optimal solutions are obtained by iteration; the objective function or loss function of the MF is defined as:
first partIs the squared difference between the existing QoS matrix and the prediction matrix, but only for elements that have been evaluated by the user; the latter partIs a regularization term added to handle overfitting due to input sparsity; by processing the optimization, a user factor matrix U is finally obtained n*d And a service factor matrix V m*d (ii) a This problem is solved by using a random gradient descent SGD, whose iterative equation is as follows:
where γ is the learning rate and λ' is the regularization coefficient; the choice of two parameters will significantly affect the result, which will diverge rather than converge when the value of γ is large; to obtain convergence, γ is empirically set to 0.001, and likewise λ' is empirically set to 0.01, although longer training times are required; the iteration will terminate when the objective function value is less than a certain threshold.
As a preferred technical solution of the present invention, in the fourth step, the prediction result specifically is: after the QoS value of a certain service is obtained through collaborative filtering, the QoS values of other users aiming at the same service are retrieved, the user with the closest value is selected, the two users have similar interests and hobbies, similar recommendation is made based on the user with the closest value, and the related value of the next user is adopted as the prediction result of the previous user.
Compared with the prior art, the invention has the following beneficial effects: the invention discloses a model-based collaborative filtering method for privacy protection collaborative Web service quality prediction, which provides a privacy protection collaborative QoS prediction framework, can protect private data of a user, and simultaneously reserves the capability of generating accurate QoS prediction. The invention introduces differential privacy as preprocessing of QoS data prediction, and the differential privacy is a strict and provable privacy protection technology. The present invention implements the proposed method based on a general method known as the laplace mechanism, and performs extensive experiments to study its performance on real datasets. The privacy accuracy of the experiment is evaluated under different conditions, and the result shows that under some constraints, the method can achieve better performance than the baseline. The invention mainly has the following advantages:
1. for the method proposed by the present invention, the privacy-preserving algorithm can be parameterized and used to match the prediction to its non-private analogues. Although there are some specialized analytical requirements, the method itself is relatively straightforward and readily available.
2. By integrating privacy protection into the application, users can be provided with unrestricted access to the original data with much less than the entire data set that meets privacy criteria at the end of its output.
3. The present invention tests the method with a real dataset. The result shows that the prediction accuracy of the disguised data of the invention is very close to that of the private data of the user.
4. Experimental results prove that the model-based collaborative filtering method for privacy protection collaborative Web service quality prediction has superiority in capturing the potential structure of QoS data.
Drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a schematic flow diagram of a model-based collaborative filtering method for privacy-preserving collaborative Web quality of service prediction according to the present invention.
Fig. 2 is a schematic diagram of a privacy preserving collaborative QoS prediction model.
FIG. 3 is a schematic diagram illustrating the comparison of privacy and accuracy between the QoS prediction based on differential privacy and the original method under different privacy in the experiment of the present invention; fig. 3 (a) represents the response time, and fig. 3 (b) represents the total time.
FIG. 4 is a schematic diagram of the comparison of the impact of service between differential privacy based QoS prediction and the original method under different privacy in the experiment of the present invention; fig. 4 (a) represents the response time, and fig. 4 (b) represents the total time.
FIG. 5 is a schematic diagram showing a comparison of user impact between differential privacy based QoS prediction and original methods under different privacy in the experiment of the present invention; fig. 5 (a) represents the response time, and fig. 5 (b) represents the total time.
FIG. 6 is a graph showing the results of the comparison of the precision at different densities between the QoS prediction based on differential privacy and the original method under different privacy conditions in the experiment of the present invention; fig. 6 (a) represents the response time, and fig. 6 (b) represents the total time.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic views illustrating only the basic structure of the present invention in a schematic manner, and thus show only the constitution related to the present invention.
1. System model and problem definition
1. Differential privacy
It is necessary to distinguish between differential privacy and conventional cryptographic systems. Differential privacy gives a strictly quantitative definition of privacy leakage under a very strict attack model, and demonstrates that: based on the idea of differential privacy, users can maximally obtain privacy protection and ensure the usability of data. The method has the following advantages: although the data is distorted, the noise required for the disturbance is independent of the data size. We can achieve a high level of privacy protection by adding a very small amount of noise. Although many privacy preserving methods have been proposed, such as k-anonymity and l-diversity, differential privacy is still considered to be the most stringent and robust privacy preserving model on its solid mathematical basis.
2.1 Security Definitions under differentiated privacy
There are two prerequisites for differential privacy. One is that the output of any computation (e.g., SUM) should not be affected by operations like inserting or deleting records. Another is that it gives a strictly quantitative definition of the privacy disclosure under a very strict attack model: an attacker cannot distinguish records with a probability greater than epsilon even if she knows the entire data set except the target. The formula is defined as follows:
definition 1: (epsilon-differential privacy) if D1 and D2 differ by at most one element for all datasets and all S e Range (K), then the random function K gives epsilon-differential privacy,
d is a database of rows, D1 is a subset of D2, and the larger data set D2 contains exactly one additional row. The probability space Pr [ ] in any case is on the coin flip of K. Privacy parameters ε >0 are public, with smaller ε yielding stronger privacy guarantees.
Since differential privacy is defined under probability, any method to achieve this must be random. Some of these methods rely on the addition of controlled noise, such as laplace mechanisms [ c.dwork, f.mcsherry, k.nissm and a.smith.calibration noise to sensitivity in private data analysis. Tcc 2006-284 ]. Others, such as exponential mechanisms and a posteriori sampling, sample from a problem-dependent distribution. We will explain the structure in detail in the following section.
2.2 Laplace mechanism of Global sensitivity
In addition to the definition of differential privacy, dwork [ c.dwork, f.mcsherry, k.nissm and a.smith.calibration noise to sensitivity in private data analysis.tcc 2006-284 ] also states that differential privacy can be achieved by adding random noise that obeys a laplacian distribution. If the probability density function for a random variable is:
the random variable has a laplacian (μ, b) distribution. μ and b are the location parameter and the scale parameter, respectively. For simplicity, we assume μ =0, so the distribution can be considered as the standard deviation ofSymmetric exponential distribution of (a).
To add noise that obeys the laplacian distribution, let b = Δ f/∈, and the generation of noise is referred to as
laplace(Δf/ε)
Here, Δ f is the global sensitivity, which is defined below. ε is a privacy parameter for exploiting privacy. As we see from the equation, the added noise is proportional to Δ f and inversely proportional to ∈.
Definition 2: (Global sensitivity) pair f: D → R d L of f k -sense is defined as:
for all D1, D2 differs by at most one element, | | | · | luminance k Represents L k And (4) norm.
3.1 System model
[ S.Zhung, J.Ford and F.Makedon.Deriving Private Information from random modulated sources SDM 2006 ] has proven that random perturbations are unsafe because it can be inferred by clustering techniques, but the system model proposed [ J.Zhu, P.He, Z.Zhung and M.R.Lyu.A Private-compressing QoS Prediction Framework for Web Service recommendation. ICWS 2015 ] is mature and applicable to many scenarios, and therefore, this model is applied here. As shown in fig. 2, in particular, each USER (USER 1, USER2 \8230; USER, etc.) invokes and collects QoS values locally and masquerades the QoS values she observes, and then sends the owners of all the masqueraded QoS values to the SERVER (SERVER). The QoS value can then be uploaded securely because the server cannot derive any personally sensitive information with masquerading data. However, the data masquerading scheme should still be able to allow the server to perform collaborative filtering (neighborhood-based or model-based) from the masquerading data. Based on the predicted QoS value (QoS Prediction), the server may run various applications, such as selection, combination, and recommendation based on the QoS value.
Data masquerading is a key component of privacy protection collaborative Web service QoS prediction. The basic idea of data masquerading is to randomly disturb the original data among these attributes:
a) The randomness should ensure that sensitive information cannot be derived from the perturbed data (e.g., qoS values for each individual user);
b) Although personal information is limited, when the number of users is very large, the aggregate information of the users can be evaluated with high accuracy.
This property is useful for calculations based on aggregated information. Without knowing the exact values of the individual data items, we can still get meaningful results, since the required aggregated information can be estimated from the perturbed data.
Another important point of our approach is the trade-off between accuracy and privacy. The more random numbers, the larger the gap between the masquerading data and the original data, which provides a higher level of privacy protection. Conversely, the fewer the random numbers, the more apparent the data characteristics. For context-based calculations, this indicates that the results are more accurate. The balance between process accuracy and privacy is an open question. In the present invention, privacy is parameterized as ε and is given by each user. By utilizing differential privacy, the random number added in the observed QoS value is a minimum value that maintains a fair degree of accuracy with respect to the specific privacy.
2. The invention relates to a collaborative filtering method based on a model for privacy protection collaborative Web service quality prediction
As shown in fig. 1, the model-based collaborative filtering method for privacy protection collaborative Web service quality prediction according to the present invention includes the following steps:
first step, data collection: each user locally collects a quality of service value, i.e. a QoS value;
step two, data disguising: disguising a quality of service value;
thirdly, cooperatively filtering the service quality value after disguising based on the model;
fourthly, predicting the result: and predicting the result according to the service quality value after collaborative filtering.
The data disguising in the second step adopts the following method:
using r ui To represent the quality of service value-QoS value, r, collected by user u for web service i u Represents the entire vector of QoS values evaluated by user u, and similarly, I ui And I u Respectively representing a binary element and a vector, R, indicating whether a QoS value is present or not u Representing the disguised data; the epsilon-differential privacy for each user u is achieved by the following equation:
R ui =r ui +Laplace(Δf/ε)
where ε is the privacy parameter for exploiting privacy, Δ f is defined as the maximum difference between QoS values, i.e.:
Δf=max(r ui -r uj )
r ui represents the quality of service value collected by user u for web service i-QoS value, r uj Represents the quality of service value-QoS value collected by user u for web service j;
and the meaning of Laplace () is given by the following formula:
if the probability density function for a random variable x is:
the random variable x has a laplace (μ, b) distribution; μ and b are a position parameter and a scale parameter, respectively; let μ =0, so the distribution is considered as a standard deviation ofA symmetric exponential distribution of (a); to add noise that obeys the laplacian distribution, let b = Δ f/ε, and the generation of noise is referred to as laplace (Δ f/ε).
The privacy parameter epsilon is given by each user and by using differential privacy the random number added in the observed QoS value is the minimum value that maintains a considerable accuracy with respect to the specific privacy.
The data disguise achieves the purpose of disguising data by randomly interfering the original data; the randomness should ensure that sensitive information, including quality of service values for each individual user, cannot be derived from perturbed data; when the number of users is very large, the aggregated information of the users can still be evaluated with high accuracy.
Differential privacy based on data masquerading:
we use r ui To represent the QoS value, r, collected by user u for web service i u Represents the entire vector of QoS values evaluated by user u, and similarly, I ui And I u Respectively, a binary element and a vector indicating whether a QoS value exists. c. C u =|I u Is the number of QoS values evaluated by user u. In our discussion, differential privacy is a key technology for data masquerading. Laplace mechanism [ c.dwork, f.mcsherry, k.nissm and a.smith.calibration noise to sensitivity in private data analysis. Tcc 2006]Epsilon-differential privacy is obtained by increasing the noise of the laplacian distribution.
Definition 3: (Laplace mechanism [ C.Dwork.Diemential privacy. Encyclopedia of Cryptography and Security.2011:338-340.]) Giving a function: g = D → R d The following calculation maintains epsilon-differential privacy
X=g(x)+Laplace(Δf/ε)
Where epsilon is a privacy parameter for exploiting privacy, and a smaller epsilon provides a stronger privacy guarantee. Δ f is the de-global sensitivity. Here, we use L 1 Norm calculation Δ f:
for simplicity, ε -differential privacy for each user u is achieved by the following equation:
R ui =r ui +Laplace(Δf/ε)
where Δ f is defined as the maximum difference between the QoS values, i.e.:
Δf=max(r ui -r uj )
after masquerading, all users send the masquerading QoS value R to the server u Random retention of the original data r ui Of the sensitive information. However, we can still estimate the aggregate information of the users. Thus, R can be directly accessed independently ui To perform QoS prediction.
Wherein, the third step is based on the collaborative filtering of the model. Collaborative Filtering (CF) is a well established technique employed by most modern recommendation systems. In QoS prediction of collaborative Web services, a user is required to provide an observed QoS value of a service that the user uses to a recommendation system. Based on the collected QoS values, the recommendation system may predict the QoS of all available services for the user through some premium algorithm. The more QoS values a user provides, the higher the prediction accuracy. In the invention, model-based collaborative filtering is adopted, and the specific method is as follows:
matrix Factorization (MF) [ Z.Zheng, H.Ma, M.R.Lyu and I.King.QoS-aware web service communication by collaborative filtering. TSC 2011,4 (2): 140-152] is a typical solution for model-based collaborative filtering, and the accuracy of prediction can be effectively improved by researching potential factors of a model.
Hypothesis sparse matrix Q n*m Represents the observed QoS values of n users and m services, where each element q ij Reflecting the QoS value of user i using service j. Using input matrix Q n*m MF aims at serving the users to the matrix Q n*m Factorize into two matrices of lower dimension d: moment of user factorArray U n*d And a service factor matrix V m*d . Then, Q n*m The null element in (b) may be approximated as the product of U and V, i.e., the unknown QoS value q' ij ByTo estimate.
The MF is often converted into an optimization problem and a locally optimal solution is obtained through iteration. The objective function (or loss function) of the MF is defined as:
the first partIs the squared difference between the existing QoS matrix and the prediction matrix, but only for elements that have been evaluated by the user. The latter part λ (| | U) i || 2 +||V j || 2 ) Is a regularization term added to handle overfitting due to input sparsity. By processing this optimization, we end up with a user factor matrix U n*d And a service factor matrix V m*d
Alternative Least Squares (ALS) and random gradient descent (SGD) are two common methods to solve this optimization problem. Since the Alternative Least Squares (ALS) method is more difficult and requires the computation of the inverse matrix, we use random gradient descent (SGD) to solve this problem. The iteration equation for SGD is as follows:
where γ is the learning rate and λ' is the regularization coefficient. The choice of two parameters will significantly affect the result. When the value of γ is large, the result diverges rather than converges. To achieve convergence we empirically set γ to 0.001, although longer training times are required. Also, λ' is empirically set to 0.01.
In the first iteration, U and V are randomly set. But appropriate settings may make the later efficient calculations faster. Therefore, we initialize U and V around the average of all QoS values observed. The iteration will terminate when the objective function value is less than a certain threshold.
And in the fourth step, after the QoS value of a certain service is obtained through collaborative filtering, searching other users for the QoS value of the same service, selecting the user with the closest value, which indicates that the two users have similar interests, making similar recommendation based on the user with the closest value, and adopting the related value of the next user as the prediction result of the previous user.
3. Experiment of
In this section, we performed three series of experiments on real datasets to evaluate our privacy preserving QoS prediction framework. The first series of experiments investigated the balance between privacy and accuracy when using the proposed method. Two additional series of experiments investigated some important data features, including the effect of size and density on the performance of our method.
TABLE 1 data set statistics
3.1 Experimental configuration
We first note that [ Z.ZHENG, Y.ZHANG and M.R.Lyu.inquiring QoS of Real-World Web services.TSC 2014 7 (1): 32-39; z.zheng, y.zhang and m.r.lyu.distributed QoS Evaluation for Real-World Web services. Icws 2010, 83-90, introduces a Real Web service QoS dataset comprising 5,825 Real Web service QoS values observed by 339 users. This data set is very useful in studying the accuracy of QoS predictions. From the data set, we focus on two representative QoS attributes: response Time (RT) and total Time (TP). Table 1 describes the statistics of the data set, AVE and STD are mean and standard deviation, respectively, and density refers to the ratio of observed data to all data. More details of the dataset may be found in [ Z.Zheng, Y.Zhang and M.R.Lyu.investigating QoS of Real-World Web services.TSC 2014 7 (1): 32-39; z.zheng, y.zhang and m.r.lyu.distributed QoS Evaluation for Real-World Web services.icws 2010.
We use cross-validation to train and evaluate QoS predictions. The data set is relatively complete, but in practice, due to limited time and resources, users usually call only a few services, and the data density is generally below 10%. To simulate this sparsity in our experiments, we randomly removed entries from the complete data set, leaving only a smaller density of historical QoS values as our training set. The deleted data is used as a test set for accuracy evaluation.
Then, we perform the QoS prediction algorithm on the training set and predict the test set. We implemented and evaluated four algorithms. UIPCC proposed in [ z.zheng, h.ma, m.r.lyu and i.king.wsrec: a Collaborative Filtering Based Web Service recontaminer system. Icws 2009 ] is a representative implementation Based on neighborhood Collaborative Filtering and MF introduced in [ z.zheng, h.ma, m.r.lyu and i.king.qos-aware Web Service registration by Collaborative Filtering. Tsc 2011,4 (2): 140-152] is an implementation Based on model Collaborative Filtering. LUIPCC and LYMPH are two different privacy integration approaches implemented by the laplace mechanism.
To quantify the accuracy of QoS predictions, we take Root Mean Square Error (RMSE) as a widely used metric in the relevant work (e.g. [ a. Berlioz, a. Friedman, m.a. Kaafr, r.boreli and s.berkovsky.application differential privacy to matrix factor. Recsys 2015:
r consists of all values in the training set that need to be predicted, and | R | is the number of elements in R. q's' ui Is a predictor of set R, q ui Are the corresponding values in the test set. Generally, a smaller RMSE indicates a better prediction result.
Note that the default parameter settings are shown in table 2. We empirically select the parameters of UIPCC and MF. By default, ε is set to 0.5, which protects sufficient privacy.
TABLE 2 parameter settings
UIPCC k=20 λ=0.1 -
MF d=20 γ=0.001 λ'=0.01
Laplace ε=0.5 - -
3.2 privacy and accuracy
Fig. 3 is a comparison between our QoS prediction based on differential privacy and the original method under different privacy, corresponding to RT and TP. By introducing differential privacy into QoS prediction, a user may achieve privacy protection. But for users who adopt our approach, they do need to consider a balance between privacy and accuracy. On the one hand, the user can obtain higher privacy protection by adding more laplacian noise, which certainly reduces the effectiveness of the data. On the other extreme side, the user can achieve 100% accuracy without adding any laplacian noise. To study the performance of varying accuracy, we performed QoS prediction algorithms on the test set and predicted the test set. The privacy parameter epsilon is incremented in steps 0.5 in the range 0.5 to 4. We can observe that both LUIPCC and LMF drop to RMSE as epsilon increases. A larger epsilon implies a looser privacy constraint and the utility of the data is not limited, so the user can get better accuracy. It is also worth noting that when epsilon becomes large in fig. 3 (e.g., greater than 2.0), our privacy preserving methods LUIPCC and LMF can achieve almost the same or even higher accuracy as UIPCC. Particularly when ε is greater than 4, the prediction accuracy of LMF is better than UIPCC. Furthermore, we have also found that MF is superior to UIPCC. This demonstrates the superiority of the model-based approach in capturing the underlying structure of the QoS data. Another fact that we need to pay attention to is that although a recent work [ j.zhu, p.he, z.zheng and m.r.lyu.a Privacy-serving QoS Prediction Framework for Web Service recommendation. Icws ] claims better performance than the original algorithms (UIPCC and MF), the added randomness to prevent Information leakage is not large enough, the adversary can accurately infer the user's Private data with the application of the cluster [ s.zhang, j.ford and f.major.deviation Information from rated sources. Sdm.
In summary, our differential privacy based algorithm can provide privacy preserving QoS prediction with parameterized privacy. The results show that the user data we disguise is very close to the loose constraints of the user private data.
3.3 influencing data size
To evaluate the impact of data size, we designed experiments by varying the number of services and users, respectively. In fig. 4, step 1000 sets the number of users as 339, and the number of services changes from 1000 to 5000, where the services are randomly selected from the original data set. Other parameter settings for the experiment are shown in table 2. We performed the same experimental setup in fig. 5, which contained 5825 services.
It is clear that both the number of services and the number of users have a positive impact on the accuracy of the algorithm, which means that the more data is given, the better the prediction is. In other words, with more data we can provide better accuracy.
Another finding is that the trend of the original algorithm and our privacy-based differential privacy algorithm is the same, such as the trend of UIPCC and LUIPCC or the trend of MF and LMF, although the precision difference between different data sizes is large. This means that the noise required for digital concealment is independent of the data size, so the user can achieve a high level of privacy protection by adding a very small amount of noise.
3.4 Effect of Density
In addition to data size, the density, expressed as θ, is also a major factor in algorithm performance. Figure 6 shows the results of the accuracy comparison at different densities. Although the effect of density on the original algorithm is not significant, it does have a significant effect of our difference-based algorithm. Higher density datasets perform better. This result means that density is also a key factor in determining the performance of the differential privacy method. More importantly, as the number of services becomes larger, the gap between the traditional approach and our privacy-based differencing approach becomes smaller and smaller. More specifically, when the density is set to 5 in fig. 6, the gap between the LUIPCC and UIPCC is 5. However, as the density increased to 30, the gap between LUIPCC and UIPCC decreased to 1. Thus, users are suggested to use a higher density data set to make the prediction closer to the original result.
5. Conclusion
The invention introduces differential privacy into a collaborative Web service QoS prediction framework for the first time. Differential privacy gives a strict quantitative definition of privacy leakage under very strict constraints. Based on the idea of differential privacy, the user can obtain maximum privacy protection by ensuring the availability of data. Experimental results show that the system and the method provide safe and accurate QoS prediction of the cooperative Web service.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.

Claims (6)

1. A collaborative filtering method based on a model for privacy protection collaborative Web service quality prediction is characterized by comprising the following steps:
the first step, data collection: each user locally collects quality of service values, i.e. QoS values;
step two, data disguising: disguising a quality of service value;
thirdly, cooperatively filtering the service quality value after disguising based on the model;
fourthly, predicting the result: and predicting the result according to the service quality value after collaborative filtering.
2. The method of claim 1, wherein in the second step, the data masquerading is performed as follows:
using r ui To represent the quality of service value-QoS value, r, collected by user u for web service i u Represents the entire vector of QoS values evaluated by user u, and similarly, I ui And I u Respectively representing binary elements and vectors, R, indicating whether QoS values are present or not u Representing the disguised data; the epsilon-differential privacy for each user u is achieved by the following equation:
R ui =r ui +Laplace(Δf/ε)
where ε is the privacy parameter for exploiting privacy, Δ f is defined as the maximum difference between QoS values, i.e.:
Δf=max(r ui -r uj )
r ui representing a quality of service value-QoS value, r, collected by user u for a web service i uj Represents the quality of service value-QoS value collected by user u for web service j;
and the meaning of Laplace () is given by the following formula:
if the probability density function for a random variable x is:
the random variable x has a laplace (μ, b) distribution; μ and b are a position parameter and a scale parameter, respectively; let μ =0, so the distribution is considered as a standard deviation ofA symmetric exponential distribution of; to add noise that obeys the laplacian distribution, let b = Δ f/ε, and the generation of noise is referred to as laplace (Δ f/ε).
3. The method of claim 1, wherein in the second step, the data masquerading achieves the purpose of masquerading data by randomly disturbing the original data; randomness should ensure that sensitive information cannot be derived from perturbed data, including quality of service values for each individual user; when the number of users is very large, the aggregated information of the users can still be evaluated with high accuracy.
4. A method according to claim 2, characterized in that in the second step, the privacy parameter s is given by each user, and by using differential privacy, the random number added in the observed QoS value is a minimum value that maintains a considerable accuracy with respect to the particular privacy.
5. The method of claim 1, wherein in a third step, the model-based collaborative filterThe method comprises the following steps: using factorization of matrices MF, assuming a sparse matrix Q n*m Represents the observed QoS values of n users and m services, where each element q ij Reflecting QoS values of user i using service j, using input matrix Q n*m MF aims at serving the users to the matrix Q n*m Factorization into two matrices of lower dimension d: user factor matrix U n*d And a service factor matrix V m*d (ii) a Then, Q n*m The null element in (b) may be approximated as the product of U and V, i.e., the unknown QoS value q' ij ByTo estimate;
MF is often converted into an optimization problem and a locally optimal solution is obtained through iteration; the objective function or loss function of MF is defined as:
the first partIs the squared difference between the existing QoS matrix and the prediction matrix, but only for elements that have been evaluated by the user; the latter part λ (| 0) i || 2 +||V j || 2 ) Is a regularization term added to handle overfitting due to input sparsity; by processing the optimization, a user factor matrix U is finally obtained n*d And a service factor matrix V m*d (ii) a This problem is solved by using a random gradient descent SGD, whose iterative equation is as follows:
where γ is the learning rate and λ' is the regularization coefficient; the choice of two parameters will significantly affect the result, which will diverge rather than converge when the value of γ is large; to obtain convergence, γ is empirically set to 0.001, and likewise λ' is empirically set to 0.01, although longer training times are required; the iteration will terminate when the objective function value is less than a certain threshold.
6. The method according to claim 1, wherein in the fourth step, the prediction result is specifically: after the QoS value of a certain service is obtained through collaborative filtering, the QoS values of other users aiming at the same service are retrieved, the users with the most similar values are selected, the two users have similar interests and hobbies, similar recommendation is made based on the users, and the related value of the next user is used as the prediction result of the previous user.
CN201710875789.9A 2017-09-25 2017-09-25 Secret protection cooperates with the collaborative filtering method based on model of Web service prediction of quality Pending CN107679415A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710875789.9A CN107679415A (en) 2017-09-25 2017-09-25 Secret protection cooperates with the collaborative filtering method based on model of Web service prediction of quality
PCT/CN2017/113485 WO2019056572A1 (en) 2017-09-25 2017-11-29 Model-based collaborative filtering method for collaborative web quality-of-service prediction for privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710875789.9A CN107679415A (en) 2017-09-25 2017-09-25 Secret protection cooperates with the collaborative filtering method based on model of Web service prediction of quality

Publications (1)

Publication Number Publication Date
CN107679415A true CN107679415A (en) 2018-02-09

Family

ID=61137965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710875789.9A Pending CN107679415A (en) 2017-09-25 2017-09-25 Secret protection cooperates with the collaborative filtering method based on model of Web service prediction of quality

Country Status (2)

Country Link
CN (1) CN107679415A (en)
WO (1) WO2019056572A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109257217A (en) * 2018-09-19 2019-01-22 河海大学 Web service QoS prediction technique based on secret protection under mobile peripheral surroundings
CN109376549A (en) * 2018-10-25 2019-02-22 广州电力交易中心有限责任公司 A kind of electricity transaction big data dissemination method based on difference secret protection
CN110232151A (en) * 2019-05-22 2019-09-13 温州大学 A kind of construction method of the QoS prediction model of mixing probability distribution detection
CN112328912A (en) * 2020-11-03 2021-02-05 重庆大学 QoS prediction method utilizing location awareness
CN112700067A (en) * 2021-01-14 2021-04-23 安徽师范大学 Method and system for predicting service quality under unreliable mobile edge environment
CN113364621A (en) * 2021-06-04 2021-09-07 浙江大学 Service quality prediction method under service network environment
CN114462707A (en) * 2022-02-10 2022-05-10 湖南科技大学 Web service multidimensional QoS (quality of service) joint prediction method based on feature depth fusion
CN116595254A (en) * 2023-05-18 2023-08-15 杭州绿城信息技术有限公司 Data privacy and service recommendation method in smart city

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554201A (en) * 2020-04-23 2021-10-26 山东大学 Grading prediction system and method based on hyper-network and federal learning
CN111783032B (en) * 2020-06-30 2023-07-04 北京百度网讯科技有限公司 Method, device, equipment and storage medium for obtaining target space-time observation data
CN112948364A (en) * 2021-02-08 2021-06-11 国网上海市电力公司 Electric energy metering data recovery method based on collaborative fitting
CN112926088B (en) * 2021-03-18 2024-03-19 之江实验室 Federal learning privacy policy selection method based on game theory
CN113420421B (en) * 2021-05-28 2023-07-25 西安邮电大学 QoS prediction method based on time sequence regularized tensor decomposition in mobile edge calculation
CN113486257B (en) * 2021-07-01 2023-07-11 湖北工业大学 Coordinated filtering convolutional neural network recommendation system and method based on countermeasure matrix decomposition
CN113536145A (en) * 2021-07-05 2021-10-22 安徽大学 Transformer collaborative filtering method based on spatial perception
CN113674066A (en) * 2021-09-06 2021-11-19 中国银行股份有限公司 Recommendation method and system for mobile banking financing products
CN113674036A (en) * 2021-09-06 2021-11-19 中国银行股份有限公司 Recommendation method and system for mobile banking advertisements
CN114117203B (en) * 2021-11-08 2024-06-11 长沙理工大学 Short video recommendation method, system and equipment
CN114398538B (en) * 2021-12-08 2024-02-06 西安电子科技大学 Cross-domain recommendation method and system for privacy protection, storage medium and computer equipment
CN115277521B (en) * 2022-06-06 2023-10-03 浙大城市学院 Multi-view-based dynamic prediction method for QoS (quality of service) of Internet of things
CN115190474B (en) * 2022-07-21 2024-05-03 河海大学 Privacy perception computing task unloading method for marine observation monitoring sensor network
CN115828312B (en) * 2023-02-17 2023-06-16 浙江浙能数字科技有限公司 Privacy protection method and system for social network of power user
CN116299172B (en) * 2023-02-27 2023-12-01 河南大学 Accurate positioning method and device based on privacy protection quality assessment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103684850A (en) * 2013-11-25 2014-03-26 浙江大学 Service neighborhood based Web Service quality prediction method
CN103840985A (en) * 2014-02-28 2014-06-04 浙江大学 Web service quality prediction method and device based on user neighborhoods

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014031551A1 (en) * 2012-08-20 2014-02-27 Thomson Licensing A method and apparatus for privacy-preserving data mapping under a privacy-accuracy trade-off
CN104050267B (en) * 2014-06-23 2017-10-03 中国科学院软件研究所 The personalized recommendation method and system of privacy of user protection are met based on correlation rule
CN107092837A (en) * 2017-04-25 2017-08-25 华中科技大学 A kind of Mining Frequent Itemsets and system for supporting difference privacy

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103684850A (en) * 2013-11-25 2014-03-26 浙江大学 Service neighborhood based Web Service quality prediction method
CN103840985A (en) * 2014-02-28 2014-06-04 浙江大学 Web service quality prediction method and device based on user neighborhoods

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王彤: "基于差分隐私保护的协同过滤推荐技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑(月刊)》 *
鲜征征 等: "差分隐私保护在推荐系统中的应用研究", 《计算机应用研究》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109257217A (en) * 2018-09-19 2019-01-22 河海大学 Web service QoS prediction technique based on secret protection under mobile peripheral surroundings
CN109376549A (en) * 2018-10-25 2019-02-22 广州电力交易中心有限责任公司 A kind of electricity transaction big data dissemination method based on difference secret protection
CN109376549B (en) * 2018-10-25 2021-09-10 广州电力交易中心有限责任公司 Electric power transaction big data publishing method based on differential privacy protection
CN110232151B (en) * 2019-05-22 2022-07-15 温州大学 Construction method of QoS (quality of service) prediction model for mixed probability distribution detection
CN110232151A (en) * 2019-05-22 2019-09-13 温州大学 A kind of construction method of the QoS prediction model of mixing probability distribution detection
CN112328912A (en) * 2020-11-03 2021-02-05 重庆大学 QoS prediction method utilizing location awareness
CN112328912B (en) * 2020-11-03 2023-05-19 重庆大学 QoS prediction method using location awareness
CN112700067A (en) * 2021-01-14 2021-04-23 安徽师范大学 Method and system for predicting service quality under unreliable mobile edge environment
CN112700067B (en) * 2021-01-14 2024-05-24 安徽师范大学 Method and system for predicting service quality in unreliable mobile edge environment
CN113364621B (en) * 2021-06-04 2022-07-26 浙江大学 Service quality prediction method under service network environment
WO2022252251A1 (en) * 2021-06-04 2022-12-08 浙江大学 Method for predicting quality of service in service network environment
CN113364621A (en) * 2021-06-04 2021-09-07 浙江大学 Service quality prediction method under service network environment
US11985042B2 (en) 2021-06-04 2024-05-14 Zhejiang University Method for predicting Quality of Service in service network environment
CN114462707A (en) * 2022-02-10 2022-05-10 湖南科技大学 Web service multidimensional QoS (quality of service) joint prediction method based on feature depth fusion
CN114462707B (en) * 2022-02-10 2023-11-17 湖南科技大学 Web service multidimensional QoS joint prediction method based on feature depth fusion
CN116595254A (en) * 2023-05-18 2023-08-15 杭州绿城信息技术有限公司 Data privacy and service recommendation method in smart city
CN116595254B (en) * 2023-05-18 2023-12-12 杭州绿城信息技术有限公司 Data privacy and service recommendation method in smart city

Also Published As

Publication number Publication date
WO2019056572A1 (en) 2019-03-28

Similar Documents

Publication Publication Date Title
CN107679415A (en) Secret protection cooperates with the collaborative filtering method based on model of Web service prediction of quality
CN107659444A (en) Secret protection cooperates with the difference privacy forecasting system and method for Web service quality
CN107609421A (en) Secret protection cooperates with the collaborative filtering method based on neighborhood of Web service prediction of quality
Sun et al. LDP-FL: Practical private aggregation in federated learning with local differential privacy
Zhao et al. Privacy-preserving blockchain-based federated learning for IoT devices
CN111866869B (en) Federal learning indoor positioning privacy protection method for edge calculation
Ruzafa-Alcázar et al. Intrusion detection based on privacy-preserving federated learning for the industrial IoT
Ganti et al. PoolView: stream privacy for grassroots participatory sensing
Liu et al. Differential private collaborative Web services QoS prediction
Singh et al. Differentially-private federated neural architecture search
Chen et al. Privacy and fairness in Federated learning: on the perspective of Tradeoff
JP2016511891A (en) Privacy against sabotage attacks on large data
Liu et al. Hybrid differential privacy based federated learning for Internet of Things
Liu et al. Face image publication based on differential privacy
Fu et al. On the practicality of differential privacy in federated learning by tuning iteration times
Wei et al. Dpis: An enhanced mechanism for differentially private sgd with importance sampling
Mireshghallah et al. A principled approach to learning stochastic representations for privacy in deep neural inference
Firdaus et al. A secure federated learning framework using blockchain and differential privacy
Cui et al. Boosting accuracy of differentially private federated learning in industrial IoT with sparse responses
Galli et al. Group privacy for personalized federated learning
Wang et al. Protecting data privacy in federated learning combining differential privacy and weak encryption
Jiang et al. Differential privacy in privacy-preserving big data and learning: Challenge and opportunity
Liu et al. Privacy-preserving collaborative web services QoS prediction via differential privacy
Pàmies-Estrems et al. Lifelogging protection scheme for internet-based personal assistants
Zhou et al. Exploring the Practicality of Differentially Private Federated Learning: A Local Iteration Tuning Approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180209

WD01 Invention patent application deemed withdrawn after publication