CN110825955A

CN110825955A - A Distributed Differential Privacy Recommendation Method Based on Location Services

Info

Publication number: CN110825955A
Application number: CN201910567967.0A
Authority: CN
Inventors: 郑孝遥; 汪祥舜; 朱德义; 孙丽萍; 俞庆英; 汪小寒; 罗永龙
Original assignee: Anhui Normal University
Current assignee: Anhui Normal University
Priority date: 2019-06-27
Filing date: 2019-06-27
Publication date: 2020-02-21
Anticipated expiration: 2039-06-27
Also published as: CN110825955B

Abstract

The invention discloses a distributed differential privacy recommendation method based on location-based services, which can solve the problems that the traditional recommendation system cannot be well adapted to the location-based recommendation services and privacy disclosure is faced. The method utilizes a distributed privacy protection recommendation framework and a differential privacy protection theory to design a singular value decomposition recommendation algorithm based on the distributed framework, and utilizes an order-preserving encryption function to realize the protection of the position requested by a user, thereby achieving the purpose of privacy protection.

Description

A Distributed Differential Privacy Recommendation Method Based on Location Services

技术领域technical field

本发明涉及推荐系统和隐私保护领域，尤其涉及一种基于位置服务的分布式差分隐私推荐方法。The present invention relates to the field of recommendation systems and privacy protection, in particular to a distributed differential privacy recommendation method based on location services.

背景技术Background technique

随着移动互联网和智能终端技术的迅速发展，基于位置服务(Loacation- basedService,LBS)得到了广泛研究和应用。目前移动用户可以使用智能终端中的GPS技术，感知自己的地理位置，同时通过向LBS服务提供商发送自己的位置信息，向服务提供商请求个性化的服务，最常见的有兴趣点推荐、地图导航等。With the rapid development of mobile Internet and intelligent terminal technology, Location-based Service (LBS) has been widely researched and applied. At present, mobile users can use GPS technology in smart terminals to perceive their geographic location, and at the same time, by sending their location information to LBS service providers, they can request personalized services from service providers, such as the most common point of interest recommendation, map Navigation etc.

用户在获取个性化服务的同时，需要向服务提供商提供自己的位置信息，同时服务提供商会根据用户的历史消费记录，计算用户的偏好，从众多的项目中推荐出用户潜在感兴趣的，符合用户位置约束需求的项目。因此用户在此过程中，面临着两个隐私泄露的威胁：一是用户地理位置隐私泄露问题；二是用户偏好信息泄露的风险。When users obtain personalized services, they need to provide their own location information to the service provider. At the same time, the service provider will calculate the user's preferences according to the user's historical consumption records, and recommend the user's potential interest from a large number of items. Items that require user location constraints. Therefore, in this process, users are faced with two threats of privacy leakage: one is the leakage of user geographic location privacy; the other is the risk of leakage of user preference information.

目前，面向位置服务推荐系统的隐私保护方法主要分为泛化、数据扰动和加密三种类型。对用户的位置信息进行泛化处理，存在面对新型攻击时安全性较低的问题；数据扰动方存在保护能力不足的问题；同态加密算法也存在计算复杂度高，在大规模数据集中应用推荐效率低的问题。At present, privacy protection methods for location-based recommendation systems are mainly divided into three types: generalization, data perturbation, and encryption. The generalization of the user's location information has the problem of low security in the face of new attacks; the data perturbation party has the problem of insufficient protection capability; the homomorphic encryption algorithm also has high computational complexity and is applied in large-scale data sets. Recommend inefficient problems.

发明内容SUMMARY OF THE INVENTION

本发明针对传统的推荐系统已不能很好的适应基于位置的推荐服务，同时也面临隐私泄露的问题，实现一种分布式隐私保护推荐框架，并利用差分隐私保护理论，设计基于分布式框架的奇异值分解推荐算法，同时利用保序加密函数实现用户请求位置的保护。Aiming at the traditional recommendation system that cannot adapt to the location-based recommendation service, and also faces the problem of privacy leakage, the present invention realizes a distributed privacy protection recommendation framework, and uses the differential privacy protection theory to design a distributed framework-based recommendation framework. Singular value decomposition recommendation algorithm, and use order-preserving encryption function to protect the user's requested location.

为了实现上述目的，本发明采用的技术方案为：基于位置服务的分布式差分隐私推荐方法，包括以下步骤：In order to achieve the above purpose, the technical solution adopted in the present invention is: a distributed differential privacy recommendation method based on location service, comprising the following steps:

步骤S11、构成分布式推荐系统架构对历史评分数据和位置隐私信息进行隐私保护；Step S11, forming a distributed recommendation system architecture to perform privacy protection on historical scoring data and location privacy information;

步骤S12、所构成的分布式推荐系统架构使用云计算服务模式，把用户的评分信息采用分布式保护处理后存储在各个云端的推荐服务器中；Step S12, the formed distributed recommendation system architecture uses the cloud computing service mode, and the user's rating information is stored in the recommendation servers of each cloud after being processed by distributed protection;

步骤S13、添加噪声，实现差分隐私保护；Step S13, adding noise to realize differential privacy protection;

步骤S14、通过Gen，Der，Enc和Cmp四个函数实现保序加密；Step S14, realize order-preserving encryption through four functions of Gen, Der, Enc and Cmp;

步骤S15、用户端执行无约束的随机切片算法；Step S15, the user terminal executes an unconstrained random slicing algorithm;

步骤S16、用户端执行具有约束的等级随机切片算法；Step S16, the user terminal executes a constrained hierarchical random slicing algorithm;

步骤S17、将分片评分发送给各个分布式推荐服务器，第二阶段执行用户的推荐请求；Step S17, sending the shard score to each distributed recommendation server, and executing the user's recommendation request in the second stage;

步骤S18、执行输入扰动随机梯度下降算法，得到添加了隐私保护的用户和项目潜在特征向量矩阵P_k ^m×f和Q_k ^n×f；Step S18, executing the input perturbation stochastic gradient descent algorithm to obtain the user and item latent feature vector matrices P _k ^m×f and Q _k ^n×f added with privacy protection;

步骤S19、位置服务器端隐私保护模型，实现位置请求服务的隐私保护。In step S19, the privacy protection model of the location server side implements the privacy protection of the location request service.

所述步骤S11中，分布式推荐系统架构主要基于奇异值分解方法构建，模型如公式1：In the step S11, the distributed recommendation system architecture is mainly constructed based on the singular value decomposition method, and the model is as formula 1:

其中Test表示用户u对项目i的评价集合的训练集，p_u和q_i表示用户和项目的潜在因子特征值向量，Ψ表示模板函数，r表示预测评分变量，p表示用户潜在特征因子变量，q表示项目潜在特征因子变量，T表示矩阵转置，λ表示正则化参数。where Test represents the training set of user u's evaluation set for item _i , p _u and qi represent the latent factor eigenvalue vectors of users and items, Ψ represents the template function, r represents the predicted rating variable, p represents the user latent eigenfactor variable, q represents the item latent eigenfactor variable, T represents the matrix transpose, and λ represents the regularization parameter.

所述步骤S12中,分布式推荐系统架构运行流程包括：In the step S12, the operation process of the distributed recommendation system architecture includes:

1)首先用户u_i对消费后的推荐项目poi_j进行评分r_ij，然后执行随机切片算法，将评分根据分布式推荐服务器的个数分成K份

并在每份数据上添加基于差分隐私的干扰噪声发送给每个推荐服务器；1) First, the user _ui scores the recommended item _{poij after consumption r ij} _, and then executes the random slicing algorithm to divide the score into K parts according to the number of distributed recommendation servers

And add interference noise based on differential privacy on each piece of data and send it to each recommendation server;

2)分布式推荐服务器k收到评分分片数据后，根据公式1中的目标函数定期执行梯度下降算法，更新用户和项目的潜在因子特征值向量

和采用公式2：2) After the distributed recommendation server k receives the scoring shard data, it periodically executes the gradient descent algorithm according to the objective function in formula 1 to update the latent factor eigenvalue vectors of users and items

and Using Equation 2:

3)当用户u_i请求兴趣点推荐服务时，通过智能终端定位获取自己的地理坐标(x_i，y_i)，然后根据用户的请求范围需求，设置自己的地址请求区间 (x_i-Δx_i1，x_i+Δx_i2)，(y_i-Δy_i1，y_i+Δy_i2)发送给位置服务器，位置服务器通过与推荐项目的地理位置匹配，筛选出符合用户请求需求的推荐项目，并向分布式推荐服务器发送评分预测请求；3) When the user _ui requests the point-of-interest recommendation service, he obtains his own geographic coordinates ( _xi , _yi ) through the intelligent terminal positioning, and then sets his own address request interval ( _xi -Δx _i1 ) according to the user's request range requirements. , x _i +Δx _i2 ), (y _i -Δy _i1 , y _i +Δy _i2 ) are sent to the location server, and the location server selects the recommended items that meet the user's request by matching the geographic location of the recommended items, and sends them to the distribution The recommendation server sends a rating prediction request;

4)分布式推荐服务器收到位置服务器的请求后，通过用户和项目潜在因子特征值向量计算预测评分，采用公式3：

4) After the distributed recommendation server receives the request from the location server, it calculates the predicted score through the eigenvalue vector of the latent factor of the user and the item, using formula 3:

每个分布式推荐服务器将自己的分片预测评分发送给用户，用户计算

Each distributed recommendation server sends its own shard prediction score to the user, and the user calculates

所述步骤S14中,函数表达如下：In the step S14, the function expression is as follows:

Gen函数：给定一个安全参数k和范围参数n，k∈N且n∈N，通过输入k 和n，Gen输出一个加密参数param和主密钥mkey，其中： (param，mkey)＝Gen(k，n)；Gen function: Given a security parameter k and a range parameter n, k∈N and n∈N, by inputting k and n, Gen outputs an encryption parameter param and a master key mkey, where: (param, mkey)=Gen( k, n);

Enc函数：给定参数param和主密钥mkey，输入明文num，该函数可以输出密文ciph，ciph＝Enc(param，mkey，num)；Enc function: Given parameter param and master key mkey, input plaintext num, the function can output ciphtext ciph, ciph=Enc(param, mkey, num);

Der函数：给定参数param和主密钥mkey，输入明文num，该函数可以生成令牌token，token＝Der(param，mkey，num)；Der function: Given parameter param and master key mkey, input plaintext num, this function can generate token token, token=Der(param, mkey, num);

Cmp函数：给定参数param，两个密文ciph和ciph′以及令牌token，该函数可以输出{-1，0，1}，Cmp(param，ciph，ciph′，token)∈{-1，0，1}；Cmp function: Given the parameter param, two ciphertexts ciph and ciph', and the token token, the function can output {-1, 0, 1}, Cmp(param, ciph, ciph', token) ∈ {-1, 0, 1};

给定密文ciph＝Enc(param，mkey，num)和ciph′＝Enc(param，mkey，num′)，则可以通过 Cmp函数实现秘密比较；Given the ciphtext ciph=Enc(param, mkey, num) and ciph′=Enc(param, mkey, num′), the secret comparison can be realized by the Cmp function;

所述步骤S15中，随机分片算法根据分布式推荐服务器的数量K，采用无约束的原则，将评分r_ij随机分成K份，并相应的发给DRS。In the step S15, the random sharding algorithm randomly divides the score r _ij into K parts according to the number K of the distributed recommendation servers and adopts the principle of no constraint, and sends them to the DRS accordingly.

所述步骤16中，随机分片算法根据分布式推荐服务器的数量K，采用等比约束的原则，将评分r_ij根据用户自身设定的比例将评分分成K份，并相应的发给DRS。In the step 16, according to the number K of distributed recommendation servers, the random sharding algorithm divides the score r _ij into K parts according to the ratio set by the user, and sends them to DRS accordingly.

所述步骤S13中，根据Laplace机制首先为评分数据添加噪声，其中评分的全局敏感度Δr＝r_max-r_min，则添加的噪声为Laplace(Δr/ε)，然后在用户端执行随机分片算法，将评分分片数据发送给每个DRS后，每个DRS都会得到一个用户- 项目分片评分矩阵

且满足

In the step S13, according to the Laplace mechanism, noise is first added to the score data, where the global sensitivity of the score Δr=r _max -r _min , then the added noise is Laplace(Δr/ε), and then random sharding is performed on the user end. Algorithm, after sending the scoring shard data to each DRS, each DRS will get a user-item shard scoring matrix

and satisfy

所述步骤S19的处理方法如下：The processing method of the step S19 is as follows:

1)用户u_i首先生成安全参数k和n，并利用Gen函数生成加密参数param 和比较密钥mkey；然后对其请求范围(x_i-Δx_i1，x_i+Δx_i2)，(y_i-Δy_i1，y_i+Δy_i2)进行加密，得到Enc(x_i-Δx_i1，x_i+Δx_i2)，Enc(y_i-Δy_i1，y_i+Δy_i2)，Der(x_i-Δx_i1，x_i+Δx_i2)和 Der(y_i-Δy_i1，y_i+Δy_i2)，用户u_i并将这些加密后的数据连同param和mkey一起发送给LBSS；1) User _ui first generates security parameters k and n, and uses Gen function to generate encryption parameter param and comparison key mkey; then request the range ( _xi -Δx _i1 , x _i +Δx _i2 ), (y _i - Δy _i1 , y _i +Δy _i2 ) are encrypted to obtain Enc(x _i -Δx _i1 , x _i +Δx _i2 ), Enc(y _i -Δy _i1 , y _i +Δy _i2 ), Der(x _i -Δx _i1 , x _i +Δx _i2 ) and Der(y _i -Δy _i1 , y _i +Δy _i2 ), user _ui sends these encrypted data together with param and mkey to LBSS;

2)位置服务器收到用户的位置请求后，执行可比较加密协议筛选兴趣点操作，首先位置服务器遍历所有兴趣点，每个兴趣点poi_j的地理坐标(lon_j，lat_j)，并将满足筛选条件的兴趣点加入待推荐集合R_P中，执行的具体比较条件如下：2) After the location server receives the user's location request, it performs a comparable encryption protocol to screen interest points. First, the location server traverses all interest points, and the geographic coordinates (lon _j , lat _j ) of each interest point poi _j will satisfy the The points of interest of the screening conditions are added to the set to be recommended _RP , and the specific comparison conditions are as follows:

位置服务器将待推荐集合R_P中的兴趣点编号发送给DRS，请求DRS执行预测推荐；The location server sends the _POI numbers in the set RP to be recommended to the DRS, and requests the DRS to perform prediction and recommendation;

3)每个DRS收到位置服务器的推荐预测请求后，执行

并将每个预测评分分片发送给用户RU；3) After each DRS receives the recommendation prediction request from the location server, it executes

and send each prediction score shard to user RU;

4)用户收到推荐服务器的评分后，执行

并从中选择Top-N个评分最高的推荐结果。4) After the user receives the rating from the recommendation server, execute

And select the Top-N recommendation results with the highest ratings.

本发明创新提出了一种基于位置服务的分布式差分隐私推荐方法，综合考虑总体权益，在保证推荐性能的同时，有较强的隐私保护能力，对学术科研和实际应用，都具有一定的贡献。The invention innovatively proposes a distributed differential privacy recommendation method based on location service, which comprehensively considers the overall rights and interests, while ensuring the recommendation performance, has strong privacy protection ability, and has certain contributions to academic research and practical applications. .

附图说明Description of drawings

下面对本发明说明书中每幅附图表达的内容作简要说明：Below is a brief description of the content expressed by each drawing in the description of the present invention:

图1为本发明实施例公开的基于位置服务的分布式差分隐私推荐方法研究的构建方法流程图；1 is a flowchart of a method for constructing a research on a distributed differential privacy recommendation method based on location services disclosed in an embodiment of the present invention;

图2为本发明实施例公开的分布式系统架构图；FIG. 2 is an architecture diagram of a distributed system disclosed by an embodiment of the present invention;

图3为本发明实施例公开的携程网北京市酒店实验数据图；Fig. 3 is the experimental data diagram of Ctrip Beijing hotel disclosed in the embodiment of the present invention;

图4为本发明实施例公开的大众点评网北京市美食实验数据图。FIG. 4 is a data diagram of Beijing food experiment data of Dianping.com disclosed in the embodiment of the present invention.

具体实施方式Detailed ways

下面对照附图，通过对实施例的描述，本发明的具体实施方式如所涉及的各构件的形状、构造、各部分之间的相互位置及连接关系、各部分的作用及工作原理、制造工艺及操作使用方法等，作进一步详细的说明，以帮助本领域技术人员对本发明的发明构思、技术方案有更完整、准确和深入的理解。Below with reference to the accompanying drawings, through the description of the embodiments, the specific implementation of the present invention, such as the shape and structure of each component involved, the mutual position and connection relationship between each part, the function and working principle of each part, and the manufacturing process and operation and use methods, etc., are described in further detail to help those skilled in the art to have a more complete, accurate and in-depth understanding of the inventive concept and technical solutions of the present invention.

本发明提出了一种分布式隐私保护推荐框架，为了防止用户的历史评分数据和位置隐私信息的泄露，本发明使用分布式推荐系统架构实现对上述两种信息的隐私保护，用云计算服务模式，把用户的评分信息采用分布式保护处理后存储在各个云端的推荐服务器中。利用差分隐私保护理论，设计基于分布式框架的奇异值分解推荐算法，同时利用保序加密函数实现用户请求位置的保护，本发明使用可比较加密的方案，通过一轮交互即可得到查询结果，同时也能满足用户位置安全性，该方案通过Gen，Der，Enc和Cmp四个函数实现。本发明提出了无约束的随机切片算法和具有约束的等级随机切片算法，并实例验证了各个分片算法的性能。The present invention proposes a distributed privacy protection recommendation framework. In order to prevent the leakage of users' historical rating data and location privacy information, the present invention uses a distributed recommendation system architecture to realize the privacy protection of the above two kinds of information, and uses a cloud computing service model. , the user's rating information is processed by distributed protection and stored in the recommendation servers of each cloud. Using the differential privacy protection theory, a singular value decomposition recommendation algorithm based on a distributed framework is designed, and the order-preserving encryption function is used to protect the user's requested location. The invention uses a comparable encryption scheme, and the query result can be obtained through one round of interaction. At the same time, it can also satisfy the user's location security. The scheme is realized by four functions: Gen, Der, Enc and Cmp. The present invention proposes an unconstrained random slicing algorithm and a constrained hierarchical random slicing algorithm, and an example verifies the performance of each slicing algorithm.

为了进一步提高分布式隐私保护框架的安全性，本发明在随机分片算法的基础上融入差分隐私保护方法，添加噪声，从而保证在分布式推荐服务器共谋的情况下，也能达到较好的隐私保护能力；通过输入扰动随机梯度下降算法得到添加了隐私保护的用户和项目潜在特征向量矩阵P_k ^m×f和Q_k ^n×f,同时用户和位置服务器之间采用可比较加密的方案。In order to further improve the security of the distributed privacy protection framework, the present invention integrates the differential privacy protection method on the basis of the random fragmentation algorithm, and adds noise, so as to ensure that in the case of the collusion of the distributed recommendation server, a better performance can be achieved. Privacy protection capability: The user and item latent eigenvector matrices P _k ^m×f and Q _k ^n×f with added privacy protection are obtained through the input perturbation stochastic gradient descent algorithm, and a comparable encryption scheme is adopted between the user and the location server.

如图1所示，具体实施方式如下：As shown in Figure 1, the specific implementation is as follows:

步骤S11、为了防止用户的历史评分数据和位置隐私信息的泄露，本发明使用分布式推荐系统架构实现对上述两种信息的隐私保护。Step S11 , in order to prevent the leakage of the user's historical rating data and location privacy information, the present invention uses a distributed recommendation system architecture to realize the privacy protection of the above two kinds of information.

步骤S12、分布式的结构使用云计算服务模式，把用户的评分信息采用分布式保护处理后存储在各个云端的推荐服务器中。In step S12, the distributed structure uses a cloud computing service mode, and the user's rating information is stored in the recommendation servers of each cloud after being processed by distributed protection.

步骤S13、添加噪声，实现差分隐私保护。Step S13, adding noise to realize differential privacy protection.

步骤S14、保序加密是一种为解决范围查询时而不泄露查询数值的查询加密方案，本发明使用可比较加密的方案，通过一轮交互即可得到查询结果，同时也能满足用户位置安全性，通过Gen，Der，Enc和Cmp四个函数实现。Step S14, order-preserving encryption is a query encryption scheme for solving the range query without revealing the query value. The present invention uses a comparable encryption scheme, and the query result can be obtained through one round of interaction, and at the same time, the security of the user's location can be satisfied. , through Gen, Der, Enc and Cmp four functions.

步骤S15、用户端执行无约束的随机切片算法。Step S15, the user terminal executes an unconstrained random slicing algorithm.

步骤S16、用户端执行具有约束的等级随机切片算法。Step S16, the user terminal executes a constrained hierarchical random slicing algorithm.

步骤S17、将分片评分发送给各个分布式推荐服务器，第二阶段执行用户的推荐请求。Step S17: Send the shard score to each distributed recommendation server, and execute the user's recommendation request in the second stage.

步骤S18、执行输入扰动随机梯度下降算法，得到添加了隐私保护的用户和项目潜在特征向量矩阵P_k ^m×f和Q_k ^n×f。Step S18: Execute the input perturbation stochastic gradient descent algorithm to obtain the user and item latent feature vector matrices P _k ^m×f and Q _k ^n×f added with privacy protection.

步骤S19、位置服务器端隐私保护模型，实现位置请求服务的隐私保护，分析论证结果。Step S19, the privacy protection model on the location server side, realizes the privacy protection of the location request service, and analyzes the demonstration result.

步骤S11中分布式隐私保护推荐框架，主要基于奇异值分解(singular valuedecomposition，SVD)方法，该方法能够高效率的处理大规模数据集，在与传统的协同过滤的方法比较中，其性能有较大的优势，其模型如式(1) 。The distributed privacy protection recommendation framework in step S11 is mainly based on the singular value decomposition (SVD) method, which can efficiently process large-scale data sets. Compared with the traditional collaborative filtering method, its performance is better. A large advantage, its model is as formula (1).

上式中，Text表示用户u对项目i的评价集合的训练集，p_u和q_i表示用户和项目的潜在因子特征值向量。Ψ是目标函数，可以通过梯度下降优化算法求得最优解。In the above formula, Text represents the training set of user u's evaluation set for item _i , and p _u and qi represent the latent factor eigenvalue vectors of users and items. Ψ is the objective function, and the optimal solution can be obtained by the gradient descent optimization algorithm.

基于图2中的系统架构，各对象实体的运行流程如下：Based on the system architecture in Figure 2, the running process of each object entity is as follows:

1)首先用户u_i对消费后的推荐项目poi_j进行评分r_ij，然后执行随机切片算法，将评分根据分布式推荐服务器的个数分成K份并在每份数据上添加基于差分隐私的干扰噪声发送给每个推荐服务器。1) First, the user _ui scores the recommended item _{poij after consumption r ij} _, and then executes the random slicing algorithm to divide the score into K parts according to the number of distributed recommendation servers And add interference noise based on differential privacy on each piece of data and send it to each recommendation server.

2)分布式推荐服务器k收到评分分片数据后，根据公式(1)中的目标函数定期执行梯度下降算法，更新用户和项目的潜在因子特征值向量

和

2) After the distributed recommendation server k receives the scoring shard data, it periodically executes the gradient descent algorithm according to the objective function in formula (1) to update the latent factor eigenvalue vectors of users and items

and

3)当用户u_i请求兴趣点推荐服务时，通过智能终端定位获取自己的地理坐标(x_i，y_i)，然后根据用户的请求范围需求，设置自己的地址请求区间 (x_i-Δx_i1，x_i+Δx_i2)，(y_i-Δy_i1，y_i+Δy_i2)发送给位置服务器，位置服务器通过与推荐项目的地理位置匹配，筛选出符合用户请求需求的推荐项目，并向分布式推荐服务器发送评分预测请求。3) When the user _ui requests the point-of-interest recommendation service, he obtains his own geographic coordinates ( _xi , _yi ) through the intelligent terminal positioning, and then sets his own address request interval ( _xi -Δx _i1 ) according to the user's request range requirements. , x _i +Δx _i2 ), (y _i -Δy _i1 , y _i +Δy _i2 ) are sent to the location server, and the location server selects the recommended items that meet the user's request by matching the geographic location of the recommended items, and sends them to the distribution The recommendation server sends a rating prediction request.

4)分布式推荐服务器收到位置服务器的请求后，通过用户和项目潜在因子特征值向量计算预测评分：4) After the distributed recommendation server receives the request of the location server, it calculates the predicted score through the eigenvalue vector of the latent factor of the user and the item:

每个分布式推荐服务器将自己的分片预测评分发送给用户，用户计算 Each distributed recommendation server sends its own shard prediction score to the user, and the user calculates

保序加密(Order Preserving Encryption,OPE)是一种为解决范围查询时而不泄露查询数值的查询加密方案，本发明使用可比较加密的方案通过一轮交互即可得到查询结果，Order Preserving Encryption (OPE) is a query encryption scheme for solving range queries without revealing the query value. The present invention uses a comparable encryption scheme to obtain query results through one round of interaction.

同时满足也能满足用户位置安全性的保证。该方案通过Gen，Der，Enc和 Cmp四个函数实现，具体作用如下：At the same time, it can also meet the guarantee of user location security. The scheme is implemented by four functions Gen, Der, Enc and Cmp, the specific functions are as follows:

Gen函数：给定一个安全参数k和范围参数n，k∈N且n∈N，通过输入 k和n，Gen输出一个加密参数param和主密钥mkey。即：Gen function: Given a security parameter k and a range parameter n, k∈N and n∈N, by inputting k and n, Gen outputs an encryption parameter param and a master key mkey. which is:

param，mkey)＝Gen(k，n) (4)param, mkey) = Gen(k, n) (4)

Enc函数：给定参数param和主密钥mkey，输入明文num，该函数可以输出密文ciph。Enc function: Given the parameter param and the master key mkey, input the plaintext num, the function can output the ciphertext ciph.

ciph＝Enc(param，mkey，num) (5)ciph=Enc(param, mkey, num) (5)

Der函数：给定参数param和主密钥mkey，输入明文num，该函数可以生成令牌token。Der function: Given the parameter param and the master key mkey, and input the plaintext num, this function can generate the token token.

token＝Der(param，mkey，num) (6)token=Der(param, mkey, num) (6)

Cmp函数：给定参数param，两个密文ciph和ciph′以及令牌token，该函数可以输出{-1，0，1}。Cmp function: Given the parameter param, two ciphertexts ciph and ciph', and the token token, the function can output {-1, 0, 1}.

Cmp(param，ciph，ciph′，token)∈{-1，0，1} (7)Cmp(param, ciph, ciph', token) ∈ {-1, 0, 1} (7)

给定密文ciph＝Enc(param，mkey，num)和ciph′＝Enc(param，mkey，num′)，则可以通过Cmp函数实现秘密比较。Given the ciphtext ciph=Enc(param, mkey, num) and ciph′=Enc(param, mkey, num′), the secret comparison can be realized by the Cmp function.

进一步的，分布式隐私保护推荐方法分为两个阶段，第一阶段执行用户端的分片算法，并在各个分布式推荐服务器端执行矩阵因子分解算法更新用户和项目的潜在特征因子；第二阶段执行用户的推荐请求。假设用户u_i对消费后的推荐项目poi_j的评分为r_ij，则在用户端执行分片算法，然后将分片评分发送给各个分布式推荐服务器。本发明提出两种随机分片算法：Further, the distributed privacy protection recommendation method is divided into two stages. The first stage executes the sharding algorithm on the user side, and the matrix factorization algorithm is executed on each distributed recommendation server side to update the potential feature factors of users and items; the second stage Execute the user's referral request. Assuming that the user _ui 's score for the post-consumption recommended item poij is r _ij , the sharding algorithm is executed on the user side, and then the _shard score is sent to each distributed recommendation server. The present invention proposes two random fragmentation algorithms:

随机分片算法1根据分布式推荐服务器的数量K，采用无约束的原则，将评分r_ij随机分成K份，并相应的发给DRS，具体算法如算法1。Random sharding algorithm 1 randomly divides the score r _ij into K parts according to the number K of distributed recommendation servers and adopts the principle of unconstrained, and sends them to DRS accordingly. The specific algorithm is as algorithm 1.

算法1无约束随机分片算法Algorithm 1 Unconstrained random sharding algorithm

步骤1、输入用户评分r_ij和切片数K。Step 1. Input the user rating r _ij and the number of slices K.

步骤2、产生(0，r_ij)之间的随机数，赋值给变量r。Step 2. Generate a random number between (0, r _ij ) and assign it to the variable r.

步骤3、比较变量r和r_ij-r，并选择其中值小的数作为切片评分。Step 3. Compare the variables r and r _ij -r, and select the number with the smallest value as the slice score.

步骤4、重复上述步骤，直到将评分r_ij K个切片评分。Step 4. Repeat the above steps until scoring r _ij K slices.

随机分片算法2根据分布式推荐服务器的数量K，采用等比约束的原则，将评分r_ij根据用户自身设定的比例将评分分成K份，并相应的发给DRS。具体步骤是用户首先随机初始化K个比例参数{w₁，w₂，…，w_K} ，并满足

用户将该参数作为私密信息保存，在后续的分片算法中采用该比例参数；然后根据比例参数分割评分r_ij，具体步骤见算法2。According to the number K of distributed recommendation servers, the random sharding algorithm 2 adopts the principle of proportional constraint, divides the score r _ij into K parts according to the proportion set by the user, and sends them to DRS accordingly. The specific steps are that the user first randomly initializes K scale parameters {w ₁ , w ₂ , . . . , w _K } , and satisfies the

The user saves the parameter as private information, and uses the proportional parameter in the subsequent sharding algorithm; and then divides the score r _ij according to the proportional parameter, see Algorithm 2 for specific steps.

算法2有约束随机分片算法Algorithm 2 Constrained random sharding algorithm

步骤2、产生K个(0,1)之间的随机数{w₁，w₂，…，w_K}，并使其满足

Step 2. Generate K random numbers between (0, 1) {w ₁ , w ₂ , ..., w _K }, and make them satisfy

步骤3、将每个随机数{w₁，w₂，…，w_K}与用户评分r_ij的相乘，得到K个切片评分。Step 3. Multiply each random number {w ₁ , w ₂ , . . . , w _K } by the user score r _ij to obtain K slice scores.

为了进一步提高分布式隐私保护框架的安全性，本发明在随机分片算法的基础上融入差分隐私保护方法，从而保证在分布式推荐服务器共谋的情况下，也能达到较好的隐私保护能力。本发明根据Laplace机制首先为评分数据添加噪声，其中评分的全局敏感度Δr＝r_max-r_min，则添加的噪声为Laplace(Δr/ε)，然后在用户端执行随机分片算法，将评分分片数据发送给每个DRS后，每个DRS都会得到一个用户-项目分片评分矩阵且满足

In order to further improve the security of the distributed privacy protection framework, the present invention integrates the differential privacy protection method on the basis of the random fragmentation algorithm, so as to ensure that the distributed recommendation server can also achieve better privacy protection capability in the case of collusion . According to the Laplace mechanism, the present invention firstly adds noise to the scoring data, where the global sensitivity of the scoring is Δr=r _max -r _min , then the added noise is Laplace (Δr/ε), and then a random fragmentation algorithm is executed on the user side to assign the scoring After the shard data is sent to each DRS, each DRS will get a user-item shard score matrix and satisfy

DRS实际上获取的是添加了干扰噪声的分片矩阵。设第k个DRS得到的评分矩阵实际上是

通过算法3可以得到添加了隐私保护的用户和项目潜在特征向量矩阵P_k ^m×f和Q_k ^n×f。What DRS actually acquires is a slice matrix with interference noise added. Let the rating matrix obtained by the kth DRS actually be

Through Algorithm 3, the potential feature vector matrices P _k ^m×f and Q _k ^n×f of users and items with added privacy protection can be obtained.

算法3添加扰动的随机梯度下降算法Algorithm 3 Stochastic Gradient Descent with Perturbation Added

步骤1、输入添加了Laplace噪声的分片评分矩阵R′_k，潜在因子矩阵的维度f，正则化参数λ，评分取值的最大值r_max。Step 1. Input the slice score matrix R′ _k added with Laplace noise, the dimension f of the latent factor matrix, the regularization parameter λ, and the maximum value of the score r _max .

步骤2、将添加过噪声的评分矩阵R′_k都控制在[0，r_max]范围内。Step 2: Control the scoring matrix R′ _k with added noise in the range of [0, r _max ].

步骤3、根据目标函数

利用随机梯度下降算法进行矩阵因子分解，计算出用户和项目特征向量矩阵P_k ^m×f和Q_k ^n×f；Step 3. According to the objective function

Use stochastic gradient descent algorithm to decompose the matrix, and calculate the user and item eigenvector matrices P _k ^m×f and Q _k ^n×f ;

在实际使用中，每个DRS在收到用户的分片后，定期执行IPSGD算法，更新 P_k ^m×f和Q_k ^n×f矩阵，因此可以通过用户和项目潜在因子特征值向量值矩阵预测其它分片评分，即：

In actual use, each DRS periodically executes the IPSGD algorithm after receiving the user’s shards, and updates the P _k ^m×f and Q _k ^n×f matrices, so it can be predicted through the user and item latent factor eigenvalue vector-valued matrices Other shard scores, namely:

位置服务器主要存储各个兴趣点的地理位置坐标，以及接受用户的位置服务请求。为避免用户的位置隐私泄露，本节在用户和位置服务器之间采用可比较加密的方案，实现位置请求服务的隐私保护协议具体如下：The location server mainly stores the geographic location coordinates of each point of interest, and accepts location service requests from users. In order to avoid the leakage of the user's location privacy, this section adopts a comparable encryption scheme between the user and the location server, and the privacy protection protocol for implementing the location request service is as follows:

1、(@RU)：用户u_i首先生成安全参数k和n，并利用Gen函数生成加密参数param和比较密钥mkey；然后对其请求范围(x_i-Δx_i1，x_i+Δx_i2)， (y_i-Δy_i1，y_i+Δy_i2)进行加密，得到Enc(x_i-Δx_i1，x_i+Δx_i2)，Enc(y_i-Δy_i1，y_i+Δy_i2)， Der(x_i-Δx_i1，x_iΔx_i2)和Der(y_i-Δy_i1，y_i+Δy_i2)，用户u_i并将这些加密后的数据连同 param和mkey一起发送给LBSS。1. (@RU): User u _i first generates security parameters k and n, and uses Gen function to generate encryption parameter param and comparison key mkey; then request the range (x _i -Δx _i1 , x _i +Δx _i2 ) , (y _i -Δy _i1 , y _i +Δy _i2 ) are encrypted to obtain Enc(x _i -Δx _i1 , x _i +Δx _i2 ), Enc(y _i -Δy _i1 , y _i +Δy _i2 ), Der( x _i -Δx _i1 , x _i Δx _i2 ) and Der(y _i -Δy _i1 , y _i +Δy _i2 ), user _ui sends these encrypted data together with param and mkey to LBSS.

2、(@LBSS)：位置服务器收到用户的位置请求后，执行可比较加密协议筛选兴趣点操作。首先位置服务器遍历所有兴趣点，每个兴趣点poi_j的地理坐标(lon_j，lat_j)，并将满足筛选条件的兴趣点加入待推荐集合R_P中。执行的具体比较条件如下：2. (@LBSS): After the location server receives the user's location request, it performs the operation of filtering points of interest through comparable encryption protocols. First, the location server traverses all points of interest, the geographic coordinates (lon _j , lat _j ) of each point of interest _poij , and adds the points of interest that meet the screening conditions to the set to be recommended _RP . The specific comparison conditions performed are as follows:

位置服务器将待推荐集合R_P中的兴趣点编号发送给DRS，请求DRS执行预测推荐。The location server sends the point of interest numbers in the set _RP to be recommended to the DRS, and requests the DRS to perform predictive recommendation.

3、(@DRS)：每个DRS收到位置服务器的推荐预测请求后，执行并将每个预测评分分片发送给用户RU。3. (@DRS): After each DRS receives the recommendation prediction request from the location server, it executes And send each predicted score shard to user RU.

4、(@RU)：用户收到推荐服务器的评分后，执行并从中选择Top-N个评分最高的推荐结果。4. (@RU): After the user receives the rating from the recommendation server, execute And select the Top-N recommendation results with the highest ratings.

最后，本发明选择以下四种算法与本发明提出模型进行比较：Finally, the present invention selects the following four algorithms to compare with the model proposed by the present invention:

(1)UBCF Model：该模型采用基于用户的协同过滤方法实现用户项目的评分预测，不具有隐私保护功能。(1) UBCF Model: This model uses a user-based collaborative filtering method to predict user items' ratings, and does not have the function of privacy protection.

(2)IBCF Model：该模型采用基于项目的协同过滤方法实现用户项目的评分预测，不具有隐私保护功能。(2) IBCF Model: This model adopts the item-based collaborative filtering method to achieve the scoring prediction of user items, and does not have the function of privacy protection.

(3)SVD Model：该模型通过矩阵因子分解技术来获取用户和项目的潜在因子特征值向量，实现用户项目的评分预测。该模型不具有隐私保护功能。(3) SVD Model: This model obtains the latent factor eigenvalue vectors of users and items through matrix factorization technology, and realizes the scoring prediction of user items. This model does not have privacy protection features.

(4)DP-SVD Model：该模型在SVD推荐模型的基础上，应用差分隐私技术向用户-项目评分矩阵中添加Laplace噪声，实现在推荐的同时，达到保护用户评分隐私的目的，但不具有保护用户地理位置的功能。(4) DP-SVD Model: Based on the SVD recommendation model, this model applies differential privacy technology to add Laplace noise to the user-item rating matrix to achieve the purpose of protecting the privacy of user ratings while recommending, but does not have the A feature that protects the user's geographic location.

(5)DDP-SVD Model：本发明提出的分布式隐私保护模型，在实现保护用户评分隐私的同时，也能具有保护用户的地理位置。(5) DDP-SVD Model: The distributed privacy protection model proposed by the present invention not only protects the privacy of user scores, but also protects the geographic location of users.

本发明采用两个国内知名的网站数据集来进行验证分析，分别是携程网北京市酒店数据和大众点评网北京市美食数据，这两个数据集都是通过网络爬虫在线抓取的数据，包括用户对项目的评价(评价等级分成1至5)，项目的地理坐标。携程网的酒店数据和大众点评网的美食数据经过清洗后，过滤掉评分稀疏的数据，筛选出符合本发明测试要求的数据，具体如图3和4所示。The present invention adopts two domestic well-known website data sets for verification and analysis, namely Beijing hotel data from Ctrip.com and Beijing food data from Dianping.com. These two data sets are data captured online by web crawlers, including The user's evaluation of the item (evaluation level is divided into 1 to 5), the geographic coordinates of the item. After cleaning the hotel data of Ctrip and the food data of Dianping.com, the data with sparse scores is filtered out, and the data that meets the test requirements of the present invention are screened out, as shown in Figures 3 and 4 .

本发明中，首先，基于移动互联位置服务的知识背景，考虑传统的推荐系统已不能很好地适应基于位置的推荐服务，同时也面临隐私泄露的问题；其次，提出一种分布式隐私保护推荐框架，设计基于分布式框架的奇异值分解推荐算法，利用保序加密函数实现用户请求位置的保护；最后，创造性地加入差分隐私保护理论有效地实现了隐私保护并且达到了较好的推荐效果。本发明有效地提高了隐私保护能力，使得到的综合性能更能达到较好的水平。In the present invention, firstly, based on the knowledge background of mobile internet location service, it is considered that traditional recommendation system can not adapt well to location-based recommendation service, and also faces the problem of privacy leakage; secondly, a distributed privacy protection recommendation is proposed. framework, design a singular value decomposition recommendation algorithm based on a distributed framework, and use the order-preserving encryption function to protect the user's requested location; finally, the differential privacy protection theory is creatively added to effectively achieve privacy protection and achieve a good recommendation effect. The invention effectively improves the privacy protection capability, so that the obtained comprehensive performance can reach a better level.

专业人员还可以进一步意识到，结合本发明中所公开的实施例描述的执行步骤，能够以电子硬件、计算机软件或者二者的结合来实现，这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。Professionals can further realize that the execution steps described in conjunction with the embodiments disclosed in the present invention can be implemented by electronic hardware, computer software or a combination of the two. Whether these functions are implemented by hardware or software depends on Specific applications and design constraints for technical solutions. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of the present invention.

上面结合附图对本发明进行了示例性描述，显然本发明具体实现并不受上述方式的限制，只要采用了本发明的方法构思和技术方案进行的各种非实质性的改进，或未经改进将本发明的构思和技术方案直接应用于其它场合的，均在本发明的保护范围之内。The present invention has been exemplarily described above in conjunction with the accompanying drawings. Obviously, the specific implementation of the present invention is not limited by the above methods, as long as various insubstantial improvements made by the method concept and technical solutions of the present invention are adopted, or no improvement is made. It is within the protection scope of the present invention to directly apply the concepts and technical solutions of the present invention to other occasions.

Claims

1. A distributed differential privacy recommendation method based on location service, characterized in that it comprises the following steps:

Step S11, forming a distributed recommendation system architecture to perform privacy protection on historical scoring data and location privacy information;

Step S12, the formed distributed recommendation system architecture uses the cloud computing service mode, and the user's rating information is stored in the recommendation servers of each cloud after being processed by distributed protection;

Step S13, adding noise to realize differential privacy protection;

Step S14, realize order-preserving encryption through four functions of Gen, Der, Enc and Cmp;

Step S15, the user terminal executes an unconstrained random slicing algorithm;

Step S16, the user terminal executes a constrained hierarchical random slicing algorithm;

Step S17, sending the shard score to each distributed recommendation server, and executing the user's recommendation request in the second stage;

Step S18, executing the input perturbation stochastic gradient descent algorithm to obtain the user and item latent feature vector matrices P _k ^m×f and Q _k ^n×f added with privacy protection;

In step S19, the privacy protection model of the location server side implements the privacy protection of the location request service.

2. The distributed differential privacy recommendation method based on location service according to claim 1, wherein: in the step S11, the distributed recommendation system architecture is mainly constructed based on the singular value decomposition method, and the model is as formula 1:

where Test represents the training set of user u's evaluation set for item _i , p _u and qi represent the latent factor eigenvalue vectors of users and items,

Represents Frobenius normal form, and Ψ represents a template function.

3. The distributed differential privacy recommendation method based on location service according to claim 1 or 2, characterized in that: in the step S12, the distributed recommendation system architecture operation process comprises:

1) First, the user _ui scores the recommended item _{poij after consumption r ij} _, and then executes the random slicing algorithm to divide the score into K parts according to the number of distributed recommendation servers

2) After the distributed recommendation server k receives the scoring shard data, it periodically executes the gradient descent algorithm according to the objective function in formula 1 to update the latent factor eigenvalue vectors of users and items

and

Using Equation 2:

3) When the user _ui requests the point of interest recommendation service, he obtains his own geographic coordinates (x _i , y _i ) through the intelligent terminal positioning, and then sets his own address request interval (x _i -Δx _i1 ) according to the user's request range requirements. ,x _i +Δx _i2 ), (y _i -Δy _i1 ,y _i +Δy _i2 ) are sent to the location server, and the location server selects the recommended items that meet the user's request by matching the geographic location of the recommended items, and sends them to the distribution The recommendation server sends a rating prediction request;

4. The distributed differential privacy recommendation method based on location service according to claim 3, is characterized in that: in described step S14, function expression is as follows:

Gen function: Given a security parameter k and a range parameter n, k∈N and n∈N, by inputting k and n, Gen outputs an encryption parameter param and a master key mkey, where: (param,mkey)=Gen( k,n);

Enc function: Given parameter param and master key mkey, input plaintext num, the function can output ciphtext ciph, ciph=Enc(param,mkey,num);

Der function: Given parameter param and master key mkey, input plaintext num, this function can generate token token, token=Der(param,mkey,num);

Cmp function: Given the parameter param, two ciphertexts ciph and ciph′ and the token token, the function can output {-1,0,1}, Cmp(param,ciph,ciph′,token)∈{-1, 0,1};

Given the ciphtext ciph=Enc(param,mkey,num) and ciph′=Enc(param,mkey,num′), the secret comparison can be realized by the Cmp function;

5. The distributed differential privacy recommendation method based on location service according to claim 4, characterized in that: in the step S15, the random sharding algorithm adopts the unconstrained principle according to the number K of distributed recommendation servers, The score r _ij is randomly divided into K parts and sent to the DRS accordingly.

6. The distributed differential privacy recommendation method based on location services according to claim 5, wherein in the step 16, the random fragmentation algorithm adopts the principle of equal ratio constraint according to the number K of distributed recommendation servers, The score r _ij is divided into K parts according to the ratio set by the user, and sent to the DRS accordingly.

7 . The distributed differential privacy recommendation method based on location services according to claim 6 , wherein in the step S13 , noise is first added to the score data according to the Laplace mechanism, wherein the global sensitivity of the score Δr=r _max 7 . -r _min , then the added noise is Laplace (Δr/ε), and then the random sharding algorithm is executed on the user side. After sending the score shard data to each DRS, each DRS will get a user-item shard score matrix and satisfy

8. The distributed differential privacy recommendation method based on location service according to claim 1, 2, 4, 5, 6 or 7, characterized in that: the processing method of step S19 is as follows:

1) User _ui first generates security parameters k and n, and uses Gen function to generate encryption parameter param and comparison key mkey; then request range ( _xi -Δx _i1 , x _i +Δx _i2 ), (y _i - Δy _i1 , y _i +Δy _i2 ) for encryption to obtain Enc(x _i -Δx _i1 ,x _i +Δx _i2 ), Enc(y _i -Δy _i1 ,y _i +Δy _i2 ), Der(x _i -Δx _i1 , x _i +Δx _i2 ) and Der(y _i -Δy _i1 , y _i +Δy _i2 ), user _ui sends these encrypted data together with param and mkey to LBSS;

2) After the location server receives the user's location request, it performs a comparable encryption protocol to filter the points of interest. First, the location server traverses all the points of interest. The geographic coordinates (lon _j , lat _j ) of each point of interest poi _j will satisfy the The points of interest of the screening conditions are added to the set to be recommended _RP , and the specific comparison conditions are as follows:

The location server sends the _POI numbers in the set RP to be recommended to the DRS, and requests the DRS to perform prediction and recommendation;

3) After each DRS receives the recommendation prediction request from the location server, it executes and send each prediction score shard to user RU;

4) After the user receives the rating from the recommendation server, execute

And select the Top-N recommendation results with the highest ratings.