CN113536095A

CN113536095A - Data recommendation method and device and storage medium

Info

Publication number: CN113536095A
Application number: CN202010279667.5A
Authority: CN
Inventors: 冉鹏; 粟栗; 耿慧拯
Original assignee: China Mobile Communications Group Co Ltd; Research Institute of China Mobile Communication Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; Research Institute of China Mobile Communication Co Ltd
Priority date: 2020-04-10
Filing date: 2020-04-10
Publication date: 2021-10-22

Abstract

The invention discloses a data recommendation method, device and storage medium, wherein the method includes: determining first data to be sent; the first data is behavior data collected locally by a terminal; sending the second data to a server Wherein, when it is determined that the first data does not meet the conversion conditions, the second data is the first data; when it is determined that the first data meets the conversion conditions, the second data is based on the first data obtaining the disturbance data; and receiving the recommendation data determined and sent by the server according to the second data.

Description

A data recommendation method, device and storage medium

技术领域technical field

本发明涉及数据隐私技术，尤其涉及一种数据推荐方法、装置和存储介质。The present invention relates to data privacy technology, and in particular, to a data recommendation method, device and storage medium.

背景技术Background technique

推荐系统通常是根据用户的兴趣特点和购买行为，向用户推荐用户感兴趣的信息和商品。针对推荐算法所处位置的不同，推荐系统分为服务器端推荐系统、客户端推荐系统和代理服务器端推荐系统等。现有对推荐系统的隐私保护方案一般通过多种技术在客户端或服务器端对上传或收集数据进行扰动，或者改写推荐算法以达到隐私保护的目的，如采用中心化差分隐私技术实现隐私保护。The recommendation system usually recommends the information and products that the user is interested in according to the user's interest characteristics and purchasing behavior. According to the different locations of recommendation algorithms, recommendation systems are divided into server-side recommendation systems, client-side recommendation systems, and proxy server-side recommendation systems. The existing privacy protection schemes for recommendation systems generally perturb the uploaded or collected data on the client or server side through various technologies, or rewrite the recommendation algorithm to achieve the purpose of privacy protection, such as using centralized differential privacy technology to achieve privacy protection.

然而，中心化差分隐私技术要求数据收集者是诚实方，不会对用户上传的真实数据产生恶意行为，而在真实场景下，完全可信的数据收集者是不存在的，很多时候用户都不希望厂商获得自己的隐私数据，但精准营销、广告投放、个性化推荐等应用，需要对大量的用户数据进行数据挖掘以获得更精确的用户画像，提升用户体验；如何在保证用户隐私的基础上实现精准推荐是目前需要解决的问题。However, the centralized differential privacy technology requires the data collector to be an honest party and will not cause malicious behavior to the real data uploaded by the user. In real scenarios, there is no completely trusted data collector, and many users do not It is hoped that manufacturers can obtain their own private data, but applications such as precision marketing, advertisement placement, and personalized recommendation need to perform data mining on a large amount of user data to obtain more accurate user portraits and improve user experience; how to ensure user privacy on the basis of Achieving accurate recommendation is a problem that needs to be solved at present.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明的主要目的在于提供一种数据推荐方法、装置和存储介质。In view of this, the main purpose of the present invention is to provide a data recommendation method, device and storage medium.

为达到上述目的，本发明的技术方案是这样实现的：In order to achieve the above object, the technical scheme of the present invention is achieved in this way:

本发明实施例提供了一种数据推荐方法，所述方法应用于终端，所述方法包括：An embodiment of the present invention provides a data recommendation method, the method is applied to a terminal, and the method includes:

确定待发送的第一数据；所述第一数据为终端本地采集的行为数据；Determine the first data to be sent; the first data is behavior data collected locally by the terminal;

将第二数据发送至服务器；其中，在确定所述第一数据不满足转化条件时，所述第二数据为第一数据；在确定所述第一数据满足转化条件时，所述第二数据为基于所述第一数据得到的扰动数据；Send the second data to the server; wherein, when it is determined that the first data does not meet the conversion conditions, the second data is the first data; when it is determined that the first data meets the conversion conditions, the second data is the disturbance data obtained based on the first data;

接收所述服务器根据所述第二数据确定并发送的推荐数据。Receive recommendation data determined and sent by the server according to the second data.

上述方案中，所述方法还包括：判断所述第一数据是否满足转化条件；In the above scheme, the method further comprises: judging whether the first data satisfies the conversion condition;

所述判断所述第一数据是否满足转化条件，包括：The judging whether the first data satisfies the conversion condition includes:

确定预设的隐私参数；所述隐私参数与预设的隐私保护程度相关联；Determine a preset privacy parameter; the privacy parameter is associated with a preset privacy protection degree;

根据所述预设的隐私参数，确定扰动概率值；determining a perturbation probability value according to the preset privacy parameter;

根据所述扰动概率值，对所述第一数据进行二值随机响应，得到响应结果；所述响应结果表征是否转化第一数据；According to the perturbation probability value, a binary random response is performed on the first data to obtain a response result; the response result represents whether the first data is converted;

相应于所述响应结果表征不转化第一数据的情况下，所述第一数据不满足转化条件；Corresponding to the response result indicating that the first data is not converted, the first data does not meet the conversion condition;

相应于所述响应结果表征转化第一数据的情况下，所述第一数据满足转化条件。In the case that the first data is transformed corresponding to the response result, the first data satisfies the transformation condition.

上述方案中，所述第一数据包括：至少一个参数和所述至少一个参数中各参数对应的数值；In the above solution, the first data includes: at least one parameter and a value corresponding to each parameter in the at least one parameter;

转化所述第一数据，包括：Transforming the first data includes:

对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；Perform a multi-value random response on the numerical value corresponding to each parameter in the at least one parameter, and obtain a random response result corresponding to each parameter in the at least one parameter;

根据所述至少一个参数中各参数对应的随机响应结果，得到所述第二数据。The second data is obtained according to the random response result corresponding to each parameter in the at least one parameter.

上述方案中，所述对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果，包括：In the above solution, the multi-value random response is performed on the numerical value corresponding to each parameter in the at least one parameter, and the random response result corresponding to each parameter in the at least one parameter is obtained, including:

根据预设的隐私参数，对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；所述隐私参数与预设的隐私保护程度相关联。According to a preset privacy parameter, a multi-value random response is performed on the value corresponding to each parameter in the at least one parameter, and a random response result corresponding to each parameter in the at least one parameter is obtained; the privacy parameter is the same as the preset privacy protection degree related.

上述方案中，所述方法还包括：确定所述第一数据和所述第二数据之间的相似度；In the above solution, the method further includes: determining the similarity between the first data and the second data;

所述将第二数据发送至服务器之前，所述方法还包括：Before the sending the second data to the server, the method further includes:

根据所述相似度向所述第二数据添加标签；所述标签表征是否采用基于所述第二数据确定的推荐数据；Add a label to the second data according to the similarity; the label indicates whether the recommendation data determined based on the second data is adopted;

所述接收所述服务器根据所述第二数据确定并发送的推荐数据之后，所述方法还包括：After receiving the recommendation data determined and sent by the server according to the second data, the method further includes:

确定所述推荐数据对应的第二数据的标签；determining the label of the second data corresponding to the recommended data;

根据所述推荐数据对应的第二数据的标签，确定推荐结果；所述推荐结果表征是否按所述推荐数据进行推荐。The recommendation result is determined according to the label of the second data corresponding to the recommendation data; the recommendation result represents whether the recommendation is performed according to the recommendation data.

本发明实施例提供了一种数据推荐装置，所述装置包括：第一处理模块、第二处理模块、第三处理模块；其中，An embodiment of the present invention provides a data recommendation device, the device includes: a first processing module, a second processing module, and a third processing module; wherein,

所述第一处理模块，用于确定待发送的第一数据；所述第一数据为终端本地采集的行为数据；The first processing module is used to determine the first data to be sent; the first data is behavior data collected locally by the terminal;

所述第二处理模块，用于将第二数据发送至服务器；其中，在确定所述第一数据不满足转化条件时，所述第二数据为第一数据；在确定所述第一数据满足转化条件时，所述第二数据为基于所述第一数据得到的扰动数据；The second processing module is configured to send the second data to the server; wherein, when it is determined that the first data does not meet the conversion condition, the second data is the first data; when it is determined that the first data meets the When converting conditions, the second data is the disturbance data obtained based on the first data;

所述第三处理模块，用于接收所述服务器根据所述第二数据确定并发送的推荐数据。The third processing module is configured to receive recommendation data determined and sent by the server according to the second data.

上述方案中，所述第二处理模块，用于确定预设的隐私参数；所述隐私参数与预设的隐私保护程度相关联；In the above solution, the second processing module is used to determine a preset privacy parameter; the privacy parameter is associated with a preset privacy protection degree;

所述第二处理模块，用于对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；The second processing module is configured to perform a multi-value random response to the numerical value corresponding to each parameter in the at least one parameter, and obtain a random response result corresponding to each parameter in the at least one parameter;

上述方案中，所述第二处理模块，用于根据预设的隐私参数，对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；所述隐私参数与预设的隐私保护程度相关联。In the above solution, the second processing module is configured to perform a multi-value random response to the value corresponding to each parameter in the at least one parameter according to the preset privacy parameter, and obtain the random value corresponding to each parameter in the at least one parameter. A response result; the privacy parameter is associated with a preset privacy protection degree.

上述方案中，所述第二处理模块，还用于确定所述第一数据和所述第二数据之间的相似度；In the above solution, the second processing module is further configured to determine the similarity between the first data and the second data;

以及，所述第二处理模块，还用于在将第二数据发送至服务器之前，根据所述相似度向所述第二数据添加标签；所述标签表征是否采用基于所述第二数据确定的推荐数据；And, the second processing module is further configured to add a label to the second data according to the similarity before sending the second data to the server; the label indicates whether a method determined based on the second data is used recommended data;

所述第三处理模块，还用于在接收所述服务器根据所述第二数据确定并发送的推荐数据之后，确定所述推荐数据对应的第二数据的标签；The third processing module is further configured to, after receiving the recommendation data determined and sent by the server according to the second data, determine the label of the second data corresponding to the recommendation data;

本发明实施例提供了一种数据推荐装置，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现上述数据推荐方法的步骤。An embodiment of the present invention provides a data recommendation apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above data recommendation method when the program is executed.

本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现上述数据推荐方法的步骤。Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the above data recommendation method.

本发明实施例所提供的数据推荐方法、装置和存储介质，确定待发送的第一数据；所述第一数据为终端本地采集的行为数据；将第二数据发送至服务器；其中，在确定所述第一数据不满足转化条件时，所述第二数据为第一数据；在确定所述第一数据满足转化条件时，所述第二数据为基于所述第一数据得到的扰动数据；接收所述服务器根据所述第二数据确定并发送的推荐数据；如此，通过对上传的数据进行本地化差分隐私(LDP，LocalDifferential Privacy)处理，可以抵御不可信的数据收集者泄露用户隐私数据的情况。The data recommendation method, device, and storage medium provided by the embodiments of the present invention determine the first data to be sent; the first data is behavior data collected locally by the terminal; send the second data to the server; When the first data does not meet the conversion conditions, the second data is the first data; when it is determined that the first data meets the conversion conditions, the second data is the disturbance data obtained based on the first data; receiving The recommended data determined and sent by the server according to the second data; in this way, by performing localized differential privacy (LDP, Local Differential Privacy) processing on the uploaded data, it is possible to resist the situation that untrustworthy data collectors leak user private data .

附图说明Description of drawings

图1为本发明实施例提供的一种数据推荐方法的流程示意图；1 is a schematic flowchart of a data recommendation method according to an embodiment of the present invention;

图2为本发明实施例提供的另一种数据推荐方法的流程示意图；2 is a schematic flowchart of another data recommendation method provided by an embodiment of the present invention;

图3为本发明实施例提供的再一种数据推荐方法的流程示意图；3 is a schematic flowchart of still another data recommendation method provided by an embodiment of the present invention;

图4为本发明实施例提供的一种数据推荐装置的结构示意图；4 is a schematic structural diagram of a data recommendation apparatus according to an embodiment of the present invention;

图5为本发明实施例提供的一种数据推荐系统的结构示意图；5 is a schematic structural diagram of a data recommendation system according to an embodiment of the present invention;

图6为本发明实施例提供的另一种数据推荐装置的结构示意图。FIG. 6 is a schematic structural diagram of another data recommendation apparatus provided by an embodiment of the present invention.

具体实施方式Detailed ways

在结合实施例对本发明再作进一步详细的说明之前，先对隐私保护和根据隐私保护后的数据进行推荐的相关技术进行说明。Before the present invention is further described in detail with reference to the embodiments, privacy protection and related technologies for recommendation based on data after privacy protection are described first.

互联网行业的迅猛发展为人们带来了数据共享的便利与快捷，但由此引发的隐私泄露风险水平日益提高，而网络攻击手段的不断升级也同样对隐私保护的理论与技术的进展提出了更高的要求。随着大数据时代的到来，厂商越来越多的将关注点放在了用户数据上。现有的研究表明，攻击者可以从海量数据中发掘出用户隐私信息，而非通过访问数据直接获取，这使得传统的加密、访问控制技术无法抵御此类型攻击。隐私保护的方式主要有三种：数据失真、数据加密、访问控制。目前的隐私保护技术结合了上述多种方案，例如K-匿名、l-多样性、T保密(T-closeness)等技术在对抗一致性攻击、背景知识攻击、相似性攻击方面起到一定的作用，这些技术都依赖于攻击者的背景知识，但都未对攻击模型做出合理的假设；再比如C.Dwork等人提出的差分隐私模型为个人信息提供了更高级别的安全保障，无需依赖攻击者所具有背景知识的多少，通过引入噪声等数据随机化处理的方法达到在数据分析中保护隐私的目的。The rapid development of the Internet industry has brought people the convenience and speed of data sharing, but the risk of privacy leakage caused by this has been increasing day by day. high demands. With the advent of the era of big data, manufacturers are increasingly focusing on user data. Existing research shows that attackers can discover user privacy information from massive data, instead of directly obtaining it by accessing data, which makes traditional encryption and access control technologies unable to resist this type of attack. There are three main ways to protect privacy: data distortion, data encryption, and access control. The current privacy protection technology combines the above-mentioned various schemes, such as K-anonymity, l-diversity, T-closeness and other technologies play a certain role in resisting consistency attacks, background knowledge attacks, and similarity attacks , these technologies all rely on the background knowledge of the attacker, but do not make reasonable assumptions about the attack model; another example is the differential privacy model proposed by C.Dwork et al., which provides a higher level of security for personal information without relying on How much background knowledge the attacker has, the purpose of protecting privacy in data analysis is achieved by introducing noise and other data randomization methods.

推荐系统通常是根据用户的兴趣特点和购买行为，向用户推荐用户感兴趣的信息和商品。针对推荐算法所处位置的不同，推荐系统分为服务器端推荐系统、客户端推荐系统和代理服务器端推荐系统等。现有对推荐系统的隐私保护方案一般通过多种技术在客户端或服务器端对上传或收集数据进行扰动，或者改写推荐算法以达到隐私保护的目的。现有技术中提出一种基于差分隐私保护的推荐系统，例如将差分隐私方法引入到基于K近邻的推荐算法中，在差分隐私的框架下进行隐私的邻居选择并据此进行推荐，该方法可以有效地抵制基于相似用户的攻击；Arnaud等人提出了差分隐私保护的矩阵分解方法，在利用矩阵分解方法进行推荐的算法下，该方法分别在用户评分数据和随机梯度下降过程中引入满足差分隐私条件的噪声扰动，该方案能在一定程度上抵御针对服务器端的攻击；Shen等人则提出了将差分隐私应用在客户端的推荐系统，利用公共数据计算对用户数据的扰动大小，保证扰动后用户数据的可用性。The recommendation system usually recommends the information and products that the user is interested in according to the user's interest characteristics and purchasing behavior. According to the different locations of recommendation algorithms, recommendation systems are divided into server-side recommendation systems, client-side recommendation systems, and proxy server-side recommendation systems. Existing privacy protection schemes for recommendation systems generally use various techniques to perturb the uploaded or collected data on the client or server side, or rewrite the recommendation algorithm to achieve the purpose of privacy protection. In the prior art, a recommendation system based on differential privacy protection is proposed. For example, the differential privacy method is introduced into the recommendation algorithm based on K-nearest neighbors, and the privacy of neighbors is selected under the framework of differential privacy and recommended based on this method. Effectively resist attacks based on similar users; Arnaud et al. proposed a matrix factorization method for differential privacy protection. Under the algorithm that uses the matrix factorization method for recommendation, the method introduces user rating data and stochastic gradient descent process to satisfy differential privacy. Conditional noise disturbance, this scheme can resist attacks on the server side to a certain extent; Shen et al. proposed a recommendation system that applies differential privacy to the client side, and uses public data to calculate the size of the disturbance to the user data to ensure the user data after disturbance. availability.

然而，上述方法采用中心化差分隐私模型有一个关键的假设，即数据收集者是诚实方，不会对用户上传的真实数据产生恶意行为。在真实场景下，完全可信的数据收集者是不存在的，很多时候用户都不希望厂商获得自己的隐私数据。而精准营销，广告投放，个性化推荐等应用则需要对大量的用户数据进行数据挖掘以获得更精确的用户画像，提升用户体验。However, the above methods adopt a central differential privacy model with a key assumption, that is, the data collectors are honest parties and do not maliciously act on the real data uploaded by users. In real scenarios, completely trusted data collectors do not exist, and in many cases users do not want manufacturers to obtain their private data. And precision marketing, advertising, personalized recommendation and other applications need to mine a large amount of user data to obtain more accurate user portraits and improve user experience.

对于推荐系统来讲，推荐系统的体系结构研究的重要问题就是用户信息收集和用户描述文件放在什么地方，服务器还是客户端上，或者是处于二者之间的代理服务器上。当推荐算法在服务器上或代理服务器上实现时，用户的隐私数据安全将无法得到保障。无论是推荐系统的管理者还是入侵推荐系统的人员都能方便地获取存放在服务器上的用户数据。由于用户的个人数据是有很高价值的，接触到用户数据的部分人会出卖用户数据或把用户数据用于非法用途。而基于客户端的推荐系统较难获取其他用户的数据，用户描述文件较难得到，协同推荐策略实施也较难，往往需要设计更加复杂的推荐算法。For recommender systems, an important issue in the research on the architecture of recommender systems is where user information collection and user description files are placed, on the server or on the client, or on the proxy server in between. When the recommendation algorithm is implemented on the server or proxy server, the security of users' private data will not be guaranteed. Both the administrator of the recommendation system and the person who hacks the recommendation system can easily obtain the user data stored on the server. Since the user's personal data is of high value, some people who come into contact with the user's data will sell the user's data or use the user's data for illegal purposes. It is difficult for client-based recommendation systems to obtain data from other users, and it is difficult to obtain user description files, and it is also difficult to implement collaborative recommendation strategies, often requiring the design of more complex recommendation algorithms.

针对上述问题，本发明实施例提供的方案，确定待发送的第一数据；所述第一数据为终端本地采集的行为数据；将第二数据发送至服务器；其中，在确定所述第一数据不满足转化条件时，所述第二数据为第一数据；在确定所述第一数据满足转化条件时，所述第二数据为基于所述第一数据得到的扰动数据；接收所述服务器根据所述第二数据确定并发送的推荐数据。In response to the above problems, the solution provided by the embodiment of the present invention is to determine the first data to be sent; the first data is the behavior data collected locally by the terminal; the second data is sent to the server; wherein, after determining the first data When the conversion condition is not met, the second data is the first data; when it is determined that the first data meets the conversion condition, the second data is the disturbance data obtained based on the first data; The second data is the recommended data determined and sent.

下面结合实施例对本发明再作进一步详细的说明。The present invention will be described in further detail below in conjunction with the embodiments.

图1为本发明实施例提供的一种数据推荐方法的流程示意图；如图1所示，所述数据推荐方法应用于终端(如手机、平板电脑、个人计算机、笔记本电脑等)；所述方法包括：FIG. 1 is a schematic flowchart of a data recommendation method provided by an embodiment of the present invention; as shown in FIG. 1 , the data recommendation method is applied to a terminal (such as a mobile phone, a tablet computer, a personal computer, a notebook computer, etc.); the method include:

步骤101、确定待发送的第一数据；所述第一数据为终端本地采集的行为数据；Step 101: Determine the first data to be sent; the first data is behavior data collected locally by the terminal;

步骤102、将第二数据发送至服务器；其中，在确定所述第一数据不满足转化条件时，所述第二数据为第一数据；在确定所述第一数据满足转化条件时，所述第二数据为基于所述第一数据得到的扰动数据；Step 102: Send the second data to the server; wherein, when it is determined that the first data does not meet the conversion conditions, the second data is the first data; when it is determined that the first data meets the conversion conditions, the The second data is the disturbance data obtained based on the first data;

步骤103、接收所述服务器根据所述第二数据确定并发送的推荐数据。Step 103: Receive recommendation data determined and sent by the server according to the second data.

在一实施例中，所述方法还包括：判断所述第一数据是否满足转化条件；In one embodiment, the method further includes: judging whether the first data satisfies conversion conditions;

相应于所述响应结果表征不转化第一数据的情况下，所述第一数据不满足转化条件；相应地，可以将第一数据作为所述第二数据；When the response result indicates that the first data is not converted, the first data does not meet the conversion conditions; correspondingly, the first data can be used as the second data;

相应于所述响应结果表征转化第一数据的情况下，所述第二数据满足转化条件；相应地，可以将基于所述第一数据得到的扰动数据作为第二数据。In the case where the first data is transformed corresponding to the response result, the second data satisfies the transformation conditions; correspondingly, the disturbance data obtained based on the first data may be used as the second data.

具体来说，这里判断所述第一数据是否满足转化条件可以根据所述扰动概率值对所述第一数据进行二值随机响应，根据响应结果来确定。也即，当响应结果表征不转化第一数据的情况下，即可确定所述第一数据不满足转化条件，当响应结果表征转化第一数据的情况下，即可确定所述第二数据满足转化条件。Specifically, to determine whether the first data satisfies the transformation condition here, a binary random response may be performed on the first data according to the disturbance probability value, and the determination may be made according to the response result. That is, when the response result indicates that the first data is not transformed, it can be determined that the first data does not meet the transformation conditions, and when the response result indicates that the first data is transformed, it can be determined that the second data meets the requirements. Conversion conditions.

这里，所述扰动概率值为一个表征是否对待上传的第一数据进行隐私保护的概率值。Here, the perturbation probability value is a probability value representing whether to perform privacy protection on the first data to be uploaded.

以下对所述二值随机响应做具体说明。The binary random response is described in detail below.

在二值随机响应中，以p的概率上传用户的真实数据，即上述真实的第一数据，以1-p的概率(即所述扰动概率值)上传一个与真实数据形式相同的扰动数据(即失真数据)。这里，二值随机响应满足下列公式(1)：In the binary random response, upload the user's real data with the probability of p, that is, the above-mentioned real first data, and upload a disturbance data ( i.e. distorted data). Here, the binary random response satisfies the following formula (1):

也就是说，相应于所述响应结果表征不转化第一数据的情况下，也即在p的概率下，将所述第一数据作为所述第二数据；That is to say, corresponding to the case where the response result indicates that the first data is not transformed, that is, under the probability of p, the first data is used as the second data;

相应于所述响应结果表征转化第一数据的情况下，也即在1-p的概率下，将转化第一数据得到的扰动数据作为第二数据。In the case of transforming the first data corresponding to the response result, that is, under the probability of 1-p, the disturbance data obtained by transforming the first data is used as the second data.

这里，通过采用随机响应(Randomized Response)技术，服务器难以区分用户上传的是真实数据(即第一数据)或扰动数据，并且终端侧可以以1-p的概率否认自身上传的真实数据。Here, by using the Randomized Response technology, it is difficult for the server to distinguish between real data (ie, first data) or disturbed data uploaded by the user, and the terminal side can deny the real data uploaded by itself with a probability of 1-p.

需要说明的是，在二值的随机响应技术中，回答真实答案(这里指上传真实的第一数据)的概率p与本地化差分隐私的预设的隐私参数ε的关系满足以下公式(2)：It should be noted that, in the binary random response technology, the relationship between the probability p of answering the real answer (here refers to uploading the real first data) and the preset privacy parameter ε of localized differential privacy satisfies the following formula (2) :

当用户选择更高程度的隐私保护程度时，本地化差分隐私的隐私预算参数ε数值将越小，对应的，终端向服务器端上传真实数据的概率p也越低。When the user chooses a higher degree of privacy protection, the value of the privacy budget parameter ε of localized differential privacy will be smaller, and correspondingly, the probability p of the terminal uploading real data to the server will be lower.

也就是说，所述根据所述预设的隐私参数，确定扰动概率值，可以包括：That is to say, the determining the perturbation probability value according to the preset privacy parameter may include:

根据所述预设的隐私参数，查询隐私参数与扰动概率值对应关系，确定所述预设的隐私参数对应的扰动概率值。According to the preset privacy parameter, the corresponding relationship between the privacy parameter and the disturbance probability value is queried, and the disturbance probability value corresponding to the preset privacy parameter is determined.

所述隐私参数与扰动概率值对应关系可以由开发人员预先设定并保存在服务器，由所述终端在确定扰动概率值时从服务器自动获取，或者，可以保存在所述终端加载的客户端中，这里不做限定。The corresponding relationship between the privacy parameter and the perturbation probability value may be preset by the developer and stored in the server, and automatically obtained by the terminal from the server when the perturbation probability value is determined, or may be stored in the client loaded by the terminal. , which is not limited here.

这里，设定所述隐私参数与扰动概率值对应关系可以根据上式(2)确定。Here, setting the corresponding relationship between the privacy parameter and the perturbation probability value can be determined according to the above formula (2).

具体来说，所述隐私参数与预设的隐私保护程度相关联，所述预设的隐私保护程度具体可以由用户通过终端的人机交互界面进行设置；不同保护程度对应不同的隐私参数；例如：终端(具体可以理解为所述终端上加载的客户端)提供选择按键，分别对应一级保护程度、二级保护程度、三级保护程度；其中，所述一级保护程度大于二级保护程度，所述二级保护程度大于三级保护程度；相应的，所述一级保护程度对应的隐私参数小于二级保护程度对应的隐私参数，所述二级保护程度对应的隐私参数小于三级保护程度对应的隐私参数。用户通过自身要求选择不同的隐私保护程度，进而终端可以确定不同的隐私保护程度和对应的隐私参数。Specifically, the privacy parameter is associated with a preset privacy protection degree, and the preset privacy protection degree can be specifically set by the user through the human-computer interaction interface of the terminal; different protection degrees correspond to different privacy parameters; for example : The terminal (specifically, it can be understood as the client loaded on the terminal) provides selection buttons, corresponding to the first-level protection level, the second-level protection level, and the third-level protection level; wherein, the first-level protection level is greater than the second-level protection level , the second-level protection degree is greater than the third-level protection degree; correspondingly, the privacy parameter corresponding to the first-level protection degree is smaller than the privacy parameter corresponding to the second-level protection degree, and the privacy parameter corresponding to the second-level protection degree is smaller than the third-level protection degree The privacy parameter corresponding to the degree. The user selects different privacy protection degrees through their own requirements, and then the terminal can determine different privacy protection degrees and corresponding privacy parameters.

这里，在数据推荐方法启用之前，用户还可以根据自身需要选择是否采用隐私保护。例如，终端提供关闭按键，用户在终端对历史数据的隐私保护程度进行选择时，除了上述一级保护程度、二级保护程度、三级保护程度，终端还可以提供一个关闭按键，若选择关闭按键，则表征不采用隐私保护。Here, before the data recommendation method is enabled, the user can also choose whether to adopt privacy protection according to his own needs. For example, the terminal provides a close button. When the user selects the privacy protection level of historical data on the terminal, in addition to the above-mentioned first-level protection level, second-level protection level, and third-level protection level, the terminal can also provide a close button. If the close button is selected , the representation does not use privacy protection.

在一实施例中，所述第一数据包括：至少一个参数和所述至少一个参数中各参数对应的数值；In one embodiment, the first data includes: at least one parameter and a value corresponding to each parameter in the at least one parameter;

转化所述第一数据，包括：Transforming the first data includes:

通过上述方式转化所述第一数据，即可基于所述第一数据得到扰动数据，得到的所述扰动数据作为所述第二数据。By transforming the first data in the above manner, disturbance data can be obtained based on the first data, and the obtained disturbance data can be used as the second data.

具体来说，所述对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果，包括：Specifically, performing a multi-value random response to the value corresponding to each parameter in the at least one parameter to obtain a random response result corresponding to each parameter in the at least one parameter, including:

具体来说，所述多值响应，指根据预设的隐私参数，针对每个参数对应的数值，进行随机响应。Specifically, the multi-valued response refers to performing a random response with respect to the value corresponding to each parameter according to preset privacy parameters.

举例来说，所述第一数据包括：至少一个商品(即所述至少一个参数)和各个商品对应的评分(即各参数对应的数值)；所述第一数据可以以向量形式记录，记做X＝(x₁,x₂,…x_i,…x_n)，其中，x_i表示用户对第i个商品的评价，一般的，评分为0表示用户还没有使用过该商品，或未对该商品评价。For example, the first data includes: at least one commodity (that is, the at least one parameter) and a score corresponding to each commodity (that is, the value corresponding to each parameter); the first data may be recorded in the form of a vector, which is recorded as X=(x ₁ , x ₂ ,...x _i ,...x _n ), where x _i represents the user's evaluation of the i-th product, generally, a score of 0 means that the user has not used the product, or has not Evaluation of this product.

针对每个商品对应的评分进行多值随机响应，包括：Multi-value random responses are performed for the scores corresponding to each item, including:

当随机响应技术判断该次不向服务器上传真实数据时，终端将对向量X的每一位x_i进行一次多值随机响应；具体来说，假设评分可以为1,2,…,k，共k个等级，终端将以下式(3)对向量X中的每一位x_i进行多值随机响应：When the random response technology judges that the real data is not to be uploaded to the server this time, the terminal will make a multi-value random response to each bit x _i of the vector X; For k levels, the terminal will perform a multi-valued random response to each bit x _i in the vector X by the following formula (3):

其中，e表示自然常数，ε表示预设的隐私参数，k表示原始的x_i的数值，即原始的评分；Among them, e represents a natural constant, ε represents a preset privacy parameter, and k represents the original value of x _i , that is, the original score;

具体来讲，对于向量X中的某一位x_i，有

的概率被置为原来的值，有

的概率被置为1,2,…,k中不等于x_i的任意一个值，算法输出结果记为y_i,最终，得到一条每一位经过扰动后的扰动数据Y＝(y₁,y₂,…y_i,…y_n)。Specifically, for a certain bit x _i in the vector X, we have

The probability of is set to the original value, there is

The probability of is set to 1,2,...,k which is not equal to any value of x _i , the output result of the algorithm is recorded as y _i , and finally, a disturbed data Y=(y ₁ ,y ₂ ,…y _i ,…y _n ).

实际应用时，考虑到扰动数据与原始数据(即原始的第一数据)相差较大，则根据扰动数据确定的推荐数据与用户所需的结果并不相符，若依旧根据此推荐数据推荐给用户会降低用户的体验度，因此这里对是否采用根据扰动数据确定的推荐数据推荐给用户进行判断。In practical application, considering that the disturbance data is quite different from the original data (that is, the original first data), the recommended data determined according to the disturbance data does not match the result required by the user. If the recommended data is still recommended to the user It will reduce the user's experience, so here it is judged whether to use the recommended data determined according to the disturbance data to recommend to the user.

基于此，在一实施例中，所述方法还包括：确定所述第一数据和所述第二数据之间的相似度；Based on this, in an embodiment, the method further includes: determining a similarity between the first data and the second data;

这里，通过随机响应得到扰动数据(记做Y)后，计算原始数据(即第一数据X)与扰动数据Y(即第二数据)之间的欧氏距离，表征原始数据X与扰动数据Y之间的相似度，这里欧氏距离的计算方式如下：Here, after the disturbance data (denoted as Y) is obtained through random response, the Euclidean distance between the original data (ie the first data X) and the disturbance data Y (ie the second data) is calculated to characterize the original data X and the disturbance data Y The similarity between , where the Euclidean distance is calculated as follows:

这里，所述相似度用于确定是否符合推荐条件，符合推荐条件则使用此次推荐结果，不符合推荐条件则不使用此次推荐结果。所述符合推荐条件指相似度超过预设相似度阈值；相反的，所述不符合推荐条件指相似度不超过预设相似度阈值；所述相似度阈值由开发人员预先设定并保存在服务器中。Here, the similarity is used to determine whether the recommendation condition is met, the recommendation result is used when the recommendation condition is met, and the recommendation result is not used when the recommendation condition is not met. The said meeting the recommendation condition means that the similarity exceeds the preset similarity threshold; on the contrary, the said not meeting the recommendation condition means that the similarity does not exceed the preset similarity threshold; the similarity threshold is preset by the developer and saved in the server middle.

所述标签可以以数字形式标记，如用数字1表示可以采用基于所述第二数据确定的推荐数据，用数字0表示不采用基于所述第二数据确定的推荐数据；当然所述标签还可以用其他数字或字母等字符进行标记，这里不做限定。The label may be marked in the form of numbers, for example, a number 1 indicates that the recommended data determined based on the second data can be used, and a number 0 indicates that the recommended data determined based on the second data is not used; of course, the label can also be used. Use other characters such as numbers or letters to mark, which is not limited here.

所述第二数据携带标签，当所述服务器接收到携带标签的所述第二数据后，根据所述第二数据确定推荐数据；并基于携带的标签为所述推荐数据添加相同的标签，即发送给终端的推荐数据也携带标签，从而所述终端基于接收的推荐数据可以确定所述推荐数据对应的第二数据的标签，并根据所述推荐数据对应的第二数据的标签，确定推荐结果。The second data carries a label. After receiving the second data carrying the label, the server determines recommended data according to the second data; and adds the same label to the recommended data based on the carried label, that is, The recommendation data sent to the terminal also carries tags, so that the terminal can determine the tag of the second data corresponding to the recommendation data based on the received recommendation data, and determine the recommendation result according to the tag of the second data corresponding to the recommendation data .

相应的，本发明实施例还提供了一种数据推荐方法，所述数据推荐方法应用于服务器，所述服务器采用所述数据推荐方法进行推荐。图2为本发明实施例提供的另一种数据推荐方法的流程示意图，如图2所示，所述方法包括：Correspondingly, an embodiment of the present invention further provides a data recommendation method, where the data recommendation method is applied to a server, and the server performs recommendation by using the data recommendation method. FIG. 2 is a schematic flowchart of another data recommendation method provided by an embodiment of the present invention. As shown in FIG. 2 , the method includes:

步骤201、接收终端发送的第二数据；所述第二数据为第一数据或基于第一数据转化得到的扰动数据；所述第一数据为从所述终端采集的原始数据；Step 201: Receive second data sent by the terminal; the second data is the first data or the disturbance data converted based on the first data; the first data is the original data collected from the terminal;

步骤202、基于接收的所述第二数据确定推荐数据；所述推荐数据携带有标签，所述标签表征是否采用基于所述第二数据确定的推荐数据；Step 202: Determine recommendation data based on the received second data; the recommendation data carries a label, and the label indicates whether the recommendation data determined based on the second data is adopted;

步骤203、向所述终端发送所述推荐数据。Step 203: Send the recommendation data to the terminal.

所述服务器接收到终端发送的第二数据后，根据第二数据可以确定推荐数据，具体来说，服务器基于接收的所述第二数据确定推荐数据的方法可以采用任意一种数据推荐方法，这里不做限定。After the server receives the second data sent by the terminal, the recommended data may be determined according to the second data. Specifically, the server may use any data recommendation method for determining the recommended data based on the received second data. Not limited.

以下提供一种具体地基于接收的所述第二数据确定推荐数据的方法，包括：The following provides a method for determining recommendation data based on the received second data, including:

确定所述第二数据中各参数的特征和各参数对应的数值，根据所述第二数据中各参数的特征和各参数对应的数值确定第一向量；Determine the characteristics of each parameter in the second data and the value corresponding to each parameter, and determine the first vector according to the characteristics of each parameter in the second data and the value corresponding to each parameter;

获取至少一个第二向量；所述第二向量基于从其他终端获得的第二数据获得；Obtain at least one second vector; the second vector is obtained based on second data obtained from other terminals;

确定所述第一向量和所述至少一个第二向量中各第二向量的相似度；determining the similarity between the first vector and each of the second vectors in the at least one second vector;

根据相似度从至少一个第二向量中确定与所述第一向量的相似度超过预设阈值的目标第二向量；所述预设阈值由开发人员预先设定并保存在服务器中；Determine a target second vector whose similarity with the first vector exceeds a preset threshold from at least one second vector according to the similarity; the preset threshold is preset by the developer and stored in the server;

确定所述目标第二向量对应的终端，基于所述目标第二向量对应的终端发送的数据，确定推荐数据。The terminal corresponding to the target second vector is determined, and recommendation data is determined based on data sent by the terminal corresponding to the target second vector.

具体来说，基于所述目标第二向量对应的终端发送的数据，确定推荐数据，包括：Specifically, the recommendation data is determined based on the data sent by the terminal corresponding to the target second vector, including:

确定所述目标第二向量对应的终端发送的数据中包括的至少一个第一参数；determining at least one first parameter included in the data sent by the terminal corresponding to the target second vector;

确定所述第一向量对应的终端发送的数据中包括的至少一个第二参数；determining at least one second parameter included in the data sent by the terminal corresponding to the first vector;

从所述至少一个第一参数中筛除所述至少一个第二参数，将剩余的第二参数作为推荐数据。The at least one second parameter is filtered out from the at least one first parameter, and the remaining second parameter is used as recommendation data.

以上仅仅是提供一种确定推荐数据的参考方法，实际应用中还可以采用其他方法，本发明实施例中对具体如何确定推荐数据不做限定。The above is only to provide a reference method for determining the recommendation data, and other methods may also be used in practical applications, and there is no limitation on how to determine the recommendation data in the embodiment of the present invention.

所述基于接收的所述第二数据确定推荐数据，包括：The determining of recommendation data based on the received second data includes:

确定所述第二数据携带的标签；determining a tag carried by the second data;

基于所述第二数据确定推荐数据，对所述推荐数据添加所述标签；Determine recommended data based on the second data, and add the tag to the recommended data;

相应的，所述向所述终端发送所述推荐数据，包括：向所述终端发送携带有所述标签的推荐数据。Correspondingly, the sending the recommendation data to the terminal includes: sending the recommendation data carrying the tag to the terminal.

图3为本发明实施例提供的再一种数据推荐方法的流程示意图；如图3所示，所述数据推荐方法，包括：FIG. 3 is a schematic flowchart of still another data recommendation method provided by an embodiment of the present invention; as shown in FIG. 3 , the data recommendation method includes:

步骤301、客户端运用本地化差分隐私技术对待上传的历史数据进行随机化处理；Step 301, the client uses the localized differential privacy technology to randomize the historical data to be uploaded;

这里，所述客户端可以加载在设备上实现，所述设备可以为如图1所示方法应用的终端。Here, the client can be implemented by being loaded on a device, and the device can be a terminal to which the method shown in FIG. 1 is applied.

具体地，所述步骤301，包括：Specifically, the step 301 includes:

步骤3011、对是否向服务器上传真实的历史数据进行一次二值随机响应，得到随机响应结果；所述随机响应结果表征是否上传真实的历史数据。此次二值随机响应中，以p的概率上传真实的历史数据，以1-p的概率上传一个与真实的历史数据形式相同的扰动数据；二值随机响应满足下列公式(4)：Step 3011: Perform a binary random response on whether to upload real historical data to the server, and obtain a random response result; the random response result represents whether to upload real historical data. In this binary random response, the real historical data is uploaded with the probability of p, and a disturbance data in the same form as the real historical data is uploaded with the probability of 1-p; the binary random response satisfies the following formula (4):

这里，采用随机响应技术，服务器难以区分客户端上传的数据为真实的历史数据或扰动数据，并且用户可以以1-p的概率否认自身上传真实的历史数据。Here, using the random response technology, it is difficult for the server to distinguish the data uploaded by the client as real historical data or disturbance data, and the user can deny himself uploading real historical data with a probability of 1-p.

推荐算法中，客户端上传的历史数据，一般可以被表示为一个“商品-评分”形式的向量X＝(x₁,x₂,…x_i,…x_n)，其中，x_i表示用户对第i个商品的评价，一般的，评分为0表示用户还没有使用过该商品，或未对该商品评价。In the recommendation algorithm, the historical data uploaded by the client can generally be expressed as a vector X=(x ₁ ,x ₂ ,...x _i ,...x _n ) in the form of "product-rating", where x _i represents the user's The evaluation of the i-th product, generally, a score of 0 indicates that the user has not used the product, or has not commented on the product.

举例来说，所述历史数据可以为用户观看电影相关的数据、用户阅读书籍的数据、用户购买某一类物品的数据；For example, the historical data may be data related to movies watched by the user, data of books read by the user, data of a certain type of items purchased by the user;

以所述商品为电影为例，所述评分为用户针对电影的评分；即所述历史数据，包括：至少一个电影和各电影对应的评分；Taking the commodity as a movie as an example, the rating is the rating of the user for the movie; that is, the historical data includes: at least one movie and the rating corresponding to each movie;

如：所述历史数据，包括：电影一-3分，电影二-5分，电影三-9分，……，电影N-6分。For example, the historical data includes: Movie 1-3 points, Movie 2-5 points, Movie 3-9 points, ..., Movie N-6 points.

步骤3012、当随机响应技术判断不向服务器上传真实的历史数据时，客户端将对向量X的每一位x_i进行一次多值随机响应；具体来说，假设评分可以为1,2,…,k，共k个等级，客户端将按下式(5)对向量X中的每一位x_i进行多值随机响应：Step 3012: When the random response technology judges not to upload real historical data to the server, the client will make a multi-value random response to each bit x _i of the vector X; specifically, it is assumed that the score can be 1, 2, . . . ,k, a total of k levels, the client will perform a multi-value random response to each bit x _i in the vector X according to the formula (5):

其中，e表示自然常数，ε表示预设的隐私参数，k表示原始x_i中的数值；Among them, e represents a natural constant, ε represents a preset privacy parameter, and k represents the value in the original _xi ;

具体来讲，对于向量X中的某一位x_i，有

的概率被置为原来的值，有

的概率被置为1,2,…,k中不等于x_i的任意一个值,算法输出结果记为y_i,最终，得到一条每一位经过扰动后的扰动数据向量Y＝(y₁,y₂,…y_i,…y_n)。Specifically, for a certain bit x _i in the vector X, we have

The probability of is set to the original value, there is

The probability of is set to 1,2,...,k is not equal to any value of x _i , the output result of the algorithm is recorded as y _i , and finally, a perturbed data vector Y=(y ₁ , y ₂ ,…y _i ,…y _n ).

具体地，所述步骤301之前，所述方法还包括：Specifically, before the step 301, the method further includes:

在推荐服务启用之前，用户可在客户端对历史数据的隐私保护程度进行选择，对应是否采用本地化差分隐私对历史数据进行保护、本地化差分隐私的保护程度。Before the recommendation service is enabled, users can choose the privacy protection degree of historical data on the client side, corresponding to whether to use localized differential privacy to protect historical data and the degree of protection of localized differential privacy.

步骤302、客户端确定目标数据，将目标数据发送至服务器。Step 302: The client determines the target data, and sends the target data to the server.

这里，所述目标数据为步骤301处理后得到的上传数据，具体可以为真实的历史数据或对历史数据处理后得到的扰动数据。Here, the target data is the uploaded data obtained after processing in step 301, and specifically may be real historical data or disturbance data obtained after processing the historical data.

具体地，所述步骤302，还包括：Specifically, the step 302 further includes:

通过随机响应得到扰动数据Y后，计算原始的历史数据X与扰动数据Y之间的欧氏距离，作为原始数据X与扰动数据Y之间相似度；这里欧氏距离的计算方式如下式(6)：After the disturbance data Y is obtained through random response, the Euclidean distance between the original historical data X and the disturbance data Y is calculated as the similarity between the original data X and the disturbance data Y; here the Euclidean distance is calculated as follows (6 ):

根据计算得到的相似度，确定所述目标数据对应的标签，所述标签表征表征是否采用基于所述目标数据确定的推荐数据；According to the calculated similarity, the label corresponding to the target data is determined, and the label characterizes whether the recommendation data determined based on the target data is adopted;

相应的，所述将目标数据发送至服务器，包括：Correspondingly, the sending the target data to the server includes:

将携带有标签的目标数据发送至服务器。Send the target data with the tag to the server.

具体来说，历史数据、即向量X进行本地化差分隐私处理后记为X′(X′＝X，X′＝Y)，客户端在上传X′前对X′进行标记，如添加用于表征是否采用基于所述X′确定的推荐数据的标签；当客户端接收到推荐数据后，根据推荐数据携带的标签即可确定是否采用相应的推荐数据(具体参见图1所示方法，这里不再赘述)。Specifically, the historical data, that is, the vector X, is subjected to localized differential privacy processing and denoted as X' (X'=X, X'=Y), and the client marks X' before uploading X', such as adding it to represent Whether to use the label of the recommended data determined based on the X'; when the client receives the recommended data, it can determine whether to use the corresponding recommended data according to the label carried by the recommended data (for details, see the method shown in Figure 1, which is not repeated here. repeat).

或者，客户端上传X′前为X′进行标记，并在本地记录上传的X′与X的关系；例如，向X添加唯一标签a；当上传的X′＝X时，向X′添加标签(即a)，或者当上传的X′＝Y、且X与Y的相似度高于设定的相似度阈值时，向Y添加标签(如a′)，这里，Y与X的相似度不高于设定的相似度阈值，则可以不标记或者标记为其他字符以表示不采用基于Y得到的推荐数据；这里，a′与a存在对应关系；当客户端接收到推荐数据后，对比推荐数据的标签(即X′对应的标签a或a′)与本地已标记的标签(包括真实的历史数据X的标记a，及与真实的历史数据相似度较高的扰动数据Y的标记a′)，若回复的推荐数据携带的标签为a′或a，可以确定与本地已标记的标签匹配，则使用本次的推荐数据，否则不使用本次的推荐数据。Or, the client side marks X' before uploading X', and records the relationship between the uploaded X' and X locally; for example, add a unique label a to X; when the uploaded X'=X, add a label to X' (ie a), or when the uploaded X'=Y and the similarity between X and Y is higher than the set similarity threshold, add a label (such as a') to Y, where the similarity between Y and X is not If it is higher than the set similarity threshold, it can be unmarked or marked with other characters to indicate that the recommendation data based on Y is not used; here, a' and a have a corresponding relationship; when the client receives the recommendation data, it compares and recommends The label of the data (that is, the label a or a' corresponding to X') and the local labeled label (including the label a of the real historical data X, and the label a' of the disturbance data Y with a high similarity to the real historical data ), if the label carried by the returned recommendation data is a' or a, and it can be determined that it matches the locally marked label, the current recommendation data is used; otherwise, the current recommendation data is not used.

步骤303、服务器接收客户端发送的目标数据，根据接收的目标数据，运用预设的推荐方法确定推荐数据；Step 303: The server receives the target data sent by the client, and according to the received target data, uses a preset recommendation method to determine the recommended data;

需要说明的是，由于扰动数据与真实的上传数据的形式相同，因此经过本地化差分隐私处理的数据依旧可以通过现有的推荐算法进行推荐。服务器在运行推荐算法后，将推荐数据携带标签返回给客户端。It should be noted that since the perturbed data is in the same form as the real uploaded data, the data processed by localized differential privacy can still be recommended by the existing recommendation algorithm. After running the recommendation algorithm, the server returns the recommendation data to the client with the tag.

这里，所述预设的推荐方法具体参照图2所示方法，这里不再赘述。Here, the preset recommendation method specifically refers to the method shown in FIG. 2 , which will not be repeated here.

步骤304、服务器将推荐数据和推荐数据对应的标签发送给客户端；Step 304, the server sends the recommended data and the tags corresponding to the recommended data to the client;

步骤305、客户端根据推荐数据和推荐数据对应的标签确定推荐结果。Step 305: The client determines the recommendation result according to the recommendation data and the tags corresponding to the recommendation data.

以下针对上述方案涉及到的本地化差分隐私技术和随机响应技术进一步说明。The following is a further description of the localized differential privacy technology and random response technology involved in the above scheme.

差分隐私技术是重要的隐私保护方式，近年来在众多领域有广泛的应用。差分隐私并不要求保证数据集整体隐私，而是对数据集中个体隐私提供保护。它通过添加随机噪声等方式对原始的统计数据做失真处理，使得该数据集中任一条记录的变化对查询输出结果的影响有限，从而攻击者通过观察查询所得结果无法得知有关个体的隐私信息，在牺牲一定的精确度的前提下保证了安全性。Differential privacy technology is an important privacy protection method, which has been widely used in many fields in recent years. Differential privacy does not require guaranteeing the overall privacy of the data set, but provides protection for the privacy of individuals in the data set. It distorts the original statistical data by adding random noise, etc., so that the change of any record in the data set has a limited impact on the query output result, so that the attacker cannot know the private information of the individual by observing the query result. Safety is guaranteed at the expense of a certain degree of accuracy.

本地化差分隐私技术是基于中心化差分隐私保护技术提出的数据采集框架，不同于中心化差分隐私对于可信第三方的假设，其针对的是不可信的第三方数据收集者。Localized differential privacy technology is a data collection framework based on centralized differential privacy protection technology, which is different from the assumption of centralized differential privacy for trusted third parties, which is aimed at untrusted third-party data collectors.

本地化差分隐私下的保护模型充分考虑了数据采集过程中数据收集者窃取或泄露用户隐私的可能性。该模型中，每个用户首先对数据进行隐私化处理，再将处理后的数据发送给数据收集者，数据收集者对采集到的数据进行统计，以得到有效的分析结果。即在对数据进行统计分析的同时，保证个体的隐私信息不被泄露。本地化差分隐私的形式化定义如下。The protection model under localized differential privacy fully considers the possibility of data collectors stealing or leaking user privacy during data collection. In this model, each user first performs privacy processing on the data, and then sends the processed data to the data collector, who makes statistics on the collected data to obtain effective analysis results. That is, while performing statistical analysis on the data, it ensures that the private information of individuals is not leaked. The formal definition of localized differential privacy is as follows.

给定n个用户，每个用户对应一个隐私算法M及其定义域Dom(M)和值域Ran(M)，若算法M在任意两条记录t和t′(t_,t′∈Dom(M))上得到相同的输出结果

满足下列不等式，则M满足ε-本地化差分隐私：Given n users, each user corresponds to a privacy algorithm M and its definition domain Dom(M) and value domain Ran(M). If algorithm M is in any two records t and t′(t _, t′∈Dom( M)) to get the same output

If the following inequalities are satisfied, then M satisfies ε-localized differential privacy:

ε为隐私保护预算，用于表示隐私保护的水平，其值越小则该算法在相邻数据集上查询结果的概率分布越相似，隐私保护水平越高。当ε＝0时，数据收集者将完全无法从收到的结果区分t和t′，此时的保护程度最高。但隐私保护水平的提高往往会造成数据可用性的降低。ε is the privacy protection budget, which is used to represent the level of privacy protection. The smaller the value, the more similar the probability distribution of the query results of the algorithm on adjacent data sets, and the higher the level of privacy protection. When ε = 0, the data collector will be completely unable to distinguish t and t' from the received results, and the degree of protection is highest at this time. However, an increase in the level of privacy protection often results in a decrease in data availability.

从定义中可以看出，本地化差分隐私技术通过控制任意两条记录的输出结果的相似性，从而确保算法M满足ε-本地化差分隐私。简言之，根据隐私算法M的某个输出结果，几乎无法推理出其输入数据为哪一条记录。It can be seen from the definition that the localized differential privacy technology ensures that the algorithm M satisfies ε-localized differential privacy by controlling the similarity of the output results of any two records. In short, according to a certain output result of the privacy algorithm M, it is almost impossible to infer which record its input data is.

随机响应技术是本地化差分隐私技术的主流扰动机制，其主要思想是利用对敏感问题响应的不确定性对原始数据进行隐私保护。随机响应技术主要包括两个步骤：扰动性统计和校正。Random response technology is the mainstream perturbation mechanism of localized differential privacy technology. The random response technique mainly includes two steps: perturbation statistics and correction.

为了具体介绍随机响应技术，下面首先引入一个具体的问题场景。假设有n个用户，其中艾滋病患者的真实比例为π，但我们并不知道。我们希望对其比例

进行统计。于是发起一个敏感的问题:“你是否为艾滋病患者？”，每个用户对此进行响应，第i个用户的答案X_i为是或否，但出于隐私性考虑，用户不会直接响应真实答案。假设其借助于一枚非均匀的硬币来给出答案，其正面向上的概率为p，反面向上的概率为1-p。抛出该硬币，若正面向上，则回答真实答案，反面向上，则回答相反的答案。In order to introduce the random response technology in detail, a specific problem scenario is first introduced below. Suppose there are n users, of which the true proportion of AIDS patients is π, but we do not know it. we want the proportion of

Do statistics. So a sensitive question is initiated: "Are you an AIDS patient?", each user responds to this, and the _i -th user's answer Xi is yes or no, but for privacy reasons, the user will not directly respond to the real Answer. Suppose it gives the answer by means of a non-uniform coin with probability p of heads and 1-p of tails. Toss the coin, if it comes up heads, the answer is the true answer, and if it comes up tails, the answer is the opposite.

首先，进行扰动性统计.利用上述扰动方法对n个用户的回答进行统计，可以得到艾滋病患者人数的统计值。假设统计结果中，回答“是”的人数为n₁，则回答“否”的人数为n-n₁.显然，按照上述统计，回答“是”和“否”的用户比例如下:First, perform perturbation statistics. The above perturbation method is used to count the responses of n users, and the statistical value of the number of AIDS patients can be obtained. Assuming that in the statistical results, the number of people who answered "yes" is n ₁ , the number of people who answered "no" is nn ₁ . Obviously, according to the above statistics, the proportion of users who answered "yes" and "no" is as follows:

Pr(X_i＝'是')＝πp+(1-π)(1-p)Pr(X _i ='yes')=πp+(1-π)(1-p)

Pr(X_i＝'否')＝(1-π)p+π(1-p)Pr(X _i ='No')=(1-π)p+π(1-p)

根据统计结果，可以得到对艾滋病患者的真实比例π的极大似然估计值

且计算

的期望可知，

为π的无偏估计：

According to the statistical results, the maximum likelihood estimate of the true proportion π of AIDS patients can be obtained

and calculate

expectations, it can be seen that

is an unbiased estimate of π:

由此可得患有HIV的总人数为：

This gives the total number of people with HIV as:

从差分隐私的角度考虑随机响应技术，假设某位病人为艾滋病患者，当他回答“你是否为艾滋病患者？”这一敏感问题时，他有概率为p的可能性回答“是”，概率为1-p的可能性回答“否”，而对于一个未患病的病人来讲，则有概率为p的可能性回答“否”，概率为1-p的可能性回答“是”。由此，我们可以得到随机响应技术对应满足的差分隐私定义：

Consider the random response technique from the perspective of differential privacy. Suppose a patient is an AIDS patient. When he answers the sensitive question "Are you an AIDS patient?", he has a probability of p to answer "yes", and the probability is A 1-p probability would answer "no", while for an unaffected patient, there would be a p probability of answering "no" and a 1-p probability of answering "yes". From this, we can get the definition of differential privacy that the random response technique satisfies:

将真实情况的概率p代入公式即可得到隐私预算参数ε与p的关系：

The relationship between the privacy budget parameter ε and p can be obtained by substituting the probability p of the real situation into the formula:

当回答真实情况的概率p越大时，隐私预算参数ε也将变大，即本地化差分隐私的保护程度越低。When the probability p of answering the true situation is larger, the privacy budget parameter ε will also become larger, that is, the lower the degree of protection of localized differential privacy.

图4为本发明实施例提供的一种数据推荐装置的结构示意图，所述数据推荐装置应用于终端，如图4所示，所述装置包括：第一处理模块、第二处理模块、第三处理模块；其中，FIG. 4 is a schematic structural diagram of a data recommendation apparatus according to an embodiment of the present invention. The data recommendation apparatus is applied to a terminal. As shown in FIG. 4 , the apparatus includes: a first processing module, a second processing module, a third processing module; wherein,

具体地，所述第二处理模块，用于确定预设的隐私参数；所述隐私参数与预设的隐私保护程度相关联；Specifically, the second processing module is used to determine a preset privacy parameter; the privacy parameter is associated with a preset privacy protection degree;

具体地，所述第一数据包括：至少一个参数和所述至少一个参数中各参数对应的数值；Specifically, the first data includes: at least one parameter and a value corresponding to each parameter in the at least one parameter;

具体地，所述第二处理模块，用于根据预设的隐私参数，对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；所述隐私参数与预设的隐私保护程度相关联。Specifically, the second processing module is configured to perform a multi-value random response to the value corresponding to each parameter in the at least one parameter according to a preset privacy parameter, and obtain a random response corresponding to each parameter in the at least one parameter As a result; the privacy parameter is associated with a preset degree of privacy protection.

具体地，所述第二处理模块，还用于确定所述第一数据和所述第二数据之间的相似度；Specifically, the second processing module is further configured to determine the similarity between the first data and the second data;

需要说明的是：上述实施例提供的数据推荐装置在实现相应数据推荐方法时，仅以上述各程序模块的划分进行举例说明，实际应用中，可以根据需要而将上述处理分配由不同的程序模块完成，即将服务器的内部结构划分成不同的程序模块，以完成以上描述的全部或者部分处理。另外，上述实施例提供的装置与相应方法的实施例属于同一构思，其具体实现过程详见方法实施例，这里不再赘述。It should be noted that: when implementing the corresponding data recommendation method, the data recommendation apparatus provided in the above-mentioned embodiments only takes the division of the above-mentioned program modules as an example. In practical applications, the above-mentioned processing may be allocated to different program modules as required. To complete, that is, to divide the internal structure of the server into different program modules to complete all or part of the above-described processing. In addition, the apparatus provided in the above-mentioned embodiment and the embodiment of the corresponding method belong to the same concept, and the specific implementation process thereof is detailed in the method embodiment, which will not be repeated here.

图5为本发明实施例提供的一种数据推荐系统的结构示意图，所述数据推荐系统包括：终端和服务器，所述终端加载有可实现图1所示数据推荐方法的客户端，如图5所示，所述客户端向所述服务器发送数据，这里，发送的数据为原始数据或扰动数据；所述服务器接收到数据后基于数据确定推荐数据，并将推荐数据发送给所述客户端。FIG. 5 is a schematic structural diagram of a data recommendation system according to an embodiment of the present invention. The data recommendation system includes: a terminal and a server, and the terminal is loaded with a client that can implement the data recommendation method shown in FIG. 1 , as shown in FIG. 5 . As shown, the client sends data to the server, where the sent data is original data or disturbed data; after receiving the data, the server determines recommended data based on the data, and sends the recommended data to the client.

所述客户端在实现相应数据推荐方法时，具体可以参照图1所示方法；这里不再赘述。When the client implements the corresponding data recommendation method, specific reference may be made to the method shown in FIG. 1 ; details are not repeated here.

所述服务器在实现相应数据推荐方法时，具体可以参照图2所示方法；这里不再赘述。When the server implements the corresponding data recommendation method, specific reference may be made to the method shown in FIG. 2 ; details are not repeated here.

图6为本发明实施例提供的一种数据推荐装置的结构示意图；如图6所示，所述装置60包括：处理器601和用于存储能够在所述处理器上运行的计算机程序的存储器602；其中，所述处理器601用于运行所述计算机程序时，执行：确定待发送的第一数据；所述第一数据为终端本地采集的行为数据；将第二数据发送至服务器；其中，在确定所述第一数据不满足转化条件时，所述第二数据为第一数据；在确定所述第一数据满足转化条件时，所述第二数据为基于所述第一数据得到的扰动数据；接收所述服务器根据所述第二数据确定并发送的推荐数据。FIG. 6 is a schematic structural diagram of a data recommendation apparatus according to an embodiment of the present invention; as shown in FIG. 6 , the apparatus 60 includes: a processor 601 and a memory for storing a computer program that can run on the processor 602; wherein, when the processor 601 is configured to run the computer program, execute: determine the first data to be sent; the first data is behavior data collected locally by the terminal; send the second data to the server; wherein , when it is determined that the first data does not meet the conversion conditions, the second data is the first data; when it is determined that the first data meets the conversion conditions, the second data is obtained based on the first data Disturbance data; receive recommendation data determined and sent by the server according to the second data.

在一实施例中，所述处理器601用于运行所述计算机程序时，执行：确定预设的隐私参数；所述隐私参数与预设的隐私保护程度相关联；根据所述预设的隐私参数，确定扰动概率值；根据所述扰动概率值，对所述第一数据进行二值随机响应，得到响应结果；所述响应结果表征是否转化第一数据；In one embodiment, when the processor 601 is configured to run the computer program, execute: determine a preset privacy parameter; the privacy parameter is associated with a preset privacy protection degree; according to the preset privacy parameters, to determine a perturbation probability value; according to the perturbation probability value, perform a binary random response on the first data to obtain a response result; the response result represents whether the first data is transformed;

在一实施例中，所述处理器601用于运行所述计算机程序时，执行：对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；In an embodiment, when the processor 601 is configured to run the computer program, execute: perform a multi-value random response to the value corresponding to each parameter in the at least one parameter, and obtain the corresponding value of each parameter in the at least one parameter. The random response result of ;

在一实施例中，所述处理器601用于运行所述计算机程序时，执行：根据预设的隐私参数，对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；所述隐私参数与预设的隐私保护程度相关联。In one embodiment, when the processor 601 is configured to run the computer program, execute: according to a preset privacy parameter, perform a multi-value random response to the value corresponding to each parameter in the at least one parameter, to obtain the A random response result corresponding to each parameter in at least one parameter; the privacy parameter is associated with a preset privacy protection degree.

在一实施例中，所述处理器601用于运行所述计算机程序时，执行：确定所述第一数据和所述第二数据之间的相似度；In an embodiment, when the processor 601 is configured to run the computer program, execute: determining the similarity between the first data and the second data;

所述将第二数据发送至服务器之前，执行：根据所述相似度向所述第二数据添加标签；所述标签表征是否采用基于所述第二数据确定的推荐数据；Before sending the second data to the server, perform: adding a label to the second data according to the similarity; the label indicates whether the recommendation data determined based on the second data is adopted;

所述接收所述服务器根据所述第二数据确定并发送的推荐数据之后，还执行：确定所述推荐数据对应的第二数据的标签；根据所述推荐数据对应的第二数据的标签，确定推荐结果；所述推荐结果表征是否按所述推荐数据进行推荐。After receiving the recommendation data determined and sent by the server according to the second data, the following further steps are performed: determining the label of the second data corresponding to the recommendation data; and determining, according to the label of the second data corresponding to the recommendation data, Recommendation result; the recommendation result represents whether the recommendation is performed according to the recommendation data.

具体来说，上述所述数据推荐装置具体执行如图1所示的方法，与图1所示的推荐方法实施例属于同一构思，其具体实现过程详见方法实施例，这里不再赘述。Specifically, the above-mentioned data recommendation apparatus specifically implements the method shown in FIG. 1 , which belongs to the same concept as the recommendation method embodiment shown in FIG.

实际应用时，所述装置60还可以包括：至少一个网络接口603。推荐装置60中的各个组件通过总线系统604耦合在一起。可理解，总线系统604用于实现这些组件之间的连接通信。总线系统604除包括数据总线之外，还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见，在图6中将各种总线都标为总线系统604。其中，所述处理器601的个数可以为至少一个。网络接口603用于数据推荐装置60与其他设备之间有线或无线方式的通信。In practical application, the apparatus 60 may further include: at least one network interface 603 . The various components in recommender 60 are coupled together by bus system 604 . It will be appreciated that the bus system 604 is used to implement connection communication between these components. In addition to the data bus, the bus system 604 also includes a power bus, a control bus and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus system 604 in FIG. 6 . The number of the processors 601 may be at least one. The network interface 603 is used for wired or wireless communication between the data recommendation apparatus 60 and other devices.

本发明实施例中的存储器602用于存储各种类型的数据以支持数据推荐装置60的操作。The memory 602 in the embodiment of the present invention is used to store various types of data to support the operation of the data recommendation apparatus 60 .

上述本发明实施例揭示的方法可以应用于处理器601中，或者由处理器601实现。处理器601可能是一种集成电路芯片，具有信号的处理能力。在实现过程中，上述方法的各步骤可以通过处理器601中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器601可以是通用处理器、数字信号处理器(DSP，DiGital Signal Processor)，或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器601可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本发明实施例所公开的方法的步骤，可以直接体现为硬件译码处理器执行完成，或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中，该存储介质位于存储器602，处理器601读取存储器602中的信息，结合其硬件完成前述方法的步骤。The methods disclosed in the above embodiments of the present invention may be applied to the processor 601 or implemented by the processor 601 . The processor 601 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 601 or an instruction in the form of software. The above-mentioned processor 601 may be a general-purpose processor, a digital signal processor (DSP, DiGital Signal Processor), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. The processor 601 may implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in combination with the embodiments of the present invention can be directly embodied as being executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium, and the storage medium is located in the memory 602, and the processor 601 reads the information in the memory 602, and completes the steps of the foregoing method in combination with its hardware.

在示例性实施例中，数据推荐装置60可以被一个或多个应用专用集成电路(ASIC，Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD，ProgrammableLogic Device)、复杂可编程逻辑器件(CPLD，Complex Programmable Logic Device)、现场可编程门阵列(FPGA，Field-Programmable Gate Array)、通用处理器、控制器、微控制器(MCU，Micro Controller Unit)、微处理器(Microprocessor)、或其他电子元件实现，用于执行前述方法。In an exemplary embodiment, the data recommendation apparatus 60 may be implemented by one or more Application Specific Integrated Circuits (ASIC, Application Specific Integrated Circuits), DSPs, Programmable Logic Devices (PLDs, Programmable Logic Devices), Complex Programmable Logic Devices (CPLDs) , Complex Programmable Logic Device), Field-Programmable Gate Array (FPGA, Field-Programmable Gate Array), general-purpose processor, controller, microcontroller (MCU, Micro Controller Unit), microprocessor (Microprocessor), or other electronic Element implementation for performing the aforementioned method.

本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器运行时，执行：确定待发送的第一数据；所述第一数据为终端本地采集的行为数据；将第二数据发送至服务器；其中，在确定所述第一数据不满足转化条件时，所述第二数据为第一数据；在确定所述第一数据满足转化条件时，所述第二数据为基于所述第一数据得到的扰动数据；接收所述服务器根据所述第二数据确定并发送的推荐数据。Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, execute: determine first data to be sent; the first data is local to the terminal The collected behavior data; sending the second data to the server; wherein, when it is determined that the first data does not meet the conversion conditions, the second data is the first data; when it is determined that the first data meets the conversion conditions, The second data is disturbance data obtained based on the first data; and recommendation data determined and sent by the server according to the second data is received.

在一实施例中，所述计算机程序被处理器运行时，执行：确定预设的隐私参数；所述隐私参数与预设的隐私保护程度相关联；根据所述预设的隐私参数，确定扰动概率值；根据所述扰动概率值，对所述第一数据进行二值随机响应，得到响应结果；所述响应结果表征是否转化第一数据；In one embodiment, when the computer program is run by the processor, execute: determine a preset privacy parameter; the privacy parameter is associated with a preset privacy protection degree; determine the disturbance according to the preset privacy parameter probability value; according to the disturbance probability value, a binary random response is performed on the first data to obtain a response result; the response result represents whether the first data is transformed;

在一实施例中，所述计算机程序被处理器运行时，执行：对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；In one embodiment, when the computer program is run by the processor, execute: perform a multi-value random response to the value corresponding to each parameter in the at least one parameter, and obtain a random response result corresponding to each parameter in the at least one parameter ;

在一实施例中，所述计算机程序被处理器运行时，执行：根据预设的隐私参数，对所述至少一个参数中各参数对应的数值进行多值随机响应，得到所述至少一个参数中各参数对应的随机响应结果；所述隐私参数与预设的隐私保护程度相关联。In one embodiment, when the computer program is executed by the processor, execute: according to preset privacy parameters, perform a multi-value random response to the value corresponding to each parameter in the at least one parameter, and obtain the value in the at least one parameter. The random response result corresponding to each parameter; the privacy parameter is associated with a preset privacy protection degree.

在一实施例中，所述计算机程序被处理器运行时，执行：确定所述第一数据和所述第二数据之间的相似度；In one embodiment, when the computer program is executed by the processor, execute: determining the similarity between the first data and the second data;

在本申请所提供的几个实施例中，应该理解到，所揭露的装置和方法，可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，如：多个单元或组件可以结合，或可以集成到另一个系统，或一些特征可以忽略，或不执行。另外，所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口，设备或单元的间接耦合或通信连接，可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.

上述作为分离部件说明的单元可以是、或也可以不是物理上分开的，作为单元显示的部件可以是、或也可以不是物理单元，即可以位于一个地方，也可以分布到多个网络单元上；可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

另外，在本发明各实施例中的各功能单元可以全部集成在一个处理单元中，也可以是各单元分别单独作为一个单元，也可以两个或两个以上单元集成在一个单元中；上述集成的单元既可以采用硬件的形式实现，也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may all be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above-mentioned integration The unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.

本领域普通技术人员可以理解：实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成，前述的程序可以存储于一计算机可读取存储介质中，该程序在执行时，执行包括上述方法实施例的步骤；而前述的存储介质包括：移动存储设备、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments can be completed by program instructions related to hardware, the aforementioned program can be stored in a computer-readable storage medium, and when the program is executed, execute Including the steps of the above-mentioned method embodiment; and the aforementioned storage medium includes: a mobile storage device, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk or an optical disk and other various A medium on which program code can be stored.

或者，本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时，也可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括：移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the above-mentioned integrated unit of the present invention is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present invention may be embodied in the form of software products in essence or the parts that make contributions to the prior art. The computer software products are stored in a storage medium and include several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) is caused to execute all or part of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk and other mediums that can store program codes.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims

1. A data recommendation method, wherein the method is applied to a terminal, and the method comprises:

Determine the first data to be sent; the first data is behavior data collected locally by the terminal;

Send the second data to the server; wherein, when it is determined that the first data does not meet the conversion conditions, the second data is the first data; when it is determined that the first data meets the conversion conditions, the second data is the disturbance data obtained based on the first data;

Receive recommendation data determined and sent by the server according to the second data.

2. The method according to claim 1, wherein the method further comprises: judging whether the first data satisfies a conversion condition;

The judging whether the first data satisfies the conversion condition includes:

Determine a preset privacy parameter; the privacy parameter is associated with a preset privacy protection degree;

determining a perturbation probability value according to the preset privacy parameter;

According to the perturbation probability value, a binary random response is performed on the first data to obtain a response result; the response result represents whether the first data is converted;

Corresponding to the response result indicating that the first data is not converted, the first data does not meet the conversion condition;

In the case that the first data is transformed corresponding to the response result, the first data satisfies the transformation condition.

3. The method according to claim 2, wherein the first data comprises: at least one parameter and a value corresponding to each parameter in the at least one parameter;

Transforming the first data includes:

Perform a multi-value random response on the numerical value corresponding to each parameter in the at least one parameter, and obtain a random response result corresponding to each parameter in the at least one parameter;

The second data is obtained according to the random response result corresponding to each parameter in the at least one parameter.

4. The method according to claim 3, wherein the multi-value random response is performed to the numerical value corresponding to each parameter in the at least one parameter, and the random response result corresponding to each parameter in the at least one parameter is obtained, include:

According to a preset privacy parameter, a multi-value random response is performed on the value corresponding to each parameter in the at least one parameter, and a random response result corresponding to each parameter in the at least one parameter is obtained; the privacy parameter is the same as the preset privacy protection degree related.

5. The method according to claim 1, wherein the method further comprises: determining the similarity between the first data and the second data;

Before the sending the second data to the server, the method further includes:

Add a label to the second data according to the similarity; the label indicates whether the recommendation data determined based on the second data is adopted;

After receiving the recommendation data determined and sent by the server according to the second data, the method further includes:

determining the label of the second data corresponding to the recommended data;

The recommendation result is determined according to the label of the second data corresponding to the recommendation data; the recommendation result represents whether the recommendation is performed according to the recommendation data.

6. A data recommendation device, characterized in that the device comprises: a first processing module, a second processing module, and a third processing module; wherein,

The first processing module is used to determine the first data to be sent; the first data is behavior data collected locally by the terminal;

The second processing module is configured to send the second data to the server; wherein, when it is determined that the first data does not meet the conversion condition, the second data is the first data; when it is determined that the first data meets the When converting conditions, the second data is the disturbance data obtained based on the first data;

The third processing module is configured to receive recommendation data determined and sent by the server according to the second data.

7. The apparatus according to claim 6, wherein the second processing module is configured to determine a preset privacy parameter; the privacy parameter is associated with a preset privacy protection degree;

8. The apparatus according to claim 7, wherein the first data comprises: at least one parameter and a value corresponding to each parameter in the at least one parameter;

The second processing module is configured to perform a multi-value random response to the numerical value corresponding to each parameter in the at least one parameter, and obtain a random response result corresponding to each parameter in the at least one parameter;

9 . The device according to claim 8 , wherein the second processing module is configured to perform a multi-value random response to the value corresponding to each parameter in the at least one parameter according to a preset privacy parameter, to obtain A random response result corresponding to each parameter in the at least one parameter; the privacy parameter is associated with a preset privacy protection degree.

10. The apparatus according to claim 6, wherein the second processing module is further configured to determine the similarity between the first data and the second data;

And, the second processing module is further configured to add a label to the second data according to the similarity before sending the second data to the server; the label indicates whether a method determined based on the second data is used recommended data;

The third processing module is further configured to, after receiving the recommendation data determined and sent by the server according to the second data, determine the label of the second data corresponding to the recommendation data;

11. A data recommendation device, comprising a memory, a processor and a computer program stored in the memory and running on the processor, wherein the processor implements any one of claims 1 to 5 when the processor executes the program the steps of the method described in item.

12. A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 5 are implemented.