CN109299436A - A kind of ordering of optimization preference method of data capture meeting local difference privacy - Google Patents

A kind of ordering of optimization preference method of data capture meeting local difference privacy Download PDF

Info

Publication number
CN109299436A
CN109299436A CN201811079995.XA CN201811079995A CN109299436A CN 109299436 A CN109299436 A CN 109299436A CN 201811079995 A CN201811079995 A CN 201811079995A CN 109299436 A CN109299436 A CN 109299436A
Authority
CN
China
Prior art keywords
data
ordering
data collection
user terminal
preference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811079995.XA
Other languages
Chinese (zh)
Other versions
CN109299436B (en
Inventor
程祥
苏森
杨健宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201811079995.XA priority Critical patent/CN109299436B/en
Publication of CN109299436A publication Critical patent/CN109299436A/en
Application granted granted Critical
Publication of CN109299436B publication Critical patent/CN109299436B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Computer Security & Cryptography (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of ordering of optimization preference methods of data capture for meeting local difference privacy, user terminal converts ordering of optimization preference data using Rule I and Rule II, and data collection platform will be sent to after the data addition noise after conversion, data collection platform and user terminal, which cooperate, realizes the algorithm of the local difference privacy of satisfaction, and entire RI model construction is completed, the model based on building generates ordering of optimization preference data.By the above method, it can guarantee the ordering of optimization preference data data effectiveness with higher collected while guaranteeing to avoid privacy leakage.

Description

A kind of ordering of optimization preference method of data capture meeting local difference privacy
Technical field
This application involves data collection techniques, in particular to a kind of ordering of optimization preference data collection for meeting local difference privacy Method.
Background technique
Ordering of optimization preference data are a kind of typical personal data.For a user, his ordering of optimization preference data are Refer to user to given item collection (Item Set) according to the sequence of its own item provided to fancy grade every in item collection.Example Such as, item collection is { laughable, white wine, Sprite, plain boiled water, beer }, certain user is < white wine to the ordering of optimization preference of this five items, laughable, Beer, Sprite, plain boiled water >, then show that the user most likes white wine, what is least liked is plain boiled water.With mobile Internet and The mobile terminals such as the fast development of the information technologies such as cloud computing and smart phone become increasingly popular, and users pass through mobile device Application program enjoys their ordering of optimization preference data sharing to various data collectors (for example, service provider) Way by personalized service is commonplace.On the other hand, for service provider, in order to provide better user New revenue opportunity is experienced and creates, it is also essential for collecting and analyzing the ordering of optimization preference data of user.However, user The personal information of extreme sensitivity is usually contained in ordering of optimization preference data, data collector directly collects these data and may result in Serious individual privacy leakage problem.
Fig. 1 is the schematic diagram of a scenario that current user preference data is collected.The scene relates generally to user's (i.e. contribution data Person) and two kinds of roles of data collector, give an item collection χ={ x being made of d item1, x2..., xd, each user ui (1≤i≤n) respectively possesses an ordering of optimization preference data σi=< σi(1), σi(2) ..., σi(d) >, and between user mutually solely It is vertical.Wherein, σi(j)=xkRepresent xkIn σiIn ranking be j.Data collector is received using data collection platform and by network Collect the ordering of optimization preference data of each user, to obtain ordering of optimization preference data set, that is, constructs the mould of ordering of optimization preference data Type.New ordering of optimization preference data can be generated by the model, the new ordering of optimization preference data and user generated by model are original partially Good sorting data has identical statistical property, meanwhile, and original ordering of optimization preference data are not directly given, it protects to a certain extent Privacy of user is protected.Data collector can directly be analyzed using collected obtained ordering of optimization preference data model, can also To give the new ordering of optimization preference data opening of the model or generation to third party (for example, research institution).
By above-mentioned processing as it can be seen that being can be to avoid model and new preference during user preference sorting data is collected The user of sorting data obtains privacy of user, but before forming ordering of optimization preference data model, the preference privacy number of user According to what still may be leaked.Specifically, for each user, there is also following Three roles to cause prestige to its privacy The side of body: 1) data collector;2) other users;3) any potential attacker in addition to data collector and other users.
The ordering of optimization preference data collection techniques of secret protection are to solve ordering of optimization preference data collection bring individual privacy to let out Dew problem provides a kind of feasible scheme.Local difference privacy technology (Local Differential proposed in recent years It Privacy is) a kind of exclusively for the difference privacy technology for solving the proposition of individual privacy leakage problem caused by data collection.Especially Ground, technical requirements contribution data person add suitable noise into the data that it possesses first, then will contain noise again Data are sent to data collector, to realize the secret protection to data contributor.
Currently, the data gathering problem for meeting local difference privacy is studied there are a few thing.Wherein, it is based on information theory, Duchi et al. proposes a kind of high dimensional data collection method that task is minimized towards mean value computation and statistical risk.By right This method is extended, and is based on sampling technique,Et al. propose a kind of data collection side referred to as Harmony Method.Particularly, to each high dimensional data, this method is randomly chosen certain dimension of the data, if the dimension is corresponding Be continuous data, then be collected based on Duchi et al. proposed method;If the corresponding dimension is discrete type number According to being then collected using SH mechanism.In order to obtain the frequent episode of multidimensional data, Qin etc. proposes one kind and is referred to as The two stages method of data capture of LDPMiner.In the first stage, this method is based on SH mechanism, primarily determines from noise data The candidate spatial of frequent episode;In second stage, this method is based on RAPPOR mechanism and obtains accurate frequent episode.Based on EM (Expectation Maximization) algorithm, Fanti et al. propose a kind of RAPPOR mechanism of extension.The mechanism is assumed Each dimension of high dimensional data is mutually indepedent, and collects each dimension data using RAPPOR mechanism, using these data as EM algorithm The Joint Distribution to infer overall data is inputted, so as to the generation for initial data.However, when data dimension is higher When, not only time complexity is high but also convergence rate is slow for the mechanism.For this problem, by the way that EM algorithm and Lasso are returned phase In conjunction with Ren et al. proposes a kind of new method, and this method can increase substantially the effect of method proposed in RAPPOR mechanism Rate.
Directly local difference privacy algorithm is applied in ordering of optimization preference data, specific calculation may is that hypothesis is deposited In the valued space that one is made of all possible ordering of optimization preference data, then the ordering of optimization preference data of each user are regarded as A discrete value in the valued space, finally directly with the one-dimensional multivalue type data collection side for meeting local difference privacy Method, including RAPPOR, SH and OLH algorithm are collected the data.However, the data after conversion have huge value empty Between, for given item collection χ={ x1, x2..., xd, then the valued space size of the data after converting is d!.Therefore, these Algorithm can make to cause the ordering of optimization preference data finally obtained unavailable containing a large amount of noises in collected data.
Summary of the invention
The application provides a kind of ordering of optimization preference method of data capture for meeting local difference privacy, realizes using this method Ordering of optimization preference data collection can be with higher in the ordering of optimization preference data for guaranteeing to guarantee to collect while avoiding privacy leakage Data effectiveness.
To achieve the above object, the application adopts the following technical scheme that
A kind of ordering of optimization preference method of data capture meeting local difference privacy, comprising:
Data collection platform is by primary vector setIn institute directed quantity zjIt is initialized as 0 vector, and in preset preference Each user terminal u is directed in item collectioniSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is user Terminal index, j are the preference entry indexes that preferences are concentrated;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn all categories The tuple t of propertyi[Aj] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, benefit With tuple ti[Aj] generate value subscriptIt is sent to the data collection platform;Wherein, AjIt is attribute setIn j-th Attribute, ti[Aj] indicate tiMiddle AjValue, the attribute number in tuple is equal to of the preferences in the ordering of optimization preference data Number, attribute and preferences correspond, and the value of each attribute is equal to the ranking of corresponding preferences;The value subscript meets item PartK ∈ 1,2 ..., | dom (Aj) |, I (ti[Aj] Indicate ti[Aj] in dom (Aj) in index, dom (Aj) indicate attribute AjValued space;
The data collection platform is sent using each user terminalIt will be in the primary vector setValue Add 1;
The data collection platform is by each value z of institute's directed quantity in the primary vector setj[k] is updated toWherein,The ε ' is preset the One privacy budget;
The data collection platform is determined according to the primary vector setWithAnd the utilization primary vector set,WithCalculate the preference All triples in item collectionMutual informationAnd construct K-thin chainIt is sent to each use Family terminal;
Data collection platform is by secondary vector Ji TaiIn institute directed quantity zj' it is initialized as 0 vector, and preset inclined Each user terminal u is directed in good item collectioniSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is to use Family terminal index, j are the preference entry indexes that preferences are concentrated, and be the preference entry index that selects of different user terminals are identical or not Together;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn all categories The tuple t of propertyi′[Aj'] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, Utilize tuple ti′[Aj'] generate value subscriptIt is sent to the data collection platform;Wherein, attribute setIncluding two SubsetWith Correspond toLeaf item collection set Correspond toInside item collection set, the value subscript meets condition
The data collection platform is sent using each user terminalIt will be in the secondary vector set Value plus 1;
The data collection platform is by each value z of institute's directed quantity in the secondary vector setj' [k] is updated toWherein,The ε " is default The second privacy budget, ε '+ε "=ε, ε are the overall privacy budget for establishing RI model;
It is obtained according to the secondary vector set describedLeaf node distributed intelligence and internal node distribution letter Breath;
Using the RI model of the distributed intelligence of the distributed intelligence and internal node that include the leaf node, preference row is generated Ordinal number evidence.
Preferably, this method further comprises: according to the mutual information of the triple, describedLeaf node distribution The distributed intelligence of information and internal node generates the ordering of optimization preference data of specified quantity.
Preferably, the data collection platform is determined according to the primary vector setInclude:
For eachIn distributionConstruct a Lasso regression model Wherein,It is the column vector that a length is 2d, its storage and distributionWithInformation,It is one A size is the binary matrix of 2d × d (d-1),It is the column vector that a length is d (d-1), for storing joint point ClothInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine Joint DistributionFurther according to Joint DistributionIt calculates
Preferably, the data collection platform is determined according to the primary vector setInclude:
For eachIn distributionConstruct a Lasso regression model Wherein,It is the column vector that a length is (d+2), its storage and distributionWithInformation,It is the binary matrix that a size is (d+2) × 2d,It is that a length is The column vector of 2d, for storing Joint DistributionInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine joint point Cloth
Preferably,
As seen from the above technical solution, in the application, user terminal is using Rule I and Rule II to ordering of optimization preference data It is converted, and is sent to data collection platform, data collection platform and user terminal after noise is added in the data after conversion Cooperate the algorithm realized and meet local difference privacy, and completes entire RI model construction, recycles the RI model of foundation raw At the ordering of optimization preference data for meeting local difference privacy.By the above method, can while guaranteeing to avoid privacy leakage guarantor Demonstrate,prove the ordering of optimization preference data data effectiveness with higher collected.
Detailed description of the invention
Fig. 1 is the schematic diagram of a scenario that current user preference data is collected;
Fig. 2 is 2-thin chain example schematic;
Fig. 3 is the performance comparison schematic diagram one in the application;
Fig. 4 is the performance comparison schematic diagram two in the application;
Fig. 5 is the performance comparison schematic diagram three in the application.
Specific embodiment
In order to which the purpose, technological means and advantage of the application is more clearly understood, the application is done below in conjunction with attached drawing It is further described.
It can not applied to data after ordering of optimization preference data in order to solve the local difference privacy methods mentioned in background technique With the problem of, applicant proposed the ordering of optimization preference data algorithms (SAFARI algorithm) for meeting local difference privacy.This method Main thought is that data collector collects and a series of small taken according to what riffle independent model (RI model) selected The distributed intelligence of value spatially, using the distributed intelligence on the small valued space of collection come the entirety of approximate ordering of optimization preference data point Cloth establishes model, and generates ordering of optimization preference data using the model established.What it is due to SAFARI algorithm process is multiple small take It is worth space rather than a big valued space, so, the scale of noise can be greatly reduced in it.
The processing of the application is described in detail below.
Currently, carrying out modeling to ordering of optimization preference data can be using RI model, RI model can be according to ordering of optimization preference number According to the mutually exclusive property between each dimension, using relative order distribution (Relative Ranking Distributions) and hand over The product of fork distribution (Interleaving Distributions) two kinds of low-dimensionals distribution carrys out the entirety of approximate ordering of optimization preference data Distribution, to effectively be modeled to ordering of optimization preference data.The model established is recycled to generate new ordering of optimization preference data, from And it realizes privacy of user and protects.
The structure of RI model is the binary tree for being referred to as K-thin chain.Wherein, the original item collection of root nodes stand, The Son item set of the original item collection of other node on behalf, and the item collection size of leaf node is no more than constant K.Fig. 2 is a 2- The example of thin chain.
In this example, original item collection { laughable, white wine, Sprite, plain boiled water, beer } is first subdivided into two mutual not phases Hand over and have the Son item set of Riffle Independent relationship, i.e. { plain boiled water } and { laughable, white wine, Sprite, beer }.Due to The size of Son item set { laughable, white wine, Sprite, beer } has been more than 2 (i.e. the values of K), which is further divided into mutually not Intersect and have the Son item set { laughable, Sprite } and { white wine, beer } of Riffle Independent relationship.
The learning process of RI model includes two stages of Structure learning and parameter learning:
1) Structure learning.The computational item mutual information of concentrating all triples first, is defined as follows:
Give an item collection (x1, x2..., xd), for any one triple in the item collectionIts In,WithIt is three items different in item collection, the mutual information of the triple is
Wherein,Indicate itemRanking,It is a binary variable.Particularly,It representsThat is, itemRanking in itemBefore;Generation TableThat is, itemRanking in itemLater.
Then according to the mutual information of triple, K-thin chain is constructed in original item collection with anchor point algorithm.
2) parameter learning.According to the K-thin chain constructed, learn the distribution of each node, carrys out approximate original preference The overall distribution of sorting data collection.Wherein, the distribution of leaf node is referred to as relative order distribution (Relative Ranking Distributions), the distribution of internal node (including root node) is referred to as cross-distribution (Interleaving Distributions)。
Relative order distribution and cross-distribution are determined by the above process, also just complete the modeling of RI model.
The ordering of optimization preference method of data capture of the application is namely based on RI model, and is generated partially according to the RI model of foundation Good sorting data.The only acquisition of the acquisition of triple mutual information and relative order distribution and cross-distribution in modeling process It is all satisfied local difference privacy.
The ordering of optimization preference data algorithm (SAFARI algorithm) for meeting local difference privacy in the application is related to two rules Rule I, Rule II and SAFA algorithm, can specifically include 5 stages:
Stage 1
1. each user converts the ordering of optimization preference data of oneself according to a transformation rule (being denoted as Rule I), from And data collection platform is enable to obtain distributed intelligence required for calculating triple mutual information.Content about distributed intelligence exists Extended meeting is discussed in detail afterwards.
Stage 2
1. data collection platform uses the privacy budget of ε ', cooperation of the SAFA algorithm by user is called, from user in rank Distributed intelligence required for calculating triple mutual information is collected in data after 1 transfer of section.Wherein, ε ' is for characterizing secret protection Intensity, in SAFA algorithm, noise is added in the data after conversion by user, data collection platform is then then forwarded to, to keep away Exempt to reveal privacy.
2. data collection platform utilizes collected distributed intelligence, triple mutual information all in RI model is calculated.
3. data collection platform constructs K-thin chain
4. data collection platform willIssue each user.
Stage 3
1. each user turns the ordering of optimization preference data of oneself according to another transformation rule (being denoted as Rule II) Change, thus make data collection platform can determine aboutRelative order be distributed (Relative Ranking Distributions) information and cross-distribution (Interleaving Distributions) information.
Stage 4
1. data collector uses the privacy budget of ε ", cooperation of the SAFA algorithm by user is called, from user in the stage 2 In data after middle conversion collect aboutSequence distributed intelligence and cross-distribution information.Wherein, ε " is for characterizing privacy guarantor Intensity is protected, in SAFA algorithm, noise is added in the data after conversion by user, it is then then forwarded to data collection platform, thus Avoid leakage privacy.
So far, after obtaining sequence distributed intelligence and cross-distribution information, the building of RI model is just completed.
Stage 5
According to the riffle independent model of building, data collector generates the new ordering of optimization preference data of n item.
The ordering of optimization preference data RI for meeting local difference privacy can be realized by the processing in above-mentioned 1~stage of stage 4 Modeling.Data collection platform can be by the RI model development of completion to third party, alternatively, preferably to provide preference to third party Sorting data, it is preferable that new ordering of optimization preference data mining further can also be generated to third party by the processing in stage 5.
In addition, stage 2 and stage 4 are divided into two parts the place for completing local difference privacy in above-mentioned method of data capture Reason, therefore, under the premise of privacy budget is ε on the whole, ε '+ε "=ε, it is preferable that in practical applications, usually take
Below we by Rule I used in the method for data capture for introducing above-mentioned satisfaction local difference privacy respectively, Rule II and SAFA method.
Design Rule I
As previously mentioned, Rule I is for converting the ordering of optimization preference data of user, the data after the conversion are based on The mutual information of triple is calculated, therefore, the design of Rule I needs to carry out according to the calculation of triple mutual information.Specifically, It is defined according to the mutual information of triple, in order to calculate any one possible triple (xi, xj, xk) mutual information, data collection Platform needs to collect the distributed intelligence of three types:
In order to complete this task, a kind of intuitive method for transformation is that each user is allowed to carry out his ordering of optimization preference data Conversion, to provide the information being distributed about these three types.In particular, each user converts his ordering of optimization preference data to One tuple comprising multiple attributes, wherein each attribute corresponds toIn one distribution.
However, this method for transformation can make the redundancy comprising amount in the data after user's conversion, and increases and turn Change complexity, because
In fact, data collector only needs to collectIn distribution, then therefrom deriveWithThe information of middle distribution. Therefore, each user only needs to convert his ordering of optimization preference data, to provideThe information of middle distribution.Due toIn include O(d3) a different distribution, the number of attributes in tuple after each user's conversion is O (d3)。
Unfortunately, when d is relatively large, due to dimension disaster, such transform mode, which will lead to, is meeting LDP's Under the conditions of, it include a large amount of noise in data collected by data collector.In order to solve this problem, we devise Rule I.According to the transformation rule, each user only needs to convert his ordering of optimization preference data, to provideMiddle distribution Information.Data collection platform only needs to collectIn distribution, then therefrom estimated using regression modelWithThe letter of middle distribution Breath.Particularly, it is found by the applicant that the estimation problem is that sparse linear returns (sparse linear regression) problem.Cause This, selection can effectively solve the Lasso regression model of the problem.The details of Rule I is introduced separately below and how to be utilized The estimation of Lasso regression modelWithThe information of middle distribution.
Rule I: each user terminal uiFirst by the ordering of optimization preference data σ of corresponding useriIt is converted into one and includes attribute SetThe tuple t of middle all propertiesi.Wherein,In each attribute AjCorresponding to an itemAjValued space be dom (Aj)={ 1,2 ..., d }.dom(Aj) be made of d possible values, these values represent The possible absolute ranking having.Then, forIn each attribute Aj, each user uiAccording to σiTo ti[Aj] assigned Value.
Distribution estimation is carried out using Lasso, is determinedWithBased on what is be collected intoIn distribution, data collection Platform can be estimated as followsWithThe information of middle distribution.
Firstly, data collection platform fromDistribution in estimateIn distributed intelligence.In particular, for eachIn distributionData collector constructs a Lasso regression modelWherein,
1)It is the column vector that a length is 2d, its storage and distributionWithInformation;
2)It is the binary matrix that a size is 2d × d (d-1);
3)It is the column vector that a length is d (d-1), it is used to store Joint DistributionLetter Breath.
By solving the Lasso regression model with minimum angle homing method, data collection platform can be estimated To obtain Joint DistributionInformation.According to Joint DistributionData collection platform can be counted Calculate distributionInformation.
Then, data collector fromWithDistribution in estimateIn distributed intelligence.In particular, for each It is aIn distributionData collector constructs a Lasso regression model Wherein,
1)It is the column vector that a length is (d+2), its storage and distributionWithLetter Breath;
2)It is the binary matrix that a size is (d+2) × 2d;
3)It is the column vector that a length is 2d, it is used to store Joint DistributionLetter Breath.
Similarly, by solving the Lasso regression model with minimum angle homing method, data collector can estimateTo obtain Joint DistributionInformation.
Need exist for explanation a bit, in above-mentioned Rule I,In each attribute AjIt is corresponded with preferences, the two Corresponding relationship is consistent in data collection platform and subscriber terminal side needs.That is, data collection is flat in SAFARI Platform and each user terminal need to guarantee to run identical Rule I.In addition, by above-mentioned processing as it can be seen that in the transformation rule, The number of attributes for including in tuple after each user's conversion is O (d).For attribute setIn each attribute Aj, it Valued space size | dom (Aj) | it is only d, it is evident that this value is much smaller than d!.Gathered by estimation.In each attribute The frequency of any one possible value, data collector can obtainIn distributed intelligence, then estimate accordinglyWith The information of middle distribution.Such processing will not generate mass of redundancy data, so as to effectively improve the effect of user preference data With.
Design Rule II
In building K-thin chainLater, data collection platform need collect aboutRelative order distribution (Relative Ranking Distributions) and cross-distribution (Interleaving Distributions).For this purpose, Devise Rule II.According to the transformation rule, each user terminal only needs the ordering of optimization preference data for corresponding to user to it to turn Change, to provide the information being distributed about both types.
Rule II: each user terminal uiThe ordering of optimization preference data σ of user is corresponded to firstiIt is converted into one Include attribute setThe tuple t of middle all propertiesi.In particular, attribute setBy two subsetsWithIt constitutesWherein,
1)
Correspond toLeaf item collection Ji Tai It isA subset, by only include an item leaf item Collection is constituted.Because aboutIn each leaf item collection relative order distribution be easy to be pushed off out, so users are not required to It provides relatedInformation.
In each attribute AjCorresponding to setIn a leaf item collection lk。AjValued space by owning About lkRelative order constitute.Particularly, when K is 1,In all leaf item collection only include an item, at this point,Wherein, K is indicatedThe most item numbers for including of middle period Son item set.
2)
In each attribute AjCorresponding in internal item collection setAn internal item collection gk。AjValued space by All about gkTranslocation sorting constitute.Then, forIn each attribute Aj, each user uiAccording to σiTo ti[Aj] carry out Assignment.
Need exist for explanation a bit, in above-mentioned Rule II,In each attribute AjLeaf item or internal item one in One is corresponding, and the corresponding relationship of the two is consistent in data collection platform and subscriber terminal side needs.That is, in SAFARI In, data collection platform and each user terminal need to guarantee to run identical Rule II.In addition, by above-mentioned processing as it can be seen that In the transformation rule, the number of attributes for including in the tuple after each user's conversion is O (d).ForIn each attribute Aj, its maximum valued space size is K!;ForIn each attribute Aj, its maximum valued space size isTherefore attribute setIn the maximum value space of any attribute beIt is obvious that this value Much smaller than d!.By estimating Ji TaiIn each attribute any one possible value frequency, data collector can obtain AboutDistributed intelligence.Such processing will not generate mass of redundancy data, so as to effectively improve user preference data Effectiveness.
SAFA method
In the data collection process of aforementioned the application, stage 2 and stage 4 require the data after converting to user terminal SAFA processing is carried out, SAFA processing is just discussed in detail here.
In order to collect building RI model needed for distributed intelligence, data collection platform needs estimate under conditions of meeting LDP The frequency of any one possible value of each attribute in tuple after counting user's conversion.
Data collection platform can call directly the current state-of-the-art method that multiattribute data is analyzed at LDP --- Harmony method, to complete this task.Particularly, forIn each attribute Aj, data collector is by AjValued space Being mapped as a size is | dom (Ai)|×|dom(Aj) | binary matrix Φj.Then, for each user ui, data receipts Collection person is from setIt is randomly chosen an attribute and (is assumed to be Ar), and SH algorithm [11] is called to collect uiTuple tiMiddle Ar Value.
It is observed that either using Rule I or Rule II, the attribute set after conversionIn all properties Valued space size be much smaller than d!.However, the attribute small for valued space, Harmony method is still by each dimension Valued space be mapped in a matrix, result in collected data and contain unnecessary noise, especially handle When binary attribute.Have in document and points out, when estimating the discrete value frequency of smallest number, generalized The effect of randomized response algorithm is best.Therefore, it is proposed that a kind of new LDP algorithm, entitled Sampling Randomizer for Multiple Attributes (SAFA), under conditions of meeting LDP, more accurately to small The multiattribute data of valued space carries out Frequency Estimation.The main thought of this method is that each user terminal randomly chooses one Then attribute disturbs the value of the attribute with generalized randomized response algorithm, and will disturb Dynamic result is sent to data collection platform.
As previously mentioned, need to be handled using SAFA algorithm in stage 2 and stage 4, the SAFA algorithm be need by with Family terminal and data collection platform carry out the process of cooperation completion.Stage 2 is identical with the SAFA algorithm that the stage 4 is applied, and only leads to It is different to cross the distributed intelligence that SAFA algorithm to be obtained, uniformly introduces the processing of SAFA algorithm below.
Detailed process is as follows by SAFA:
1. data collection platform initialization vector setIn all vector, i.e., all values in each vector are assigned to 0;Here, for stage 2, vector setIt is exactlyFor stage 4, vector setIt is exactly relative order distributed collection and friendship Pitch the intersection of distributed collection;
2. being directed to each user terminal uiIt performs the following operations:
When 2.1. data collection platform is converted from Rule I or Rule IIOne rope of middle random selection Draw j;
2.2. j is sent to u by data collection platformi
2.3.uiThe value subscript for having noise is generated, is denoted asSo that
Wherein k ∈ 1,2 ..., | dom (Aj)|};
2.4.uiIt willIt is sent to data collector;
2.5. data collector willValue increase by 1;
After having executed aforesaid operations to all user terminals, following operation is continued to execute:
3. for setEach of vector zjExecute following processing:
3.1. probability is arranged in data collector
3.2. probability is arranged in data collector
3.3. by vector zjEach of value zj[k] is updated to
Above-mentioned is the specific processing of SAFA method.It is hidden that SAFA method to illustrate in the application can satisfy local difference Theoretic proof is given below in private.
Theorem: for any user ui, privacy budget ε ', SAFA meet ε '-LDP.
It proves:
It is defined by LDP, the tuple t different for any twoi, t 'i, arbitrarilyWherein It is the property index selected by data collector, it would be desirable to prove
Because j be it is randomly selected,
We discuss (1) in all possible 4 kinds of situations.
Situation 1: ifAnd
Situation 2: ifAnd
Situation 3: ifAnd
Situation 4: ifAnd
In conclusionIt sets up.Therefore, conclusion must be demonstrate,proved.
Above by form analysis, the ordering of optimization preference data collection algorithm of the local difference privacy of satisfaction in the application (SAFARI) local difference privacy can be met to each user guaranteeing algorithm, while guarantees number collected by data collector According to data effectiveness with higher.
Here, the ordering of optimization preference method of data capture that local difference privacy is met in the application above is summarized as follows:
1, data collection platform is by primary vector setIn institute directed quantity zjIt is initialized as 0 vector, and preset inclined Each user terminal u is directed in good item collectioniSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is to use Family terminal index, j are the preference entry indexes that preferences are concentrated;
2, for each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn own The tuple t of attributei[Aj] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, Utilize tuple ti[Aj] generate value subscriptIt is sent to the data collection platform;Wherein, AjIt is property set platformIn jth A attribute, the attribute number in tuple are equal to the number of the preferences in the ordering of optimization preference data, and attribute and preferences are one by one Corresponding, the value of each attribute is equal to the ranking of corresponding preferences;The value subscript meets condition
3, data collection platform is sent using each user terminalIt will be in the primary vector setValue adds 1;
4, data collection platform is by each value z of institute's directed quantity in the primary vector setj[k] is updated to Wherein,The ε ' is that preset first privacy is pre- It calculates;
5, data collection platform according to primary vector set (namely) determine WithAnd utilization primary vector set,WithCalculate the preferences Concentrate all triplesMutual informationAnd construct K-thin chainIt is sent to each user Terminal;
6, data collection platform is by secondary vector setIn institute directed quantity zj' it is initialized as 0 vector, and preset Preferences, which are concentrated, is directed to each user terminal uiSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is User terminal index, j are the preference entry indexes that preferences are concentrated, be the preference entry index that selects of different user terminals to be identical or It is different;
7, for each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn own The tuple t of attributei′[Aj'] assignment is carried out, and the preference entry index of the user terminal is sent to according to the data collection platform J utilizes tuple ti′[Aj'] generate value subscriptIt is sent to the data collection platform;Wherein, attribute setIncluding two A subsetWith Correspond toLeaf item collection set Correspond toInside item collection set, the value subscript meets Condition
8, data collection platform is sent using each user terminalIt will be in the secondary vector setValue Add 1;
9, data collection platform is by each value z of institute's directed quantity in the secondary vector setj' [k] is updated toWherein,The ε " is default The second privacy budget, ε '+ε "=ε, ε are the overall privacy budget for establishing RI model;
10, it is obtained according to secondary vector set (the namely intersection of relative order distributed collection and translocation sorting distributed collection) It arrivesLeaf node distributed intelligence and internal node distributed intelligence.
11, using the RI model of foundation, new ordering of optimization preference data are generated.Wherein, RI model isLeaf node point The distributed intelligence of cloth information and internal node.
In above-mentioned method of data capture, user terminal is to t in the processing of step 1 and step 2i[Aj] carry out assignment processing It can be and executed with random order, user terminal is to t in the processing of step 6 and step 2i′[Aj'] processing that carries out assignment can be with It is to be executed with random order.
It compares followed by with RAPPOR, SH with OLH, the SAFARI method of the application proposition can be determined in data There is apparent advantage in the effectiveness of data collected by collecting platform.The advantages of in order to which the application method is better described, uses The limit first-order is distributed (Q1) and the limit second-order distribution (Q2) measure RAPPOR, SH, OLH and SAFARI tetra- The effectiveness of ordering of optimization preference data collected by a algorithm.Wherein, for the distribution of the limit first-order and second-order Limit distribution, we generate the L between the limit distribution of data and the distribution of initial data with algorithms of different1Distance is to measure The effectiveness for the data being collected into.That specifically tests is provided that we use two groups of true data set Sushi and Jester Test the performance of each method.The specific features of data are as shown in table 2 in this two group data set.
2 data set features of table
Data set Number of users The quantity of item
Sushi 5,000 3~10
Jester 20,000 3~10
Illustrate the performance of SAFARI method below by analysis experimental data.
Firstly, measuring RAPPOR, SH, OLH using the distribution of the limit first-order and the distribution of the limit second-order With the performance of tetra- methods of SAFARI.Experimental result is as shown in Figure 3.
From figure 3, it can be seen that in different data sets, as privacy budget becomes larger, RAPPOR, SH, OLH and SAFARI The limit point of the distribution of the limit first-order and the distribution of the limit second-order and raw data set of the data that algorithm generates L between cloth1Distance reduces, but the test result of SAFARI algorithm is consistently less than RAPPOR, SH and OLH.This is because: a side Face, for SAFARI algorithm, K-thin chain makes data collector with correlation distribution information collected by SAFARI Accuracy have extraordinary robustness, influenced by noise be added smaller;On the other hand, for RAPPOR, SH and OLH Algorithm, when privacy parameters reduce, they can introduce a large amount of noise.
Then, we test the validity of Rule I using data set Sushi and Jester.For this purpose, we are by it and separately The Rule I (being denoted as Rule I*) of one version is compared.In Rule I*, each user by his ordering of optimization preference data into Row conversion, directly to provideThe information of middle distribution.We allow data collector with SAFA method respectively from user according to Distributed intelligence is collected in the data of Rule I and Rule I* conversion, and the S of its acquisition is presented3The average L of middle distribution1Distance.It is real It is as shown in Figure 4 to test result.
From fig. 4, it can be seen that when d is no more than 4, Rule I* will lead to better effect for different data sets With.This is because Lasso returns the bring advantage enemy only brought influence of information loss when d is smaller.So And when d is relatively large, Rule I will lead to fairly good as a result, such demonstrate the superiority of Rule I.
Finally, we utilize the validity of data set Sushi and Jester testing SA FA algorithm.For this purpose, we by it with Harmony method compares.We allow data collector that SAFA and Harmony method is used to be turned from user according to Rule I respectively S is collected in the data of change1In distributed intelligence, and present its obtain distribution average L1Distance.Experimental result is as shown in Figure 5.
From fig. 5, it can be seen that being made an uproar for different data sets using what distributed intelligence collected by SAFA method contained Volume is smaller.This is because when the valued space of attribute is smaller, by the valued space of each attribute in Harmony algorithm The process for being mapped to a matrix can introduce unnecessary noise.
By above-mentioned every test as it can be seen that the ordering of optimization preference data collection that the method for data capture of the application is realized, Neng Gou Guarantee to avoid to guarantee the ordering of optimization preference data data effectiveness with higher collected while privacy leakage.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims (5)

1. a kind of ordering of optimization preference method of data capture for meeting local difference privacy characterized by comprising
Data collection platform is by primary vector setIn institute directed quantity zjIt is initialized as 0 vector, and in preset preference item collection In be directed to each user terminal uiSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is user terminal Index, j are the preference entry indexes that preferences are concentrated;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setMiddle all properties Tuple ti[Aj] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, utilize member Group ti[Aj] generate value subscriptIt is sent to the data collection platform;Wherein, AjIt is attribute setIn j-th of attribute, ti[Aj] indicate tiMiddle AjValue, the attribute number in tuple is equal to the number of the preferences in the ordering of optimization preference data, belongs to Property corresponded with preferences, the value of each attribute is equal to the ranking of corresponding preferences;The value subscript meets condition I(ti[Aj] table Show ti[Aj] in dom (Aj) in index, dom (Aj) indicate attribute AjValued space;
The data collection platform is sent using each user terminalIt will be in the primary vector setValue plus 1;
The data collection platform is by each value z of institute's directed quantity in the primary vector setj[k] is updated toWherein,The ε ' is preset the One privacy budget;
The data collection platform is determined according to the primary vector setWithAnd the utilization primary vector set,WithCalculate the preference All triples in item collectionMutual informationAnd construct K-thin chainIt is sent to each User terminal;
Data collection platform is by secondary vector setIn institute directed quantity zj' is initialized as 0 vector, and in preset preferences It concentrates and is directed to each user terminal uiSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is user's end End index, j are the preference entry indexes that preferences are concentrated, and be the preference entry index that different user terminals select are identical or different;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setMiddle all properties Tuple ti′[Aj'] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, it utilizes Tuple ti′[Aj'] generate value subscriptIt is sent to the data collection platform;Wherein, attribute setIncluding two subsetsWith Correspond toLeaf item collection set Correspond toInside item collection set, the value subscript meets condition
The data collection platform is sent using each user terminalIt will be in the secondary vector setValue adds 1;
The data collection platform is by each value z of institute's directed quantity in the secondary vector setj' [k] is updated toWherein,The ε " is default The second privacy budget, ε '+ε "=ε, ε are the overall privacy budget for establishing RI model;
It is obtained according to the secondary vector set describedLeaf node distributed intelligence and internal node distributed intelligence;
Using the RI model of the distributed intelligence of the distributed intelligence and internal node that include the leaf node, ordering of optimization preference number is generated According to.
2. the method according to claim 1, wherein this method further comprises: according to the mutual of the triple It is information, describedLeaf node distributed intelligence and internal node distributed intelligence, generate the ordering of optimization preference number of specified quantity According to.
3. method according to claim 1 or 2, which is characterized in that the data collection platform is according to the primary vector Set determinesInclude:
For eachIn distributionConstruct a Lasso regression model Wherein,It is the column vector that a length is 2d, its storage and distributionWithInformation,It is one A size is the binary matrix of 2d × d (d-1),It is the column vector that a length is d (d-1), for storing joint point ClothInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine Joint DistributionFurther according to Joint DistributionIt calculates
4. method according to claim 1 or 2, which is characterized in that the data collection platform is according to the primary vector Set determinesInclude:
For eachIn distributionConstruct a Lasso regression model Wherein,It is the column vector that a length is (d+2), its storage and distributionWithInformation,It is the binary matrix that a size is (d+2) × 2d,It is that a length is The column vector of 2d, for storing Joint DistributionInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine Joint Distribution
5. method according to claim 1 or 2, which is characterized in that
CN201811079995.XA 2018-09-17 2018-09-17 Preference sorting data collection method meeting local differential privacy Active CN109299436B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811079995.XA CN109299436B (en) 2018-09-17 2018-09-17 Preference sorting data collection method meeting local differential privacy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811079995.XA CN109299436B (en) 2018-09-17 2018-09-17 Preference sorting data collection method meeting local differential privacy

Publications (2)

Publication Number Publication Date
CN109299436A true CN109299436A (en) 2019-02-01
CN109299436B CN109299436B (en) 2021-10-15

Family

ID=65163261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811079995.XA Active CN109299436B (en) 2018-09-17 2018-09-17 Preference sorting data collection method meeting local differential privacy

Country Status (1)

Country Link
CN (1) CN109299436B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443063A (en) * 2019-06-26 2019-11-12 电子科技大学 The method of the federal deep learning of self adaptive protection privacy
WO2020177484A1 (en) * 2019-03-01 2020-09-10 华南理工大学 Localized difference privacy urban sanitation data report and privacy calculation method
CN111669366A (en) * 2020-04-30 2020-09-15 南京大学 Localized differential private data exchange method and storage medium
WO2020248150A1 (en) * 2019-06-12 2020-12-17 Alibaba Group Holding Limited Method and system for answering multi-dimensional analytical queries under local differential privacy
CN112329056A (en) * 2020-11-03 2021-02-05 石家庄铁道大学 Government affair data sharing-oriented localized differential privacy method
CN112995076A (en) * 2019-12-17 2021-06-18 国家电网有限公司大数据中心 Discrete data frequency estimation method, user side, data center and system
CN113111383A (en) * 2021-04-21 2021-07-13 山东大学 Personalized differential privacy protection method and system for vertically-divided data
CN114091100A (en) * 2021-11-23 2022-02-25 北京邮电大学 Track data collection method and system meeting local differential privacy
WO2022107284A1 (en) * 2020-11-19 2022-05-27 日本電信電話株式会社 Concealment device, concealment method, and program
CN115098931A (en) * 2022-07-20 2022-09-23 江苏艾佳家居用品有限公司 Small sample analysis method for mining personalized requirements of indoor design of user

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140283091A1 (en) * 2013-03-15 2014-09-18 Microsoft Corporation Differentially private linear queries on histograms
CN105740245A (en) * 2014-12-08 2016-07-06 北京邮电大学 Frequent item set mining method
CN106991335A (en) * 2017-02-20 2017-07-28 南京邮电大学 A kind of data publication method based on difference secret protection
US20170316346A1 (en) * 2016-04-28 2017-11-02 Qualcomm Incorporated Differentially private iteratively reweighted least squares
CN107862219A (en) * 2017-11-14 2018-03-30 哈尔滨工业大学深圳研究生院 The guard method of demand privacy in a kind of social networks
CN107871087A (en) * 2017-11-08 2018-04-03 广西师范大学 The personalized difference method for secret protection that high dimensional data is issued under distributed environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140283091A1 (en) * 2013-03-15 2014-09-18 Microsoft Corporation Differentially private linear queries on histograms
CN105740245A (en) * 2014-12-08 2016-07-06 北京邮电大学 Frequent item set mining method
US20170316346A1 (en) * 2016-04-28 2017-11-02 Qualcomm Incorporated Differentially private iteratively reweighted least squares
CN106991335A (en) * 2017-02-20 2017-07-28 南京邮电大学 A kind of data publication method based on difference secret protection
CN107871087A (en) * 2017-11-08 2018-04-03 广西师范大学 The personalized difference method for secret protection that high dimensional data is issued under distributed environment
CN107862219A (en) * 2017-11-14 2018-03-30 哈尔滨工业大学深圳研究生院 The guard method of demand privacy in a kind of social networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苏炜航等: "一种基于隐树模型的满足差分隐私的高维数据发布算法", 《小型微型计算机系统》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020177484A1 (en) * 2019-03-01 2020-09-10 华南理工大学 Localized difference privacy urban sanitation data report and privacy calculation method
WO2020248150A1 (en) * 2019-06-12 2020-12-17 Alibaba Group Holding Limited Method and system for answering multi-dimensional analytical queries under local differential privacy
CN110443063A (en) * 2019-06-26 2019-11-12 电子科技大学 The method of the federal deep learning of self adaptive protection privacy
CN110443063B (en) * 2019-06-26 2023-03-28 电子科技大学 Adaptive privacy-protecting federal deep learning method
CN112995076B (en) * 2019-12-17 2022-09-27 国家电网有限公司大数据中心 Discrete data frequency estimation method, user side, data center and system
CN112995076A (en) * 2019-12-17 2021-06-18 国家电网有限公司大数据中心 Discrete data frequency estimation method, user side, data center and system
CN111669366A (en) * 2020-04-30 2020-09-15 南京大学 Localized differential private data exchange method and storage medium
CN112329056A (en) * 2020-11-03 2021-02-05 石家庄铁道大学 Government affair data sharing-oriented localized differential privacy method
CN112329056B (en) * 2020-11-03 2021-11-02 石家庄铁道大学 Government affair data sharing-oriented localized differential privacy method
WO2022107284A1 (en) * 2020-11-19 2022-05-27 日本電信電話株式会社 Concealment device, concealment method, and program
CN113111383B (en) * 2021-04-21 2022-05-20 山东大学 Personalized differential privacy protection method and system for vertically-divided data
CN113111383A (en) * 2021-04-21 2021-07-13 山东大学 Personalized differential privacy protection method and system for vertically-divided data
CN114091100A (en) * 2021-11-23 2022-02-25 北京邮电大学 Track data collection method and system meeting local differential privacy
CN114091100B (en) * 2021-11-23 2024-05-03 北京邮电大学 Track data collection method and system meeting local differential privacy
CN115098931A (en) * 2022-07-20 2022-09-23 江苏艾佳家居用品有限公司 Small sample analysis method for mining personalized requirements of indoor design of user
CN115098931B (en) * 2022-07-20 2022-12-16 江苏艾佳家居用品有限公司 Small sample analysis method for mining personalized requirements of indoor design of user

Also Published As

Publication number Publication date
CN109299436B (en) 2021-10-15

Similar Documents

Publication Publication Date Title
CN109299436A (en) A kind of ordering of optimization preference method of data capture meeting local difference privacy
Duan et al. JointRec: A deep-learning-based joint cloud video recommendation framework for mobile IoT
CN110503531A (en) The dynamic social activity scene recommended method of timing perception
CN105678590B (en) Cloud model-based topN recommendation method for social network
Xie et al. Accurate recovery of missing network measurement data with localized tensor completion
CN112182424A (en) Social recommendation method based on integration of heterogeneous information and isomorphic information networks
CN107220328A (en) The video recommendation method of weak relation and strong relation based on social networks
CN105956093A (en) Individual recommending method based on multi-view anchor graph Hash technology
CN108897789A (en) A kind of cross-platform social network user personal identification method
CN106095887A (en) Context aware Web service recommendation method based on weighted space-time effect
Li et al. An improved multilevel fuzzy comprehensive evaluation algorithm for security performance
CN105825430A (en) Heterogeneous social network-based detection method
CN107609469A (en) Community network association user method for digging and system
CN105843829A (en) Big data credibility measurement method based on layering model
CN111340187B (en) Network characterization method based on attention countermeasure mechanism
Li et al. Submodular maximization in clean linear time
Wang et al. A collaborative filtering recommendation algorithm based on item and cloud model
CN117391816A (en) Heterogeneous graph neural network recommendation method, device and equipment
Yang et al. Improving the recommendation of collaborative filtering by fusing trust network
CN115618127A (en) Collaborative filtering algorithm of neural network recommendation system
Pan et al. Large-scale expectile regression with covariates missing at random
CN113902113A (en) Convolutional neural network channel pruning method
Zhang et al. The nearest neighbor algorithm of filling missing data based on cluster analysis
Li et al. Leveraging reconstructive profiles of users and items for tag-aware recommendation
CN106777092A (en) The intelligent medical calling querying method of dynamic Skyline inquiries under mobile cloud computing environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant