CN109299436A - A kind of ordering of optimization preference method of data capture meeting local difference privacy - Google Patents
A kind of ordering of optimization preference method of data capture meeting local difference privacy Download PDFInfo
- Publication number
- CN109299436A CN109299436A CN201811079995.XA CN201811079995A CN109299436A CN 109299436 A CN109299436 A CN 109299436A CN 201811079995 A CN201811079995 A CN 201811079995A CN 109299436 A CN109299436 A CN 109299436A
- Authority
- CN
- China
- Prior art keywords
- data
- ordering
- data collection
- user terminal
- preference
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Algebra (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Computer Security & Cryptography (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Operations Research (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of ordering of optimization preference methods of data capture for meeting local difference privacy, user terminal converts ordering of optimization preference data using Rule I and Rule II, and data collection platform will be sent to after the data addition noise after conversion, data collection platform and user terminal, which cooperate, realizes the algorithm of the local difference privacy of satisfaction, and entire RI model construction is completed, the model based on building generates ordering of optimization preference data.By the above method, it can guarantee the ordering of optimization preference data data effectiveness with higher collected while guaranteeing to avoid privacy leakage.
Description
Technical field
This application involves data collection techniques, in particular to a kind of ordering of optimization preference data collection for meeting local difference privacy
Method.
Background technique
Ordering of optimization preference data are a kind of typical personal data.For a user, his ordering of optimization preference data are
Refer to user to given item collection (Item Set) according to the sequence of its own item provided to fancy grade every in item collection.Example
Such as, item collection is { laughable, white wine, Sprite, plain boiled water, beer }, certain user is < white wine to the ordering of optimization preference of this five items, laughable,
Beer, Sprite, plain boiled water >, then show that the user most likes white wine, what is least liked is plain boiled water.With mobile Internet and
The mobile terminals such as the fast development of the information technologies such as cloud computing and smart phone become increasingly popular, and users pass through mobile device
Application program enjoys their ordering of optimization preference data sharing to various data collectors (for example, service provider)
Way by personalized service is commonplace.On the other hand, for service provider, in order to provide better user
New revenue opportunity is experienced and creates, it is also essential for collecting and analyzing the ordering of optimization preference data of user.However, user
The personal information of extreme sensitivity is usually contained in ordering of optimization preference data, data collector directly collects these data and may result in
Serious individual privacy leakage problem.
Fig. 1 is the schematic diagram of a scenario that current user preference data is collected.The scene relates generally to user's (i.e. contribution data
Person) and two kinds of roles of data collector, give an item collection χ={ x being made of d item1, x2..., xd, each user ui
(1≤i≤n) respectively possesses an ordering of optimization preference data σi=< σi(1), σi(2) ..., σi(d) >, and between user mutually solely
It is vertical.Wherein, σi(j)=xkRepresent xkIn σiIn ranking be j.Data collector is received using data collection platform and by network
Collect the ordering of optimization preference data of each user, to obtain ordering of optimization preference data set, that is, constructs the mould of ordering of optimization preference data
Type.New ordering of optimization preference data can be generated by the model, the new ordering of optimization preference data and user generated by model are original partially
Good sorting data has identical statistical property, meanwhile, and original ordering of optimization preference data are not directly given, it protects to a certain extent
Privacy of user is protected.Data collector can directly be analyzed using collected obtained ordering of optimization preference data model, can also
To give the new ordering of optimization preference data opening of the model or generation to third party (for example, research institution).
By above-mentioned processing as it can be seen that being can be to avoid model and new preference during user preference sorting data is collected
The user of sorting data obtains privacy of user, but before forming ordering of optimization preference data model, the preference privacy number of user
According to what still may be leaked.Specifically, for each user, there is also following Three roles to cause prestige to its privacy
The side of body: 1) data collector;2) other users;3) any potential attacker in addition to data collector and other users.
The ordering of optimization preference data collection techniques of secret protection are to solve ordering of optimization preference data collection bring individual privacy to let out
Dew problem provides a kind of feasible scheme.Local difference privacy technology (Local Differential proposed in recent years
It Privacy is) a kind of exclusively for the difference privacy technology for solving the proposition of individual privacy leakage problem caused by data collection.Especially
Ground, technical requirements contribution data person add suitable noise into the data that it possesses first, then will contain noise again
Data are sent to data collector, to realize the secret protection to data contributor.
Currently, the data gathering problem for meeting local difference privacy is studied there are a few thing.Wherein, it is based on information theory,
Duchi et al. proposes a kind of high dimensional data collection method that task is minimized towards mean value computation and statistical risk.By right
This method is extended, and is based on sampling technique,Et al. propose a kind of data collection side referred to as Harmony
Method.Particularly, to each high dimensional data, this method is randomly chosen certain dimension of the data, if the dimension is corresponding
Be continuous data, then be collected based on Duchi et al. proposed method;If the corresponding dimension is discrete type number
According to being then collected using SH mechanism.In order to obtain the frequent episode of multidimensional data, Qin etc. proposes one kind and is referred to as
The two stages method of data capture of LDPMiner.In the first stage, this method is based on SH mechanism, primarily determines from noise data
The candidate spatial of frequent episode;In second stage, this method is based on RAPPOR mechanism and obtains accurate frequent episode.Based on EM
(Expectation Maximization) algorithm, Fanti et al. propose a kind of RAPPOR mechanism of extension.The mechanism is assumed
Each dimension of high dimensional data is mutually indepedent, and collects each dimension data using RAPPOR mechanism, using these data as EM algorithm
The Joint Distribution to infer overall data is inputted, so as to the generation for initial data.However, when data dimension is higher
When, not only time complexity is high but also convergence rate is slow for the mechanism.For this problem, by the way that EM algorithm and Lasso are returned phase
In conjunction with Ren et al. proposes a kind of new method, and this method can increase substantially the effect of method proposed in RAPPOR mechanism
Rate.
Directly local difference privacy algorithm is applied in ordering of optimization preference data, specific calculation may is that hypothesis is deposited
In the valued space that one is made of all possible ordering of optimization preference data, then the ordering of optimization preference data of each user are regarded as
A discrete value in the valued space, finally directly with the one-dimensional multivalue type data collection side for meeting local difference privacy
Method, including RAPPOR, SH and OLH algorithm are collected the data.However, the data after conversion have huge value empty
Between, for given item collection χ={ x1, x2..., xd, then the valued space size of the data after converting is d!.Therefore, these
Algorithm can make to cause the ordering of optimization preference data finally obtained unavailable containing a large amount of noises in collected data.
Summary of the invention
The application provides a kind of ordering of optimization preference method of data capture for meeting local difference privacy, realizes using this method
Ordering of optimization preference data collection can be with higher in the ordering of optimization preference data for guaranteeing to guarantee to collect while avoiding privacy leakage
Data effectiveness.
To achieve the above object, the application adopts the following technical scheme that
A kind of ordering of optimization preference method of data capture meeting local difference privacy, comprising:
Data collection platform is by primary vector setIn institute directed quantity zjIt is initialized as 0 vector, and in preset preference
Each user terminal u is directed in item collectioniSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is user
Terminal index, j are the preference entry indexes that preferences are concentrated;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn all categories
The tuple t of propertyi[Aj] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, benefit
With tuple ti[Aj] generate value subscriptIt is sent to the data collection platform;Wherein, AjIt is attribute setIn j-th
Attribute, ti[Aj] indicate tiMiddle AjValue, the attribute number in tuple is equal to of the preferences in the ordering of optimization preference data
Number, attribute and preferences correspond, and the value of each attribute is equal to the ranking of corresponding preferences;The value subscript meets item
PartK ∈ 1,2 ..., | dom (Aj) |, I (ti[Aj]
Indicate ti[Aj] in dom (Aj) in index, dom (Aj) indicate attribute AjValued space;
The data collection platform is sent using each user terminalIt will be in the primary vector setValue
Add 1;
The data collection platform is by each value z of institute's directed quantity in the primary vector setj[k] is updated toWherein,The ε ' is preset the
One privacy budget;
The data collection platform is determined according to the primary vector setWithAnd the utilization primary vector set,WithCalculate the preference
All triples in item collectionMutual informationAnd construct K-thin chainIt is sent to each use
Family terminal;
Data collection platform is by secondary vector Ji TaiIn institute directed quantity zj' it is initialized as 0 vector, and preset inclined
Each user terminal u is directed in good item collectioniSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is to use
Family terminal index, j are the preference entry indexes that preferences are concentrated, and be the preference entry index that selects of different user terminals are identical or not
Together;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn all categories
The tuple t of propertyi′[Aj'] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform,
Utilize tuple ti′[Aj'] generate value subscriptIt is sent to the data collection platform;Wherein, attribute setIncluding two
SubsetWith Correspond toLeaf item collection set Correspond toInside item collection set, the value subscript meets condition
The data collection platform is sent using each user terminalIt will be in the secondary vector set
Value plus 1;
The data collection platform is by each value z of institute's directed quantity in the secondary vector setj' [k] is updated toWherein,The ε " is default
The second privacy budget, ε '+ε "=ε, ε are the overall privacy budget for establishing RI model;
It is obtained according to the secondary vector set describedLeaf node distributed intelligence and internal node distribution letter
Breath;
Using the RI model of the distributed intelligence of the distributed intelligence and internal node that include the leaf node, preference row is generated
Ordinal number evidence.
Preferably, this method further comprises: according to the mutual information of the triple, describedLeaf node distribution
The distributed intelligence of information and internal node generates the ordering of optimization preference data of specified quantity.
Preferably, the data collection platform is determined according to the primary vector setInclude:
For eachIn distributionConstruct a Lasso regression model
Wherein,It is the column vector that a length is 2d, its storage and distributionWithInformation,It is one
A size is the binary matrix of 2d × d (d-1),It is the column vector that a length is d (d-1), for storing joint point
ClothInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine Joint DistributionFurther according to Joint DistributionIt calculates
Preferably, the data collection platform is determined according to the primary vector setInclude:
For eachIn distributionConstruct a Lasso regression model Wherein,It is the column vector that a length is (d+2), its storage and distributionWithInformation,It is the binary matrix that a size is (d+2) × 2d,It is that a length is
The column vector of 2d, for storing Joint DistributionInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine joint point
Cloth
Preferably,
As seen from the above technical solution, in the application, user terminal is using Rule I and Rule II to ordering of optimization preference data
It is converted, and is sent to data collection platform, data collection platform and user terminal after noise is added in the data after conversion
Cooperate the algorithm realized and meet local difference privacy, and completes entire RI model construction, recycles the RI model of foundation raw
At the ordering of optimization preference data for meeting local difference privacy.By the above method, can while guaranteeing to avoid privacy leakage guarantor
Demonstrate,prove the ordering of optimization preference data data effectiveness with higher collected.
Detailed description of the invention
Fig. 1 is the schematic diagram of a scenario that current user preference data is collected;
Fig. 2 is 2-thin chain example schematic;
Fig. 3 is the performance comparison schematic diagram one in the application;
Fig. 4 is the performance comparison schematic diagram two in the application;
Fig. 5 is the performance comparison schematic diagram three in the application.
Specific embodiment
In order to which the purpose, technological means and advantage of the application is more clearly understood, the application is done below in conjunction with attached drawing
It is further described.
It can not applied to data after ordering of optimization preference data in order to solve the local difference privacy methods mentioned in background technique
With the problem of, applicant proposed the ordering of optimization preference data algorithms (SAFARI algorithm) for meeting local difference privacy.This method
Main thought is that data collector collects and a series of small taken according to what riffle independent model (RI model) selected
The distributed intelligence of value spatially, using the distributed intelligence on the small valued space of collection come the entirety of approximate ordering of optimization preference data point
Cloth establishes model, and generates ordering of optimization preference data using the model established.What it is due to SAFARI algorithm process is multiple small take
It is worth space rather than a big valued space, so, the scale of noise can be greatly reduced in it.
The processing of the application is described in detail below.
Currently, carrying out modeling to ordering of optimization preference data can be using RI model, RI model can be according to ordering of optimization preference number
According to the mutually exclusive property between each dimension, using relative order distribution (Relative Ranking Distributions) and hand over
The product of fork distribution (Interleaving Distributions) two kinds of low-dimensionals distribution carrys out the entirety of approximate ordering of optimization preference data
Distribution, to effectively be modeled to ordering of optimization preference data.The model established is recycled to generate new ordering of optimization preference data, from
And it realizes privacy of user and protects.
The structure of RI model is the binary tree for being referred to as K-thin chain.Wherein, the original item collection of root nodes stand,
The Son item set of the original item collection of other node on behalf, and the item collection size of leaf node is no more than constant K.Fig. 2 is a 2-
The example of thin chain.
In this example, original item collection { laughable, white wine, Sprite, plain boiled water, beer } is first subdivided into two mutual not phases
Hand over and have the Son item set of Riffle Independent relationship, i.e. { plain boiled water } and { laughable, white wine, Sprite, beer }.Due to
The size of Son item set { laughable, white wine, Sprite, beer } has been more than 2 (i.e. the values of K), which is further divided into mutually not
Intersect and have the Son item set { laughable, Sprite } and { white wine, beer } of Riffle Independent relationship.
The learning process of RI model includes two stages of Structure learning and parameter learning:
1) Structure learning.The computational item mutual information of concentrating all triples first, is defined as follows:
Give an item collection (x1, x2..., xd), for any one triple in the item collectionIts
In,WithIt is three items different in item collection, the mutual information of the triple is
Wherein,Indicate itemRanking,It is a binary variable.Particularly,It representsThat is, itemRanking in itemBefore;Generation
TableThat is, itemRanking in itemLater.
Then according to the mutual information of triple, K-thin chain is constructed in original item collection with anchor point algorithm.
2) parameter learning.According to the K-thin chain constructed, learn the distribution of each node, carrys out approximate original preference
The overall distribution of sorting data collection.Wherein, the distribution of leaf node is referred to as relative order distribution (Relative Ranking
Distributions), the distribution of internal node (including root node) is referred to as cross-distribution (Interleaving
Distributions)。
Relative order distribution and cross-distribution are determined by the above process, also just complete the modeling of RI model.
The ordering of optimization preference method of data capture of the application is namely based on RI model, and is generated partially according to the RI model of foundation
Good sorting data.The only acquisition of the acquisition of triple mutual information and relative order distribution and cross-distribution in modeling process
It is all satisfied local difference privacy.
The ordering of optimization preference data algorithm (SAFARI algorithm) for meeting local difference privacy in the application is related to two rules
Rule I, Rule II and SAFA algorithm, can specifically include 5 stages:
Stage 1
1. each user converts the ordering of optimization preference data of oneself according to a transformation rule (being denoted as Rule I), from
And data collection platform is enable to obtain distributed intelligence required for calculating triple mutual information.Content about distributed intelligence exists
Extended meeting is discussed in detail afterwards.
Stage 2
1. data collection platform uses the privacy budget of ε ', cooperation of the SAFA algorithm by user is called, from user in rank
Distributed intelligence required for calculating triple mutual information is collected in data after 1 transfer of section.Wherein, ε ' is for characterizing secret protection
Intensity, in SAFA algorithm, noise is added in the data after conversion by user, data collection platform is then then forwarded to, to keep away
Exempt to reveal privacy.
2. data collection platform utilizes collected distributed intelligence, triple mutual information all in RI model is calculated.
3. data collection platform constructs K-thin chain
4. data collection platform willIssue each user.
Stage 3
1. each user turns the ordering of optimization preference data of oneself according to another transformation rule (being denoted as Rule II)
Change, thus make data collection platform can determine aboutRelative order be distributed (Relative Ranking
Distributions) information and cross-distribution (Interleaving Distributions) information.
Stage 4
1. data collector uses the privacy budget of ε ", cooperation of the SAFA algorithm by user is called, from user in the stage 2
In data after middle conversion collect aboutSequence distributed intelligence and cross-distribution information.Wherein, ε " is for characterizing privacy guarantor
Intensity is protected, in SAFA algorithm, noise is added in the data after conversion by user, it is then then forwarded to data collection platform, thus
Avoid leakage privacy.
So far, after obtaining sequence distributed intelligence and cross-distribution information, the building of RI model is just completed.
Stage 5
According to the riffle independent model of building, data collector generates the new ordering of optimization preference data of n item.
The ordering of optimization preference data RI for meeting local difference privacy can be realized by the processing in above-mentioned 1~stage of stage 4
Modeling.Data collection platform can be by the RI model development of completion to third party, alternatively, preferably to provide preference to third party
Sorting data, it is preferable that new ordering of optimization preference data mining further can also be generated to third party by the processing in stage 5.
In addition, stage 2 and stage 4 are divided into two parts the place for completing local difference privacy in above-mentioned method of data capture
Reason, therefore, under the premise of privacy budget is ε on the whole, ε '+ε "=ε, it is preferable that in practical applications, usually take
Below we by Rule I used in the method for data capture for introducing above-mentioned satisfaction local difference privacy respectively,
Rule II and SAFA method.
Design Rule I
As previously mentioned, Rule I is for converting the ordering of optimization preference data of user, the data after the conversion are based on
The mutual information of triple is calculated, therefore, the design of Rule I needs to carry out according to the calculation of triple mutual information.Specifically,
It is defined according to the mutual information of triple, in order to calculate any one possible triple (xi, xj, xk) mutual information, data collection
Platform needs to collect the distributed intelligence of three types:
In order to complete this task, a kind of intuitive method for transformation is that each user is allowed to carry out his ordering of optimization preference data
Conversion, to provide the information being distributed about these three types.In particular, each user converts his ordering of optimization preference data to
One tuple comprising multiple attributes, wherein each attribute corresponds toIn one distribution.
However, this method for transformation can make the redundancy comprising amount in the data after user's conversion, and increases and turn
Change complexity, because
In fact, data collector only needs to collectIn distribution, then therefrom deriveWithThe information of middle distribution.
Therefore, each user only needs to convert his ordering of optimization preference data, to provideThe information of middle distribution.Due toIn include
O(d3) a different distribution, the number of attributes in tuple after each user's conversion is O (d3)。
Unfortunately, when d is relatively large, due to dimension disaster, such transform mode, which will lead to, is meeting LDP's
Under the conditions of, it include a large amount of noise in data collected by data collector.In order to solve this problem, we devise
Rule I.According to the transformation rule, each user only needs to convert his ordering of optimization preference data, to provideMiddle distribution
Information.Data collection platform only needs to collectIn distribution, then therefrom estimated using regression modelWithThe letter of middle distribution
Breath.Particularly, it is found by the applicant that the estimation problem is that sparse linear returns (sparse linear regression) problem.Cause
This, selection can effectively solve the Lasso regression model of the problem.The details of Rule I is introduced separately below and how to be utilized
The estimation of Lasso regression modelWithThe information of middle distribution.
Rule I: each user terminal uiFirst by the ordering of optimization preference data σ of corresponding useriIt is converted into one and includes attribute
SetThe tuple t of middle all propertiesi.Wherein,In each attribute AjCorresponding to an itemAjValued space be dom (Aj)={ 1,2 ..., d }.dom(Aj) be made of d possible values, these values represent
The possible absolute ranking having.Then, forIn each attribute Aj, each user uiAccording to σiTo ti[Aj] assigned
Value.
Distribution estimation is carried out using Lasso, is determinedWithBased on what is be collected intoIn distribution, data collection
Platform can be estimated as followsWithThe information of middle distribution.
Firstly, data collection platform fromDistribution in estimateIn distributed intelligence.In particular, for eachIn distributionData collector constructs a Lasso regression modelWherein,
1)It is the column vector that a length is 2d, its storage and distributionWithInformation;
2)It is the binary matrix that a size is 2d × d (d-1);
3)It is the column vector that a length is d (d-1), it is used to store Joint DistributionLetter
Breath.
By solving the Lasso regression model with minimum angle homing method, data collection platform can be estimated
To obtain Joint DistributionInformation.According to Joint DistributionData collection platform can be counted
Calculate distributionInformation.
Then, data collector fromWithDistribution in estimateIn distributed intelligence.In particular, for each
It is aIn distributionData collector constructs a Lasso regression model Wherein,
1)It is the column vector that a length is (d+2), its storage and distributionWithLetter
Breath;
2)It is the binary matrix that a size is (d+2) × 2d;
3)It is the column vector that a length is 2d, it is used to store Joint DistributionLetter
Breath.
Similarly, by solving the Lasso regression model with minimum angle homing method, data collector can estimateTo obtain Joint DistributionInformation.
Need exist for explanation a bit, in above-mentioned Rule I,In each attribute AjIt is corresponded with preferences, the two
Corresponding relationship is consistent in data collection platform and subscriber terminal side needs.That is, data collection is flat in SAFARI
Platform and each user terminal need to guarantee to run identical Rule I.In addition, by above-mentioned processing as it can be seen that in the transformation rule,
The number of attributes for including in tuple after each user's conversion is O (d).For attribute setIn each attribute Aj, it
Valued space size | dom (Aj) | it is only d, it is evident that this value is much smaller than d!.Gathered by estimation.In each attribute
The frequency of any one possible value, data collector can obtainIn distributed intelligence, then estimate accordinglyWith
The information of middle distribution.Such processing will not generate mass of redundancy data, so as to effectively improve the effect of user preference data
With.
Design Rule II
In building K-thin chainLater, data collection platform need collect aboutRelative order distribution
(Relative Ranking Distributions) and cross-distribution (Interleaving Distributions).For this purpose,
Devise Rule II.According to the transformation rule, each user terminal only needs the ordering of optimization preference data for corresponding to user to it to turn
Change, to provide the information being distributed about both types.
Rule II: each user terminal uiThe ordering of optimization preference data σ of user is corresponded to firstiIt is converted into one
Include attribute setThe tuple t of middle all propertiesi.In particular, attribute setBy two subsetsWithIt constitutesWherein,
1)
Correspond toLeaf item collection Ji Tai It isA subset, by only include an item leaf item
Collection is constituted.Because aboutIn each leaf item collection relative order distribution be easy to be pushed off out, so users are not required to
It provides relatedInformation.
In each attribute AjCorresponding to setIn a leaf item collection lk。AjValued space by owning
About lkRelative order constitute.Particularly, when K is 1,In all leaf item collection only include an item, at this point,Wherein, K is indicatedThe most item numbers for including of middle period Son item set.
2)
In each attribute AjCorresponding in internal item collection setAn internal item collection gk。AjValued space by
All about gkTranslocation sorting constitute.Then, forIn each attribute Aj, each user uiAccording to σiTo ti[Aj] carry out
Assignment.
Need exist for explanation a bit, in above-mentioned Rule II,In each attribute AjLeaf item or internal item one in
One is corresponding, and the corresponding relationship of the two is consistent in data collection platform and subscriber terminal side needs.That is, in SAFARI
In, data collection platform and each user terminal need to guarantee to run identical Rule II.In addition, by above-mentioned processing as it can be seen that
In the transformation rule, the number of attributes for including in the tuple after each user's conversion is O (d).ForIn each attribute
Aj, its maximum valued space size is K!;ForIn each attribute Aj, its maximum valued space size isTherefore attribute setIn the maximum value space of any attribute beIt is obvious that this value
Much smaller than d!.By estimating Ji TaiIn each attribute any one possible value frequency, data collector can obtain
AboutDistributed intelligence.Such processing will not generate mass of redundancy data, so as to effectively improve user preference data
Effectiveness.
SAFA method
In the data collection process of aforementioned the application, stage 2 and stage 4 require the data after converting to user terminal
SAFA processing is carried out, SAFA processing is just discussed in detail here.
In order to collect building RI model needed for distributed intelligence, data collection platform needs estimate under conditions of meeting LDP
The frequency of any one possible value of each attribute in tuple after counting user's conversion.
Data collection platform can call directly the current state-of-the-art method that multiattribute data is analyzed at LDP ---
Harmony method, to complete this task.Particularly, forIn each attribute Aj, data collector is by AjValued space
Being mapped as a size is | dom (Ai)|×|dom(Aj) | binary matrix Φj.Then, for each user ui, data receipts
Collection person is from setIt is randomly chosen an attribute and (is assumed to be Ar), and SH algorithm [11] is called to collect uiTuple tiMiddle Ar
Value.
It is observed that either using Rule I or Rule II, the attribute set after conversionIn all properties
Valued space size be much smaller than d!.However, the attribute small for valued space, Harmony method is still by each dimension
Valued space be mapped in a matrix, result in collected data and contain unnecessary noise, especially handle
When binary attribute.Have in document and points out, when estimating the discrete value frequency of smallest number, generalized
The effect of randomized response algorithm is best.Therefore, it is proposed that a kind of new LDP algorithm, entitled Sampling
Randomizer for Multiple Attributes (SAFA), under conditions of meeting LDP, more accurately to small
The multiattribute data of valued space carries out Frequency Estimation.The main thought of this method is that each user terminal randomly chooses one
Then attribute disturbs the value of the attribute with generalized randomized response algorithm, and will disturb
Dynamic result is sent to data collection platform.
As previously mentioned, need to be handled using SAFA algorithm in stage 2 and stage 4, the SAFA algorithm be need by with
Family terminal and data collection platform carry out the process of cooperation completion.Stage 2 is identical with the SAFA algorithm that the stage 4 is applied, and only leads to
It is different to cross the distributed intelligence that SAFA algorithm to be obtained, uniformly introduces the processing of SAFA algorithm below.
Detailed process is as follows by SAFA:
1. data collection platform initialization vector setIn all vector, i.e., all values in each vector are assigned to
0;Here, for stage 2, vector setIt is exactlyFor stage 4, vector setIt is exactly relative order distributed collection and friendship
Pitch the intersection of distributed collection;
2. being directed to each user terminal uiIt performs the following operations:
When 2.1. data collection platform is converted from Rule I or Rule IIOne rope of middle random selection
Draw j;
2.2. j is sent to u by data collection platformi;
2.3.uiThe value subscript for having noise is generated, is denoted asSo that
Wherein k ∈ 1,2 ..., | dom (Aj)|};
2.4.uiIt willIt is sent to data collector;
2.5. data collector willValue increase by 1;
After having executed aforesaid operations to all user terminals, following operation is continued to execute:
3. for setEach of vector zjExecute following processing:
3.1. probability is arranged in data collector
3.2. probability is arranged in data collector
3.3. by vector zjEach of value zj[k] is updated to
Above-mentioned is the specific processing of SAFA method.It is hidden that SAFA method to illustrate in the application can satisfy local difference
Theoretic proof is given below in private.
Theorem: for any user ui, privacy budget ε ', SAFA meet ε '-LDP.
It proves:
It is defined by LDP, the tuple t different for any twoi, t 'i, arbitrarilyWherein
It is the property index selected by data collector, it would be desirable to prove
Because j be it is randomly selected,
We discuss (1) in all possible 4 kinds of situations.
Situation 1: ifAnd
Situation 2: ifAnd
Situation 3: ifAnd
Situation 4: ifAnd
In conclusionIt sets up.Therefore, conclusion must be demonstrate,proved.
Above by form analysis, the ordering of optimization preference data collection algorithm of the local difference privacy of satisfaction in the application
(SAFARI) local difference privacy can be met to each user guaranteeing algorithm, while guarantees number collected by data collector
According to data effectiveness with higher.
Here, the ordering of optimization preference method of data capture that local difference privacy is met in the application above is summarized as follows:
1, data collection platform is by primary vector setIn institute directed quantity zjIt is initialized as 0 vector, and preset inclined
Each user terminal u is directed in good item collectioniSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is to use
Family terminal index, j are the preference entry indexes that preferences are concentrated;
2, for each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn own
The tuple t of attributei[Aj] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform,
Utilize tuple ti[Aj] generate value subscriptIt is sent to the data collection platform;Wherein, AjIt is property set platformIn jth
A attribute, the attribute number in tuple are equal to the number of the preferences in the ordering of optimization preference data, and attribute and preferences are one by one
Corresponding, the value of each attribute is equal to the ranking of corresponding preferences;The value subscript meets condition
3, data collection platform is sent using each user terminalIt will be in the primary vector setValue adds
1;
4, data collection platform is by each value z of institute's directed quantity in the primary vector setj[k] is updated to Wherein,The ε ' is that preset first privacy is pre-
It calculates;
5, data collection platform according to primary vector set (namely) determine
WithAnd utilization primary vector set,WithCalculate the preferences
Concentrate all triplesMutual informationAnd construct K-thin chainIt is sent to each user
Terminal;
6, data collection platform is by secondary vector setIn institute directed quantity zj' it is initialized as 0 vector, and preset
Preferences, which are concentrated, is directed to each user terminal uiSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is
User terminal index, j are the preference entry indexes that preferences are concentrated, be the preference entry index that selects of different user terminals to be identical or
It is different;
7, for each user terminal, using the ordering of optimization preference data of user itself, to including attribute setIn own
The tuple t of attributei′[Aj'] assignment is carried out, and the preference entry index of the user terminal is sent to according to the data collection platform
J utilizes tuple ti′[Aj'] generate value subscriptIt is sent to the data collection platform;Wherein, attribute setIncluding two
A subsetWith Correspond toLeaf item collection set Correspond toInside item collection set, the value subscript meets
Condition
8, data collection platform is sent using each user terminalIt will be in the secondary vector setValue
Add 1;
9, data collection platform is by each value z of institute's directed quantity in the secondary vector setj' [k] is updated toWherein,The ε " is default
The second privacy budget, ε '+ε "=ε, ε are the overall privacy budget for establishing RI model;
10, it is obtained according to secondary vector set (the namely intersection of relative order distributed collection and translocation sorting distributed collection)
It arrivesLeaf node distributed intelligence and internal node distributed intelligence.
11, using the RI model of foundation, new ordering of optimization preference data are generated.Wherein, RI model isLeaf node point
The distributed intelligence of cloth information and internal node.
In above-mentioned method of data capture, user terminal is to t in the processing of step 1 and step 2i[Aj] carry out assignment processing
It can be and executed with random order, user terminal is to t in the processing of step 6 and step 2i′[Aj'] processing that carries out assignment can be with
It is to be executed with random order.
It compares followed by with RAPPOR, SH with OLH, the SAFARI method of the application proposition can be determined in data
There is apparent advantage in the effectiveness of data collected by collecting platform.The advantages of in order to which the application method is better described, uses
The limit first-order is distributed (Q1) and the limit second-order distribution (Q2) measure RAPPOR, SH, OLH and SAFARI tetra-
The effectiveness of ordering of optimization preference data collected by a algorithm.Wherein, for the distribution of the limit first-order and second-order
Limit distribution, we generate the L between the limit distribution of data and the distribution of initial data with algorithms of different1Distance is to measure
The effectiveness for the data being collected into.That specifically tests is provided that we use two groups of true data set Sushi and Jester
Test the performance of each method.The specific features of data are as shown in table 2 in this two group data set.
2 data set features of table
Data set | Number of users | The quantity of item |
Sushi | 5,000 | 3~10 |
Jester | 20,000 | 3~10 |
Illustrate the performance of SAFARI method below by analysis experimental data.
Firstly, measuring RAPPOR, SH, OLH using the distribution of the limit first-order and the distribution of the limit second-order
With the performance of tetra- methods of SAFARI.Experimental result is as shown in Figure 3.
From figure 3, it can be seen that in different data sets, as privacy budget becomes larger, RAPPOR, SH, OLH and SAFARI
The limit point of the distribution of the limit first-order and the distribution of the limit second-order and raw data set of the data that algorithm generates
L between cloth1Distance reduces, but the test result of SAFARI algorithm is consistently less than RAPPOR, SH and OLH.This is because: a side
Face, for SAFARI algorithm, K-thin chain makes data collector with correlation distribution information collected by SAFARI
Accuracy have extraordinary robustness, influenced by noise be added smaller;On the other hand, for RAPPOR, SH and OLH
Algorithm, when privacy parameters reduce, they can introduce a large amount of noise.
Then, we test the validity of Rule I using data set Sushi and Jester.For this purpose, we are by it and separately
The Rule I (being denoted as Rule I*) of one version is compared.In Rule I*, each user by his ordering of optimization preference data into
Row conversion, directly to provideThe information of middle distribution.We allow data collector with SAFA method respectively from user according to
Distributed intelligence is collected in the data of Rule I and Rule I* conversion, and the S of its acquisition is presented3The average L of middle distribution1Distance.It is real
It is as shown in Figure 4 to test result.
From fig. 4, it can be seen that when d is no more than 4, Rule I* will lead to better effect for different data sets
With.This is because Lasso returns the bring advantage enemy only brought influence of information loss when d is smaller.So
And when d is relatively large, Rule I will lead to fairly good as a result, such demonstrate the superiority of Rule I.
Finally, we utilize the validity of data set Sushi and Jester testing SA FA algorithm.For this purpose, we by it with
Harmony method compares.We allow data collector that SAFA and Harmony method is used to be turned from user according to Rule I respectively
S is collected in the data of change1In distributed intelligence, and present its obtain distribution average L1Distance.Experimental result is as shown in Figure 5.
From fig. 5, it can be seen that being made an uproar for different data sets using what distributed intelligence collected by SAFA method contained
Volume is smaller.This is because when the valued space of attribute is smaller, by the valued space of each attribute in Harmony algorithm
The process for being mapped to a matrix can introduce unnecessary noise.
By above-mentioned every test as it can be seen that the ordering of optimization preference data collection that the method for data capture of the application is realized, Neng Gou
Guarantee to avoid to guarantee the ordering of optimization preference data data effectiveness with higher collected while privacy leakage.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.
Claims (5)
1. a kind of ordering of optimization preference method of data capture for meeting local difference privacy characterized by comprising
Data collection platform is by primary vector setIn institute directed quantity zjIt is initialized as 0 vector, and in preset preference item collection
In be directed to each user terminal uiSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is user terminal
Index, j are the preference entry indexes that preferences are concentrated;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setMiddle all properties
Tuple ti[Aj] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, utilize member
Group ti[Aj] generate value subscriptIt is sent to the data collection platform;Wherein, AjIt is attribute setIn j-th of attribute,
ti[Aj] indicate tiMiddle AjValue, the attribute number in tuple is equal to the number of the preferences in the ordering of optimization preference data, belongs to
Property corresponded with preferences, the value of each attribute is equal to the ranking of corresponding preferences;The value subscript meets condition I(ti[Aj] table
Show ti[Aj] in dom (Aj) in index, dom (Aj) indicate attribute AjValued space;
The data collection platform is sent using each user terminalIt will be in the primary vector setValue plus 1;
The data collection platform is by each value z of institute's directed quantity in the primary vector setj[k] is updated toWherein,The ε ' is preset the
One privacy budget;
The data collection platform is determined according to the primary vector setWithAnd the utilization primary vector set,WithCalculate the preference
All triples in item collectionMutual informationAnd construct K-thin chainIt is sent to each
User terminal;
Data collection platform is by secondary vector setIn institute directed quantity zj' is initialized as 0 vector, and in preset preferences
It concentrates and is directed to each user terminal uiSelection preference entry index j is sent to corresponding user terminal respectively;Wherein, i is user's end
End index, j are the preference entry indexes that preferences are concentrated, and be the preference entry index that different user terminals select are identical or different;
For each user terminal, using the ordering of optimization preference data of user itself, to including attribute setMiddle all properties
Tuple ti′[Aj'] assignment is carried out, and the preference entry index j of the user terminal is sent to according to the data collection platform, it utilizes
Tuple ti′[Aj'] generate value subscriptIt is sent to the data collection platform;Wherein, attribute setIncluding two subsetsWith Correspond toLeaf item collection set Correspond toInside item collection set, the value subscript meets condition
The data collection platform is sent using each user terminalIt will be in the secondary vector setValue adds
1;
The data collection platform is by each value z of institute's directed quantity in the secondary vector setj' [k] is updated toWherein,The ε " is default
The second privacy budget, ε '+ε "=ε, ε are the overall privacy budget for establishing RI model;
It is obtained according to the secondary vector set describedLeaf node distributed intelligence and internal node distributed intelligence;
Using the RI model of the distributed intelligence of the distributed intelligence and internal node that include the leaf node, ordering of optimization preference number is generated
According to.
2. the method according to claim 1, wherein this method further comprises: according to the mutual of the triple
It is information, describedLeaf node distributed intelligence and internal node distributed intelligence, generate the ordering of optimization preference number of specified quantity
According to.
3. method according to claim 1 or 2, which is characterized in that the data collection platform is according to the primary vector
Set determinesInclude:
For eachIn distributionConstruct a Lasso regression model
Wherein,It is the column vector that a length is 2d, its storage and distributionWithInformation,It is one
A size is the binary matrix of 2d × d (d-1),It is the column vector that a length is d (d-1), for storing joint point
ClothInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine Joint DistributionFurther according to Joint DistributionIt calculates
4. method according to claim 1 or 2, which is characterized in that the data collection platform is according to the primary vector
Set determinesInclude:
For eachIn distributionConstruct a Lasso regression model Wherein,It is the column vector that a length is (d+2), its storage and distributionWithInformation,It is the binary matrix that a size is (d+2) × 2d,It is that a length is
The column vector of 2d, for storing Joint DistributionInformation;
The Lasso regression model is solved by minimum angle homing method, estimation obtainsAnd determine Joint Distribution
5. method according to claim 1 or 2, which is characterized in that
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811079995.XA CN109299436B (en) | 2018-09-17 | 2018-09-17 | Preference sorting data collection method meeting local differential privacy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811079995.XA CN109299436B (en) | 2018-09-17 | 2018-09-17 | Preference sorting data collection method meeting local differential privacy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109299436A true CN109299436A (en) | 2019-02-01 |
CN109299436B CN109299436B (en) | 2021-10-15 |
Family
ID=65163261
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811079995.XA Active CN109299436B (en) | 2018-09-17 | 2018-09-17 | Preference sorting data collection method meeting local differential privacy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109299436B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443063A (en) * | 2019-06-26 | 2019-11-12 | 电子科技大学 | The method of the federal deep learning of self adaptive protection privacy |
WO2020177484A1 (en) * | 2019-03-01 | 2020-09-10 | 华南理工大学 | Localized difference privacy urban sanitation data report and privacy calculation method |
CN111669366A (en) * | 2020-04-30 | 2020-09-15 | 南京大学 | Localized differential private data exchange method and storage medium |
WO2020248150A1 (en) * | 2019-06-12 | 2020-12-17 | Alibaba Group Holding Limited | Method and system for answering multi-dimensional analytical queries under local differential privacy |
CN112329056A (en) * | 2020-11-03 | 2021-02-05 | 石家庄铁道大学 | Government affair data sharing-oriented localized differential privacy method |
CN112995076A (en) * | 2019-12-17 | 2021-06-18 | 国家电网有限公司大数据中心 | Discrete data frequency estimation method, user side, data center and system |
CN113111383A (en) * | 2021-04-21 | 2021-07-13 | 山东大学 | Personalized differential privacy protection method and system for vertically-divided data |
CN114091100A (en) * | 2021-11-23 | 2022-02-25 | 北京邮电大学 | Track data collection method and system meeting local differential privacy |
WO2022107284A1 (en) * | 2020-11-19 | 2022-05-27 | 日本電信電話株式会社 | Concealment device, concealment method, and program |
CN115098931A (en) * | 2022-07-20 | 2022-09-23 | 江苏艾佳家居用品有限公司 | Small sample analysis method for mining personalized requirements of indoor design of user |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140283091A1 (en) * | 2013-03-15 | 2014-09-18 | Microsoft Corporation | Differentially private linear queries on histograms |
CN105740245A (en) * | 2014-12-08 | 2016-07-06 | 北京邮电大学 | Frequent item set mining method |
CN106991335A (en) * | 2017-02-20 | 2017-07-28 | 南京邮电大学 | A kind of data publication method based on difference secret protection |
US20170316346A1 (en) * | 2016-04-28 | 2017-11-02 | Qualcomm Incorporated | Differentially private iteratively reweighted least squares |
CN107862219A (en) * | 2017-11-14 | 2018-03-30 | 哈尔滨工业大学深圳研究生院 | The guard method of demand privacy in a kind of social networks |
CN107871087A (en) * | 2017-11-08 | 2018-04-03 | 广西师范大学 | The personalized difference method for secret protection that high dimensional data is issued under distributed environment |
-
2018
- 2018-09-17 CN CN201811079995.XA patent/CN109299436B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140283091A1 (en) * | 2013-03-15 | 2014-09-18 | Microsoft Corporation | Differentially private linear queries on histograms |
CN105740245A (en) * | 2014-12-08 | 2016-07-06 | 北京邮电大学 | Frequent item set mining method |
US20170316346A1 (en) * | 2016-04-28 | 2017-11-02 | Qualcomm Incorporated | Differentially private iteratively reweighted least squares |
CN106991335A (en) * | 2017-02-20 | 2017-07-28 | 南京邮电大学 | A kind of data publication method based on difference secret protection |
CN107871087A (en) * | 2017-11-08 | 2018-04-03 | 广西师范大学 | The personalized difference method for secret protection that high dimensional data is issued under distributed environment |
CN107862219A (en) * | 2017-11-14 | 2018-03-30 | 哈尔滨工业大学深圳研究生院 | The guard method of demand privacy in a kind of social networks |
Non-Patent Citations (1)
Title |
---|
苏炜航等: "一种基于隐树模型的满足差分隐私的高维数据发布算法", 《小型微型计算机系统》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020177484A1 (en) * | 2019-03-01 | 2020-09-10 | 华南理工大学 | Localized difference privacy urban sanitation data report and privacy calculation method |
WO2020248150A1 (en) * | 2019-06-12 | 2020-12-17 | Alibaba Group Holding Limited | Method and system for answering multi-dimensional analytical queries under local differential privacy |
CN110443063A (en) * | 2019-06-26 | 2019-11-12 | 电子科技大学 | The method of the federal deep learning of self adaptive protection privacy |
CN110443063B (en) * | 2019-06-26 | 2023-03-28 | 电子科技大学 | Adaptive privacy-protecting federal deep learning method |
CN112995076B (en) * | 2019-12-17 | 2022-09-27 | 国家电网有限公司大数据中心 | Discrete data frequency estimation method, user side, data center and system |
CN112995076A (en) * | 2019-12-17 | 2021-06-18 | 国家电网有限公司大数据中心 | Discrete data frequency estimation method, user side, data center and system |
CN111669366A (en) * | 2020-04-30 | 2020-09-15 | 南京大学 | Localized differential private data exchange method and storage medium |
CN112329056A (en) * | 2020-11-03 | 2021-02-05 | 石家庄铁道大学 | Government affair data sharing-oriented localized differential privacy method |
CN112329056B (en) * | 2020-11-03 | 2021-11-02 | 石家庄铁道大学 | Government affair data sharing-oriented localized differential privacy method |
WO2022107284A1 (en) * | 2020-11-19 | 2022-05-27 | 日本電信電話株式会社 | Concealment device, concealment method, and program |
CN113111383B (en) * | 2021-04-21 | 2022-05-20 | 山东大学 | Personalized differential privacy protection method and system for vertically-divided data |
CN113111383A (en) * | 2021-04-21 | 2021-07-13 | 山东大学 | Personalized differential privacy protection method and system for vertically-divided data |
CN114091100A (en) * | 2021-11-23 | 2022-02-25 | 北京邮电大学 | Track data collection method and system meeting local differential privacy |
CN114091100B (en) * | 2021-11-23 | 2024-05-03 | 北京邮电大学 | Track data collection method and system meeting local differential privacy |
CN115098931A (en) * | 2022-07-20 | 2022-09-23 | 江苏艾佳家居用品有限公司 | Small sample analysis method for mining personalized requirements of indoor design of user |
CN115098931B (en) * | 2022-07-20 | 2022-12-16 | 江苏艾佳家居用品有限公司 | Small sample analysis method for mining personalized requirements of indoor design of user |
Also Published As
Publication number | Publication date |
---|---|
CN109299436B (en) | 2021-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109299436A (en) | A kind of ordering of optimization preference method of data capture meeting local difference privacy | |
Duan et al. | JointRec: A deep-learning-based joint cloud video recommendation framework for mobile IoT | |
CN110503531A (en) | The dynamic social activity scene recommended method of timing perception | |
CN105678590B (en) | Cloud model-based topN recommendation method for social network | |
Xie et al. | Accurate recovery of missing network measurement data with localized tensor completion | |
CN112182424A (en) | Social recommendation method based on integration of heterogeneous information and isomorphic information networks | |
CN107220328A (en) | The video recommendation method of weak relation and strong relation based on social networks | |
CN105956093A (en) | Individual recommending method based on multi-view anchor graph Hash technology | |
CN108897789A (en) | A kind of cross-platform social network user personal identification method | |
CN106095887A (en) | Context aware Web service recommendation method based on weighted space-time effect | |
Li et al. | An improved multilevel fuzzy comprehensive evaluation algorithm for security performance | |
CN105825430A (en) | Heterogeneous social network-based detection method | |
CN107609469A (en) | Community network association user method for digging and system | |
CN105843829A (en) | Big data credibility measurement method based on layering model | |
CN111340187B (en) | Network characterization method based on attention countermeasure mechanism | |
Li et al. | Submodular maximization in clean linear time | |
Wang et al. | A collaborative filtering recommendation algorithm based on item and cloud model | |
CN117391816A (en) | Heterogeneous graph neural network recommendation method, device and equipment | |
Yang et al. | Improving the recommendation of collaborative filtering by fusing trust network | |
CN115618127A (en) | Collaborative filtering algorithm of neural network recommendation system | |
Pan et al. | Large-scale expectile regression with covariates missing at random | |
CN113902113A (en) | Convolutional neural network channel pruning method | |
Zhang et al. | The nearest neighbor algorithm of filling missing data based on cluster analysis | |
Li et al. | Leveraging reconstructive profiles of users and items for tag-aware recommendation | |
CN106777092A (en) | The intelligent medical calling querying method of dynamic Skyline inquiries under mobile cloud computing environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |