CN115277156B - User identity privacy protection method for resisting neighbor attack in social network - Google Patents
User identity privacy protection method for resisting neighbor attack in social network Download PDFInfo
- Publication number
- CN115277156B CN115277156B CN202210867729.3A CN202210867729A CN115277156B CN 115277156 B CN115277156 B CN 115277156B CN 202210867729 A CN202210867729 A CN 202210867729A CN 115277156 B CN115277156 B CN 115277156B
- Authority
- CN
- China
- Prior art keywords
- nodes
- user
- node
- neighbor
- degree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000012986 modification Methods 0.000 claims abstract description 9
- 230000004048 modification Effects 0.000 claims abstract description 9
- 238000009826 distribution Methods 0.000 claims description 51
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000010586 diagram Methods 0.000 claims description 6
- 238000000342 Monte Carlo simulation Methods 0.000 claims description 3
- 238000003064 k means clustering Methods 0.000 claims description 3
- 239000013598 vector Substances 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims 1
- 238000012217 deletion Methods 0.000 claims 1
- 230000037430 deletion Effects 0.000 claims 1
- 238000005192 partition Methods 0.000 description 2
- 235000005156 Brassica carinata Nutrition 0.000 description 1
- 244000257790 Brassica carinata Species 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0407—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the identity of one or more communicating identities is hidden
- H04L63/0421—Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Bioethics (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及社会网络隐私保护领域,具体涉及一种社交网络中抵抗邻居攻击的用户身份隐私保护方法。The present invention relates to the field of social network privacy protection, and in particular to a method for protecting user identity privacy in a social network against neighbor attacks.
背景技术Background Art
社交网络中用户填写姓名、职业、电话号码、电子邮件、身份证号码等信息,保存在数据库中,然而,这些数据中除了个人的信息外,还体现了一定的社会关系。这些数据中包含了很多用户的隐私信息,因此,在社交网络数据发布前必须使用匿名技术保护用户隐私。In social networks, users fill in their names, occupations, phone numbers, email addresses, ID numbers and other information, which are stored in the database. However, in addition to personal information, these data also reflect certain social relationships. These data contain a lot of user privacy information, so anonymization technology must be used to protect user privacy before social network data is released.
朴素用户隐私保护方法是移除用户的身份、属性等,但Backstrom等指出这种朴素隐私保护技术在面对1*-邻居攻击时能够重新识别出用户的身份,不能很好地保护用户的隐私。图结构修改能够有效地保护用户的隐私,其通过在数据发布前的原始图中,添加或删除节点和边的方法改变图的结构,在修改后的图(称为匿名图)中达到用户身份隐私或属性隐私的目的。使用图修改技术对用户隐私保护,首先对节点划分,划分精确度直接影响图信息损失量可能导致图数据可用性降低,必须寻求更精确的划分标准。The naive user privacy protection method is to remove the user's identity and attributes, but Backstrom et al. pointed out that this naive privacy protection technology can re-identify the user's identity when facing a 1*-neighbor attack, and cannot protect the user's privacy well. Graph structure modification can effectively protect user privacy. It changes the structure of the graph by adding or deleting nodes and edges in the original graph before data is released, and achieves the purpose of user identity privacy or attribute privacy in the modified graph (called anonymous graph). When using graph modification technology to protect user privacy, first divide the nodes. The division accuracy directly affects the amount of graph information loss and may reduce the availability of graph data. It is necessary to seek more accurate division criteria.
发明内容Summary of the invention
有鉴于此,本发明的目的在于提供一种社交网络中抵抗邻居攻击的用户身份隐私保护方法,通过对图结构的修改使得修改后的匿名图达到k-匿名,从而能够有效提保护用户的身份隐私。In view of this, the purpose of the present invention is to provide a user identity privacy protection method in a social network against neighbor attacks, by modifying the graph structure so that the modified anonymous graph achieves k-anonymity, thereby effectively protecting the user's identity privacy.
为实现上述目的,本发明采用如下技术方案:To achieve the above object, the present invention adopts the following technical solution:
一种社交网络中抵抗邻居攻击的用户身份隐私保护方法,包括以下步骤:A method for protecting user identity privacy in a social network against neighbor attacks comprises the following steps:
步骤1)建立社交网络模型,将其表示为图G=(V,E),其中V是图的顶点集,表示社交网络中的用户;E是边集,表示社交网络中的用户之间的关系;Step 1) Establish a social network model and represent it as a graph G = (V, E), where V is the vertex set of the graph, representing the users in the social network; E is the edge set, representing the relationship between users in the social network;
根据度量d(v),lc(v)将用户节点初划分成T个簇,其中d(v)表示用户节点v 的度,其含义为社交网络中与该用户具有联系的用户数量;lc(v)表示用户节点v在网络中的局部聚类系数,其含义为节点v的邻居之间联系的紧密程度,在划分结束后,将这些簇按照每个簇的最大节点度降序排列;The user nodes are initially divided into T clusters according to the metrics d(v) and lc(v), where d(v) represents the degree of the user node v, which means the number of users connected to the user in the social network; lc(v) represents the local clustering coefficient of the user node v in the network, which means the closeness of the connection between the neighbors of the node v. After the division is completed, these clusters are arranged in descending order according to the maximum node degree of each cluster;
步骤2)预设一个用户隐私需求阈值k,若某个簇Ci中用户数少于阈值k,则计算该簇的平均度与相邻的前后两簇Ci-1,Ci+1的平均度的差值,将该簇合并到差值小的簇中,重复该过程直到所有簇的中用户的数量都大于k;Step 2) Preset a user privacy requirement threshold k. If the number of users in a cluster Ci is less than the threshold k, calculate the difference between the average degree of the cluster and the average degrees of the two adjacent clusters Ci -1 and Ci +1 , merge the cluster into the cluster with the smaller difference, and repeat the process until the number of users in all clusters is greater than k;
步骤3),在簇合并完成后,针对用户节点个数大于2k的簇,对其进行簇分裂操作使得每一个簇中用户的数量为[k,2k)的某个取值;具体为:Step 3), after the cluster merging is completed, for clusters with more than 2k user nodes, a cluster splitting operation is performed so that the number of users in each cluster is a value of [k, 2k); specifically:
S3-1,对于每个簇中的用户节点,按度数降序排序,构建用户节点的1*-邻居图;S3-1, for each user node in each cluster, sort them in descending order of degree and construct the 1*-neighborhood graph of the user node;
S3-2,构造用户节点的1*-邻居结构特征矩阵其中分别表示用户的节点v在社交网络中的度分布、内度分布、外度分布及间隙度分布;S3-2, construct the 1*-neighborhood structure feature matrix of the user node in They represent the degree distribution, inner degree distribution, outer degree distribution and gap degree distribution of the user's node v in the social network respectively;
S3-3,根据公式计算同一簇中任意两个节点之间的结构相似度,其中分别表示用户节点度分布、内度分布、外度分布及间隙度分布的不相关程度,k1、k2、k3、k4分别表示各个相似度所占比重,且满足k1+k2+k3+k4=1;S3-3, according to the formula Calculate the structural similarity between any two nodes in the same cluster, where They represent the irrelevance of user node degree distribution, inner degree distribution, outer degree distribution and gap degree distribution respectively. k 1 , k 2 , k 3 , k 4 represent the proportion of each similarity respectively and satisfy k 1 +k 2 +k 3 +k 4 =1.
S3-4,利用K-means聚类算法将节点划分为T个簇;S3-4, use K-means clustering algorithm to divide the nodes into T clusters;
步骤4),根据每个簇中用户节点的1*-邻居图计算用户每对节点间的相似度,并据此构造出一个带权二部图,在二部图上计算出图编辑距离,据此找到目标图编辑路径P;Step 4), calculate the similarity between each pair of user nodes according to the 1*-neighborhood graph of the user nodes in each cluster, and construct a weighted bipartite graph based on it, calculate the graph edit distance on the bipartite graph, and find the target graph edit path P based on it;
步骤5),根据步骤4)找到的图编辑路径P,修改簇中节点的1*-邻居图,使得他们同构。Step 5), based on the graph editing path P found in step 4), modify the 1*-neighbor graphs of the nodes in the cluster so that they are isomorphic.
2.根据权利要求1所述的一种抵抗1*-邻居攻击的用户身份隐私保护方法,其特征在于:所述1*-邻居图为原始图G的一个子图,定义为:2. A user identity privacy protection method for resisting 1*-neighborhood attack according to claim 1, characterized in that: the 1*-neighborhood graph is a subgraph of the original graph G, defined as:
G(v)=(V(v),E(v),D(v))G(v)=(V(v),E(v),D(v))
其中V(v)是包括用户节点v本身及其邻居节点的集合,E(v)是V(v)中节点的边即邻居之间的关系,D(v)是节点v的邻居在社交网络中邻居的数量构成的集合即V(v)中所有节点的度构成的集合。Where V(v) is the set including the user node v itself and its neighbor nodes, E(v) is the edge of the nodes in V(v), that is, the relationship between neighbors, and D(v) is the set consisting of the number of neighbors of node v in the social network, that is, the set consisting of the degrees of all nodes in V(v).
进一步的,所述步骤2)具体为:Furthermore, the step 2) is specifically as follows:
S2-1,对于节点数小于k的簇,将其记为其中上标1表示该簇是第一次划分后得到的结果,其簇内节点的平均度记为计算其前后相邻的两个簇的节点平均度,分别记为 S2-1, for clusters with less than k nodes, record them as The superscript 1 indicates that the cluster is the result of the first partition, and the average degree of the nodes in the cluster is recorded as calculate The two adjacent clusters The average node degree of
S2-2,若满足公式则将添加到中,否则将添加到中;S2-2, if Satisfy the formula Then Add to Otherwise, Add to middle;
S2-3,重复执行上述步骤,直到所有的簇中的节点数都超过k。S2-3, repeat the above steps until the number of nodes in all clusters exceeds k.
进一步的,所述步骤4)具体为:Furthermore, the step 4) is specifically as follows:
S4-1,如果两个用户节点的l*-邻居图中邻居节点数不相等,则在用户邻居节点数少的图中添加用户节点使得两个图中节点数相等;S4-1, if the number of neighbor nodes in the l*-neighborhood graph of two user nodes is not equal, then add the user node to the graph with fewer user neighbor nodes so that the number of nodes in the two graphs is equal;
S4-2,构造用户节点的匹配代价矩阵,并以用户节点的匹配代价作为边权值构造一个带权二部图;S4-2, construct the matching cost matrix of the user node, and construct a weighted bipartite graph using the matching cost of the user node as the edge weight;
S4-3,利用二部图计算用户节点间的图编辑距离以得到匹配的节点以及图编辑路径。S4-3, using a bipartite graph to calculate the graph edit distance between user nodes to obtain matching nodes and graph edit paths.
进一步的,所述步骤5)具体为:Further, the step 5) is specifically as follows:
S5-1,构造图G的邻接矩阵记为A=(aij)n×n,其中当节点vi和vj间存在边时, aij=1,否则,aij=0;S5-1, construct the adjacency matrix of graph G, denoted as A = (a ij ) n × n , where a ij = 1 when there is an edge between nodes v i and v j , otherwise, a ij = 0;
S5-2,计算A2及A3,及若则令若则令计算 S5-2, calculate A 2 and A 3 , and like Then like Then calculate
S5-3,对于社交网络中每个用户节点v,根据S4-3计算得到的匹配节点u,计算出节点v需要修改的度并记为 将社交网络中每个用户节点需修改的度按照降序排列,得到的度修改序列记为其中, dv表示用户节点v的邻居个数;S5-3, for each user node v in the social network, according to the matching node u calculated in S4-3, calculate the degree of node v that needs to be modified and record it as Arrange the degree of each user node in the social network that needs to be modified in descending order, and the degree modification sequence is recorded as Among them, d v represents the number of neighbors of user node v;
S5-4,按照DM修改图结构。S5-4, modify the graph structure according to D M.
进一步的,所述S3-2具体为:Furthermore, the S3-2 is specifically:
S3-2-1,计算用户节点v的1*-邻居图G(v)中邻居节点的度分布 是用户节点vi的度,表示vi在原始图G中邻居的个数,N(vi)为用户节点v所有邻居的集合;S3-2-1, calculate the degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of user node v is the degree of user node vi , indicating the number of neighbors of vi in the original graph G, N(v i ) is the set of all neighbors of user node v;
S3-2-2,计算用户节点v的1*-邻居图G(v)中邻居节点的内度分布 是用户节点的内度,表示用户节点vi在1*-邻居图G(v)中邻居的个数, S3-2-2, calculate the inner degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of user node v is the inner degree of the user node, which indicates the number of neighbors of the user node vi in the 1*-neighborhood graph G(v),
步骤3-2-3,计算用户节点v的1*-邻居图G(v)中邻居节点的出度分布 是vi的出度,表示用户节点vi在1*-邻居图G(v)之外邻居的个数, Step 3-2-3, calculate the out-degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of user node v is the out-degree of vi , indicating the number of neighbors of user node vi outside the 1*-neighborhood graph G(v),
步骤3-2-4,计算用户节点v的1*-邻居图G(v)中邻居节点的间隙度分布其中 Step 3-2-4, calculate the gap degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of user node v in
S3-2-5,社交网络中每个用户节点的特征矩阵记为 S3-2-5, the feature matrix of each user node in the social network is recorded as
为用户节点v在社交网络中邻居的个数。is the number of neighbors of user node v in the social network.
进一步的,所述S3-3具体为:Furthermore, the S3-3 is specifically:
S3-3-1,对于同一簇中的用户节点v及u,分别利用JS散度计算他们的度分布、内度分布、出度分布、间隙度分布的不相关程度,分别记为: 所述JS散度定义为:S3-3-1, for user nodes v and u in the same cluster, use JS divergence to calculate the degree of irrelevance of their degree distribution, inner degree distribution, out-degree distribution, and gap degree distribution, respectively, which are recorded as: The JS divergence is defined as:
其中P={p1,p2,…,pt},Q={q1,q2,…,qt}分别为同一概率空间中的两个概率分布, Where P = {p 1 ,p 2 ,…,p t }, Q = {q 1 ,q 2 ,…,q t } are two probability distributions in the same probability space.
S3-3-2,计算用户节点v及u的相似度向量 则用户节点u和v的相似度为k1+k2+k3+ k4=1。S3-3-2, calculate the similarity vector of user nodes v and u Then the similarity between user nodes u and v is k 1 +k 2 +k 3 + k 4 =1.
进一步的,所述S4-2具体为:Furthermore, the S4-2 is specifically:
S4-2-1,对于同一簇中的任意一对顶点v和u,G(v)=(V1,E1)和G(u)= (V2,E2)分别是它们的1*-邻居图,对于任意节点vi∈G(v),计算其与G(u)中所有节点的匹配代价 S4-2-1, for any pair of vertices v and u in the same cluster, G(v) = (V 1 , E 1 ) and G(u) = (V 2 , E 2 ) are their 1*-neighborhood graphs respectively. For any node vi ∈ G(v), calculate its matching cost with all nodes in G(u)
S4-2-2,构造所述代价矩阵 S4-2-2, construct the cost matrix
S4-2-3,构造带权二部图V1、V2分别为顶点集且两者中节点数量相等,记为x,为边集, 为边权值矩阵,wij=cij。S4-2-3, construct a weighted bipartite graph V 1 and V 2 are vertex sets with the same number of nodes, denoted by x. For edge sets, is the edge weight matrix, w ij = c ij .
进一步的,所述S4-3具体为:Furthermore, the S4-3 is specifically:
S4-3-1,选择最大度的节点作为匹配种子节点对;S4-3-1, select the node with the maximum degree as the matching seed node pair;
S4-3-2,利用蒙特卡洛方法求二部图B的的最优匹配;S4-3-2, use the Monte Carlo method to find the optimal match of the bipartite graph B;
S4-3-3,找到最优匹配所对应的图编辑路径P={v1→ut1,v2→ut2,…,vx→ utx},其中,ut1、ut2、utm分别为v1、v2、vm的匹配节点。S4-3-3, find the graph editing path P corresponding to the optimal match = { v1 →u t1 , v2 →u t2 , ..., vx →u tx }, where u t1 , u t2 , u tm are the matching nodes of v1 , v2 , and vm respectively.
进一步的,所述S5-4具体为:Furthermore, the S5-4 is specifically:
S5-4-1,若表示用户节点vi需增加条边,则分别在在节点的两跳和三跳邻居节点间寻找需要增加边的节点,并连边,若连边数量小于则添加假节点并与vi连边最终使得连边总数等于具体为:S5-4-1, if Indicates that user node vi needs to increase edges, then find the nodes that need to add edges between the two-hop and three-hop neighbor nodes of the node and connect them. If the number of connected edges is less than Then add a fake node and connect it to v i so that the total number of connected edges equals Specifically:
S5-4-1-1,在节点vi的两跳节点搜寻需要增加度的节点,预设为鼸i,若则在vi和vj之间增加一条边, S5-4-1-1, search for the node whose degree needs to be increased in the two-hop nodes of node v i , which is preset as 鼸i . If Then add an edge between vi and vj ,
S5-4-1-2,若不存在需要增加度的两跳节点,则在三跳节点中搜寻需要增加度的节点,预设为鼸i,若则在vi和vj之间增加一条边, S5-4-1-2, if there is no two-hop node that needs to increase the degree, search for the node that needs to increase the degree among the three-hop nodes, and preset it as 鼸i . Then add an edge between vi and vj ,
S5-4-1-3,重复上述步骤,直到或者不存在需要增加度的两跳及三跳节点;S5-4-1-3, repeat the above steps until Or there are no two-hop or three-hop nodes that need to increase the degree;
S5-4-1-4,若终止步骤;S5-4-1-4, if Termination step;
S5-4-1-5,若且不存在需要增加度的两跳及三跳节点,则增加相应个数的假节点,并与vi相连;S5-4-1-5, if If there are no two-hop or three-hop nodes whose degree needs to be increased, then add the corresponding number of fake nodes and connect them to vi ;
S5-4-2,若表示用户节点vi需删除条边,在其邻居中寻找同样需要删除边的邻居,将它们之间的连边删除,当删除边数等于时停止,若删除边数不足则将节点相邻边按照边介数中心性从低到高删除,直到删除总边数等于具体为:S5-4-2, if Indicates that the user node v i needs to delete the entry edge, find the neighbors that also need to delete edges among their neighbors, and delete the edges between them. When the number of deleted edges is equal to Stop when the number of edges to be deleted is insufficient. Then delete the adjacent edges of the node according to the edge betweenness centrality from low to high, until the total number of deleted edges equals Specifically:
S5-4-2-1,在节点vi的邻居中依次寻找需要减少度数的节点,并把它们加入到一个候选集合CS中,并按照用户节点的度降序排列;S5-4-2-1, search for nodes whose degrees need to be reduced in the neighbors of node v i , add them to a candidate set CS, and arrange them in descending order according to the degree of the user node;
S5-4-2-2,依次从删除CS中节点与vi之间的连边;S5-4-2-2, delete the edges between the nodes in CS and vi in turn;
S5-4-2-3,若终止步骤;S5-4-2-3, if Termination step;
S5-4-2-4,若在vi的剩下的相邻边中依次按照边介数中心性从小到大删除相应的边,直到所述边介数中心性为社交网络中所有用户之间的最短路径经过该边的路径数与网络中所有最节点之间短路径数量之比。S5-4-2-4, if Among the remaining adjacent edges of vi , delete the corresponding edges in order of edge betweenness centrality from small to large, until The edge betweenness centrality is the ratio of the number of paths through which the shortest paths between all users in the social network pass through the edge to the number of shortest paths between all nodes in the network.
本发明与现有技术相比具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
本发明在社会网络的图数据遭受1*-邻居攻击时,采用图修改技术实现了用户隐私身份隐私信息的保护;根据图编辑距离对同一簇中的1*-邻居图进行修改,使它们达到概率不可区分;在实现社交网络中用户身份隐私保护的同时,提高图数据的可用性。本发明所提供的一种社会网络中抵抗1*-邻居攻击的用户身份隐私保护方法具有较好的应用和推广作用。The present invention uses graph modification technology to protect user privacy information when graph data of a social network is attacked by 1*-neighbors; the 1*-neighbor graphs in the same cluster are modified according to the graph edit distance to make them probabilistically indistinguishable; while protecting user identity privacy in a social network, the usability of graph data is improved. The user identity privacy protection method against 1*-neighbor attacks in a social network provided by the present invention has good application and promotion effects.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本发明方法流程图;Fig. 1 is a flow chart of the method of the present invention;
图2是本发明一实施例中原始karate图及图中标号为1的节点的1*-邻居图示意图;FIG2 is a schematic diagram of an original karate graph and a 1*-neighborhood graph of a node labeled 1 in the graph according to an embodiment of the present invention;
图3是本发明一实施例中二部图示意图。FIG. 3 is a schematic diagram of a second diagram in an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
下面结合附图及实施例对本发明做进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and embodiments.
请参照图1,本发明提供一种社交网络中抵抗1*-邻居攻击的用户身份隐私保护方法,包括以下步骤:Referring to FIG. 1 , the present invention provides a method for protecting user identity privacy in a social network against 1*-neighbor attacks, comprising the following steps:
步骤1),对于给定图G=(V,E),根据度量:d(v),lc(v)将节点划分成若干个簇,其中d(v),lc(v)分别表示节点v的度及其的局部聚类系数。在划分结束后,将这些簇按照每个簇的最大节点度降序排列;Step 1), for a given graph G = (V, E), divide the nodes into several clusters according to the metrics: d(v), lc(v), where d(v), lc(v) represent the degree of node v and its local clustering coefficient respectively. After the division, the clusters are sorted in descending order according to the maximum node degree of each cluster;
步骤2),在节点粗划分后,某些簇中节点数少于一个给定的隐私需求k,根据簇的平均度与相邻两簇的平均度的差值,将该簇合并到差值小的簇中,以确保所有组的大小都大于k;Step 2), after the nodes are roughly divided, if the number of nodes in some clusters is less than a given privacy requirement k, the cluster is merged into the cluster with the smaller difference according to the difference between the average degree of the cluster and the average degree of the two adjacent clusters to ensure that the size of all groups is greater than k;
步骤2)具体方法为:Step 2) The specific method is:
S2-1,对于节点数小于k的簇,我们将其记为其中上标1表示该簇是第一次划分后得到的结果,其簇内节点的平均度记为计算其前后相邻的两个簇的节点平均度,分别记为 S2-1, for clusters with less than k nodes, we denote them as The superscript 1 indicates that the cluster is the result of the first partition, and the average degree of the nodes in the cluster is recorded as calculate The two adjacent clusters The average node degree of
S2-2,若满足公式则将添加到中,否则将添加到中;S2-2, if Satisfy the formula Then Add to Otherwise, Add to middle;
S2-3,重复执行上述步骤,直到所有的簇中的节点数都超过k。S2-3, repeat the above steps until the number of nodes in all clusters exceeds k.
步骤3),在簇合并完成后,某些簇中节点个数大于2k个,需对其进行簇分裂操作使得每一个簇的大小为[k,2k);Step 3), after the cluster merging is completed, if the number of nodes in some clusters is greater than 2k, cluster splitting operations need to be performed so that the size of each cluster is [k, 2k);
步骤3)具体为:Step 3) is specifically:
S3-1,对于每个簇中的用户节点,按度数降序排序,构建用户节点的1*-邻居图;S3-1, for each user node in each cluster, sort them in descending order of degree and construct the 1*-neighborhood graph of the user node;
S3-2,构造用户节点的1*-邻居结构特征矩阵其中分别表示用户的节点v在社交网络中的度分布、内度分布、外度分布及间隙度分布;S3-2-1,计算用户节点v的1*-邻居图G(v)中邻居节点的度分布 是用户节点vi的度,表示vi在原始图G中邻居的个数,N(vi)为用户节点v所有邻居的集合;S3-2, construct the 1*-neighborhood structure feature matrix of the user node in They represent the degree distribution, inner degree distribution, outer degree distribution and gap degree distribution of the user's node v in the social network respectively; S3-2-1, calculate the degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of the user node v is the degree of user node vi , indicating the number of neighbors of vi in the original graph G, N(v i ) is the set of all neighbors of user node v;
S3-2-2,计算用户节点v的1*-邻居图G(v)中邻居节点的内度分布 是用户节点的内度,表示用户节点vi在1*-邻居图G(v)中邻居的个数, S3-2-2, calculate the inner degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of user node v is the inner degree of the user node, which indicates the number of neighbors of the user node vi in the 1*-neighborhood graph G(v),
步骤3-2-3,计算用户节点v的1*-邻居图G(v)中邻居节点的出度分布 是vi的出度,表示用户节点vi在1*-邻居图G(v)之外邻居的个数, Step 3-2-3, calculate the out-degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of user node v is the out-degree of vi , indicating the number of neighbors of user node vi outside the 1*-neighborhood graph G(v),
步骤3-2-4,计算用户节点v的1*-邻居图G(v)中邻居节点的间隙度分布 Step 3-2-4, calculate the gap degree distribution of neighbor nodes in the 1*-neighborhood graph G(v) of user node v
S3-2-5,社交网络中每个用户节点的特征矩阵记为 S3-2-5, the feature matrix of each user node in the social network is recorded as
为用户节点v在社交网络中邻居的个数。is the number of neighbors of user node v in the social network.
S3-3,根据公式计算同一簇中任意两个节点之间的结构相似度,其中分别表示用户节点度分布、内度分布、外度分布及间隙度分布的不相关程度,k1、k2、k3、k4分别表示各个相似度所占比重,且满足k1+k2+k3+k4=1;S3-3, according to the formula Calculate the structural similarity between any two nodes in the same cluster, where They represent the irrelevance of user node degree distribution, inner degree distribution, outer degree distribution and gap degree distribution respectively. k 1 , k 2 , k 3 , k 4 represent the proportion of each similarity respectively and satisfy k 1 +k 2 +k 3 +k 4 =1.
S3-3-1,对于同一簇中的用户节点v及u,分别利用JS散度计算他们的度分布、内度分布、出度分布、间隙度分布的不相关程度,分别记为: 所述JS散度定义为:其中P= {p1,p2,…,pt},Q={q1,q2,…,qt}分别为同一概率空间中的两个概率分布, S3-3-1, for user nodes v and u in the same cluster, use JS divergence to calculate the degree of irrelevance of their degree distribution, inner degree distribution, out-degree distribution, and gap degree distribution, respectively, which are recorded as: The JS divergence is defined as: Where P = {p 1 ,p 2 ,…,p t }, Q = {q 1 ,q 2 ,…,q t } are two probability distributions in the same probability space.
S3-3-2,计算用户节点v及u的相似度向量 则用户节点u和v的相似度为k1+k2+k3+ k4=1。S3-3-2, calculate the similarity vector of user nodes v and u Then the similarity between user nodes u and v is k 1 +k 2 +k 3 + k 4 =1.
S3-4,利用K-means聚类算法将节点划分为T个簇。S3-4, use K-means clustering algorithm to divide the nodes into T clusters.
步骤4),根据每个簇中节点的1*-邻居图计算节点间的相似度,构造出一个带权二部图,并在二部图上计算出图编辑距离,并找到图编辑路径P;Step 4), calculate the similarity between nodes according to the 1*-neighborhood graph of nodes in each cluster, construct a weighted bipartite graph, calculate the graph edit distance on the bipartite graph, and find the graph edit path P;
步骤4具体方法为:Step 4:
S4-1,如果两个用户节点的l*-邻居图中邻居节点数不相等,则在用户邻居节点数少的图中添加用户节点使得两个图中节点数相等;S4-1, if the number of neighbor nodes in the l*-neighborhood graph of two user nodes is not equal, then add the user node to the graph with fewer user neighbor nodes so that the number of nodes in the two graphs is equal;
S4-2,构造节点的匹配代价矩阵,并以节点的匹配代价作为边权值构造一个带权二部图;S4-2, construct the node matching cost matrix, and construct a weighted bipartite graph using the node matching cost as the edge weight;
S4-2-1,对于同一簇中的任意一对顶点v和u,G(v)=(V1,E1)和G(u)= (V2,E2)分别是它们的1*-邻居图,对于任意节点vi∈G(v),计算其与G(u)中所有节点的匹配代价 S4-2-1, for any pair of vertices v and u in the same cluster, G(v) = (V 1 , E 1 ) and G(u) = (V 2 , E 2 ) are their 1*-neighborhood graphs respectively. For any node vi ∈ G(v), calculate its matching cost with all nodes in G(u)
S4-2-2,构造所述代价矩阵 S4-2-2, construct the cost matrix
S4-2-3,构造带权二部图V1、V2分别为顶点集且两者中节点数量相等,记为x,为边集, 为边权值矩阵,wij=cij。S4-2-3, construct a weighted bipartite graph V 1 and V 2 are vertex sets with the same number of nodes, denoted by x. For edge sets, is the edge weight matrix, w ij = c ij .
S4-3,利用二部图计算节点的图编辑距离并得到匹配的节点和图编辑路径。S4-3, using the bipartite graph to calculate the graph edit distance of the nodes and obtain the matching nodes and graph edit paths.
S4-3-1,选择最大度的节点作为匹配种子节点对;S4-3-1, select the node with the maximum degree as the matching seed node pair;
S4-3-2,利用蒙特卡洛方法求二部图B的的最优匹配;S4-3-2, use the Monte Carlo method to find the optimal match of the bipartite graph B;
S4-3-3,找到最优匹配所对应的图编辑路径P={v1→ut1,v2→ut2,…,vx→S4-3-3, find the graph editing path P corresponding to the best match = {v 1 →u t1 ,v 2 →u t2 ,…,v x →
utx},其中,ut1、ut2、utm分别为v1、v2、vm的匹配节点。u tx }, where u t1 , u t2 , and u tm are the matching nodes of v 1 , v 2 , and v m respectively.
步骤5),根据步骤4)找到的图编辑路径P,修改簇中节点的1*-邻居图,使得他们同构。Step 5), based on the graph editing path P found in step 4), modify the 1*-neighbor graphs of the nodes in the cluster so that they are isomorphic.
步骤5)方法为:Step 5) The method is:
S5-1,构造图G的邻接矩阵记为A=(aij)n×n,其中当节点vi和vj间存在边时, aij=1,否则,aij=0;S5-1, construct the adjacency matrix of graph G, denoted as A = (a ij ) n × n , where a ij = 1 when there is an edge between nodes v i and v j , otherwise, a ij = 0;
S5-2,计算A2及A3,及若则令若则令计算 S5-2, calculate A 2 and A 3 , and like Then like Then calculate
S5-3,对于社交网络中每个用户节点v,根据S4-3计算得到的匹配节点u,计算出节点v需要修改的度并记为 将社交网络中每个用户节点需修改的度按照降序排列,得到的度修改序列记为其中, dv表示用户节点v的邻居个数;S5-3, for each user node v in the social network, according to the matching node u calculated in S4-3, calculate the degree of node v that needs to be modified and record it as Arrange the degree of each user node in the social network that needs to be modified in descending order, and the degree modification sequence is recorded as Among them, d v represents the number of neighbors of user node v;
S5-4,按照DM修改图结构。S5-4, modify the graph structure according to D M.
S5-4-1,若表示用户节点vi需增加条边,则分别在在节点的两跳和三跳邻居节点间寻找需要增加边的节点,并连边,若连边数量小于则添加假节点并与vi连边最终使得连边总数等于 S5-4-1, if Indicates that user node v i needs to increase edges, then find the nodes that need to add edges between the two-hop and three-hop neighbor nodes of the node and connect them. If the number of connected edges is less than Then add a fake node and connect it to v i so that the total number of connected edges equals
S5-4-1-1,在节点vi的两跳节点搜寻需要增加度的节点,预设为鼸i,若则在vi和vj之间增加一条边, S5-4-1-1, search for the node whose degree needs to be increased in the two-hop nodes of node v i , which is preset as 鼸i . If Then add an edge between vi and vj ,
S5-4-1-2,若不存在需要增加度的两跳节点,则在三跳节点中搜寻需要增加度的节点,预设为鼸i,若则在vi和vj之间增加一条边, S5-4-1-2, if there is no two-hop node that needs to increase the degree, search for the node that needs to increase the degree among the three-hop nodes, and preset it as 鼸i . Then add an edge between vi and vj ,
S5-4-1-3,重复上述步骤,直到或者不存在需要增加度的两跳及三跳节点;S5-4-1-3, repeat the above steps until Or there are no two-hop or three-hop nodes that need to increase the degree;
S5-4-1-4,若终止步骤;S5-4-1-4, if Termination step;
S5-4-1-5,若且不存在需要增加度的两跳及三跳节点,则增加相应个数的假节点,并与vi相连。S5-4-1-5, if If there are no two-hop or three-hop nodes whose degree needs to be increased, then a corresponding number of fake nodes are added and connected to vi .
S5-4-2,若表示用户节点vi需删除条边,在其邻居中寻找同样需要删除边的邻居,将它们之间的连边删除,当删除边数等于时停止,若删除边数不足则将节点相邻边按照边介数中心性从低到高删除,直到删除总边数等于 S5-4-2, if Indicates that the user node v i needs to delete the entry edge, find the neighbors that also need to delete edges among their neighbors, and delete the edges between them. When the number of deleted edges is equal to Stop when the number of edges to be deleted is insufficient. Then delete the adjacent edges of the node according to the edge betweenness centrality from low to high, until the total number of deleted edges equals
S5-4-2-1,在节点vi的邻居中依次寻找需要减少度数的节点,并把它们加入到一个候选集合CS中,并按照用户节点的度降序排列;S5-4-2-1, search for nodes whose degrees need to be reduced in the neighbors of node v i , add them to a candidate set CS, and arrange them in descending order according to the degree of the user node;
S5-4-2-2,依次从删除CS中节点与vi之间的连边;S5-4-2-2, delete the edges between the nodes in CS and vi in turn;
S5-4-2-3,若终止步骤;S5-4-2-3, if Termination step;
S5-4-2-4,若在vi的剩下的相邻边中依次按照边介数中心性从小到大删除相应的边,直到所述边介数中心性为社交网络中所有用户之间的最短路径经过该边的路径数与网络中所有最节点之间短路径数量之比。S5-4-2-4, if Among the remaining adjacent edges of vi , delete the corresponding edges in order of edge betweenness centrality from small to large, until The edge betweenness centrality is the ratio of the number of paths through which the shortest paths between all users in the social network pass through the edge to the number of shortest paths between all nodes in the network.
以上所述仅为本发明的较佳实施例,凡依本发明申请专利范围所做的均等变化与修饰,皆应属本发明的涵盖范围。The above description is only a preferred embodiment of the present invention. All equivalent changes and modifications made according to the scope of the patent application of the present invention should fall within the scope of the present invention.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210867729.3A CN115277156B (en) | 2022-07-22 | 2022-07-22 | User identity privacy protection method for resisting neighbor attack in social network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210867729.3A CN115277156B (en) | 2022-07-22 | 2022-07-22 | User identity privacy protection method for resisting neighbor attack in social network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115277156A CN115277156A (en) | 2022-11-01 |
CN115277156B true CN115277156B (en) | 2023-05-23 |
Family
ID=83768681
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210867729.3A Active CN115277156B (en) | 2022-07-22 | 2022-07-22 | User identity privacy protection method for resisting neighbor attack in social network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115277156B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111723399A (en) * | 2020-06-15 | 2020-09-29 | 内蒙古科技大学 | A privacy-preserving method for directed graphs in large-scale social networks based on k-kernels |
CN113706326A (en) * | 2021-08-31 | 2021-11-26 | 福建师范大学 | Mobile social network diagram modification method based on matrix operation |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8965409B2 (en) * | 2006-03-17 | 2015-02-24 | Fatdoor, Inc. | User-generated community publication in an online neighborhood social network |
US8790182B2 (en) * | 2011-05-27 | 2014-07-29 | Zynga Inc. | Collaborative diplomacy mechanics |
US9292884B2 (en) * | 2013-07-10 | 2016-03-22 | Facebook, Inc. | Network-aware product rollout in online social networks |
CN106354886B (en) * | 2016-10-18 | 2019-05-28 | 南京邮电大学 | The method of potential neighbor relational graph screening nearest-neighbors is utilized in recommender system |
CN114401136B (en) * | 2022-01-14 | 2023-05-05 | 天津大学 | Rapid anomaly detection method for multiple attribute networks |
CN114692205A (en) * | 2022-04-19 | 2022-07-01 | 辽宁工业大学 | Graph anonymization method for privacy protection of weighted social network |
-
2022
- 2022-07-22 CN CN202210867729.3A patent/CN115277156B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111723399A (en) * | 2020-06-15 | 2020-09-29 | 内蒙古科技大学 | A privacy-preserving method for directed graphs in large-scale social networks based on k-kernels |
CN113706326A (en) * | 2021-08-31 | 2021-11-26 | 福建师范大学 | Mobile social network diagram modification method based on matrix operation |
Also Published As
Publication number | Publication date |
---|---|
CN115277156A (en) | 2022-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Aligning users across social networks using network embedding. | |
Fu et al. | Is-label: an independent-set based labeling scheme for point-to-point distance querying on large graphs | |
Liu et al. | MCS-GPM: Multi-constrained simulation based graph pattern matching in contextual social graphs | |
CN106355506B (en) | Influence maximization initial node selection method in online social network | |
CN111428323B (en) | A Method of Using Generalized Discount and k-shell to Identify a Group of Key Nodes in Complex Networks | |
CN111475838B (en) | Deep neural network-based graph data anonymizing method, device and storage medium | |
CN101916256A (en) | A Community Discovery Method Integrating Actor Interests and Network Topology | |
CN104077723A (en) | Social network recommending system and social network recommending method | |
Jia et al. | Enhancing node-level adversarial defenses by Lipschitz regularization of graph neural networks | |
Sun et al. | Index-based intimate-core community search in large weighted graphs | |
CN113220904A (en) | Data processing method, data processing device and electronic equipment | |
CN116628360A (en) | Social network histogram issuing method and device based on differential privacy | |
Huang et al. | Discovering association rules with graph patterns in temporal networks | |
Lu et al. | A unified link prediction framework for predicting arbitrary relations in heterogeneous academic networks | |
Hao et al. | MLDA: a multi-level k-degree anonymity scheme on directed social network graphs | |
CN108040321B (en) | A Location Anonymous Method Against Replay Attacks in Road Network Environment | |
CN115277156B (en) | User identity privacy protection method for resisting neighbor attack in social network | |
Kim et al. | OCSM: Finding overlapping cohesive subgraphs with minimum degree | |
CN112464108A (en) | Resource recommendation method for crowdsourcing knowledge sharing community | |
CN112765414A (en) | Graph embedding vector generation method and graph embedding-based community discovery method | |
CN112163170A (en) | A method and system for improving social network alignment based on virtual nodes and meta-learning | |
Xiang et al. | TKDA: An improved method for k-degree anonymity in social graphs | |
Liu et al. | A new method of identifying core designers and teams based on the importance and similarity of networks | |
Sun et al. | Distance dynamics based overlapping semantic community detection for node‐attributed networks | |
Zhang et al. | Social network sensitive area perturbance method based on firefly algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |