CN111767567A

CN111767567A - Social information security management method

Info

Publication number: CN111767567A
Application number: CN202010577101.0A
Authority: CN
Inventors: 杨良斌; 于腊梅
Original assignee: International Relations, University of
Current assignee: International Relations, University of
Priority date: 2020-06-22
Filing date: 2020-06-22
Publication date: 2020-10-13

Abstract

The invention discloses a social information security management method, which relates to the technical field of information security, and comprises the steps of obtaining a seed node set, determining the aggregation density of all nodes, generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated to form clusters, and the inner layer of the seed node set is circulated to form a set with high aggregation density; determining a calculation distance matrix; and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment. Clustering is carried out based on user information and social relations, all nodes in the social network are clustered into a super point at least comprising k nodes according to the distance between the nodes, anonymization processing is carried out on the super point, and the anonymized super point can effectively prevent various privacy attacks taking node attribute privacy, sub-graph structures and the like as background knowledge, so that information loss is effectively reduced, and data effectiveness is improved.

Description

Social information security management method

Technical Field

The invention relates to the technical field of information security, in particular to a social information security management method.

Background

With the rapid development of multimedia and mobile social networking technologies, social multimedia data is distributed more and more efficiently, digitized information can be conveniently and quickly transmitted on the network in different forms, and multimedia communication gradually becomes an important means for information exchange in daily life of people. But at the same time, a series of problems of abuse, illegal copying, piracy, plagiarism and appropriation of multimedia occur.

The rapid development of internet technology promotes the rise of various social network platforms. The social network moves the life of people and the connection between people to the internet, so that a large amount of information is accumulated, the information reflects social laws to a certain extent, and a class of data with important research significance and application value is formed.

At present, a plurality of protection technologies have emerged for the problem of social network privacy protection, and the simplest method in technical implementation is to hide user identity information and not process other information. Although the technology protects the personal privacy of the user within a certain range, the identity of the individual can be still recognized by a malicious person through the background knowledge of the social network relationship of the target person, and the privacy of the user is leaked.

An effective solution to the problems in the related art has not been proposed yet.

Disclosure of Invention

Aiming at the problems in the related art, the invention provides a social information security management method to overcome the technical problems in the prior related art.

The technical scheme of the invention is realized as follows:

the social information security management method comprises the following steps:

acquiring a seed node set, determining the aggregation density of all nodes, and generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated as the number of clusters, and the inner layer of the seed node set is circulated as the set with high aggregation density;

determining a calculation distance matrix;

and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment.

Further, obtaining a High aggregation density area to obtain a set High;

selecting the node with the maximum aggregation density as a first initial central node seed 1;

selecting the points which are farthest from the seed1 from the set High to form a set, and selecting the points with the highest density from the set as seed 2;

the initial seed node is represented as:

further, the method comprises the steps of giving a social network G ═ V, E and A, clustering according to the similarity among the calculation nodes, enabling the number of the nodes in each cluster to be larger than or equal to k, dividing all the nodes in the social network into cluster sets through clustering, representing that the cluster set V is clustered, generating cluster sets Sclt ═ V { clt1, clt2, …, clts }, and Uclti ═ V,

i, j ∈ 1, 2, …, n, i ≠ j, anonymized social network Gano ═ j (Vano, Eano, Aano), wherein Vano ═ { vclt1, vclt2, … vclts }, vctti is an anonymized node of the anonymized network, and Vano ═ Vano × Vano, Vclti, Vcltj ∈ Vano, (Vclti, Vcltj) ∈ Eano.

Further, a set of social network information is predetermined, including personal information and social relationship information including users, which is described by a labeled undirected weightless graph, denoted as G ═ (V, E, a), where V ═ V1, V2, …, vn is a set of points in the social network, where vi (i ═ 1, 2, …, n) represents any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).

Further, the structural information feature distance is expressed as:

in the social network G (V, E, a), a node set V ═ V1, V2, …, vn }, where a neighbor relation of any one node vi may be represented as Neibor ═ (nbi1, nbi2, …, nbin), if a social relation exists between vi and vj, that is, an edge (vi, vj) belongs to E, i ≠ j, then nbij is 0, otherwise nbij is 1, a structural feature distance between nodes is:

further, the distance of the personal information feature is expressed in the social network G (V, E, a), where Ai ═ i (Ai1, Ai2, …, aim) is an m-dimensional attribute sequence of the node vi (i ═ 1, 2, …, n), and is expressed as:

the calculation method of the continuous numerical attribute information loss is the difference between the two;

the information loss calculation method of discrete data is that when the two attributes are equal, the distance is 0, and when the two attributes are not equal, the distance is 1.

Further, the distance between two nodes is measured by combining the structural information characteristic distance and the personal information characteristic distance of the node into a comprehensive distance through a parameter alpha, and the closer the distance in the clustering algorithm is, the nodes are aggregated in a super point, which is expressed as:

CD＝a×SIFD+(1-a)×PIFD。

the invention has the beneficial effects that:

the invention clusters based on user information and social relations, clusters all nodes in the social network into the super points at least comprising k nodes according to the distance between the nodes, and carries out anonymization processing on the super points, the anonymized super points can effectively prevent various privacy attacks taking node attribute privacy, sub-graph structures and the like as background knowledge, so that an attacker cannot identify users with the probability more than 1/k, and optimizes the selection algorithm of initial nodes and the calculation method of node spacing in the clustering process according to the characteristics of a clustering algorithm and the social network, thereby effectively reducing information loss and improving data effectiveness.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a flowchart illustrating a social information security management method according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.

According to an embodiment of the invention, a social information security management method is provided.

As shown in fig. 1, the social information security management method according to the embodiment of the present invention includes the following steps:

determining a calculation distance matrix;

By means of the scheme, clustering is carried out based on user information and social relations, all nodes in the social network are clustered into the super points at least comprising k nodes according to the distances between the nodes, anonymization processing is carried out on the super points, the anonymized super points can effectively prevent various privacy attacks with node attribute privacy, sub-graph structures and the like as background knowledge, an attacker cannot identify users with the probability larger than 1/k, the selection algorithm of initial nodes in the clustering process and the calculation method of node distances are optimized according to the clustering algorithm and the characteristics of the social network, meanwhile, information loss is effectively reduced, and data effectiveness is improved.

Acquiring a High aggregation density area to obtain a set High;

the initial seed node is represented as:

wherein, further comprises anonymizing social networks, including giving a social network G ═ V, E, A, clustering according to the similarity among the calculation nodes, making the number of nodes in each cluster be more than or equal to k, dividing all nodes in the social network into cluster sets through clustering, expressing as clustering point sets V, generating cluster sets Sclt ═ V { clt1, clt2, …, clts }, Uclti ═ V,

Wherein, a set of social network information is predetermined, including personal information and social relationship information including users, which is described by a labeled undirected weightless graph, denoted as G ═ (V, E, a), where V ═ { V1, V2, …, vn } is a set of points in the social network, where vi (i ═ 1, 2, …, n) denotes any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).

Wherein, the structural information characteristic distance is expressed as:

in the social network G (V, E, a), the personal information feature distance is expressed as an m-dimensional attribute sequence of a node vi (i ═ 1, 2, …, n) with Ai ═ 1, Ai2, …, aim, and is expressed as:

The distance between two nodes is measured by combining the structural information characteristic distance and the personal information characteristic distance of the node into a comprehensive distance through a parameter alpha, and the closer the distance in the clustering algorithm is, the nodes are aggregated in a super point, which is represented as:

CD＝a×SIFD+(1-a)×PIFD。

in summary, with the above technical solution of the present invention, the following effects can be achieved:

1) establishing a social network model of an actual problem according to the actual situation of the social network, clustering on a social network graph, and completing data anonymization processing to form an anonymous social network;

2) according to the characteristics of the social network, respectively quantifying the social relationship of the users between the network nodes and the distance of the user information, and calculating the distance between the network nodes and the over point formed by clustering to perform clustering;

3) aiming at the characteristics of a social network and a clustering algorithm and the problems that different users have different requirements on privacy protection, the clustering algorithm is optimized by combining a clustering coefficient and a node density, and privacy protection of data is carried out to different degrees.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. The social information security management method is characterized by comprising the following steps:

determining a calculation distance matrix;

2. The social information security management method of claim 1,

acquiring a High aggregation density area to obtain a set High;

the initial seed node is represented as:

3. the method of claim 2, further comprising anonymizing the social networks, including giving one social network G ═ G(V, E, A), clustering according to the similarity among the calculation nodes, enabling the number of the nodes in each cluster to be larger than or equal to k, dividing all the nodes in the social network into cluster sets through clustering, representing that the point set V is clustered, generating a cluster set Sclt [ { clt1, clt2, …, clts }, and Uclti ═ V,

4. The method for social information security management according to claim 3, wherein the social network information set is predetermined and includes personal information and social relationship information including users, which is described by a labeled undirected graph, denoted as G ═ (V, E, a), where V ═ { V1, V2, …, vn } is a set of points in the social network, where vi (i ═ 1, 2, …, n) denotes any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).

5. The method for managing social information security as claimed in claim 4, wherein the distance of the structural information characteristic is expressed as:

6. the social information security management method of claim 5, wherein the personal information characteristic distance is expressed in social network G (V, E, A), and Ai (Ai1, Ai2, …, aim) is an m-dimensional attribute sequence of node vi (i 1, 2, …, n) expressed as:

7. The social information security management method of claim 1, wherein the distance between two nodes is measured by combining the structural information feature distance and the personal information feature distance of the node into a composite distance according to a parameter α, and the closer the distance in the clustering algorithm, the nodes are aggregated in a super point, which is expressed as:

CD＝a×SIFD+(1-a)×PIFD。