CN111767567A - Social information security management method - Google Patents
Social information security management method Download PDFInfo
- Publication number
- CN111767567A CN111767567A CN202010577101.0A CN202010577101A CN111767567A CN 111767567 A CN111767567 A CN 111767567A CN 202010577101 A CN202010577101 A CN 202010577101A CN 111767567 A CN111767567 A CN 111767567A
- Authority
- CN
- China
- Prior art keywords
- nodes
- social
- node
- distance
- social network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000007726 management method Methods 0.000 title claims abstract description 14
- 230000002776 aggregation Effects 0.000 claims abstract description 18
- 238000004220 aggregation Methods 0.000 claims abstract description 18
- 238000004364 calculation method Methods 0.000 claims abstract description 15
- 239000011159 matrix material Substances 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims description 8
- 208000024858 congenital sideroblastic anemia-B-cell immunodeficiency-periodic fever-developmental delay syndrome Diseases 0.000 claims description 3
- 201000005956 sideroblastic anemia with B-cell immunodeficiency, periodic fevers, and developmental delay Diseases 0.000 claims description 3
- 239000002131 composite material Substances 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Bioethics (AREA)
- Computer Security & Cryptography (AREA)
- Human Resources & Organizations (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Computer Hardware Design (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a social information security management method, which relates to the technical field of information security, and comprises the steps of obtaining a seed node set, determining the aggregation density of all nodes, generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated to form clusters, and the inner layer of the seed node set is circulated to form a set with high aggregation density; determining a calculation distance matrix; and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment. Clustering is carried out based on user information and social relations, all nodes in the social network are clustered into a super point at least comprising k nodes according to the distance between the nodes, anonymization processing is carried out on the super point, and the anonymized super point can effectively prevent various privacy attacks taking node attribute privacy, sub-graph structures and the like as background knowledge, so that information loss is effectively reduced, and data effectiveness is improved.
Description
Technical Field
The invention relates to the technical field of information security, in particular to a social information security management method.
Background
With the rapid development of multimedia and mobile social networking technologies, social multimedia data is distributed more and more efficiently, digitized information can be conveniently and quickly transmitted on the network in different forms, and multimedia communication gradually becomes an important means for information exchange in daily life of people. But at the same time, a series of problems of abuse, illegal copying, piracy, plagiarism and appropriation of multimedia occur.
The rapid development of internet technology promotes the rise of various social network platforms. The social network moves the life of people and the connection between people to the internet, so that a large amount of information is accumulated, the information reflects social laws to a certain extent, and a class of data with important research significance and application value is formed.
At present, a plurality of protection technologies have emerged for the problem of social network privacy protection, and the simplest method in technical implementation is to hide user identity information and not process other information. Although the technology protects the personal privacy of the user within a certain range, the identity of the individual can be still recognized by a malicious person through the background knowledge of the social network relationship of the target person, and the privacy of the user is leaked.
An effective solution to the problems in the related art has not been proposed yet.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a social information security management method to overcome the technical problems in the prior related art.
The technical scheme of the invention is realized as follows:
the social information security management method comprises the following steps:
acquiring a seed node set, determining the aggregation density of all nodes, and generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated as the number of clusters, and the inner layer of the seed node set is circulated as the set with high aggregation density;
determining a calculation distance matrix;
and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment.
Further, obtaining a High aggregation density area to obtain a set High;
selecting the node with the maximum aggregation density as a first initial central node seed 1;
selecting the points which are farthest from the seed1 from the set High to form a set, and selecting the points with the highest density from the set as seed 2;
the initial seed node is represented as:
further, the method comprises the steps of giving a social network G ═ V, E and A, clustering according to the similarity among the calculation nodes, enabling the number of the nodes in each cluster to be larger than or equal to k, dividing all the nodes in the social network into cluster sets through clustering, representing that the cluster set V is clustered, generating cluster sets Sclt ═ V { clt1, clt2, …, clts }, and Uclti ═ V,i, j ∈ 1, 2, …, n, i ≠ j, anonymized social network Gano ═ j (Vano, Eano, Aano), wherein Vano ═ { vclt1, vclt2, … vclts }, vctti is an anonymized node of the anonymized network, and Vano ═ Vano × Vano, Vclti, Vcltj ∈ Vano, (Vclti, Vcltj) ∈ Eano.
Further, a set of social network information is predetermined, including personal information and social relationship information including users, which is described by a labeled undirected weightless graph, denoted as G ═ (V, E, a), where V ═ V1, V2, …, vn is a set of points in the social network, where vi (i ═ 1, 2, …, n) represents any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).
Further, the structural information feature distance is expressed as:
in the social network G (V, E, a), a node set V ═ V1, V2, …, vn }, where a neighbor relation of any one node vi may be represented as Neibor ═ (nbi1, nbi2, …, nbin), if a social relation exists between vi and vj, that is, an edge (vi, vj) belongs to E, i ≠ j, then nbij is 0, otherwise nbij is 1, a structural feature distance between nodes is:
further, the distance of the personal information feature is expressed in the social network G (V, E, a), where Ai ═ i (Ai1, Ai2, …, aim) is an m-dimensional attribute sequence of the node vi (i ═ 1, 2, …, n), and is expressed as:
the calculation method of the continuous numerical attribute information loss is the difference between the two;
the information loss calculation method of discrete data is that when the two attributes are equal, the distance is 0, and when the two attributes are not equal, the distance is 1.
Further, the distance between two nodes is measured by combining the structural information characteristic distance and the personal information characteristic distance of the node into a comprehensive distance through a parameter alpha, and the closer the distance in the clustering algorithm is, the nodes are aggregated in a super point, which is expressed as:
CD=a×SIFD+(1-a)×PIFD。
the invention has the beneficial effects that:
the invention clusters based on user information and social relations, clusters all nodes in the social network into the super points at least comprising k nodes according to the distance between the nodes, and carries out anonymization processing on the super points, the anonymized super points can effectively prevent various privacy attacks taking node attribute privacy, sub-graph structures and the like as background knowledge, so that an attacker cannot identify users with the probability more than 1/k, and optimizes the selection algorithm of initial nodes and the calculation method of node spacing in the clustering process according to the characteristics of a clustering algorithm and the social network, thereby effectively reducing information loss and improving data effectiveness.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flowchart illustrating a social information security management method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
According to an embodiment of the invention, a social information security management method is provided.
As shown in fig. 1, the social information security management method according to the embodiment of the present invention includes the following steps:
acquiring a seed node set, determining the aggregation density of all nodes, and generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated as the number of clusters, and the inner layer of the seed node set is circulated as the set with high aggregation density;
determining a calculation distance matrix;
and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment.
By means of the scheme, clustering is carried out based on user information and social relations, all nodes in the social network are clustered into the super points at least comprising k nodes according to the distances between the nodes, anonymization processing is carried out on the super points, the anonymized super points can effectively prevent various privacy attacks with node attribute privacy, sub-graph structures and the like as background knowledge, an attacker cannot identify users with the probability larger than 1/k, the selection algorithm of initial nodes in the clustering process and the calculation method of node distances are optimized according to the clustering algorithm and the characteristics of the social network, meanwhile, information loss is effectively reduced, and data effectiveness is improved.
Acquiring a High aggregation density area to obtain a set High;
selecting the node with the maximum aggregation density as a first initial central node seed 1;
selecting the points which are farthest from the seed1 from the set High to form a set, and selecting the points with the highest density from the set as seed 2;
the initial seed node is represented as:
wherein, further comprises anonymizing social networks, including giving a social network G ═ V, E, A, clustering according to the similarity among the calculation nodes, making the number of nodes in each cluster be more than or equal to k, dividing all nodes in the social network into cluster sets through clustering, expressing as clustering point sets V, generating cluster sets Sclt ═ V { clt1, clt2, …, clts }, Uclti ═ V,i, j ∈ 1, 2, …, n, i ≠ j, anonymized social network Gano ═ j (Vano, Eano, Aano), wherein Vano ═ { vclt1, vclt2, … vclts }, vctti is an anonymized node of the anonymized network, and Vano ═ Vano × Vano, Vclti, Vcltj ∈ Vano, (Vclti, Vcltj) ∈ Eano.
Wherein, a set of social network information is predetermined, including personal information and social relationship information including users, which is described by a labeled undirected weightless graph, denoted as G ═ (V, E, a), where V ═ { V1, V2, …, vn } is a set of points in the social network, where vi (i ═ 1, 2, …, n) denotes any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).
Wherein, the structural information characteristic distance is expressed as:
in the social network G (V, E, a), a node set V ═ V1, V2, …, vn }, where a neighbor relation of any one node vi may be represented as Neibor ═ (nbi1, nbi2, …, nbin), if a social relation exists between vi and vj, that is, an edge (vi, vj) belongs to E, i ≠ j, then nbij is 0, otherwise nbij is 1, a structural feature distance between nodes is:
in the social network G (V, E, a), the personal information feature distance is expressed as an m-dimensional attribute sequence of a node vi (i ═ 1, 2, …, n) with Ai ═ 1, Ai2, …, aim, and is expressed as:
the calculation method of the continuous numerical attribute information loss is the difference between the two;
the information loss calculation method of discrete data is that when the two attributes are equal, the distance is 0, and when the two attributes are not equal, the distance is 1.
The distance between two nodes is measured by combining the structural information characteristic distance and the personal information characteristic distance of the node into a comprehensive distance through a parameter alpha, and the closer the distance in the clustering algorithm is, the nodes are aggregated in a super point, which is represented as:
CD=a×SIFD+(1-a)×PIFD。
in summary, with the above technical solution of the present invention, the following effects can be achieved:
1) establishing a social network model of an actual problem according to the actual situation of the social network, clustering on a social network graph, and completing data anonymization processing to form an anonymous social network;
2) according to the characteristics of the social network, respectively quantifying the social relationship of the users between the network nodes and the distance of the user information, and calculating the distance between the network nodes and the over point formed by clustering to perform clustering;
3) aiming at the characteristics of a social network and a clustering algorithm and the problems that different users have different requirements on privacy protection, the clustering algorithm is optimized by combining a clustering coefficient and a node density, and privacy protection of data is carried out to different degrees.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (7)
1. The social information security management method is characterized by comprising the following steps:
acquiring a seed node set, determining the aggregation density of all nodes, and generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated as the number of clusters, and the inner layer of the seed node set is circulated as the set with high aggregation density;
determining a calculation distance matrix;
and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment.
2. The social information security management method of claim 1,
acquiring a High aggregation density area to obtain a set High;
selecting the node with the maximum aggregation density as a first initial central node seed 1;
selecting the points which are farthest from the seed1 from the set High to form a set, and selecting the points with the highest density from the set as seed 2;
the initial seed node is represented as:
3. the method of claim 2, further comprising anonymizing the social networks, including giving one social network G ═ G(V, E, A), clustering according to the similarity among the calculation nodes, enabling the number of the nodes in each cluster to be larger than or equal to k, dividing all the nodes in the social network into cluster sets through clustering, representing that the point set V is clustered, generating a cluster set Sclt [ { clt1, clt2, …, clts }, and Uclti ═ V,i, j ∈ 1, 2, …, n, i ≠ j, anonymized social network Gano ═ j (Vano, Eano, Aano), wherein Vano ═ { vclt1, vclt2, … vclts }, vctti is an anonymized node of the anonymized network, and Vano ═ Vano × Vano, Vclti, Vcltj ∈ Vano, (Vclti, Vcltj) ∈ Eano.
4. The method for social information security management according to claim 3, wherein the social network information set is predetermined and includes personal information and social relationship information including users, which is described by a labeled undirected graph, denoted as G ═ (V, E, a), where V ═ { V1, V2, …, vn } is a set of points in the social network, where vi (i ═ 1, 2, …, n) denotes any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).
5. The method for managing social information security as claimed in claim 4, wherein the distance of the structural information characteristic is expressed as:
in the social network G (V, E, a), a node set V ═ V1, V2, …, vn }, where a neighbor relation of any one node vi may be represented as Neibor ═ (nbi1, nbi2, …, nbin), if a social relation exists between vi and vj, that is, an edge (vi, vj) belongs to E, i ≠ j, then nbij is 0, otherwise nbij is 1, a structural feature distance between nodes is:
6. the social information security management method of claim 5, wherein the personal information characteristic distance is expressed in social network G (V, E, A), and Ai (Ai1, Ai2, …, aim) is an m-dimensional attribute sequence of node vi (i 1, 2, …, n) expressed as:
the calculation method of the continuous numerical attribute information loss is the difference between the two;
the information loss calculation method of discrete data is that when the two attributes are equal, the distance is 0, and when the two attributes are not equal, the distance is 1.
7. The social information security management method of claim 1, wherein the distance between two nodes is measured by combining the structural information feature distance and the personal information feature distance of the node into a composite distance according to a parameter α, and the closer the distance in the clustering algorithm, the nodes are aggregated in a super point, which is expressed as:
CD=a×SIFD+(1-a)×PIFD。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010577101.0A CN111767567A (en) | 2020-06-22 | 2020-06-22 | Social information security management method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010577101.0A CN111767567A (en) | 2020-06-22 | 2020-06-22 | Social information security management method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111767567A true CN111767567A (en) | 2020-10-13 |
Family
ID=72721725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010577101.0A Withdrawn CN111767567A (en) | 2020-06-22 | 2020-06-22 | Social information security management method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111767567A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113486396A (en) * | 2021-07-02 | 2021-10-08 | 北京工业大学 | Social network-oriented high-availability K-anonymous data processing method and device, electronic equipment and storage medium |
-
2020
- 2020-06-22 CN CN202010577101.0A patent/CN111767567A/en not_active Withdrawn
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113486396A (en) * | 2021-07-02 | 2021-10-08 | 北京工业大学 | Social network-oriented high-availability K-anonymous data processing method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Du et al. | Big data privacy preserving in multi-access edge computing for heterogeneous Internet of Things | |
Singh et al. | Fuzzy-folded bloom filter-as-a-service for big data storage in the cloud | |
Langari et al. | Combined fuzzy clustering and firefly algorithm for privacy preserving in social networks | |
Ding et al. | Efficient fault-tolerant group recommendation using alpha-beta-core | |
Wu et al. | A survey of privacy-preservation of graphs and social networks | |
US8976710B2 (en) | Methods for discovering and analyzing network topologies and devices thereof | |
CN104809408B (en) | A kind of histogram dissemination method based on difference privacy | |
Zhang et al. | Towards privacy preserving publishing of set-valued data on hybrid cloud | |
Yin et al. | Attribute couplet attacks and privacy preservation in social networks | |
Kreso et al. | Data mining privacy preserving: Research agenda | |
Sopaoglu et al. | A top-down k-anonymization implementation for apache spark | |
CN111475838A (en) | Graph data anonymizing method, device and storage medium based on deep neural network | |
CN107070932B (en) | Anonymous method for preventing label neighbor attack in social network dynamic release | |
Wu et al. | A multi-threshold ant colony system-based sanitization model in shared medical environments | |
Tai et al. | Structural diversity for resisting community identification in published social networks | |
CN109614521B (en) | Efficient privacy protection sub-graph query processing method | |
Anand et al. | Privacy preserving framework using Gaussian mutation based firebug optimization in cloud computing | |
CN111767567A (en) | Social information security management method | |
Liu et al. | Randomized perturbation for privacy-preserving social network data publishing | |
Liu et al. | K‐anonymity against neighborhood attacks in weighted social networks | |
CN113743496A (en) | K-anonymous data processing method and system based on cluster mapping | |
CN106778352B (en) | Multisource privacy protection method for combined release of set value data and social network data | |
Casas-Roma et al. | Evolutionary algorithm for graph anonymization | |
Dhia | Access control in social networks: a reachability-based approach | |
Petkos et al. | Social circle discovery in ego-networks by mining the latent structure of user connections and profile attributes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20201013 |