CN111767567A - Social information security management method - Google Patents

Social information security management method Download PDF

Info

Publication number
CN111767567A
CN111767567A CN202010577101.0A CN202010577101A CN111767567A CN 111767567 A CN111767567 A CN 111767567A CN 202010577101 A CN202010577101 A CN 202010577101A CN 111767567 A CN111767567 A CN 111767567A
Authority
CN
China
Prior art keywords
nodes
social
node
distance
social network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010577101.0A
Other languages
Chinese (zh)
Inventor
杨良斌
于腊梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Relations, University of
Original Assignee
International Relations, University of
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Relations, University of filed Critical International Relations, University of
Priority to CN202010577101.0A priority Critical patent/CN111767567A/en
Publication of CN111767567A publication Critical patent/CN111767567A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioethics (AREA)
  • Computer Security & Cryptography (AREA)
  • Human Resources & Organizations (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Computer Hardware Design (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a social information security management method, which relates to the technical field of information security, and comprises the steps of obtaining a seed node set, determining the aggregation density of all nodes, generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated to form clusters, and the inner layer of the seed node set is circulated to form a set with high aggregation density; determining a calculation distance matrix; and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment. Clustering is carried out based on user information and social relations, all nodes in the social network are clustered into a super point at least comprising k nodes according to the distance between the nodes, anonymization processing is carried out on the super point, and the anonymized super point can effectively prevent various privacy attacks taking node attribute privacy, sub-graph structures and the like as background knowledge, so that information loss is effectively reduced, and data effectiveness is improved.

Description

Social information security management method
Technical Field
The invention relates to the technical field of information security, in particular to a social information security management method.
Background
With the rapid development of multimedia and mobile social networking technologies, social multimedia data is distributed more and more efficiently, digitized information can be conveniently and quickly transmitted on the network in different forms, and multimedia communication gradually becomes an important means for information exchange in daily life of people. But at the same time, a series of problems of abuse, illegal copying, piracy, plagiarism and appropriation of multimedia occur.
The rapid development of internet technology promotes the rise of various social network platforms. The social network moves the life of people and the connection between people to the internet, so that a large amount of information is accumulated, the information reflects social laws to a certain extent, and a class of data with important research significance and application value is formed.
At present, a plurality of protection technologies have emerged for the problem of social network privacy protection, and the simplest method in technical implementation is to hide user identity information and not process other information. Although the technology protects the personal privacy of the user within a certain range, the identity of the individual can be still recognized by a malicious person through the background knowledge of the social network relationship of the target person, and the privacy of the user is leaked.
An effective solution to the problems in the related art has not been proposed yet.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a social information security management method to overcome the technical problems in the prior related art.
The technical scheme of the invention is realized as follows:
the social information security management method comprises the following steps:
acquiring a seed node set, determining the aggregation density of all nodes, and generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated as the number of clusters, and the inner layer of the seed node set is circulated as the set with high aggregation density;
determining a calculation distance matrix;
and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment.
Further, obtaining a High aggregation density area to obtain a set High;
selecting the node with the maximum aggregation density as a first initial central node seed 1;
selecting the points which are farthest from the seed1 from the set High to form a set, and selecting the points with the highest density from the set as seed 2;
the initial seed node is represented as:
Figure BDA0002551386760000021
further, the method comprises the steps of giving a social network G ═ V, E and A, clustering according to the similarity among the calculation nodes, enabling the number of the nodes in each cluster to be larger than or equal to k, dividing all the nodes in the social network into cluster sets through clustering, representing that the cluster set V is clustered, generating cluster sets Sclt ═ V { clt1, clt2, …, clts }, and Uclti ═ V,
Figure BDA0002551386760000022
i, j ∈ 1, 2, …, n, i ≠ j, anonymized social network Gano ═ j (Vano, Eano, Aano), wherein Vano ═ { vclt1, vclt2, … vclts }, vctti is an anonymized node of the anonymized network, and Vano ═ Vano × Vano, Vclti, Vcltj ∈ Vano, (Vclti, Vcltj) ∈ Eano.
Further, a set of social network information is predetermined, including personal information and social relationship information including users, which is described by a labeled undirected weightless graph, denoted as G ═ (V, E, a), where V ═ V1, V2, …, vn is a set of points in the social network, where vi (i ═ 1, 2, …, n) represents any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).
Further, the structural information feature distance is expressed as:
in the social network G (V, E, a), a node set V ═ V1, V2, …, vn }, where a neighbor relation of any one node vi may be represented as Neibor ═ (nbi1, nbi2, …, nbin), if a social relation exists between vi and vj, that is, an edge (vi, vj) belongs to E, i ≠ j, then nbij is 0, otherwise nbij is 1, a structural feature distance between nodes is:
Figure BDA0002551386760000031
further, the distance of the personal information feature is expressed in the social network G (V, E, a), where Ai ═ i (Ai1, Ai2, …, aim) is an m-dimensional attribute sequence of the node vi (i ═ 1, 2, …, n), and is expressed as:
the calculation method of the continuous numerical attribute information loss is the difference between the two;
the information loss calculation method of discrete data is that when the two attributes are equal, the distance is 0, and when the two attributes are not equal, the distance is 1.
Further, the distance between two nodes is measured by combining the structural information characteristic distance and the personal information characteristic distance of the node into a comprehensive distance through a parameter alpha, and the closer the distance in the clustering algorithm is, the nodes are aggregated in a super point, which is expressed as:
CD=a×SIFD+(1-a)×PIFD。
the invention has the beneficial effects that:
the invention clusters based on user information and social relations, clusters all nodes in the social network into the super points at least comprising k nodes according to the distance between the nodes, and carries out anonymization processing on the super points, the anonymized super points can effectively prevent various privacy attacks taking node attribute privacy, sub-graph structures and the like as background knowledge, so that an attacker cannot identify users with the probability more than 1/k, and optimizes the selection algorithm of initial nodes and the calculation method of node spacing in the clustering process according to the characteristics of a clustering algorithm and the social network, thereby effectively reducing information loss and improving data effectiveness.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flowchart illustrating a social information security management method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
According to an embodiment of the invention, a social information security management method is provided.
As shown in fig. 1, the social information security management method according to the embodiment of the present invention includes the following steps:
acquiring a seed node set, determining the aggregation density of all nodes, and generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated as the number of clusters, and the inner layer of the seed node set is circulated as the set with high aggregation density;
determining a calculation distance matrix;
and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment.
By means of the scheme, clustering is carried out based on user information and social relations, all nodes in the social network are clustered into the super points at least comprising k nodes according to the distances between the nodes, anonymization processing is carried out on the super points, the anonymized super points can effectively prevent various privacy attacks with node attribute privacy, sub-graph structures and the like as background knowledge, an attacker cannot identify users with the probability larger than 1/k, the selection algorithm of initial nodes in the clustering process and the calculation method of node distances are optimized according to the clustering algorithm and the characteristics of the social network, meanwhile, information loss is effectively reduced, and data effectiveness is improved.
Acquiring a High aggregation density area to obtain a set High;
selecting the node with the maximum aggregation density as a first initial central node seed 1;
selecting the points which are farthest from the seed1 from the set High to form a set, and selecting the points with the highest density from the set as seed 2;
the initial seed node is represented as:
Figure BDA0002551386760000041
wherein, further comprises anonymizing social networks, including giving a social network G ═ V, E, A, clustering according to the similarity among the calculation nodes, making the number of nodes in each cluster be more than or equal to k, dividing all nodes in the social network into cluster sets through clustering, expressing as clustering point sets V, generating cluster sets Sclt ═ V { clt1, clt2, …, clts }, Uclti ═ V,
Figure BDA0002551386760000042
i, j ∈ 1, 2, …, n, i ≠ j, anonymized social network Gano ═ j (Vano, Eano, Aano), wherein Vano ═ { vclt1, vclt2, … vclts }, vctti is an anonymized node of the anonymized network, and Vano ═ Vano × Vano, Vclti, Vcltj ∈ Vano, (Vclti, Vcltj) ∈ Eano.
Wherein, a set of social network information is predetermined, including personal information and social relationship information including users, which is described by a labeled undirected weightless graph, denoted as G ═ (V, E, a), where V ═ { V1, V2, …, vn } is a set of points in the social network, where vi (i ═ 1, 2, …, n) denotes any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).
Wherein, the structural information characteristic distance is expressed as:
in the social network G (V, E, a), a node set V ═ V1, V2, …, vn }, where a neighbor relation of any one node vi may be represented as Neibor ═ (nbi1, nbi2, …, nbin), if a social relation exists between vi and vj, that is, an edge (vi, vj) belongs to E, i ≠ j, then nbij is 0, otherwise nbij is 1, a structural feature distance between nodes is:
Figure BDA0002551386760000051
in the social network G (V, E, a), the personal information feature distance is expressed as an m-dimensional attribute sequence of a node vi (i ═ 1, 2, …, n) with Ai ═ 1, Ai2, …, aim, and is expressed as:
the calculation method of the continuous numerical attribute information loss is the difference between the two;
the information loss calculation method of discrete data is that when the two attributes are equal, the distance is 0, and when the two attributes are not equal, the distance is 1.
The distance between two nodes is measured by combining the structural information characteristic distance and the personal information characteristic distance of the node into a comprehensive distance through a parameter alpha, and the closer the distance in the clustering algorithm is, the nodes are aggregated in a super point, which is represented as:
CD=a×SIFD+(1-a)×PIFD。
in summary, with the above technical solution of the present invention, the following effects can be achieved:
1) establishing a social network model of an actual problem according to the actual situation of the social network, clustering on a social network graph, and completing data anonymization processing to form an anonymous social network;
2) according to the characteristics of the social network, respectively quantifying the social relationship of the users between the network nodes and the distance of the user information, and calculating the distance between the network nodes and the over point formed by clustering to perform clustering;
3) aiming at the characteristics of a social network and a clustering algorithm and the problems that different users have different requirements on privacy protection, the clustering algorithm is optimized by combining a clustering coefficient and a node density, and privacy protection of data is carried out to different degrees.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. The social information security management method is characterized by comprising the following steps:
acquiring a seed node set, determining the aggregation density of all nodes, and generating the seed node set in a set with high aggregation density, wherein the outer layer of the seed node set is circulated as the number of clusters, and the inner layer of the seed node set is circulated as the set with high aggregation density;
determining a calculation distance matrix;
and allocating nodes, wherein the outer layer cycle is the number of unallocated nodes, the inner layer cycle is the number of clusters, and the nodes in each cluster are subjected to anonymization treatment.
2. The social information security management method of claim 1,
acquiring a High aggregation density area to obtain a set High;
selecting the node with the maximum aggregation density as a first initial central node seed 1;
selecting the points which are farthest from the seed1 from the set High to form a set, and selecting the points with the highest density from the set as seed 2;
the initial seed node is represented as:
Figure FDA0002551386750000011
3. the method of claim 2, further comprising anonymizing the social networks, including giving one social network G ═ G(V, E, A), clustering according to the similarity among the calculation nodes, enabling the number of the nodes in each cluster to be larger than or equal to k, dividing all the nodes in the social network into cluster sets through clustering, representing that the point set V is clustered, generating a cluster set Sclt [ { clt1, clt2, …, clts }, and Uclti ═ V,
Figure FDA0002551386750000012
i, j ∈ 1, 2, …, n, i ≠ j, anonymized social network Gano ═ j (Vano, Eano, Aano), wherein Vano ═ { vclt1, vclt2, … vclts }, vctti is an anonymized node of the anonymized network, and Vano ═ Vano × Vano, Vclti, Vcltj ∈ Vano, (Vclti, Vcltj) ∈ Eano.
4. The method for social information security management according to claim 3, wherein the social network information set is predetermined and includes personal information and social relationship information including users, which is described by a labeled undirected graph, denoted as G ═ (V, E, a), where V ═ { V1, V2, …, vn } is a set of points in the social network, where vi (i ═ 1, 2, …, n) denotes any user in the social network; e { (vi, vj) | i ≠ j, i is more than or equal to 1, j is less than or equal to n } is an edge set in the social network, wherein (vi, vj) represents the social relationship between users vi and vj; a ═ { a1, a2, …, An } is a set of attributes in the social network and a set of user personal information, where Ai ═ (Ai1, Ai2, …, aim) is An m-dimensional attribute sequence of node vi (i ═ 1, 2, …, n).
5. The method for managing social information security as claimed in claim 4, wherein the distance of the structural information characteristic is expressed as:
in the social network G (V, E, a), a node set V ═ V1, V2, …, vn }, where a neighbor relation of any one node vi may be represented as Neibor ═ (nbi1, nbi2, …, nbin), if a social relation exists between vi and vj, that is, an edge (vi, vj) belongs to E, i ≠ j, then nbij is 0, otherwise nbij is 1, a structural feature distance between nodes is:
Figure FDA0002551386750000021
6. the social information security management method of claim 5, wherein the personal information characteristic distance is expressed in social network G (V, E, A), and Ai (Ai1, Ai2, …, aim) is an m-dimensional attribute sequence of node vi (i 1, 2, …, n) expressed as:
the calculation method of the continuous numerical attribute information loss is the difference between the two;
the information loss calculation method of discrete data is that when the two attributes are equal, the distance is 0, and when the two attributes are not equal, the distance is 1.
7. The social information security management method of claim 1, wherein the distance between two nodes is measured by combining the structural information feature distance and the personal information feature distance of the node into a composite distance according to a parameter α, and the closer the distance in the clustering algorithm, the nodes are aggregated in a super point, which is expressed as:
CD=a×SIFD+(1-a)×PIFD。
CN202010577101.0A 2020-06-22 2020-06-22 Social information security management method Withdrawn CN111767567A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010577101.0A CN111767567A (en) 2020-06-22 2020-06-22 Social information security management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010577101.0A CN111767567A (en) 2020-06-22 2020-06-22 Social information security management method

Publications (1)

Publication Number Publication Date
CN111767567A true CN111767567A (en) 2020-10-13

Family

ID=72721725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010577101.0A Withdrawn CN111767567A (en) 2020-06-22 2020-06-22 Social information security management method

Country Status (1)

Country Link
CN (1) CN111767567A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486396A (en) * 2021-07-02 2021-10-08 北京工业大学 Social network-oriented high-availability K-anonymous data processing method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486396A (en) * 2021-07-02 2021-10-08 北京工业大学 Social network-oriented high-availability K-anonymous data processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Du et al. Big data privacy preserving in multi-access edge computing for heterogeneous Internet of Things
Singh et al. Fuzzy-folded bloom filter-as-a-service for big data storage in the cloud
Langari et al. Combined fuzzy clustering and firefly algorithm for privacy preserving in social networks
Ding et al. Efficient fault-tolerant group recommendation using alpha-beta-core
Wu et al. A survey of privacy-preservation of graphs and social networks
US8976710B2 (en) Methods for discovering and analyzing network topologies and devices thereof
CN104809408B (en) A kind of histogram dissemination method based on difference privacy
Zhang et al. Towards privacy preserving publishing of set-valued data on hybrid cloud
Yin et al. Attribute couplet attacks and privacy preservation in social networks
Kreso et al. Data mining privacy preserving: Research agenda
Sopaoglu et al. A top-down k-anonymization implementation for apache spark
CN111475838A (en) Graph data anonymizing method, device and storage medium based on deep neural network
CN107070932B (en) Anonymous method for preventing label neighbor attack in social network dynamic release
Wu et al. A multi-threshold ant colony system-based sanitization model in shared medical environments
Tai et al. Structural diversity for resisting community identification in published social networks
CN109614521B (en) Efficient privacy protection sub-graph query processing method
Anand et al. Privacy preserving framework using Gaussian mutation based firebug optimization in cloud computing
CN111767567A (en) Social information security management method
Liu et al. Randomized perturbation for privacy-preserving social network data publishing
Liu et al. K‐anonymity against neighborhood attacks in weighted social networks
CN113743496A (en) K-anonymous data processing method and system based on cluster mapping
CN106778352B (en) Multisource privacy protection method for combined release of set value data and social network data
Casas-Roma et al. Evolutionary algorithm for graph anonymization
Dhia Access control in social networks: a reachability-based approach
Petkos et al. Social circle discovery in ego-networks by mining the latent structure of user connections and profile attributes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20201013