CN110956553A - Community structure division method based on social network node dual-label propagation algorithm - Google Patents

Community structure division method based on social network node dual-label propagation algorithm Download PDF

Info

Publication number
CN110956553A
CN110956553A CN201911293324.8A CN201911293324A CN110956553A CN 110956553 A CN110956553 A CN 110956553A CN 201911293324 A CN201911293324 A CN 201911293324A CN 110956553 A CN110956553 A CN 110956553A
Authority
CN
China
Prior art keywords
node
social network
label
nodes
dual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911293324.8A
Other languages
Chinese (zh)
Inventor
郑文锋
杨波
尹超
刘珊
曾庆川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911293324.8A priority Critical patent/CN110956553A/en
Publication of CN110956553A publication Critical patent/CN110956553A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a community structure division method based on a social network node dual-label propagation algorithm, which comprises the steps of firstly constructing a social network graph, distributing labels and initial membership values for nodes in the social network graph, then calculating the weight of each node by using a K-shell algorithm, and finally realizing dual-label propagation of the social network node in an iterative updating mode; therefore, by increasing the number of node labels and setting membership values, the social complexity of the social network nodes is fully considered, the stability and the accuracy of iteration results can be improved, and the community structure division is more accurate.

Description

Community structure division method based on social network node dual-label propagation algorithm
Technical Field
The invention belongs to the technical field of social networks, and particularly relates to a community structure division method based on a social network node dual-label propagation algorithm.
Background
The label propagation algorithm is an algorithm widely adopted in the field of social network analysis, and is often used for automatically mining community structures in social relationships. By mining the community structure in the network, the organization structure information and social functions hidden in the network and interesting attributes hidden among community members, such as common hobbies and the like, can be found. By researching the relationships among communities, individuals and the relationships between individuals and communities in the social network, a large amount of valuable information can be mined, and the method can be applied to many fields.
The existing label propagation algorithm mainly comprises the following steps: (1) in the initial stage, each node in the social network is assigned with a unique label L, which is the initial label value of the node, and is usually a value such as a character string type; (2) the tags are then propagated through social relationships (i.e., edges in the social network) to other neighboring nodes through multiple rounds of iterative computations. In the process of one round of iterative operation, a certain node decides which label should be given in the round according to labels of other nodes which are in edge contact with the certain node, and the basic principle is as follows: and if the labels with the maximum number cannot be found out as many as the labels of the neighbor nodes, randomly assigning one label. And each node re-determines a new label which should be obtained by the node in the iteration according to the principle, so that a round of label assignment operation is completed. (3) When the labels of most nodes are not changed after multiple rounds of iterative operation, the final calculation result is obtained.
In the prior art, the granted patent "201611263101.3" discloses a label propagation method, which includes calculating a weight value of each node in a social network; transmitting the label and the weight of each node to a receiving node connected with the node with edges; and iteratively executing the step of endowing a new label to the receiving node according to the number of the labels received by the receiving node and the weight value of the label source node until a preset label propagation end condition is met. However, the patent only considers that the initialization node is a single label, and the complexity of the current social people as social nodes is not fully considered, so that the result is inaccurate.
In summary, the main factors causing instability of iteration results by the existing label propagation algorithm are the first: the principle of updating the tags is too simple; secondly, the method comprises the following steps: the initialization tag is single; thirdly, the method comprises the following steps: the degree of membership of the label to the node has not been considered. Therefore, most nodes randomly select the labels to update the labels of the nodes, and the iteration result is unstable and has low accuracy.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a community structure division method based on a social network node double-label propagation algorithm.
In order to achieve the above object, the present invention provides a community structure division method based on a social network node dual-label propagation algorithm, which is characterized by comprising the following steps:
(1) constructing a social network graph;
reading social network data, and constructing a social network graph with social network users as nodes and relationships among the users as edges;
(2) initializing a distribution node label and a membership degree;
in the social network diagram, two tags are allocated to each node
Figure BDA0002319730780000021
And set its membership value to
Figure BDA0002319730780000022
Figure BDA0002319730780000023
Wherein i ═ 1,2, …, represents the number of nodes in the social network diagram;
(3) calculating the weight of each node by using a K-shell algorithm;
(4) realizing the dual-label propagation of the social network nodes in an iterative updating mode;
(4.1), setting the initial iteration number K to be 1, and setting the maximum iteration number K; setting a percentage threshold value P of all nodes corresponding to the tags, wherein the percentage threshold value is unchanged in the propagation process;
(4.2) randomly traversing all nodes in the social network graph, and updating the node double labels according to the node double label propagation rule, wherein the specific updating process is as follows:
(4.2.1) randomly selecting the non-updated node Vi
(4.2.2) to node ViApplying a dual-label propagation rule to all neighbor nodes;
computing node ViReceiving the strength of each label from the neighbor node;
Figure BDA0002319730780000024
wherein,
Figure BDA0002319730780000025
represents a node ViReceiving the total strength value, N, of all neighbor nodes with label name LL(Vi) Representation and node ViConnected set of neighbor nodes with tag name L, KUA weight value of the neighbor node U is represented,
Figure BDA0002319730780000026
representing the membership value of the label name L in the neighbor node U;
(4.2.3) update node ViThe label of (1);
at node ViSelecting two labels with the maximum strength as a node V from all the received labelsiAnd updating it;
let the two selected maximum intensity values be
Figure BDA0002319730780000031
Then node ViThe membership of the two new tags of (1) is updated as follows:
Figure BDA0002319730780000032
(4.2.4) performing double-label updating on all nodes which are not updated in the social network diagram according to the method of the steps (4.2.1) - (4.2.3);
(4.3) after the iteration, counting the percentage of the nodes with unchanged label names, then comparing whether the percentage is greater than a preset threshold value P, if the percentage is greater than P or the current iteration times reaches the maximum iteration times K, stopping the iteration, and then entering the step (5); otherwise, making k equal to k +1, returning to the step (4.2), and performing the next iteration;
(5) community structure for obtaining social network
And dividing the nodes with the same label into the same community, thereby obtaining the community structure of the social network.
The invention aims to realize the following steps:
the invention relates to a community structure division method based on a social network node double-label propagation algorithm, which comprises the steps of firstly constructing a social network graph, distributing labels and initial membership values for nodes in the social network graph, then calculating the weight of each node by using a K-shell algorithm, and finally realizing the double-label propagation of the social network node in an iterative updating mode; therefore, by increasing the number of node labels and setting membership values, the social complexity of the social network nodes is fully considered, the stability and the accuracy of iteration results can be improved, and the community structure division is more accurate.
Drawings
FIG. 1 is a flow chart of a community structure partitioning method based on a social network node dual-label propagation algorithm according to the present invention;
FIG. 2 is a diagram of an initialized social network.
Detailed Description
The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
Examples
FIG. 1 is a flow chart of a community structure division method based on a social network node dual-label propagation algorithm.
In this embodiment, as shown in fig. 1, the method for dividing a community structure based on a social network node dual-label propagation algorithm of the present invention includes the following steps:
s1, constructing a social network graph;
reading social network data, and constructing a social network graph with social network users as nodes and relationships among the users as edges;
in this embodiment, for example, in a microblog network, each microblog registered user is used as a node in a social network, and a mutual concern and comment relationship among the users is used as an edge in the social network.
S2, initializing a distribution node label and a membership degree;
in the social network diagram, two tags are allocated to each node
Figure BDA0002319730780000041
And set its membership value to
Figure BDA0002319730780000042
Figure BDA0002319730780000043
The initial membership values are all 0.5, wherein i is 1,2, …, which represents the number of nodes in the social network diagram; in the present embodiment, the initialized social network diagram is shown in fig. 2.
S3, calculating the weight of each node by using a K-shell algorithm;
the specific process of calculating the weight of each node in the social network by the K-shell algorithm is as follows: firstly, setting a k value as 1, and calculating a current value of each node in the social network; then, deleting all nodes and connecting edges thereof with the value less than or equal to k in the social network, and marking the weight of the deleted nodes as k; since the deletion of the edge may cause the value of the partial nodes to be reduced, at this time, the value of each node in the social network needs to be calculated again, and the nodes and the edges thereof are continuously deleted when the calculated value is less than or equal to k, and the loop is iterated until the values of the nodes which are not deleted in the social network are all greater than k; then add 1 to the k value and repeat the above steps until all nodes in the social network are labeled with a k value.
In this embodiment, a network node centrality algorithm, a node degree algorithm, and a random walk algorithm may also be used to calculate the weight of each node, and the specific calculation process is not described herein again.
S4, realizing the dual-label propagation of the social network nodes in an iterative updating mode;
s4.1, setting the initial iteration number K to be 1, and setting the maximum iteration number K; setting the percentage threshold value of the labels corresponding to all the nodes, which is unchanged in the propagation process, as P-95%;
s4.2, all nodes in the social network graph are traversed randomly, and node double-label updating is carried out according to a node double-label propagation rule, wherein the specific updating process is as follows:
s4.2.1 randomly selecting non-updated node Vi
S4.2.2, pair of nodes ViApplying a dual-label propagation rule to all neighbor nodes;
computing node ViReceiving the strength of each label from the neighbor node;
Figure BDA0002319730780000051
wherein,
Figure BDA0002319730780000052
represents a node ViReceiving the total strength value, N, of all neighbor nodes with label name LL(Vi) Representation and node ViConnected set of neighbor nodes with tag name L, KUA weight value of the neighbor node U is represented,
Figure BDA0002319730780000053
representing membership of a tag name L in a neighboring node UA value of the metric;
s4.2.3, update node ViThe label of (1);
at node ViSelecting two labels with the maximum strength as a node V from all the received labelsiAnd updating it;
let the two selected maximum intensity values be
Figure BDA0002319730780000054
Then node ViThe membership of the two new tags of (1) is updated as follows:
Figure BDA0002319730780000055
s4.2.4, according to the method of steps S4.2.1-S4.2.3, performing double-label updating on all the nodes which are not updated in the social network diagram;
s4.3, after the iteration is performed in the current round, counting the percentage of the nodes with unchanged label names, then comparing whether the percentage is larger than a preset threshold value P, if the percentage is larger than P or the current iteration times reaches the maximum iteration times K, stopping the iteration, and then entering the step S5; otherwise, let k be k +1, and then return to step S4.2 to perform the next iteration;
s5 community structure for dividing social network
And dividing the nodes with the same label into the same community, thereby obtaining the community structure of the social network.
In summary, when node dual tags are propagated in a social network, a tag propagation process is divided into three stages of initializing tags, propagating tags and updating tags. Firstly, constructing a social network graph; initializing the label and the membership degree of each node according to the social network diagram, and distributing two labels to each node in the social network diagram, wherein the membership degree is 0.5; calculating the weight of each node in the social network; updating the node dual-label according to the node dual-label propagation rule and the dual-label selection rule; iteratively executing the step of updating the double labels of the nodes until a preset double label propagation end condition is met; and obtaining the community structure of the social network according to the node double labels. According to the method, the stability and the accuracy of the iteration result of the label propagation method are effectively improved by introducing the idea of double labels and label membership degree and calculating the node weight. The method can be applied to the fields of target group mining, accurate marketing and the like.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims (2)

1. A community structure division method based on a social network node dual-label propagation algorithm is characterized by comprising the following steps:
(1) constructing a social network graph;
reading social network data, and constructing a social network graph with social network users as nodes and relationships among the users as edges;
(2) initializing a distribution node label and a membership degree;
in the social network diagram, two tags are allocated to each node
Figure FDA0002319730770000011
And set its membership value to
Figure FDA0002319730770000012
Figure FDA0002319730770000013
(3) Calculating the weight of each node by using a K-shell algorithm;
(4) realizing the dual-label propagation of the social network nodes in an iterative updating mode;
(4.1), setting the initial iteration number K to be 1, and setting the maximum iteration number K; setting a percentage threshold value P of all nodes corresponding to the tags, wherein the percentage threshold value is unchanged in the propagation process;
(4.2) randomly traversing all nodes in the social network graph, and updating the node double labels according to the node double label propagation rule, wherein the specific updating process is as follows:
(4.2.1) randomly selecting the non-updated node Vi
(4.2.2) to node ViApplying a dual-label propagation rule to all neighbor nodes;
computing node ViReceiving the strength of each label from the neighbor node;
Figure FDA0002319730770000014
wherein,
Figure FDA0002319730770000015
represents a node ViReceiving the total strength value, N, of all neighbor nodes with label name LL(Vi) Representation and node ViConnected set of neighbor nodes with tag name L, KUA weight value of the neighbor node U is represented,
Figure FDA0002319730770000016
representing the membership value of the label name L in the neighbor node U;
(4.2.3) update node ViThe label of (1);
(4.2.4) performing double-label updating on all nodes which are not updated in the social network diagram according to the method of the steps (4.2.1) - (4.2.3);
(4.3) after the iteration, counting the percentage of the nodes with unchanged label names, then comparing whether the percentage is greater than a preset threshold value P, if the percentage is greater than P or the current iteration times reaches the maximum iteration times K, stopping the iteration, and then entering the step (5); otherwise, making k equal to k +1, returning to the step (4.2), and performing the next iteration;
(5) community structure for obtaining social network
And dividing the nodes with the same label into the same community, thereby obtaining the community structure of the social network.
2. The method as claimed in claim 1, wherein the update node V is a node of a social network based on a community structure partitioning method based on a dual-label propagation algorithmiThe specific process of the label is as follows:
at node ViSelecting two labels with the maximum strength as a node V from all the received labelsiAnd updating it;
let the two selected maximum intensity values be
Figure FDA0002319730770000021
Then node ViThe membership of the two new tags of (1) is updated as follows:
Figure FDA0002319730770000022
CN201911293324.8A 2019-12-16 2019-12-16 Community structure division method based on social network node dual-label propagation algorithm Pending CN110956553A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911293324.8A CN110956553A (en) 2019-12-16 2019-12-16 Community structure division method based on social network node dual-label propagation algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911293324.8A CN110956553A (en) 2019-12-16 2019-12-16 Community structure division method based on social network node dual-label propagation algorithm

Publications (1)

Publication Number Publication Date
CN110956553A true CN110956553A (en) 2020-04-03

Family

ID=69981892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911293324.8A Pending CN110956553A (en) 2019-12-16 2019-12-16 Community structure division method based on social network node dual-label propagation algorithm

Country Status (1)

Country Link
CN (1) CN110956553A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177854A (en) * 2021-04-23 2021-07-27 携程计算机技术(上海)有限公司 Community division method and system, electronic device and storage medium
CN114615090A (en) * 2022-05-10 2022-06-10 富算科技(上海)有限公司 Data processing method, system, device and medium based on cross-domain label propagation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140003425A1 (en) * 2012-06-29 2014-01-02 Futurewei Technologies, Inc. Implementing a Multicast Virtual Private Network by Using Multicast Resource Reservation Protocol-Traffic Engineering
CN103729475A (en) * 2014-01-24 2014-04-16 福州大学 Multi-label propagation discovery method of overlapping communities in social network
CN106022371A (en) * 2016-05-18 2016-10-12 电子科技大学 Community discovery algorithm based on APR algorithm and MAP algorithm
CN106789588A (en) * 2016-12-30 2017-05-31 东软集团股份有限公司 Label transmission method and device
CN109255054A (en) * 2017-07-14 2019-01-22 元素征信有限责任公司 A kind of community discovery algorithm in enterprise's map based on relationship weight

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140003425A1 (en) * 2012-06-29 2014-01-02 Futurewei Technologies, Inc. Implementing a Multicast Virtual Private Network by Using Multicast Resource Reservation Protocol-Traffic Engineering
CN103729475A (en) * 2014-01-24 2014-04-16 福州大学 Multi-label propagation discovery method of overlapping communities in social network
CN106022371A (en) * 2016-05-18 2016-10-12 电子科技大学 Community discovery algorithm based on APR algorithm and MAP algorithm
CN106789588A (en) * 2016-12-30 2017-05-31 东软集团股份有限公司 Label transmission method and device
CN109255054A (en) * 2017-07-14 2019-01-22 元素征信有限责任公司 A kind of community discovery algorithm in enterprise's map based on relationship weight

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177854A (en) * 2021-04-23 2021-07-27 携程计算机技术(上海)有限公司 Community division method and system, electronic device and storage medium
CN114615090A (en) * 2022-05-10 2022-06-10 富算科技(上海)有限公司 Data processing method, system, device and medium based on cross-domain label propagation

Similar Documents

Publication Publication Date Title
CN103678671B (en) A kind of dynamic community detection method in social networks
CN108492201B (en) Social network influence maximization method based on community structure
CN104598605B (en) A kind of user force appraisal procedure in social networks
US10157239B2 (en) Finding common neighbors between two nodes in a graph
US7685141B2 (en) Connection sub-graphs in entity relationship graphs
CN103678669A (en) Evaluating system and method for community influence in social network
Balaprakash et al. Estimation-based ant colony optimization and local search for the probabilistic traveling salesman problem
CN107391542A (en) A kind of open source software community expert recommendation method based on document knowledge collection of illustrative plates
CN110719106B (en) Social network graph compression method and system based on node classification and sorting
US9361403B2 (en) Efficiently counting triangles in a graph
CN110956553A (en) Community structure division method based on social network node dual-label propagation algorithm
CN110838072A (en) Social network influence maximization method and system based on community discovery
CN104484365B (en) In a kind of multi-source heterogeneous online community network between network principal social relationships Forecasting Methodology and system
Petersen et al. Tree shift topological entropy
CN104700311B (en) A kind of neighborhood in community network follows community discovery method
US20210357764A1 (en) Reducing errors introduced by model updates
CN112085293A (en) Method and device for training interactive prediction model and predicting interactive object
CN110659394A (en) Recommendation method based on two-way proximity
CN112966054A (en) Enterprise graph node relation-based ethnic group division method and computer equipment
Hasan et al. Graphettes: Constant-time determination of graphlet and orbit identity including (possibly disconnected) graphlets up to size 8
US20230289618A1 (en) Performing knowledge graph embedding using a prediction model
CN109120431A (en) The method, apparatus and terminal device that propagating source selects in complex network
CN114547439A (en) Service optimization method based on big data and artificial intelligence and electronic commerce AI system
CN114417177A (en) Label propagation overlapping community discovery method based on node comprehensive influence
CN107291860B (en) Seed user determination method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200403