CN110956553A - Community structure division method based on social network node dual-label propagation algorithm - Google Patents
Community structure division method based on social network node dual-label propagation algorithm Download PDFInfo
- Publication number
- CN110956553A CN110956553A CN201911293324.8A CN201911293324A CN110956553A CN 110956553 A CN110956553 A CN 110956553A CN 201911293324 A CN201911293324 A CN 201911293324A CN 110956553 A CN110956553 A CN 110956553A
- Authority
- CN
- China
- Prior art keywords
- node
- social network
- label
- nodes
- dual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 25
- 238000010586 diagram Methods 0.000 claims description 12
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 238000005065 mining Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000003012 network analysis Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000005295 random walk Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a community structure division method based on a social network node dual-label propagation algorithm, which comprises the steps of firstly constructing a social network graph, distributing labels and initial membership values for nodes in the social network graph, then calculating the weight of each node by using a K-shell algorithm, and finally realizing dual-label propagation of the social network node in an iterative updating mode; therefore, by increasing the number of node labels and setting membership values, the social complexity of the social network nodes is fully considered, the stability and the accuracy of iteration results can be improved, and the community structure division is more accurate.
Description
Technical Field
The invention belongs to the technical field of social networks, and particularly relates to a community structure division method based on a social network node dual-label propagation algorithm.
Background
The label propagation algorithm is an algorithm widely adopted in the field of social network analysis, and is often used for automatically mining community structures in social relationships. By mining the community structure in the network, the organization structure information and social functions hidden in the network and interesting attributes hidden among community members, such as common hobbies and the like, can be found. By researching the relationships among communities, individuals and the relationships between individuals and communities in the social network, a large amount of valuable information can be mined, and the method can be applied to many fields.
The existing label propagation algorithm mainly comprises the following steps: (1) in the initial stage, each node in the social network is assigned with a unique label L, which is the initial label value of the node, and is usually a value such as a character string type; (2) the tags are then propagated through social relationships (i.e., edges in the social network) to other neighboring nodes through multiple rounds of iterative computations. In the process of one round of iterative operation, a certain node decides which label should be given in the round according to labels of other nodes which are in edge contact with the certain node, and the basic principle is as follows: and if the labels with the maximum number cannot be found out as many as the labels of the neighbor nodes, randomly assigning one label. And each node re-determines a new label which should be obtained by the node in the iteration according to the principle, so that a round of label assignment operation is completed. (3) When the labels of most nodes are not changed after multiple rounds of iterative operation, the final calculation result is obtained.
In the prior art, the granted patent "201611263101.3" discloses a label propagation method, which includes calculating a weight value of each node in a social network; transmitting the label and the weight of each node to a receiving node connected with the node with edges; and iteratively executing the step of endowing a new label to the receiving node according to the number of the labels received by the receiving node and the weight value of the label source node until a preset label propagation end condition is met. However, the patent only considers that the initialization node is a single label, and the complexity of the current social people as social nodes is not fully considered, so that the result is inaccurate.
In summary, the main factors causing instability of iteration results by the existing label propagation algorithm are the first: the principle of updating the tags is too simple; secondly, the method comprises the following steps: the initialization tag is single; thirdly, the method comprises the following steps: the degree of membership of the label to the node has not been considered. Therefore, most nodes randomly select the labels to update the labels of the nodes, and the iteration result is unstable and has low accuracy.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a community structure division method based on a social network node double-label propagation algorithm.
In order to achieve the above object, the present invention provides a community structure division method based on a social network node dual-label propagation algorithm, which is characterized by comprising the following steps:
(1) constructing a social network graph;
reading social network data, and constructing a social network graph with social network users as nodes and relationships among the users as edges;
(2) initializing a distribution node label and a membership degree;
in the social network diagram, two tags are allocated to each nodeAnd set its membership value to Wherein i ═ 1,2, …, represents the number of nodes in the social network diagram;
(3) calculating the weight of each node by using a K-shell algorithm;
(4) realizing the dual-label propagation of the social network nodes in an iterative updating mode;
(4.1), setting the initial iteration number K to be 1, and setting the maximum iteration number K; setting a percentage threshold value P of all nodes corresponding to the tags, wherein the percentage threshold value is unchanged in the propagation process;
(4.2) randomly traversing all nodes in the social network graph, and updating the node double labels according to the node double label propagation rule, wherein the specific updating process is as follows:
(4.2.1) randomly selecting the non-updated node Vi;
(4.2.2) to node ViApplying a dual-label propagation rule to all neighbor nodes;
computing node ViReceiving the strength of each label from the neighbor node;
wherein,represents a node ViReceiving the total strength value, N, of all neighbor nodes with label name LL(Vi) Representation and node ViConnected set of neighbor nodes with tag name L, KUA weight value of the neighbor node U is represented,representing the membership value of the label name L in the neighbor node U;
(4.2.3) update node ViThe label of (1);
at node ViSelecting two labels with the maximum strength as a node V from all the received labelsiAnd updating it;
let the two selected maximum intensity values beThen node ViThe membership of the two new tags of (1) is updated as follows:
(4.2.4) performing double-label updating on all nodes which are not updated in the social network diagram according to the method of the steps (4.2.1) - (4.2.3);
(4.3) after the iteration, counting the percentage of the nodes with unchanged label names, then comparing whether the percentage is greater than a preset threshold value P, if the percentage is greater than P or the current iteration times reaches the maximum iteration times K, stopping the iteration, and then entering the step (5); otherwise, making k equal to k +1, returning to the step (4.2), and performing the next iteration;
(5) community structure for obtaining social network
And dividing the nodes with the same label into the same community, thereby obtaining the community structure of the social network.
The invention aims to realize the following steps:
the invention relates to a community structure division method based on a social network node double-label propagation algorithm, which comprises the steps of firstly constructing a social network graph, distributing labels and initial membership values for nodes in the social network graph, then calculating the weight of each node by using a K-shell algorithm, and finally realizing the double-label propagation of the social network node in an iterative updating mode; therefore, by increasing the number of node labels and setting membership values, the social complexity of the social network nodes is fully considered, the stability and the accuracy of iteration results can be improved, and the community structure division is more accurate.
Drawings
FIG. 1 is a flow chart of a community structure partitioning method based on a social network node dual-label propagation algorithm according to the present invention;
FIG. 2 is a diagram of an initialized social network.
Detailed Description
The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
Examples
FIG. 1 is a flow chart of a community structure division method based on a social network node dual-label propagation algorithm.
In this embodiment, as shown in fig. 1, the method for dividing a community structure based on a social network node dual-label propagation algorithm of the present invention includes the following steps:
s1, constructing a social network graph;
reading social network data, and constructing a social network graph with social network users as nodes and relationships among the users as edges;
in this embodiment, for example, in a microblog network, each microblog registered user is used as a node in a social network, and a mutual concern and comment relationship among the users is used as an edge in the social network.
S2, initializing a distribution node label and a membership degree;
in the social network diagram, two tags are allocated to each nodeAnd set its membership value to The initial membership values are all 0.5, wherein i is 1,2, …, which represents the number of nodes in the social network diagram; in the present embodiment, the initialized social network diagram is shown in fig. 2.
S3, calculating the weight of each node by using a K-shell algorithm;
the specific process of calculating the weight of each node in the social network by the K-shell algorithm is as follows: firstly, setting a k value as 1, and calculating a current value of each node in the social network; then, deleting all nodes and connecting edges thereof with the value less than or equal to k in the social network, and marking the weight of the deleted nodes as k; since the deletion of the edge may cause the value of the partial nodes to be reduced, at this time, the value of each node in the social network needs to be calculated again, and the nodes and the edges thereof are continuously deleted when the calculated value is less than or equal to k, and the loop is iterated until the values of the nodes which are not deleted in the social network are all greater than k; then add 1 to the k value and repeat the above steps until all nodes in the social network are labeled with a k value.
In this embodiment, a network node centrality algorithm, a node degree algorithm, and a random walk algorithm may also be used to calculate the weight of each node, and the specific calculation process is not described herein again.
S4, realizing the dual-label propagation of the social network nodes in an iterative updating mode;
s4.1, setting the initial iteration number K to be 1, and setting the maximum iteration number K; setting the percentage threshold value of the labels corresponding to all the nodes, which is unchanged in the propagation process, as P-95%;
s4.2, all nodes in the social network graph are traversed randomly, and node double-label updating is carried out according to a node double-label propagation rule, wherein the specific updating process is as follows:
s4.2.1 randomly selecting non-updated node Vi;
S4.2.2, pair of nodes ViApplying a dual-label propagation rule to all neighbor nodes;
computing node ViReceiving the strength of each label from the neighbor node;
wherein,represents a node ViReceiving the total strength value, N, of all neighbor nodes with label name LL(Vi) Representation and node ViConnected set of neighbor nodes with tag name L, KUA weight value of the neighbor node U is represented,representing membership of a tag name L in a neighboring node UA value of the metric;
s4.2.3, update node ViThe label of (1);
at node ViSelecting two labels with the maximum strength as a node V from all the received labelsiAnd updating it;
let the two selected maximum intensity values beThen node ViThe membership of the two new tags of (1) is updated as follows:
s4.2.4, according to the method of steps S4.2.1-S4.2.3, performing double-label updating on all the nodes which are not updated in the social network diagram;
s4.3, after the iteration is performed in the current round, counting the percentage of the nodes with unchanged label names, then comparing whether the percentage is larger than a preset threshold value P, if the percentage is larger than P or the current iteration times reaches the maximum iteration times K, stopping the iteration, and then entering the step S5; otherwise, let k be k +1, and then return to step S4.2 to perform the next iteration;
s5 community structure for dividing social network
And dividing the nodes with the same label into the same community, thereby obtaining the community structure of the social network.
In summary, when node dual tags are propagated in a social network, a tag propagation process is divided into three stages of initializing tags, propagating tags and updating tags. Firstly, constructing a social network graph; initializing the label and the membership degree of each node according to the social network diagram, and distributing two labels to each node in the social network diagram, wherein the membership degree is 0.5; calculating the weight of each node in the social network; updating the node dual-label according to the node dual-label propagation rule and the dual-label selection rule; iteratively executing the step of updating the double labels of the nodes until a preset double label propagation end condition is met; and obtaining the community structure of the social network according to the node double labels. According to the method, the stability and the accuracy of the iteration result of the label propagation method are effectively improved by introducing the idea of double labels and label membership degree and calculating the node weight. The method can be applied to the fields of target group mining, accurate marketing and the like.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.
Claims (2)
1. A community structure division method based on a social network node dual-label propagation algorithm is characterized by comprising the following steps:
(1) constructing a social network graph;
reading social network data, and constructing a social network graph with social network users as nodes and relationships among the users as edges;
(2) initializing a distribution node label and a membership degree;
(3) Calculating the weight of each node by using a K-shell algorithm;
(4) realizing the dual-label propagation of the social network nodes in an iterative updating mode;
(4.1), setting the initial iteration number K to be 1, and setting the maximum iteration number K; setting a percentage threshold value P of all nodes corresponding to the tags, wherein the percentage threshold value is unchanged in the propagation process;
(4.2) randomly traversing all nodes in the social network graph, and updating the node double labels according to the node double label propagation rule, wherein the specific updating process is as follows:
(4.2.1) randomly selecting the non-updated node Vi;
(4.2.2) to node ViApplying a dual-label propagation rule to all neighbor nodes;
computing node ViReceiving the strength of each label from the neighbor node;
wherein,represents a node ViReceiving the total strength value, N, of all neighbor nodes with label name LL(Vi) Representation and node ViConnected set of neighbor nodes with tag name L, KUA weight value of the neighbor node U is represented,representing the membership value of the label name L in the neighbor node U;
(4.2.3) update node ViThe label of (1);
(4.2.4) performing double-label updating on all nodes which are not updated in the social network diagram according to the method of the steps (4.2.1) - (4.2.3);
(4.3) after the iteration, counting the percentage of the nodes with unchanged label names, then comparing whether the percentage is greater than a preset threshold value P, if the percentage is greater than P or the current iteration times reaches the maximum iteration times K, stopping the iteration, and then entering the step (5); otherwise, making k equal to k +1, returning to the step (4.2), and performing the next iteration;
(5) community structure for obtaining social network
And dividing the nodes with the same label into the same community, thereby obtaining the community structure of the social network.
2. The method as claimed in claim 1, wherein the update node V is a node of a social network based on a community structure partitioning method based on a dual-label propagation algorithmiThe specific process of the label is as follows:
at node ViSelecting two labels with the maximum strength as a node V from all the received labelsiAnd updating it;
let the two selected maximum intensity values beThen node ViThe membership of the two new tags of (1) is updated as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911293324.8A CN110956553A (en) | 2019-12-16 | 2019-12-16 | Community structure division method based on social network node dual-label propagation algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911293324.8A CN110956553A (en) | 2019-12-16 | 2019-12-16 | Community structure division method based on social network node dual-label propagation algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110956553A true CN110956553A (en) | 2020-04-03 |
Family
ID=69981892
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911293324.8A Pending CN110956553A (en) | 2019-12-16 | 2019-12-16 | Community structure division method based on social network node dual-label propagation algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110956553A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113177854A (en) * | 2021-04-23 | 2021-07-27 | 携程计算机技术(上海)有限公司 | Community division method and system, electronic device and storage medium |
CN114615090A (en) * | 2022-05-10 | 2022-06-10 | 富算科技(上海)有限公司 | Data processing method, system, device and medium based on cross-domain label propagation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140003425A1 (en) * | 2012-06-29 | 2014-01-02 | Futurewei Technologies, Inc. | Implementing a Multicast Virtual Private Network by Using Multicast Resource Reservation Protocol-Traffic Engineering |
CN103729475A (en) * | 2014-01-24 | 2014-04-16 | 福州大学 | Multi-label propagation discovery method of overlapping communities in social network |
CN106022371A (en) * | 2016-05-18 | 2016-10-12 | 电子科技大学 | Community discovery algorithm based on APR algorithm and MAP algorithm |
CN106789588A (en) * | 2016-12-30 | 2017-05-31 | 东软集团股份有限公司 | Label transmission method and device |
CN109255054A (en) * | 2017-07-14 | 2019-01-22 | 元素征信有限责任公司 | A kind of community discovery algorithm in enterprise's map based on relationship weight |
-
2019
- 2019-12-16 CN CN201911293324.8A patent/CN110956553A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140003425A1 (en) * | 2012-06-29 | 2014-01-02 | Futurewei Technologies, Inc. | Implementing a Multicast Virtual Private Network by Using Multicast Resource Reservation Protocol-Traffic Engineering |
CN103729475A (en) * | 2014-01-24 | 2014-04-16 | 福州大学 | Multi-label propagation discovery method of overlapping communities in social network |
CN106022371A (en) * | 2016-05-18 | 2016-10-12 | 电子科技大学 | Community discovery algorithm based on APR algorithm and MAP algorithm |
CN106789588A (en) * | 2016-12-30 | 2017-05-31 | 东软集团股份有限公司 | Label transmission method and device |
CN109255054A (en) * | 2017-07-14 | 2019-01-22 | 元素征信有限责任公司 | A kind of community discovery algorithm in enterprise's map based on relationship weight |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113177854A (en) * | 2021-04-23 | 2021-07-27 | 携程计算机技术(上海)有限公司 | Community division method and system, electronic device and storage medium |
CN114615090A (en) * | 2022-05-10 | 2022-06-10 | 富算科技(上海)有限公司 | Data processing method, system, device and medium based on cross-domain label propagation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103678671B (en) | A kind of dynamic community detection method in social networks | |
CN108492201B (en) | Social network influence maximization method based on community structure | |
CN104598605B (en) | A kind of user force appraisal procedure in social networks | |
US10157239B2 (en) | Finding common neighbors between two nodes in a graph | |
US7685141B2 (en) | Connection sub-graphs in entity relationship graphs | |
CN103678669A (en) | Evaluating system and method for community influence in social network | |
Balaprakash et al. | Estimation-based ant colony optimization and local search for the probabilistic traveling salesman problem | |
CN107391542A (en) | A kind of open source software community expert recommendation method based on document knowledge collection of illustrative plates | |
CN110719106B (en) | Social network graph compression method and system based on node classification and sorting | |
US9361403B2 (en) | Efficiently counting triangles in a graph | |
CN110956553A (en) | Community structure division method based on social network node dual-label propagation algorithm | |
CN110838072A (en) | Social network influence maximization method and system based on community discovery | |
CN104484365B (en) | In a kind of multi-source heterogeneous online community network between network principal social relationships Forecasting Methodology and system | |
Petersen et al. | Tree shift topological entropy | |
CN104700311B (en) | A kind of neighborhood in community network follows community discovery method | |
US20210357764A1 (en) | Reducing errors introduced by model updates | |
CN112085293A (en) | Method and device for training interactive prediction model and predicting interactive object | |
CN110659394A (en) | Recommendation method based on two-way proximity | |
CN112966054A (en) | Enterprise graph node relation-based ethnic group division method and computer equipment | |
Hasan et al. | Graphettes: Constant-time determination of graphlet and orbit identity including (possibly disconnected) graphlets up to size 8 | |
US20230289618A1 (en) | Performing knowledge graph embedding using a prediction model | |
CN109120431A (en) | The method, apparatus and terminal device that propagating source selects in complex network | |
CN114547439A (en) | Service optimization method based on big data and artificial intelligence and electronic commerce AI system | |
CN114417177A (en) | Label propagation overlapping community discovery method based on node comprehensive influence | |
CN107291860B (en) | Seed user determination method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200403 |