CN108492201B - Social network influence maximization method based on community structure - Google Patents

Social network influence maximization method based on community structure Download PDF

Info

Publication number
CN108492201B
CN108492201B CN201810269184.XA CN201810269184A CN108492201B CN 108492201 B CN108492201 B CN 108492201B CN 201810269184 A CN201810269184 A CN 201810269184A CN 108492201 B CN108492201 B CN 108492201B
Authority
CN
China
Prior art keywords
node
community
influence
network
communities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810269184.XA
Other languages
Chinese (zh)
Other versions
CN108492201A (en
Inventor
仇丽青
于金凤
范鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Science and Technology
Original Assignee
Shandong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Science and Technology filed Critical Shandong University of Science and Technology
Priority to CN201810269184.XA priority Critical patent/CN108492201B/en
Publication of CN108492201A publication Critical patent/CN108492201A/en
Application granted granted Critical
Publication of CN108492201B publication Critical patent/CN108492201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a social network influence maximization method based on a community structure, which comprises the following specific processes: (1) dividing communities to form a candidate node set, identifying core nodes and boundary nodes in a network and forming the candidate node set by dividing the network; (2) selecting nodes in a heuristic manner, verifying potential influence through the degree of the nodes, the scale of communities, the number of connected communities and influence weight for each node in the candidate node set, and accordingly selecting the node with the maximum potential influence in the heuristic manner to join the seed set; (3) and executing a greedy algorithm, and selecting the node with the maximum marginal profit to join the seed set by using the greedy algorithm. According to the method, the accuracy and the operation efficiency of the initial seed node mining are further improved by analyzing the effect of the community structure in influence propagation, and the problem of maximization of the influence of the social network is effectively solved.

Description

Social network influence maximization method based on community structure
Technical Field
The invention relates to the field of social networks, in particular to a social network influence maximization method based on a community structure.
Background
In recent years, with the rise of social networks, more and more social platforms such as Facebook, Twitter, Google +, and the like have attracted wide attention. These platforms act as carriers for social networks, allowing various information to be propagated across social networks. How to make the information maximally spread outwards through the social platforms and let more users accept the information is called "influence maximization problem". The problem of maximizing the influence of social networks is a hot problem in social network research, and has great application value in the fields of marketing, disease propagation, rumor control and the like.
The problem of maximizing the influence of the social network is how to select Top-K seed nodes for propagation so as to maximize the final propagation range. The problem is proved to be an NP difficult problem, so that two main types of solutions are provided at present, one is a greedy algorithm with a better influence range, and the other is a heuristic algorithm with higher efficiency. Because the greedy algorithm needs longer running time and the result of the heuristic algorithm is unstable, a hybrid algorithm generated by combining the heuristic algorithm and the greedy algorithm is a better method for solving the problem of influence maximization at present, and the algorithm mainly applies the heuristic algorithm in the first stage and applies the greedy algorithm in the second stage. All these impact maximization algorithms are generally based on two impact propagation models, namely a linear threshold model (LT) and an independent cascade model (IC), wherein the independent cascade model is a less stable model and requires a large amount of simulation, the linear threshold model has incomparable advantages of the independent cascade model, and its "cumulative characteristic" enables a node to activate a large number of nodes in the subsequent activation process, and the specific rule is: each node is initially assumed to be an active or inactive node and each inactive node is assigned a threshold value representing how easily the node can be affected, and the node is activated only if the sum of all active neighbor influences of the node is greater than or equal to the node threshold value. Each active node can participate in the activation process for multiple times, so that when an inactive node is started and is not activated, the influence of neighbor nodes is accumulated continuously, and the possibility of activation is increased.
Generally, there are two main indicators for verifying the influence maximization problem, one is the influence range and the other is the execution efficiency. However, most of current work does not consider the practical structure problem of networks, and each network has the community structure characteristic, namely the community interior connection is close and the community connection is sparse. Through the analysis to the structure, influence scope and execution efficiency can be further improved, the core node in the community can make information propagate in the community as soon as possible, and the boundary node between the communities can enlarge the information propagation scope, and execution efficiency and accuracy can be improved by identifying the two types of nodes through dividing the community. Therefore, aiming at the problem of large network scale in the aspect of influence maximization, the Louvain algorithm is used for dividing communities, is a rapid algorithm with high accuracy, and can be applied to large-scale networks.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the technical defects in the existing influence maximization algorithm, a community structure-based social network influence maximization method is provided, and the influence range of the seed nodes is further improved on the premise of ensuring the algorithm efficiency by analyzing the effect of the community structure in actual information propagation.
The technical scheme of the invention is as follows:
a social network influence maximization method based on community structure, the method comprising the steps of:
(1) constructing a social network graph: g ═ V, E, where G represents the social network, V represents the set of nodes, and E represents the set of edges of the network;
(2) dividing communities to generate a candidate node set: firstly, the Louvain algorithm proposed by Blondel et al is adopted to carry out community division on the input network G to generate M communities, namely C ═ C1,C2,...CM) Second, find out the boundary node set S of each communityboundaryAnd a core node set ScoreTaking a union set to form a candidate node set CS, wherein ScoreSelecting nodes with larger degree of 10% of the number of each community as a core node set according to degree centrality;
(3) heuristically selecting nodes: heuristically selecting from the set of candidate nodes formed in step (2)
Figure BDA0001612118660000032
Adding the nodes with the largest potential influence into a seed node set S, and executing the activation process of the seed node set S by using a linear threshold model to generate an initial active node set A, wherein k represents the number of the target node set S, and c represents an enlightening factor;
(4) a greedy algorithm is executed: continuing to select from the set of candidate nodes formed in step (2) using a greedy algorithm
Figure BDA0001612118660000033
And adding the node with the maximum marginal profit into the seed set S, and simultaneously activating by using a linear threshold model to generate a new active node adding set A.
Further, the specific operation steps of the Louvain algorithm in the step (2) are as follows:
(a) merging communities: taking each node in the network as a community, then determining which neighbor communities are combined based on the modularity gain maximization standard, and repeating the process until the modularity gain is not increased any more, wherein the modularity gain is defined as follows:
Figure BDA0001612118660000031
therein, sigmainRepresents the sum of all edge weights, Σ, in the community CtotSum of weights, k, representing all edges connected to Community Ci,inRepresenting the sum of the weights of all edges from the node i to the community C, wherein m represents the total number of the edges of the network;
(b) constructing a new network: taking the new community obtained in the step (a) as a new node, constructing a new network and repeatedly executing the step (a);
(c) the two stages are repeated until the modularity gain is not changed any more;
further, the node with the largest potential influence is selected in step (3) to join the seed node set, and then the potential influence of each node is calculated as follows:
(a) for any node v in the graph G, firstly, judging the community attribute of each node, namely a core node or a boundary node, and respectively calculating the community influence of each node based on the community structure;
(b) for the core node, the degree of the node and the number of communities in which the node is located are integrated to evaluate the community influence, and the calculation formula is as follows:
CI(v)=CD(v)+CS(v)/2
wherein, CD(v) Degree of representation of the community, CS(v) Representing the size of the community in which the node is positioned;
(c) for the boundary node, the community influence of the boundary node is evaluated by integrating the degree of the node, the number of communities directly connected with the node and the community scale mean value of the neighbor communities of the node, and the calculation formula is as follows:
CI(v)=CD(v)+CN(v)+AvgNS(v)/3
wherein, CD(v) Degree of representation of the community, CN(v) Representing the number of communities to which the node is directly connected, AvgNS(v) The average value of the community sizes of the neighbor communities of the representative node is calculated by the following formula:
Figure BDA0001612118660000041
wherein, | Ci(w) | represents the scale of the community where the neighbor w of the node v is located;
(d) in order to make the contribution of each index to the community influence consistent, the normalization criterion is used for optimization, and the community influence of each node v in the network is defined as follows by integrating the steps (b) and (c):
Figure BDA0001612118660000042
wherein each index is the result after normalization;
(e) except the community influence obtained in the step (d), each node v has a direct influence weight b on the neighbor node wvwAnd combining the two, and calculating the potential influence of each node v in the network as follows:
Figure BDA0001612118660000043
wherein w ∈ neighbor (v),
Figure BDA0001612118660000044
indicating that node w is an inactive neighbor of node v.
Further, the specific operation steps of the greedy algorithm in the step (4) are as follows:
(a) initialization: initializing a seed node set S;
(b) calculating the marginal benefit of each node v: which is expressed as the final impact increment that can be brought by adding a node v to the seed set S, the calculation formula is as follows:
σ(S+v)-σ(S)
wherein σ (g) represents an influence function;
(c) selecting a seed node: selecting the node with the largest influence gain to be added into the seed set S, and updating the influence of each node;
(d) repeating step (c) until k nodes satisfying the target are selected.
The invention has the beneficial effects that: according to the social network influence maximization method based on the community structure, core nodes and boundary nodes of a community are identified through community structure characteristics of the network, a candidate node set is formed, then the community influence of each node and the influence weight of each node are integrated by means of the accumulative characteristics of linear thresholds to heuristically select seed nodes with the largest potential influence, and finally a greedy algorithm is applied to select the seed nodes. By the method, the accuracy and the operation efficiency of the initial seed node mining are further improved, and the problem of maximization of the influence of the social network is effectively solved.
Drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a flow chart of a social network influence maximization method based on community structure according to the present invention;
FIG. 2 is a graph of the effect of the influence ranges of different heuristic factors when the seed node size is 50 according to the heuristic factor c of the present invention;
FIG. 3 is a graph of the runtime effects of different heuristic factors when the heuristic factor c of the present invention is at a seed node size of 50;
FIG. 4 is a graph comparing the effect of the present invention on the scope of influence of the HepTh social network with the existing algorithm;
FIG. 5 is a graph comparing the impact of the present invention on the context of the BrightKite social network with existing algorithms;
FIG. 6 is a graph comparing the impact of the present invention on the relationships social network of Epinions with the prior art algorithm;
FIG. 7 is a graph comparing the impact of the present invention with existing algorithms on the Amazon social network;
FIG. 8 is a graph comparing the runtime of the present invention over four social networks with existing algorithms;
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic views illustrating only the basic structure of the present invention in a schematic manner, and thus show only the constitution related to the present invention.
Fig. 1 shows a method for maximizing social network influence based on a community structure according to the present invention, which comprises the following specific steps:
step 1, constructing a social network diagram: g is (V, E),
where G represents a social network, V represents a set of nodes, and E represents a set of edges of the network.
And 2, dividing communities to generate a candidate node set.
Firstly, the input network G is divided into communities by adopting the Louvain algorithm proposed in the industry, and M communities are generated, namely C ═ C1,C2,...CM) Second, find out the boundary node set S of each communityboundaryAnd a core node set ScoreTaking a union set to form a candidate node set CS, wherein ScoreThe nodes with the larger degree of 10% of the number of each community are selected as a core node set according to the degree centrality. The Louvain algorithm in the step specifically comprises the following steps:
(2a) merging communities: taking each node in the network as a community, then determining which neighbor communities are combined based on the modularity gain maximization standard, and repeating the process until the modularity gain is not increased any more, wherein the modularity gain is defined as follows:
Figure BDA0001612118660000061
therein, sigmainRepresents the sum of all edge weights, Σ, in the community CtotSum of weights, k, representing all edges connected to Community Ci,inThe sum of the weights representing all edges of node i to community CAnd m represents the total number of edges of the network.
(2b) Constructing a new network: and (3) taking the new community obtained in the step (2a) as a new node, constructing a new network and repeatedly executing the step (2 a).
(2c) These two phases are repeated until the modularity gain is no longer changed.
And step 3, heuristically selecting nodes.
Heuristically selecting from the set of candidate nodes formed in step 2
Figure BDA0001612118660000072
Adding the nodes with the largest potential influence into the seed node set S, and executing the activation process of the seed node set S by using a linear threshold model to generate an initial active node set A, wherein k represents the number of the target node set S, and c represents a heuristic factor. The calculation of the potential influence described in this step is as follows:
(3a) for any node v in the graph G, firstly, the community attribute of each node, namely a core node or a boundary node, is judged, and the community influence of each node is calculated respectively based on the community structure.
(3b) For the core node, the degree of the node and the number of communities in which the node is located are integrated to evaluate the community influence, and the calculation formula is as follows:
CI(v)=CD(v)+CS(v)/2
wherein, CD(v) Degree of representation of the community, CS(v) Representing the size of the community in which the node is located.
(3c) For the boundary node, the community influence of the boundary node is evaluated by integrating the degree of the node, the number of communities directly connected with the node and the community scale mean value of the neighbor communities of the node, and the calculation formula is as follows:
CI(v)=CD(v)+CN(v)+AvgNS(v)/3
wherein, CD(v) Degree of representation of the community, CN(v) Representing the number of communities to which the node is directly connected, AvgNS(v) The average value of the community sizes of the neighbor communities of the representative node is calculated by the following formula:
Figure BDA0001612118660000071
wherein, | Ci(w) | represents the size of the community in which the neighbor w of node v resides.
(3d) In order to make the contribution of each index to the community influence consistent, the normalization criterion is used for optimization, and the steps (3b) and (3c) are integrated, and the community influence of each node v in the network is defined as follows:
Figure BDA0001612118660000081
wherein each index is the result after normalization.
(3e) Except the community influence obtained in the step (3d), each node v has a direct influence weight b on the neighbor node wvwAnd combining the two, and calculating the potential influence of each node v in the network as follows:
Figure BDA0001612118660000082
wherein w ∈ neighbor (v),
Figure BDA0001612118660000083
indicating that node w is an inactive neighbor of node v.
And 4, executing a greedy algorithm.
Continuing to select from the set of candidate nodes formed in step 2 using a greedy algorithm
Figure BDA0001612118660000084
And adding the node with the maximum marginal profit into the seed set S, and simultaneously activating by using a linear threshold model to generate a new active node adding set A. The greedy algorithm in this step specifically comprises the following steps:
(4a) initialization: the set of seeders S is initialized.
(4b) Calculating the marginal benefit of each node v: which is expressed as the final impact increment that can be brought by adding a node v to the seed set S, the calculation formula is as follows:
σ(S+v)-σ(S)
where σ (g) represents the influence function.
(c) Selecting a seed node: and selecting the node with the largest influence gain to join the seed set S, and updating the influence of each node.
(4d) And (4c) repeating the step until k nodes meeting the target are selected.
Example (b):
data set and experimental setup
In this example, four different sizes of published datasets HepTh dataset, brightkit dataset, epipcations dataset, and Amazon dataset from SNAP were used. The HepTh data set is from a network of high-energy physical theory collaborators and is an undirected graph. The brightkit dataset is a location-based social network, an undirected graph. The eponions data set is from a trust network, and is a directed graph formed by selecting partial trust at members of the eponions website to comment on the formed link relationship. The Amazon dataset comes from Amazon purchasing web sites, and if two products in the web site are often purchased together, there will be a link relationship, and thus a directed graph. The static structural feature statistics of these four data sets are shown in table 1.
Table 1: statistical analysis of static structural features of experimental data
Figure BDA0001612118660000091
The linear threshold model used in the present invention has its threshold value often assigned to one in [0,1 ]]Here, in order to make the result more definite, the present invention uses the classical threshold θ of 0.5 proposed by Kempe et al, and the influence weight of the linear threshold model is often set to bvw=1/CD(v) This means that the contribution of node v to each neighbor is the same, but this does not fit into the real world situation, so i amSet the influence weight to
Figure BDA0001612118660000092
Wherein C isD(v) Degree representing node v, and n (v) represents a neighbor set of node v.
All simulation experiments in the following examples were compared with the PHG (Partition-Heuristic-Greeny) of the present invention using a hybrid HPG, a Greedy method, a PageRank method, a Degreee method, and a randomization method.
Second, dividing the community representation
In order to identify key nodes more accurately, the accuracy of community division is particularly important, and for a large-scale social network, the running time is also a necessary consideration, and all the considerations are combined, the invention selects a Louvain algorithm to divide the community, the algorithm is an algorithm which can be applied to the large-scale social network and has higher accuracy, and the division result is shown in table 2.
Table 2: results of community discovery
Figure BDA0001612118660000101
By analyzing table 2, there are two parameters, the modularity Q and the parameter u, which characterize the community division. The modularity Q is used for measuring the advantages and disadvantages of network division by comparing the connection density difference of the existing network and the reference network under the same community division, and the higher the value of the modularity is, the better the network division is represented. The modularity values of the four networks in table 2 range from 0.76 to 0.91, indicating whether the Louvain algorithm has a higher accuracy for partitioning communities. And the parameter u ═ Smin/Smax) The average probability of representing that each node in the network is not in the same community with the neighbor node is directly determined by the parameter, the smaller the parameter is, the stronger the representative community structure is, the parameters of the four data sets in the table 2 are all lower than 0.01, which indicates that the communities divided by the four networks are all strong community structures, and is more favorable for identifying key nodes in the communities. General Table 2 and related analysesIt can be known that the Louvain algorithm is a relatively suitable algorithm applied to a large-scale social network.
Selection of heuristic factor c
By considering the combined effect of the impact propagation and the runtime, an appropriate heuristic c is selected for each data set to optimize the present invention. Fig. 2 and fig. 3 show the influence range and the runtime variation of different heuristic factors at a seed node size of 50, respectively, and it can be seen from fig. 2 that as the heuristic factor increases, the influence range gradually decreases on most data sets, while the influence range on eponions data sets does not vary much. Similarly, as can be seen from fig. 3, the efficiency of the runtime is gradually increasing as the heuristic increases. This is mainly due to the low effectiveness and efficiency of heuristic algorithms and the high effectiveness and efficiency of greedy algorithms. Therefore, to obtain a relatively suitable runtime and a relatively high impact range, the Amazon dataset and brightkill dataset were set to 0.4 and 0.4, respectively, heuristic factors. The eponions dataset, however, has its heuristic factor set to 1 depending on its run time, since it is not widely separated over the range of influence. The HepTh data set has a small size and a small runtime gap, so its heuristic factor is set to 0.2 depending on the impact range.
Fourth, range of influence
Fig. 4 to 7 show graphs comparing the influence ranges of the PHG of the present invention with other five algorithms HPG, Greedy, PageRank, Degree and Random on four datasets of HepTh, brightkit, epicons and Amazon at seed node scales of 1-50, respectively. As can be seen from these four figures, the Random algorithm performs the worst, mainly because the algorithm does not consider any factors, while other algorithms consider some factors to a different degree. The heuristic algorithms PageRank and Degree, while performing better than the Random algorithm, are much worse than the remaining algorithms Greedy, HPG and PHG. When the seed scale is smaller, greeny and PHG algorithms as high-influence algorithms can be found to have a similar influence range with the PHG of the invention, but the PHG of the invention performs better and better with the increase of the node scale. For example, at a seed node size of 50, the PHG algorithm is 10.7% and 35.5% higher on Amazon datasets than the Greedy algorithm and the HPG algorithm, respectively. These results all indicate that community structure information has an important role in information dissemination, so that the invention can effectively identify influential nodes.
Fifth, running time
FIG. 8 shows a runtime comparison of the present invention with other five algorithms HPG, Greedy, PageRank, Degreee and Random at a seed node size of 50. It can be seen from the figure that the runtime of the Random, Degree and PageRank algorithms is relatively short, mainly because these algorithms are not stable and do not perform the propagation of the influence well. In the remaining algorithms, the efficiency of the PHG of the present invention is higher than the efficiencies of greeny and HPG of the other two algorithms, mainly because the present invention forms a candidate node set by partitioning the community search key nodes and the core nodes, thereby reducing the seed search space and improving the operation efficiency.
In conclusion, by analyzing the effect of the community structure in influence propagation, the invention utilizes the community structure information to identify the key nodes in the social network, so that the invention not only improves the influence range, but also further optimizes the operation efficiency, and has excellent effect.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.

Claims (3)

1. A social network influence maximization method based on community structure, the method comprising the steps of:
(1) if two products in the website are frequently purchased together, a link relation is formed, a directed graph is formed, product purchase data from the purchased website is obtained, and a social network graph is constructed: g ═ V, E, where G represents the social network, V represents the set of nodes, and E represents the set of edges of the network;
(2) dividing communities to generate a candidate node set: firstly, the input network G is subjected to community division by adopting a Louvain algorithm to generate M communities, namely C ═ C (C)1,C2,...CM) Second, find out the boundary node set S of each communityboundaryAnd a core node set ScoreTaking a union set to form a candidate node set CS, wherein ScoreSelecting nodes with larger degree of 10% of the number of each community as a core node set according to degree centrality;
(3) heuristically selecting nodes: heuristic selection from the set of candidate nodes formed in step (2)
Figure 778576DEST_PATH_IMAGE002
Adding the nodes with the largest potential influence into a seed node set S, and executing the activation process of the seed node set S by using a linear threshold model to generate an initial active node set A, wherein k represents the number of the target node set S, and c represents an enlightening factor;
(4) a greedy algorithm is executed: continuing to select from the set of candidate nodes formed in step (3) using a greedy algorithm
Figure 139457DEST_PATH_IMAGE004
Adding the node with the maximum marginal profit into a seed set S, and simultaneously activating by using a linear threshold model to generate a new active node adding set A;
and (4) selecting the node with the largest potential influence to be added into the seed node set in the step (3), wherein the potential influence of each node is calculated in the following process:
(a) for any node v in the graph G, firstly, judging the community attribute of each node, namely a core node or a boundary node, and respectively calculating the community influence of each node based on the community structure;
(b) for the core node, the degree of the node and the number of communities in which the node is located are integrated to evaluate the community influence, and the calculation formula is as follows:
CI(v)=CD(v)+CS(v)/2
wherein, CD(v) Degree of representation of the community, CS(v) Representing the size of the community in which the node is positioned;
(c) for the boundary node, the community influence of the boundary node is evaluated by integrating the degree of the node, the number of communities directly connected with the node and the community scale mean value of the neighbor communities of the node, and the calculation formula is as follows:
CI(v)=CD(v)+CN(v)+AvgNS(v)/3
wherein, CD(v) Degree of representation of the community, CN(v) Representing the number of communities to which the node is directly connected, AvgNS(v) The average value of the community sizes of the neighbor communities of the representative node is calculated by the following formula:
Figure DEST_PATH_IMAGE005
wherein, | Ci(w) | represents the scale of the community where the neighbor w of the node v is located;
(d) in order to make the contribution of each index to the community influence consistent, the normalization criterion is used for optimization, and the community influence of each node v in the network is defined as follows by integrating the steps (b) and (c):
Figure 325719DEST_PATH_IMAGE006
wherein each index is the result after normalization;
(e) except the community influence obtained in the step (d), each node v has a direct influence weight b on the neighbor node wvwAnd combining the two, and calculating the potential influence of each node v in the network as follows:
Figure DEST_PATH_IMAGE007
wherein w ∈ neighbor (v),
Figure 642300DEST_PATH_IMAGE008
indicating that node w is an inactive neighbor of node v.
2. The method for maximizing social network influence based on community structure as claimed in claim 1, wherein the specific operation steps of the Louvain algorithm in the step (2) are as follows:
(a) merging communities: taking each node in the network as a community, then determining which neighbor communities are combined based on the modularity gain maximization standard, and repeating the process until the modularity gain is not increased any more, wherein the modularity gain is defined as follows:
Figure DEST_PATH_IMAGE009
therein, sigmainRepresents the sum of all edge weights, Σ, in the community CtotSum of weights, k, representing all edges connected to Community Ci,inRepresenting the sum of the weights of all edges from the node i to the community C, wherein m represents the total number of the edges of the network;
(b) constructing a new network: taking the new community obtained in the step (a) as a new node, constructing a new network and repeatedly executing the step (a);
(c) these two phases are repeated until the modularity gain is no longer changed.
3. The method for maximizing social network influence based on community structure as claimed in claim 1, wherein the greedy algorithm in step (4) is specifically operated as follows:
(a) initialization: initializing a seed node set S;
(b) calculating the marginal benefit of each node v: which is expressed as the final impact increment that can be brought by adding a node v to the seed set S, the calculation formula is as follows:
σ(S+v)-σ(S)
wherein σ (g) represents an influence function;
(c) selecting a seed node: selecting the node with the largest influence gain to be added into the seed set S, and updating the influence of each node;
(d) repeating step (c) until k nodes satisfying the target are selected.
CN201810269184.XA 2018-03-29 2018-03-29 Social network influence maximization method based on community structure Active CN108492201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810269184.XA CN108492201B (en) 2018-03-29 2018-03-29 Social network influence maximization method based on community structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810269184.XA CN108492201B (en) 2018-03-29 2018-03-29 Social network influence maximization method based on community structure

Publications (2)

Publication Number Publication Date
CN108492201A CN108492201A (en) 2018-09-04
CN108492201B true CN108492201B (en) 2022-02-08

Family

ID=63317301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810269184.XA Active CN108492201B (en) 2018-03-29 2018-03-29 Social network influence maximization method based on community structure

Country Status (1)

Country Link
CN (1) CN108492201B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064348B (en) * 2018-09-06 2021-10-08 上海交通大学 Method for locking rumor community and inhibiting rumor propagation in social network
CN109617871B (en) * 2018-12-06 2020-04-14 西安电子科技大学 Network node immunization method based on community structure information and threshold
CN109345158A (en) * 2018-12-19 2019-02-15 重庆百行智能数据科技研究院有限公司 Business risk recognition methods, device and computer readable storage medium
CN109685355A (en) * 2018-12-19 2019-04-26 重庆百行智能数据科技研究院有限公司 Business risk recognition methods, device and computer readable storage medium
CN110222273B (en) * 2019-05-14 2021-08-17 上海交通大学 Business point promotion method and system in social network based on geographic community
CN110162716B (en) * 2019-05-21 2020-12-25 湖南大学 Influence community searching method and system based on community retrieval
CN111988129B (en) * 2019-05-21 2022-07-01 中移(苏州)软件技术有限公司 Influence maximization data set processing method, device and system
CN110263264B (en) * 2019-06-28 2021-04-27 南昌航空大学 Method for acquiring social network key node
CN110796561B (en) * 2019-10-19 2023-04-11 上海大学 Influence maximization method and device based on three-hop velocity attenuation propagation model
CN110750721A (en) * 2019-10-21 2020-02-04 秒针信息技术有限公司 Information pushing method and device, electronic equipment and readable storage medium
CN110855641B (en) * 2019-10-30 2022-07-01 支付宝(杭州)信息技术有限公司 Community attribute information determination method, device and storage medium
CN110992195B (en) * 2019-11-25 2023-04-21 中山大学 Social network high-influence user identification method combined with time factors
CN111275565A (en) * 2020-02-14 2020-06-12 山东科技大学 Social network influence maximization method based on local and global influences
CN111597665B (en) * 2020-05-15 2023-05-23 天津科技大学 Hierarchical network embedding method based on network partition
CN111782969B (en) * 2020-07-06 2023-05-23 桂林电子科技大学 Social network maximum influence node selection method based on geographic area
CN112148989B (en) * 2020-10-16 2021-08-24 重庆理工大学 Social network node influence recommendation system based on local nodes and degree discount
CN112214689A (en) * 2020-10-22 2021-01-12 上海交通大学 Method and system for maximizing influence of group in social network
CN113282744B (en) * 2021-06-07 2022-11-08 南京邮电大学 Literary work character relation visualization analysis method based on node influence measurement
CN113436674B (en) * 2021-06-23 2023-02-17 兰州大学 Incremental community detection method-TSEIA based on TOPSIS seed expansion
CN113284030B (en) * 2021-06-28 2023-05-23 南京信息工程大学 Urban traffic network community division method
CN115049002A (en) * 2022-06-15 2022-09-13 重庆理工大学 Complex network influence node identification method based on reverse generation network
CN115329209A (en) * 2022-07-18 2022-11-11 齐齐哈尔大学 Method for maximizing influence of time sequence social network of improved K-shell
CN115659007B (en) * 2022-09-21 2023-11-14 浙江大学 Dynamic influence propagation seed minimization method based on diversity

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875281A (en) * 2017-03-13 2017-06-20 哈尔滨工程大学 Community network node method for digging based on greedy subgraph

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050154639A1 (en) * 2004-01-09 2005-07-14 Zetmeir Karl D. Business method and model for integrating social networking into electronic auctions and ecommerce venues.
US20070198510A1 (en) * 2006-02-03 2007-08-23 Customerforce.Com Method and system for assigning customer influence ranking scores to internet users

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875281A (en) * 2017-03-13 2017-06-20 哈尔滨工程大学 Community network node method for digging based on greedy subgraph

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Improving Louvain Algorithm for Community Detection;Bin Hu等;《International Conference on Artificial Intelligence and Engineering Applications (AIEA 2016)》;20161112;摘要,第2.2节 *
一种新型的社会网络影响最大化算法;田家堂等;《计算机学报》;20111015;第34卷(第10期);摘要、第2.1节、第3.1节,算法1 *

Also Published As

Publication number Publication date
CN108492201A (en) 2018-09-04

Similar Documents

Publication Publication Date Title
CN108492201B (en) Social network influence maximization method based on community structure
Li et al. Conformity-aware influence maximization in online social networks
Yun et al. An efficient algorithm for mining high utility patterns from incremental databases with one database scan
Ju et al. A new algorithm for positive influence maximization in signed networks
Qiu et al. PHG: A three-phase algorithm for influence maximization based on community structure
Lin et al. CK-LPA: Efficient community detection algorithm based on label propagation with community kernel
Pei et al. Efficient collective influence maximization in cascading processes with first-order transitions
Pal et al. Centrality measures, upper bound, and influence maximization in large scale directed social networks
US20120330864A1 (en) Fast personalized page rank on map reduce
Kundu et al. Fuzzy-rough community in social networks
CN103116611A (en) Social network opinion leader identification method
Hong et al. Efficient minimum cost seed selection with theoretical guarantees for competitive influence maximization
CN112052404B (en) Group discovery method, system, equipment and medium of multi-source heterogeneous relation network
CN105117488B (en) A kind of distributed storage RDF data balanced division method based on hybrid hierarchy cluster
CN102262681A (en) Method for identifying key blog sets in blog information spreading
Dey et al. Influence maximization in online social network using different centrality measures as seed node of information propagation
Vega-Oliveros et al. Link prediction based on stochastic information diffusion
Durón Heatmap centrality: a new measure to identify super-spreader nodes in scale-free networks
CN107222410B (en) Method, device, terminal and computer readable storage medium for link prediction
Sheng et al. Research on the influence maximization based on community detection
Gialampoukidis et al. Community detection in complex networks based on DBSCAN* and a Martingale process
Freitas et al. Local partition in rich graphs
Zarei et al. Chaotic memetic algorithm and its application for detecting community structure in complex networks
Li et al. Multi-topical authority sensitive influence maximization with authority based graph pruning and three-stage heuristic optimization
Kumar et al. Opinion leader detection in Asian social networks using modified spider monkey optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant