CN111242794A

CN111242794A - Method for measuring social network influence

Info

Publication number: CN111242794A
Application number: CN202010066714.8A
Authority: CN
Inventors: 吴晴晴; 周丽华; 黄亚群
Original assignee: Yunnan University YNU
Current assignee: Yunnan University YNU
Priority date: 2020-01-20
Filing date: 2020-01-20
Publication date: 2020-06-05
Also published as: AU2020102905A4

Abstract

The invention discloses a method for measuring social network influence, which is used for effectively selecting seed nodes. Algorithm 1 shows the pseudo code of the CCIM algorithm. Firstly, dividing the network G (V, E) into M communities by a community detection algorithm, then calculating the influence of the nodes and finding out the seed nodes with the maximum influence. To avoid duplicate calculations, we use incremental calculations of marginal gain strategies. After selecting the seed node, the overlapping impact is deleted and the impact of the rest of the nodes is recalculated. Finally, the seed nodes propagate the influence in the network in a specific diffusion model to maximize the influence range.

Description

Method for measuring social network influence

Technical Field

The invention relates to the technical field of internet, in particular to a method for measuring social network influence.

Background

In recent years, the rapid development of internet technology has promoted the development of social networks such as Twitter, microblog and wechat. Social networks are networks with intricate relationships between individuals that facilitate the propagation of information between individuals. The purpose of Influence Maximization (IM) is to determine the number of users that are most influential, maximizing the expected number of users that are ultimately affected through information diffusion. Due to its wide range of practical applications, such as virus marketing [1,2], rumor control [3,4] and cascade detection [5], maximizing impact has attracted considerable attention from researchers and experts.

The IM problem was first proposed by Kempe et al [6], proving that it is an NP-hard problem, and a greedy algorithm with guaranteed solution accuracy was proposed. The traditional greedy algorithm has higher time complexity, so the traditional greedy algorithm cannot be applied to a network with larger scale. To address this problem, researchers have proposed a number of approximation algorithms and heuristics in recent years, such as simulation-based algorithms [5,7], centrality-based algorithms [8,9,10], path-based algorithms [11, 12,13] and community-based algorithms [14,15,16,17,18 ]. Community-based algorithms typically utilize the impact of nodes in a community to approximate their impact on the entire network.

The community structure [19] is one of the most prominent features of the network, and is described as a special group, in which nodes are connected tightly within communities and connected sparsely between communities. It exposes the organizational structure and functional components of the network and describes the structure of the network from a perspective. For two nodes in the community, even if they have only weak relationship in the microstructure due to sparsity of data, the influence between them will be strengthened due to the limitation of the community structure. In addition, since the range of influence of one person is limited, the influence of one community can be used to approximate the influence of the whole network. By utilizing the advantage that the scale of the community is much smaller than that of the whole network, the influence of the nodes can be calculated more effectively under the condition of ensuring the precision of a solution;

existing community-based IM algorithms have achieved some success, e.g., CoFIM [17]And IMPC [18 ]]. However, these algorithms only consider the number of nodes in the community, and ignore the connection density of edges in the community. A network with a community structure, Community C, as depicted in FIG. 1(a)₃And C₄Has the same number of nodes, but the ratio of edges in C₃The more important. Considering only the number of nodes, the impact of both communities is the same. However, the greater the number of edges in the community, the higher the likelihood of interaction between nodes, which is indicative ofThe chance of activating an inactive node may be increased. Thus, a distinction is made between C₃And C₄The influence of (a) is beneficial for more accurately measuring the influence of the node.

In addition, the existing method can only be applied to non-overlapping community structures. In the real world, however, communities often overlap, i.e., a node may belong to many communities. For example, in FIG. 1(b), node v₁Belong to C₁And C and₄three communities and therefore have an impact on the nodes in all three communities.

IM research has been a popular research topic in social network analysis, and aims to search the most influential users in social networks to maximize the influence. In recent years, much research has been focused on the issue of IM, with small-scale community structures to improve operating efficiency. However, existing community-based influence maximization methods only consider the number of nodes in the community, and ignore the density of connections between nodes in the community. Furthermore, existing research methods can only be applied to non-overlapping community structures.

Disclosure of Invention

The present invention is directed to solving the above problems and providing a method for measuring social network influence.

The invention realizes the purpose through the following technical scheme:

the invention comprises the following steps:

problem definition:

given a network G, the goal is to select a set of most influential nodes S, maximizing the expected total number of active nodes σ (S) under a particular diffusion model:

S^*＝arg_Smaxσ(S) (1)

influence measurement:

setting the influence diffusion process of an activated node u into two stages, wherein one stage is called multi-neighbor propagation, and the other stage is called community propagation;

(1) multi-neighbor impact:

in the multi-neighbor propagation process, two steps of propagation are influenced: first, the influence propagates from node u to N (u), which is the direct influence of u on N (u); the influence will then continue to propagate from the active node in N (u) to N (u)), which is the indirect influence of node u on N (u)) through N (u); for each activated node u, through the information cascade diffusion mode, the influence on neighbors more than two hops is small, and the neighbors are difficult to activate indirectly; thus only its effect on one-hop neighbor N (u) and two-hop neighbor N (u)) is considered;

let p_uvDenotes the probability of u affecting v, IN₁(u) and IN₂(u) represents the influence of node u on one-hop and two-hop neighbors, respectively, IN₁(u) and IN₂(u) is defined as equation (2) [20]And (3);

the multi-neighbor impact of node u is then denoted as f₁(u) approximated by the formula:

f₁(u)＝IN₁(u)+IN₂(u) (4)

since N (u) and N (u)) are u direct and indirect neighbors, the effect of u on N (u) is direct, while the effect of u on N (u)) is indirect; the direct effect is greater than the indirect effect, so nodes in n (u) are more likely to be activated by u; if only the influence of the active node u is considered, after multi-neighbor propagation, N (u) contains a greater proportion of active nodes than the proportion of active nodes in N (u));

(2) community influence:

because the activation node has less influence on communities other than bc (u) and nc (u), the influence on communities other than bc (u) and nc (u) is ignored for the sake of simplicity; the effect of an activation node in BC (u) is referred to as an intra-community effect, and the effect of an activation node in NC (u) is referred to as an inter-community effect;

defining the concept of community closeness based on the average shortest distance between nodes in the community; the definition of the influence in the community and the influence between the communities is given;

definitions 1. Community closeness. hypothesis C_iIs a community, d (u, v)_minIs the shortest path between nodes u and v, u, v ∈ C_iThen the community C_iThe compactness of (c) is defined as:

close(C_i) Is community C_iAverage shortest path of middle node, which reflects C_iMiddle edge connection Density, close (C)_i) Smaller values indicate C_iThe greater the effect of (c);

(2.1) influence within Community

The intra-community impact measures the impact of the activated node in BC (u) on the non-activated node in BC (u); let A_iRepresents Community C_iE.g. nodes activated in BC (u), the influence will be from A_iDiffusion to C_iAn inactive node in (1);

in the cascade diffusion process, the influence force is reduced along with the increase of the path length; obviously, A_iActive node and C in_iThe shorter the path between the inactive nodes in (a) indicates that the inactive nodes are more easily activated; c_iTo A_iIs denoted as cd (u, C)_i,A_i) It is defined as:

wherein d (v, A)_i)_minDenoted as nodes v to A_iShortest path, | C_i/(u∪A_i) I denotes Community C_iNumber of inactive nodes, cd (u, C)_i,A_i) Is in consideration of A_iA variation of community closeness of; cd (u, C)_i,A_i) The smaller the value of (A), the community C is indicated_iNode A of_iThe more closely related, therefore A_iFor community C_iIs affected byThe larger;

when an influence is propagated in a community, the influence in the community depends not only on the closeness degree of the community but also on the number of nodes in the community; more nodes in the community means that more nodes may be activated; therefore, we normalize the number of nodes in the community to the weight of the community, denoted as

Therefore, the temperature of the molten metal is controlled,

the larger and the larger the size of cd (u, C)_i,A_i) Smaller, indicating community C_iThe influence of (2) is greater; therefore, we measure the intra-community influence of nodes by combining weights and closeness of communities, denoted as IC₁(u), which is defined as:

(2.2) inter-Community Effect

The inter-community influence measures the influence of an activated node in NC (u) on an inactivated node in NC (u); since the node u belongs to C_iAnd neighbor community C_jConnections between e NC (u) are sparse, so for each C_j，C_jNumber of active nodes (A) in (2)_j) Is very small, therefore A_jFor community C_jIs relatively small; neglecting the influence from A_jPropagation process to NC (u), and estimating inter-community influence of the node u by using the influence of the community itself; as with the intra-community effects, we measure community C in combination with the weight and closeness of the community_jThe influence of (C) is represented as I (C)_j) It is defined as:

node u belongs to C_iIs expressed as IC₂(u), which can be approximated as:

then, we integrate the intra-community and inter-community influence into the community influence of node u, denoted as f₂(u)：

f₂(u)＝α·IC₁(u)+β·IC₂(u) (9)

Wherein α and β represent weights of intra-community and inter-community impacts, respectively;

(3) total influence of

Integrating the multi-neighbor influence and the community influence into the total influence of the node u in the whole network, and recording the total influence as f (u), the method is beneficial to measuring the influence of the node comprehensively;

f(u)＝f₁(u)+f₂(u)＝[IN₁(u)+IN₂(u)]+[α·IC₁(u)+β·IC₂(u)](10)。

the invention has the beneficial effects that:

the invention is a method for measuring social network influence, compared with the prior art, the method can effectively select seed nodes. Algorithm 1 shows the pseudo code of the CCIM algorithm. Firstly, dividing the network G (V, E) into M communities by a community detection algorithm, then calculating the influence of the nodes and finding out the seed nodes with the maximum influence. To avoid duplicate calculations, we use incremental calculations of marginal gain strategies. After selecting the seed node, the overlapping impact is deleted and the impact of the rest of the nodes is recalculated. Finally, the seed nodes propagate the influence in the network in a specific diffusion model to maximize the influence range.

Drawings

FIG. 1 is a network with a community architecture;

a non-overlapping communities in the graph; b overlapping communities

FIG. 2 is a schematic diagram of the inventive network partitioning into 4 communities.

Detailed Description

The invention will be further described with reference to the accompanying drawings in which:

firstly, relevant symbols used by the method are introduced, secondly, the problems to be solved by the method are clarified, and finally, a main method for measuring the influence of the nodes is introduced.

1. Correlation symbols

Given a network G (V, E), where V is a set of nodes and E is a set of edges. Suppose a network is divided into M communities, denoted C ═ C₁,C₂,...,C_MWithin communities, nodes are tightly connected and sparsely connected between communities. For any node u e V, it may belong to one or more communities. If u ∈ C_i,C_iThe community of which is called u. Let BC (u) denote the set of communities to which u belongs, | BC (u) | ≧ 1. In G, if node v is directly connected to node u, v is called a neighbor or a single-hop neighbor of u. The neighbors of a node neighbor are referred to as two-hop neighbors of the node. We denote the one-hop neighbor set of u by N (u), and the two-hop neighbor set of u by N (u)). If u ∈ C_i,

But its neighbor node v ∈ C_j(i ≠ j), then C_jReferred to as the neighbor community of node u. Let NC (u) denote a set of neighbor communities of u, | NC (u) | ≧ 0.

During the information dissemination, each node is in one of two states, active and inactive. An active state indicates that the node has accepted the information exposed to it and can forward the information to its neighbors, while an inactive state means that the node has not accepted the information. When inactive nodes are affected by active nodes, their state will change from inactive to active, and vice versa. An active node u first attempts to affect its neighbors

While the active neighbor further affects its neighbor n (v).

2. Problem definition

Given a network G, our goal is to select a set of most influential nodes S, maximizing the expected total number of active nodes σ (S) under a particular diffusion model:

S^*＝arg_Smaxσ(S) (1)

the key to solving the above problem is how to measure the impact of the nodes.

3. Amount of influence

We assume that the impact diffusion process of an active node u is divided into two phases, one phase is called multi-neighbor propagation, and the other phase is called community propagation. The multi-neighbor propagation considers the point-to-point influence and reflects the influence of a microscopic level. The community propagation considers the influence of points on communities and reflects the influence of the mesoscopic level. Simultaneous consideration of microscopic and mesoscopic effects facilitates accurate measurement of the effect of nodes in the network from multiple levels and angles.

(1) Multiple neighbor impact

In the multi-neighbor propagation process, we focus on two steps that affect propagation. First, the influence propagates from node u to N (u), which is the direct influence of u on N (u). The effect will then continue to propagate from the active node in N (u) to N (u)), which is the indirect effect that node u has on N (u)) through N (u). For each activated node u, the influence on neighbors more than two hops is small by the information cascade diffusion mode, and the activated nodes u are difficult to activate indirectly. The present invention therefore considers its impact only on one-hop neighbors N (u) and two-hop neighbors N (u)).

Let p_uvDenotes the probability of u affecting v, IN₁(u) and IN₂(u) represents the influence of node u on one-hop and two-hop neighbors, respectively, IN₁(u) and IN₂(u) is defined as equation (2) [20]And (3).

f₁(u)＝IN₁(u)+IN₂(u) (4)

since N (u) and N (u)) are u direct and indirect neighbors, the effect of u on N (u) is direct, while the effect of u on N (u)) is indirect. Generally, direct effects are greater than indirect effects, so nodes in N (u) are more likely to be activated by u. If only the effect of active node u is considered, then after multi-neighbor propagation, N (u) contains a greater proportion of active nodes than N (u)).

(2) Community influence:

after multi-neighbor propagation, some nodes in N (u) and N (u)) are activated, which will further affect the rest of the inactive nodes in the network. Nodes in N (u) and N (u)) may be dispersed in different communities, and thus the impact may be propagated to different communities. The key issue at this stage is how to analyze the propagation of computational impact from N (u) and N (u)) to different communities. The nodes in N (u) may be dispersed among BC (u) (the set of communities to which u belongs) and NC (u) (the set of neighbor communities of u), while the nodes in N (N) (u) may be dispersed among BC (u), NC (u) and other communities. For simplicity, we ignore the effects of nodes in other communities except BC (u) and NC (u). In the present invention, the influence of an activation node in bc (u) is referred to as an intra-community influence, and the influence of an activation node in nc (u) is referred to as an inter-community influence.

The influence of a node in a community depends not only on the number of nodes in the community, but also on the relationship between nodes in the community. The more nodes in the community, the larger the influence range; the tighter the relationship between nodes, the greater the impact strength. To quantify the observations, we define the notion of community closeness based on the average shortest distance between nodes in the community. Then, we give a definition of intra-community and inter-community impacts.

close(C_i) Is community C_iAverage shortest path of middle node, which reflects C_iMiddle edge connection Density, close (C)_i) Smaller values indicate C_iThe greater the effect of (c).

(2.1) influence within Community

The intra-community impact measures the impact of the activated node in BC (u) on the non-activated node in BC (u). Let A_iRepresents Community C_iE.g. nodes activated in BC (u), the influence will be from A_iDiffusion to C_iIs inactive node in the network.

In the cascade diffusion process, the influence force decreases as the path length increases. Obviously, A_iActive node and C in_iThe shorter the path between inactive nodes in (a) indicates that the inactive nodes are more easily activated. C_iTo A_iIs denoted as cd (u, C)_i,A_i) It is defined as:

wherein d (v, A)_i)_minDenoted as nodes v to A_iShortest path, | C_i/(u∪A_i) I denotes Community C_iNumber of inactive nodes, cd (u, C)_i,A_i) Is in consideration of A_iIs a variation of community closeness. cd (u, C)_i,A_i) The smaller the value of (A), the community C is indicated_iNode A of_iThe more closely related, therefore A_iFor community C_iThe greater the effect of (c).

When an influence is propagated in a community, the influence in the community depends not only on how close the community is, but also on the number of nodes in the community. More nodes in the community means that there may be more nodes activated. Therefore, we normalize the number of nodes in the community to the weight of the community, denoted as

Therefore, the temperature of the molten metal is controlled,

the larger and the larger the size of cd (u, C)_i,A_i) Smaller, indicating community C_iThe greater the influence of (c). Therefore, we measure the intra-community influence of nodes by combining weights and closeness of communities, denoted as IC₁(u), which is defined as:

(2.2) inter-Community Effect

The inter-community impact measures the impact of an active node in NC (u) on an inactive node in NC (u). Since the node u belongs to C_iAnd neighbor community C_jConnections between e NC (u) are sparse, so for each C_j，C_jNumber of active nodes (A) in (2)_j) Is very small, therefore A_jFor community C_jIs relatively small. Therefore, to improve computational efficiency, we neglect to influence from A_jAnd (5) a propagation process to the NC (numerical control) (u), and estimating the inter-community influence of the node u by using the influence of the community. As with the intra-community effects, we measure community C in combination with the weight and closeness of the community_jThe influence of (C) is represented as I (C)_j) It is defined as:

node u belongs to C_iIs expressed as IC₂(u), which can be approximated as:

f₂(u)＝α·IC₁(u)+β·IC₂(u) (9)

Where α and β represent the weight of the intra-community and inter-community impacts, respectively.

(3) Total influence of

The influence of multiple neighbors and the influence of communities are integrated into the total influence of a node u in the whole network, and is marked as f (u), so that the influence of the node is more comprehensively measured.

f(u)＝f₁(u)+f₂(u)

＝[IN₁(u)+IN₂(u)]+[α·IC₁(u)+β·IC₂(u)](10)

The invention provides a community closeness-based influence maximization (CCIM) algorithm to effectively select seed nodes. Algorithm 1 shows the pseudo code of the CCIM algorithm. Firstly, dividing the network G (V, E) into M communities by a community detection algorithm, then calculating the influence of the nodes and finding out the seed nodes with the maximum influence. To avoid duplicate calculations, we use incremental calculations of marginal gain strategies. After selecting the seed node, the overlapping impact is deleted and the impact of the rest of the nodes is recalculated. Finally, the seed nodes propagate the influence in the network in a specific diffusion model to maximize the influence range.

Example (b): community closeness-based influence maximization in social networks

The method mainly comprises three steps: (1) the community detection (2) influence measurement (3) diffusion process, each step is described in detail below.

(1) Community detection

The network G (V, E) is divided into M communities by a community detection algorithm, for example, fig. 2, and the network is divided into 4 communities. Community C₁Comprises nodes V0, V1, V2, V3, V4 and community C₂Comprises nodes V5, V6, V7, V8, V9, V10 and community C₃Comprises nodes V4, V11, V12, V13, V14, V15, V16, V17, community C₄Comprising nodes V4, V18, V19, V20, V21, V22, V23.

(2) Amount of influence

The multi-neighbor impact of each node is calculated according to equation (4), and the results are shown in table 1.

TABLE 1 Multi-neighbor impact

The intra-community influence of each node is calculated according to equation (6), and the results are shown in table 2.

TABLE 2. influence in Community

Node point	Influence in community	Node point	Influence in community	Node point	Influence in community
						V0	0.000222	V8	0.6002	V16	2.25025
V1	0.000167	V9	0.66689	V17	2.400267
						V2	0.0	V10	0.6002	V18	1.33355
V3	0.000167	V11	2.25025	V19	1.000167
						V4	3.7435841	V12	2.500278	V20	1.500249
V5	0.6002	V13	2.4002667	V21	1.33355
						V6	0.66689	V14	2.00022	V22	1.500249
V7	0.6002	V15	2.400267	V23	1.000167

The inter-community influence of each node is calculated according to equation (8), and the result is shown in table 3.

TABLE 3. inter-Community impact

Node point	Inter-community influence	Node point	Inter-community influence	Node point	Inter-community influence
						V0	0.0	V8	0.0	V16	0.840285
V1	0.0	V9	0.0	V17	0.0
						V2	1.99095	V10	0.0	V18	1.150959
V3	1.99095	V11	0.840285	V19	1.150958
						V4	0.375125	V12	0.0	V20	0.0
V5	0.0	V13	0.0	V21	0.0
						V6	1.9910988	V14	0.0	V22	0.0
V7	0.0	V15	0.0	V23	0.0

The total influence of each node is calculated according to equation (10), where we set α -0.5 β -0.2, and the results are shown in table 4.

TABLE 4 Total Effect

Node point	Inter-community influence	Node point	Inter-community influence	Node point	Inter-community influence
						V0	1.2719355	V8	1.92509925	V16	3.6074650
V1	1.9445267	V9	2.785825	V17	2.97989375
						V2	2.857316	V10	1.92509925	V18	2.7489924
V3	2.067717	V11	3.7046879	V19	3.149440
						V4	5.73185	V12	2.521963	V20	2.2045293
V5	1.835814	V13	3.1941795	V21	2.688324
						V6	3.629282	V14	2.9167755	V22	2.2045293
V7	1.835814	V15	2.97989375	V23	3.0293676

(3) Diffusion process

A node is selected as a seed node, information diffusion is carried out under a traditional independent cascade model (IC), for the IC model, each activated node has the opportunity of activating an inactive neighbor of the activated node once, and the propagation probability from u to v is defined as p_uv＝1/k_vWherein k is_vRepresenting the degree of node v. In the experimental process, the activation threshold of each node is set as a random number, and in order to eliminate the contingency of the result, the influence diffusion value is estimated by performing 10000 Monte Carlo (MC) simulations, and the result is shown in table 5.

TABLE 5 influence number

Node point	Influence quantity	Node point	Influence quantity	Node point	Influence quantity
						V0	2.258	V8	2.6415	V16	3.7167
V1	2.933	V9	3.3569	V17	3.0882
						V2	3.401	V10	2.6445	V18	3.268
V3	2.9126	V11	3.8922	V19	3.759
						V4	5.5802	V12	2.4494	V20	2.659
V5	2.6066	V13	3.2343	V21	3.1741
						V6	4.0219	V14	3.1379	V22	2.6506
V7	2.5854	V15	3.1412	V23	3.5672

Reference documents:

1.Domingos,P.,Richardson,M.:Mining the network value of customers.In:Proceedings of the 7th ACM SIGKDD international conference on Knowledgediscovery and data mining,pp.57-66.KDD,San Francisco(2001).

2.Richardson,M.,Domingos,P.:Mining Knowledge-Sharing Sites for ViralMarketing.In:Proceedings of the 8th ACM SIGKDD international conference onKnowledge discovery and data mining,pp.61-70.KDD,Edmonton(2002).

3.Budak,C.,Agrawal,D.,Abbadi,A.-E.:Limiting the spread ofmisinformation in social networks.In:Proceedings of the 20th InternationalConference on World Wide Web,pp.665-674.WWW,Hyderabad.(2011)

4.He,X.,Song,G.,Chen,W.:Influence blocking maximization in socialnetworks under the competitive linear threshold model.In:Proceedings of the12th SIAM International Conference on Data Mining,pp.463-474.SDM,Anaheim(2011).

5.Leskovec,J.,Krause,A.,Guestrin,C.:Cost-effective outbreak detectionin networks.In:Proceedings of the 13th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining,pp.420–429.ACM,San Jose(2007).

6.Kempe,D.,Kleinberg,J.,Tardos,E.:Maximizing the spread of influencethrough a social network.In:Proceedings of the 9th ACM SIGKDD internationalconference on Knowledge discovery and data mining,pp.137-146.ACM,Washington(2003).

7.Goyal,A.,Lu,W.,Lakshmanan,L.V.:Celf++:optimizing the greedyalgorithm for influence maximization in social networks.In:Proceedings of the20th International Conference Companion on World Wide Web,pp.47–48.ACM,(2011).

8.Chen,W.,Wang,Y.,Yang,S.:Efficient influence maximization in socialnetworks.In:Proceedings of the 15th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining,pp.199–208.ACM,Paris(2009).

9.Liu,H.-L.,Ma,C.,Xiang,B.-B.:Identifying multiple influentialspreaders based on generalized closeness centrality.Physica A 492,2237–2248(2018).

10.Zhu,J.,Liu,Y.,Yin,X.:A new structure-hole-based algorithm forinfluence maximization in large online social networks,IEEE Access 5,23405–23412(2017).

11.Kim,J.,Kim,S.-K.,Yu,H.:Scalable and parallelizable processing ofinfluence maximization for large-scale social networks.In:29th IEEEInternational Conference on Data Engineering,pp.266–277.IEEE,Brisbane(2013).

12.Liu,B.,Cong,G.,Zeng,Y.:Influence spreading path and itsapplication to the time constrained social influence maximization problem andbeyond.IEEE Trans.Knowl.Data Eng.26(8),1904–1917(2014).

13.Ko,Y.-Y.,Chae,D.-K.,Kim,S.-W.:Accurate path-based methods forinfluence maximization in social networks.In:Proceedings of the 25thInternational Conference Companion on World Wide Web,pp.59–60.WWW,Geneva(2016).

14.Galstyan,A.,Musoyan,V.:Maximizing influence propagation innetworks with community structure.Physical Review E 79(2),056102(2009).

15.Cao,T.,Wu,X.,Wang,S.,Hu,X.:OASNET:an optimal allocation approachto influence maximization in modular social networks.In:ACM Symposium onApplied Computing,pp.1088–1094.SAC,Sierre(2010).

16.Wang,Y.,Cong,G.,Song,G.:Community-based greedy algorithm formining top-K influential nodes in mobile social network.In:Proceedings of the16th ACM SIGKDD international conference on Knowledge discovery and datamining,pp.1039-1048.ACM,Washington(2010).

17.Shang,J.,Zhou,S.,Li,X.:CoFIM:A community-based framework forinfluence maximization on large-scale networks.Knowledge-Based Systems117,88-100(2017).

18.Shang,J.,Wu,H.:IMPC:Influence maximization based on multi-neighborpotential in community networks.Physica A 512,1085-1103(2018).

19.Girvan,M.,Newman,M.E.J.:Community structure in social andbiological networks.Proc Natl Acad Sci U S A99(12),7821-7826(2002).

20.Wang,Y.,Feng,X.:A potential-based node selection strategy forinfluence maximization in a social network.Lecture Notes in Computer Science5678,350-361(2009).

Lancichinetti,A.,Fortunato,S.,Radicchi,F.:Benchmark graphs fortesting community detection algorithms.Physical Review E 78(4Pt 2),046110(2008).

the foregoing shows and describes the general principles and features of the present invention, together with the advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method of measuring social networking influence, comprising the steps of:

problem definition:

given a network G, the goal is to select a set of most influential nodes C₂C₄S, maximizing the expected total number of active nodes σ (S) under a specific diffusion model:

S^*＝arg_Smaxσ(S) (1)

influence measurement:

(1) multi-neighbor impact:

f₁(u)＝IN₁(u)+IN₂(u) (4)

(2) community influence:

(2.1) influence within Community

The intra-community impact measures the inactivity of the activated node pair BC (u) in BC (u)The impact of a live node; let A_iRepresents Community C_iE.g. nodes activated in BC (u), the influence will be from A_iDiffusion to C_iAn inactive node in (1);

wherein d (v, A)_i)_minDenoted as nodes v to A_iShortest path, | C_i/(u∪A_i) I denotes Community C_iNumber of inactive nodes, cd (u, C)_i,A_i) Is in consideration of A_iA variation of community closeness of; cd (u, C)_i,A_i) The smaller the value of (A), the community C is indicated_iNode A of_iThe more closely related, therefore A_iFor community C_iThe greater the effect of (c);

when an influence is propagated in a community, the influence in the community depends not only on the closeness degree of the community but also on the number of nodes in the community; more nodes in the community means that more nodes may be activated; therefore, we normalize the number of nodes in the community to the weight of the community, denoted as W_Ci(ii) a Thus, W_CiThe larger and the larger the size of cd (u, C)_i,A_i) Smaller, indicating community C_iThe influence of (2) is greater; therefore, we measure the intra-community influence of nodes by combining weights and closeness of communities, denoted as IC₁(u), which is defined as:

(2.2) inter-Community Effect

The inter-community influence measures the influence of an activated node in NC (u) on an inactivated node in NC (u); since the node u belongs to C_iAnd neighbor community C_jConnections between e NC (u) are sparse, so for each C_j，C_jNumber of active nodes (A) in (2)_j) Is very small, therefore A_jFor community C_jIs relatively small; neglecting influence from A_jPropagation process to NC (u), and estimating inter-community influence of the node u by using the influence of the community itself; as with the intra-community effects, we measure community C in combination with the weight and closeness of the community_jThe influence of (C) is represented as I (C)_j) It is defined as:

node u belongs to C_iIs expressed as IC₂(u), which can be approximated as:

f₂(u)＝α·IC₁(u)+β·IC₂(u) (9)

(3) total influence of

The multi-neighbor influence and the community influence are integrated into the total influence of the node u in the whole network, and the total influence is recorded as f (u), so that the influence of the node is more comprehensively measured;

f(u)＝f₁(u)+f₂(u)＝[IN₁(u)+IN₂(u)]+[α·IC₁(u)+β·IC₂(u)](10)。