Summary of the invention
The present invention is directed to above-mentioned the deficiencies in the prior art, proposed a kind of community discovery method based on factions' random walk.
The concrete thought that the present invention realizes is: first from community network, find out very big factions, then find out the community at each very big factions place, with the very big factions in each community of all community network node replacements in very big factions, overcome prior art and cannot find the shortcoming of overlapping community, made the present invention there is the ability of finding overlapping community.By changing the size of multi-scale parameters, obtain different community's Attraction Degree, thereby obtain the community structure under different levels, overcome the shortcoming that prior art can not obtain the community structure under different levels, make the present invention to there is the ability that can obtain the community structure under different levels.Calculate the random movement probability of very big factions to community, find out the community at each very big factions place, thereby obtain the community network node in each community, overcome the shortcoming of community's unstable result that prior art obtains, improved the stability of community discovery result.
The concrete steps that the present invention realizes comprise as follows:
(1) from community network, find out very big factions:
(2) process unnecessary very big factions and isolated node:
2a) optional very big factions in found out very big factions;
The node sum and the node in selected very big factions that 2b) compare in each very big factions are total, find out the very big factions that are greater than the node sum in selected very big factions;
2c) adopt degree of comprising value formula, calculate degree of the comprising value of the very big factions more than the node sum in selected very big factions with respect to node sum of selected very big factions;
2d) in judgement degree of comprising value, whether be greater than 0.75, if so, from found out very big factions set, delete selected very big factions, otherwise, retain selected very big factions, execution step 2e);
2e) judge that whether the very big factions in community network are all processed complete, if so, perform step 2f), otherwise, execution step 2a);
2f) travel through all very big factions, find out the isolated node not being included in any one very big factions;
2g) isolated node is joined in the very big factions that the neighbor node number that comprises isolated node is maximum;
(3) construct very big factions network:
3a) optional two very big factions in very big factions, adopt degree of overlapping formula, calculate the degree of overlapping of selected two very big factions;
Whether the degree of overlapping that 3b) judges every two very big factions has all been calculated, and if so, performs step 3c), otherwise, execution step 3a);
3c) degree of overlapping is greater than between two very big factions of 0.2 and sets up a limit, form very big factions network;
(4) initialization tight ness rating matrix:
4a) optional two very big factions in very big factions, adopt tight ness rating formula, calculate the tight ness rating of selected two very big factions;
Whether the tight ness rating that 4b) judges every two very big factions has all been calculated, if so, perform step (5), otherwise, execution step 4a);
(5) find out the community at each very big factions place:
5a) each very big factions is constructed respectively to an initial community for correspondence with it, in each initial community, only have very big factions;
5b) optional very big factions in very big factions, obtain community's set at the very big factions of the neighbours place of selected very big factions;
5c) calculate the community Attraction Degree of each community to selected very big factions in the set of community, neighbours very big factions place;
5d) calculate the random movement probability of selected very big factions to each community in the set of community, neighbours very big factions place, obtain largest random movement probability;
5e) the community from own place by selected very big factions, the community corresponding to largest random movement probability moves, and obtains society's area code at selected very big factions place in current iteration;
5f) all very big factions are judged whether to obtain society's area code at its place, if so, perform step 5g), otherwise, execution step 5b);
5g) to all very big factions, judge that the society's area code whether society's area code at its place calculate with last iteration is identical, if so, execution step (6), otherwise, execution step 5b);
(6) obtain the community network node in each community:
Very big factions with in each community of all community network node replacements in very big factions, obtain the community network node in each community.
Compared with prior art, tool has the following advantages in the present invention:
First, during the community network node of the present invention in obtaining each community, first from community network, find out very big factions, then find out the community at each very big factions place, with the very big factions in each community of all community network node replacements in very big factions, overcome prior art and cannot find the deficiency of overlapping community, made the present invention there is the ability of finding overlapping community.
Second, the present invention is when calculating community's Attraction Degree, by changing the size of multi-scale parameters, obtain different community's Attraction Degree, thereby obtain the community structure under different levels, overcome the shortcoming that prior art can not obtain the community structure under different levels, made the present invention to there is the ability that can obtain the community structure under different levels.
The 3rd, the present invention is when calculating random movement probability, first from community network, find out very big factions, then calculate the random movement probability of very big factions to community, find out the community at each very big factions place, thereby obtain the community network node in each community, overcome the shortcoming of community's unstable result that prior art obtains, make the present invention to have improved the stability of community discovery result.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further detail.
With reference to Fig. 1, the concrete steps that the present invention realizes are as follows:
Step 1 is found out very big factions from community network.
The 1st step, an optional node v in community network, is assigned to set Y by the neighbor node set of node v, and node v joins in set X;
The 2nd step, in pair set Y, the label of node is done ascending sort according to order from small to large;
The 3rd step, in the node of set Y, elects the node of smallest sequence number as node t, and node t is joined in set X;
The 4th step, whether the node in judgement set X in the neighbor node set of each node in community network, if so, the node in community network is joined can with set X in node form in the node set of factions, otherwise, do not add;
The 5th step, whether judgement can be empty set with the node set of gathering the node formation factions in X, if so, carries out the 7th step, otherwise carries out the 6th step;
The 6th step, joins set X by node t, can again be assigned to set Y with the node set of gathering the node formation factions in X, carries out the 2nd step;
The 7th step, whether the node in judgement set X is included in any one other factions, if so, do not export, otherwise the node in set X forms very big factions, the node in output set X;
The 8th step, whether the node in judgement set Y is all processed complete, if so, carries out the 9th step, otherwise carries out the 3rd step;
The 9th step, judges that whether the node in community network is all processed complete, if so, performs step (2), otherwise carries out the 1st step.
Step 2, processes unnecessary very big factions and isolated node.
2a) optional very big factions in found out very big factions;
The node sum and the node in selected very big factions that 2b) compare in each very big factions are total, find out the very big factions that are greater than the node sum in selected very big factions;
2c) adopt degree of comprising value formula, calculate degree of the comprising value of the very big factions more than the node sum in selected very big factions with respect to node sum of selected very big factions, degree of comprising value formula is as follows:
Wherein, C (i, j) represents that the very big j of factions is with respect to the degree of comprising of the very big i of factions, V
irepresent the node set comprising in the very big i of factions, V
jrepresent the node set comprising in the very big j of factions, ∩ represents V
iand V
jdo intersection operation.
2d) in judgement degree of comprising value, whether be greater than 0.75, if so, from found out very big factions set, delete selected very big factions, otherwise, retain selected very big factions, execution step 2e);
2e) judge that whether the very big factions in community network are all processed complete, if so, perform step 2f), otherwise, execution step 2a);
2f) travel through all very big factions, find out the isolated node not being included in any one very big factions;
2
g) isolated node is joined in the very big factions that the neighbor node number that comprises isolated node is maximum;
Step 3, constructs very big factions network.
3a) optional two very big factions in very big factions, adopt degree of overlapping formula, calculate the degree of overlapping of selected two very big factions, and degree of overlapping formula is as follows:
Wherein, δ (d, c) represents the degree of overlapping of the very big d of factions and the very big c of factions, V
drepresent the node set comprising in the very big d of factions, V
crepresent the node set comprising in the very big c of factions, ∩ represents V
dand V
cdo intersection operation, min is illustrated in | V
d| and | V
c| in get minimum value operation.
Whether the degree of overlapping that 3b) judges every two very big factions has all been calculated, and if so, performs step 3c), otherwise, execution step 3a);
3c) degree of overlapping is greater than between two very big factions of 0.2 and sets up a limit, form very big factions network;
Step 4, initialization tight ness rating matrix.
4a) optional two very big factions in very big factions, adopt tight ness rating formula, calculate the tight ness rating of selected two very big factions, and tight ness rating formula is as follows:
m
pq=1+|Γ
p∩Γ
q|
Wherein, m
pqthe tight ness rating value that represents the very big p of factions and the very big q of factions, Γ
prepresent the very big factions of the neighbours set of the very big p of factions, Γ
qrepresent the very big factions of the neighbours set of the very big q of factions, ∩ represents Γ
pand Γ
qdo intersection operation.
Whether the tight ness rating that 4b) judges every two very big factions has all been calculated, if so, perform step (5), otherwise, execution step 4a);
Step 5, finds out the community at each very big factions place.
5a) each very big factions is constructed respectively to an initial community for correspondence with it, in each initial community, only have very big factions;
5b) optional very big factions in very big factions, obtain community's set at the very big factions of the neighbours place of selected very big factions;
5c) calculate the community Attraction Degree of each community to selected very big factions in the set of community, neighbours very big factions place, community's Attraction Degree formula is as follows:
Wherein, A (a, k) represents the community Attraction Degree of k community to the very big a of factions, and s is illustrated in the very big factions of neighbours of the very big a of factions in k community, Γ
arepresent the very big factions of the neighbours set of the very big a of factions, R
krepresent k community, ∩ represents Γ
aand R
kdo intersection operation, m
asthe tight ness rating value that represents the very big a of factions and the very big s of factions, ∑ represents all R
kthe very big factions of neighbours of the very big a of factions in community and the greatly tight ness rating of a of factions are carried out sum operation, and f and e represent two the very big factions of different neighbours of the very big a of factions in k community, m
fethe tight ness rating that represents the very big f of factions and the very big e of factions, ∑ represents all R
kthe tight ness rating summation of every two the very big factions of neighbours of the very big a of factions in community, b and w represent the very big factions that two in k community are different, m
bwthe tight ness rating that represents the very big b of factions and the very big w of factions, ∑ represents R
kthe tight ness rating sum operation of every two the very big factions in community, t represents multi-scale parameters, by changing the size of t, can obtain the community structure under different levels.
5d) calculate the random movement probability of selected very big factions to each community in the set of community, neighbours very big factions place, obtain largest random movement probability, random movement probability formula is as follows:
Wherein, X (a, k) represents the random movement probability of the very big a of factions to k community, and A (a, k) represents the community Attraction Degree of k community to the very big a of factions, R
hthe community that represents the very big factions of the neighbours place of the very big a of factions, R'
arepresent community's set at the very big factions of the neighbours place of the very big a of factions, A (a, h) represent the community Attraction Degree of h community to the very big a of factions, ∑ represents the community's Attraction Degree sum operation to the very big a of factions to each community in community's set at the very big factions of the neighbours of the very big a of factions place.
5e) the community from own place by selected very big factions, the community corresponding to largest random movement probability moves, and obtains society's area code at selected very big factions place in current iteration;
5f) all very big factions are judged whether to obtain society's area code at its place, if so, perform step 5g), otherwise, execution step 5b);
5g) to all very big factions, judge that the society's area code whether society's area code at its place calculate with last iteration is identical, if so, execution step (6), otherwise, execution step 5b);
Step 6, obtains the community network node in each community.
Very big factions with in each community of all community network node replacements in very big factions, obtain the community network node in each community.
Effect of the present invention can be further described by following emulation experiment.
1, simulated conditions:
Emulation of the present invention is under Intel (R) Xeon (R) CPU, the hardware environment of 4G internal memory and the development environment of Eclipse 3.2, realizes that program carries out with Java language.Emulation experiment is used data from true community network Zachary ' s Karate club network, node in club's network represents clubbite, social interaction between the line-up of delegates of limit, Zachary ' s Karate club network includes 34 members and 78 limits altogether.
2, emulation content:
In conjunction with the community discovery method based on factions' random walk of the present invention, Zachary ' s Karate club network is carried out to community discovery, finally obtain the community network node in each community.
3. analysis of simulation result:
Fig. 3 is the analogous diagram to Zachary ' s Karate club network.Numbering in Fig. 3 in circle represents node, and wherein, node 1,2,3,4,5,6,7,8,9,10,11,12,13,14,17,18,20,22 belongs to community 1. Node 3,9,15,16,19,21,23,24,25,26,27,28,29,30,31,32,33,34 belongs to community 2, wherein node 3 and node 9 are overlapping nodes of community 1 and community 2, and the present invention can obtain the community discovery result of network as can be seen here.