CN103605793A - Heterogeneous social network community detection method based on genetic algorithm - Google Patents
Heterogeneous social network community detection method based on genetic algorithm Download PDFInfo
- Publication number
- CN103605793A CN103605793A CN201310651893.1A CN201310651893A CN103605793A CN 103605793 A CN103605793 A CN 103605793A CN 201310651893 A CN201310651893 A CN 201310651893A CN 103605793 A CN103605793 A CN 103605793A
- Authority
- CN
- China
- Prior art keywords
- community
- node
- population
- individual
- centerdot
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 23
- 230000002068 genetic effect Effects 0.000 title claims abstract description 14
- 239000011159 matrix material Substances 0.000 claims abstract description 20
- 108090000623 proteins and genes Proteins 0.000 claims description 13
- 230000035772 mutation Effects 0.000 claims description 6
- 238000000034 method Methods 0.000 abstract description 16
- 238000002474 experimental method Methods 0.000 description 9
- 238000012360 testing method Methods 0.000 description 7
- 238000011156 evaluation Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 108010022579 ATP dependent 26S protease Proteins 0.000 description 1
- 241001270131 Agaricus moelleri Species 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000010429 evolutionary process Effects 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Business, Economics & Management (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Primary Health Care (AREA)
- Artificial Intelligence (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- General Business, Economics & Management (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a heterogeneous social network community detection method based on a genetic algorithm and the heterogeneous social network community detection method is used for mainly solving the problem that the accuracy rate of the detected community structure is obviously reduced when the social network data and relationship are large in scale in the prior art. The implementation scheme of the method comprises the following steps: constructing an adjacency matrix for describing a heterogeneous social network according to the number of nodes in the network and information of relation among the nodes; generating random symbolic coding individual according to the adjacency matrix; evaluating the advantages and disadvantages of the individuals by taking the improved modularity density as a fitness function; optimizing the individual according to the fitness function value of the individual by adopting a genetic algorithm; reducing the optimized individual with the highest fitness function value into a corresponding heterogeneous network, and decoding to obtain a partitioned community structure. The experimental result proves that the community structure of the heterogeneous social network can be effectively detected, the detection accuracy rate is high, and the method can be used for community detection of a large-scale heterogeneous social network.
Description
Technical field
The invention belongs to community network computing technique field, particularly a kind of isomery community network community detection method, can be used for the structural research to complex social system and large scale community network.
Background technology
Social system, refers to the system consisting of economic relation, political relation and cultural relations between social man and social man and social man, such as family, political party, community are all the social systems of different levels.Social system is a kind of typical complication system, can be abstracted into and process for complex network, and being about to entity in system abstract is node, by the contact of inter-entity abstract be company limit between node, obtain a community network being formed by node and Lian Bian.Complex network abstract by social system and that obtain is called community network.
Community's test problems is an important research direction of complex network, starts be in recent years subject to the extensive concern in the fields such as computer science, biology, sociology and economics and embodied certain using value.Community in complex network refers to that some are more similar each other, and has the node cluster of larger difference with other the most of node in network.The community structure of complex network show community inner connect between tight ,Er community, connect sparse.Community's testing goal of complex network is to survey and disclose the intrinsic community structure of complex network, and community structure contributes to understand and infer the 26S Proteasome Structure and Function of whole network.Community structure can be used for protein function identification, metabolic pathway prediction, web community mining, connects in the practical problemss such as prediction.
The community network generally studied is all comprised of same node, but the formation of community network is more complicated in real life, and node species may surpass a kind of.The community network that comprises more than one nodes is called isomery community network.Such as evaluating in tag system at a film, the entity of film, label, user's three types has formed whole system, a user marks and has added a label a film, is just related like this between these three kinds of entities, between corresponding node, has the limit of company; And between different user, between different films, between different labels, be do not have associated, so with not connecting limit between category node.
Community's test problems of isomery community network can be described as: the isomery community network that comprises k class entity represents by scheming G, schemes G (V, E) and is comprised of node set V and Lian Bian set E.Wherein, node set V can regard as by k category node subset V
1, V
2..., V
kform, between non-same category node, have to connect and with not connecting between category node.In such a isomery community network, according to the link division community between non-same category node, the community structure that makes to mark off has the non-same category node that the non-same category node in community connects between tight ,Er community and connects sparse feature.
At present, the community network community detection method proposing in existing document, the single traditional society's network of research node type mostly, mainly contain figure dividing method, spectral method and fast algorithm, the common ground of these methods is to build similarity matrix, and by solving proper vector, realizes the division of community structure.These methods are not all considered the situation of multiple node in network, and in actual applications, node diversity in community network is very common and be not allow to ignore, and therefore traditional community network community detection method cannot be applied to community's test problems of isomery community network.In addition, also have now a kind of community detection method of isomery community network, i.e. many relations clustering method, the method is that similarity is calculated in the contact based on multiple inter-entity, and according to similarity by entity division cluster.Though this method can mark off the community structure of isomery community network, yet because community network type is many, data are a large amount of, relation is complicated, there is scaling concern, along with community network data be related to the increase of scale, survey community structure accuracy and obviously reduce.
Summary of the invention
The object of the invention is to the deficiency for above-mentioned prior art, a kind of isomery community network community detection method based on genetic algorithm is proposed, so that isomery community network data are classified or cluster, and then realize the detection of community function and prediction, be met the isomery community network community structure compared with high-accuracy.
Technical scheme of the present invention is achieved in that
For achieving the above object, performing step of the present invention is as follows:
1) the node classification in heterogeneous network is counted to the number n of k and every category node
1, n
2..., n
kadd up, obtain the total number n=n of nodes
1+ n
2+ ... + n
k; By contact details between the number of every category node and node, build the k dimension adjacency matrix A that describes isomery community network, the size of A is n
1* n
2* n
k;
2) make the big or small pn=50 of initial population, according to the total number n of node, produce at random the individuality of pn symbolization coding, with these individual initial population p that forms
0; Crossover probability pc=0.8 is set, variation Probability p m=0.2, first initial algebra g
0=1, maximum algebraically mg=50, current algebraically g=g
0, make g godfather for population p
gequal initial population p
0, i.e. p
g=p
0;
3) calculate g godfather for population p
gin each individual fitness function value D:
In formula, m is the genic value kind number in individuality, V
1, V
2..., V
mbe the node set of m community being obtained by individuality, V is all node set, i.e. V=V in network
1∪ V
2∪ ... ∪ V
m,
node set V
cdifference set in V,
1≤c≤m, | V
c| be the V of community
cin node number, L (V
c) be c the company's edge strip number in community,
be from c community, to be connected to c the company's edge strip number in community, L (V) always connects edge strip number in network;
4) according to g godfather for population p
gin each individual fitness function value D, to g godfather for population p
gcarry out elite's reservation operations and algorithm of tournament selection operation, the g godfather after being upgraded is for population p
g';
5) produce at random the first random number rand
1, and to the first random number rand
1pc compares with crossover probability, if rand
1<pc, carries out the 6th) step; Otherwise, obtain with upgrade after g godfather for population p
g' g that equates is for progeny population chp
g, i.e. chp
g=p
g', carry out the 7th) step;
6) to the g godfather after upgrading for population p
g' carry out single channel interlace operation, obtain g for progeny population chp
g;
7) to g for progeny population chp
gin each individual R carry out mutation operation, the g after being upgraded is for progeny population chp
g';
8) make current algebraically g=g+1, current algebraically g and maximum algebraically mg are compared, if g≤mg, the parent population p in g generation
gwith upgrade after g-1 for progeny population chp
g-1' equate i.e. p
g=chp
g-1', return to step 3); Otherwise, perform step 9);
9) get g-1 after renewal for progeny population chp
g-1' the middle the highest individuality of fitness function value, this individuality is reduced into corresponding heterogeneous network, the community structure that decoding obtains marking off.
The present invention has the following advantages compared with prior art:
1. the present invention is directed to the feature of isomery community network, designed the fitness function of weighing isomery community network community structure quality, not only solved conventional module density function and can not weigh the problem of heterogeneous network community structure quality, and guaranteed that the community structure of using the present invention to obtain has higher accuracy.
2. the mutation operation of the present invention's design has local search ability, makes evolutionary process easily jump out local optimum, has not only improved efficiency of evolution, and can efficiently obtain optimum solution.
3. with respect to existing heterogeneous network community detection method, the present invention can obtain the community structure that accuracy is higher to the test of user-commodity network.
Experimental result shows, the community detection method based on genetic algorithm that the present invention proposes can effectively detect the community structure of heterogeneous network.
Accompanying drawing explanation
Fig. 1 is the general flow chart of realizing of the present invention;
Fig. 2 is the mutation operation sub-process figure in the present invention;
Fig. 3 is user-commodity network topological diagram that emulation of the present invention is used;
Fig. 4 is the adjacency matrix schematic diagram of describing user-commodity network;
Fig. 5 is to the detected community structure schematic diagram of user-commodity network with the present invention.
Embodiment
For the present invention being known to user-commodity network topological diagram that Fig. 3 provides is take in description, this example, be example, but do not form any limitation of the invention, the present invention goes for all isomery community networks.
With reference to Fig. 1, implementation step of the present invention is as follows:
Node classification in step 1. pair heterogeneous network is counted the number n of k and every category node
1, n
2..., n
kadd up, obtain the total number n=n of nodes
1+ n
2+ ... + n
k.
As described in Figure 3, it is user-commodity heterogeneous network to the heterogeneous network of this example.In this network, have 2 kinds of nodes, node classification is counted k=2, the 1st category node representative of consumer wherein, and the 2nd category node represents commodity, the square in figure represents the 1st category node, has 50, i.e. n
1=50, circle represents the 2nd category node, has 50, i.e. n
2=50, the total number of node in network is 100, i.e. n=100;
2a) according to the every category node number n in network
1, n
2..., n
k, the size of setting adjacency matrix A is n
1* n
2* n
k;
In the heterogeneous network of this example, the 1st category node number n
1be 50, the 2 category node number n
2be 50, setting 2 dimension adjacency matrix A sizes is 50 * 50;
2b) according to the contact details between nodes, determine the value of each element in adjacency matrix A:
If the 1st category node v in network
i1, the 2nd category node v
i2..., k category node v
ikbetween mutually have connection, between this k node, have limit to be connected, the elements A (i in corresponding adjacency matrix
1, i
2..., i
k)=1,
If the 1st category node v
i1, the 2nd category node v
i2..., k category node v
ikbetween be not mutually to have connection, between this k node, do not have limit to be connected, the elements A (i in corresponding adjacency matrix
1, i
2..., i
k)=0, wherein 1≤i
1≤ n
1, 1≤i
2≤ n
2..., 1≤i
k≤ n
k;
In this example, if the i in user-commodity heterogeneous network
1individual user is to i
2individual commodity have a purchaser record, A (i
1, i
2)=1; If the i in user-commodity heterogeneous network
1individual user is to i
2individual commodity do not have purchaser record, A (i
1, i
2)=0, wherein 1≤i
1≤ 50,1≤i
2≤ 50, as shown in Figure 4.In Fig. 4, horizontal ordinate represents the 1st category node label, and ordinate represents the 2nd category node label, and in figure, stain represents that the element value that transverse and longitudinal coordinate is corresponding is 1.
Step 3. makes the big or small pn=50 of initial population, produces at random the individuality of pn symbolization coding according to the total number n of node, with these individual initial population p that forms
0.
The individuality that above-mentioned symbolic coding obtains, refers to each individual R=(r
1, r
2..., r
i..., r
n) in each gene r
iget at random the integer between 1 to n, wherein, 1≤i≤n, i genic value r
irepresent i node v
icommunity's label at place.
In the heterogeneous network of this example, symbolization coding produces 50 individualities, wherein each individual R=(r
1, r
2..., r
i..., r
100) in each gene r
iget at random the integer between 1 to 100, wherein, 1≤i≤100.
Step 4. arranges crossover probability pc=0.8, variation Probability p m=0.2, first initial algebra g
0=1, maximum algebraically mg=50, current algebraically g=g
0, make g godfather for population p
gequal initial population p
0, i.e. p
g=p
0;
5a) according to g godfather for population p
gin individuality, obtain m genic value kind in individuality, obtain m the node set in community simultaneously and be: V
1, V
2..., V
c, V
m, 1≤c≤m; The all node set that obtained in network by the union of m node set are V, i.e. V=V
1∪ V
2∪ ... ∪ V
m, wherein ∪ is union operational symbol, according to c node set V
cwith all node set V, obtain c node set V
cdifference set in all node set V is
?
5b) according to adjacency matrix A, calculate c the company's edge strip in community and count L (V
c):
5c) according to adjacency matrix A, calculate company's edge strip number that Cong Ge c community is connected to other communities except c community in network
Wherein, k node v
i1, v
i2..., v
ikin have a node at least at c node set V
cin, at least separately there is a node at c node set V
cdifference set in all node set V
in;
5d) according to adjacency matrix A, the edge strip that always connects calculating in network is counted L (V):
5e) according to the company's edge strip in above-mentioned c the community calculating, count L (V
c), from c community, be connected to company's edge strip number of other communities except c community network
count L (V) with the edge strip that always connects in network, calculate g godfather for population p
gin each individual corresponding fitness function value D:
Wherein, | V
c| be c node set V
cin node number.
Step 6. according to g godfather for population p
gin each individual fitness function value D, to g godfather for population p
gcarry out elite's reservation operations and algorithm of tournament selection operation, the g godfather after being upgraded is for population p
g'.
In the prior art, the selection operation of genetic algorithm has elite's reservation operations, roulette to select operation, algorithm of tournament selection operation, random ergodic select operation and block and select operation etc., because elite's reservation operations wherein can make classic individuality in parent population preserve, and algorithm of tournament selection operation can make the parent population obtaining after operation have more much higher sample, so the present invention has adopted elite's reservation operations and algorithm of tournament selection operation.
Concrete is described below:
6a) establishing new population pnew is empty set, carries out elite's reservation operations 1 time, is about to parent population p
gthe individuality of middle fitness function value maximum is put into new population pnew;
6b) carry out pn-1 algorithm of tournament selection operation, at parent population p
gin choose at random two individualities, relatively these two individual fitness function values sizes, put into new population pnew by the larger individuality of fitness function value;
6c) the g godfather after order renewal is for population p
g' equal new population pnew, i.e. p
g'=pnew.
The random first random number rand that produces of step 7.
1, and to the first random number rand
1pc compares with crossover probability, if rand
1<pc, performs step 8; Otherwise, obtain with upgrade after g godfather for population p
g' g that equates is for progeny population chp
g, i.e. chp
g=p
g', perform step 9.
G godfather after step 8. pair renewal is for population p
g' carry out single channel interlace operation, obtain g for progeny population chp
g.
In prior art, the interlace operation of genetic algorithm has single-point interlace operation, multiple spot interlace operation, uniform crossover operator, discrete interlace operation and single channel interlace operation etc., due to single channel wherein, intersecting is a kind of interlace operation that is more suitable for symbolic coding individuality, so the present invention has adopted single channel interlace operation.
The concrete operations that single channel is intersected are as follows:
8a) establish iterations q=1, establish g for progeny population chp
gfor empty set;
8b) random g godfather after renewal is for population p
g' two individual R of middle selection
1and R
2, and by R
1individual as source, by R
2individual as object, for example, suppose the total number n=5 of nodes, random g godfather after renewal is for population p
g' two individual R of middle selection
1=(3,2,4,4,2) and R
2=(4,1,3,4,1), by R
1individual as source, by R
2individual as object;
8c) be chosen at random the integer i between 1 to n, establish genes of interest value e and equal the individual R in source
1in i genic value r
1 i, i.e. e=r
1 i, establishing nodal scheme set U is empty set, the individual R of reference source
1in each genic value r
1 jwith genes of interest value e, if r
1 j=e, puts into nodal scheme set U by nodal scheme j, obtains nodal scheme set V to be changed
e=U, wherein 1≤j≤n, chooses the integer i being chosen between 1 to 5 at random, obtains i=3; If genes of interest value e equals the individual R in source
1in the 3rd genic value r
1 3, i.e. e=r
1 3, e=4, establishing nodal scheme set U is empty set, the individual R of reference source
1in each genic value r
1 jwith genes of interest value e, the individual R in source
1the node that middle genic value equals genes of interest value e is the 3rd and the 4th, nodal scheme 3 and 4 is put into set U, obtains nodal scheme set V to be changed
4=U;
8d) for nodal scheme set V to be changed
ein each nodal scheme j, by the individual R of object
2j genic value r of middle correspondence
2 jchange into genes of interest value e, i.e. r
2 j=e, the new individual R after being intersected
2', by new individual R
2' put into g for progeny population chp
gin, for nodal scheme set V to be changed
4in each nodal scheme 3 and 4, by R
2the 3rd of middle correspondence and the 4th genic value are all changed into genes of interest value e, i.e. R
2in r
2 3=e, r
2 4=e, the new individual R after being intersected
2'=(4, Isosorbide-5-Nitrae, 4,1), by new individual R
2' put into g for progeny population chp
gin;
8e) make q=q+1, compare iterations q and parent Population Size np, if q≤np returns to step 8b); Otherwise, finish to carry out.
Step 9. couple g is for progeny population chp
gin each individual R carry out mutation operation, the g after being upgraded is for progeny population chp
g'.
With reference to Fig. 2, the specific descriptions of this step are as follows:
9a) establish iterations q=1;
9b) produce at random the second random number rand
2m compares with variation Probability p, if rand
2<pm, forwards step 9c to); Otherwise, forward step 9f to);
9c) establishing genic value set L is empty set, by adjacency matrix A, is obtained and q node v
qeach the node v that has connection
j, by each node v
jj genic value r of correspondence in individual R
jput into genic value set L, 1≤j≤n wherein, establishing maximum localized mode lumpiness max is minus infinity, i.e. max=-∞;
9d) for each the genic value r in genic value set L
j, suppose q genic value r in individual R
qequal genic value r
j, calculate r under this supposed situation
jthe localized mode lumpiness f of individual community
rj, and compare r
jthe localized mode lumpiness function f of individual community
rjwith maximum localized mode lumpiness max, if f
rj>max, makes max=f
rj, order obtains the label l of community of maximum localized mode lumpiness
q=r
j;
Wherein, r
jthe localized mode lumpiness f of individual community
rjcomputing formula is as follows:
In formula, V
rjbe r
jnode set in individual community, 1≤c≤n wherein, d
itfor node v
itdegree, be connected to node v
itcompany's limit number, 1≤i wherein
t≤ n
t, 1≤t≤k, | V| always connects limit number in network;
If each genic value r 9e) in genic value set L
jcorresponding localized mode lumpiness f
rjall calculate completely, make q genic value r in individual R
qequal to obtain the label l of community of maximum localized mode lumpiness
q, i.e. r
q=l
q, execution step 9f); Otherwise, return to step 9d);
9f) make q=q+1, and compare iterations q and the total number n of node, if q≤n returns to step 9b); Otherwise, finish to carry out.
Step 11. is got g-1 after renewal for progeny population chp
g-1' the middle the highest individuality of fitness function value, this individuality is reduced into corresponding heterogeneous network, the community structure that decoding obtains marking off.
Effect of the present invention can be verified by following emulation experiment:
1. test running environment and evaluation criterion
The environment of experiment operation: processor is Intel (R) Core (TM) 2Duo CPU E6550 2.33GHz, inside saves as 1.99GB, and hard disk is 120G, and operating system is Microsoft windows7, and programmed environment is MATLAB7.13.
The community structure quality that this experimental selection normalized mutual information NMI comes evaluation experimental method to detect as the evaluation criterion of community structure:
In above formula, C
0represent real community structure, C
erepresent that experiment detects the community structure obtaining, H (C) represents the shannon entropy of community structure C.If community structure and Fiel's plot structure that experiment is found are on all four, the value of NMI is maximal value 1; If community structure and Fiel's plot structure that experiment is found are completely independently, the value of NMI is minimum value 0.
2. experiment content and interpretation of result
Emulation one, carries out community structure detection by the inventive method to the user-commodity network shown in Fig. 3, and as shown in Figure 5, in Fig. 5, the representative of the node of different gray scales is in different communities for testing result.
As can be seen from Figure 5, in user-commodity isomery community network, 3 communities have been detected.From the contrast of Fig. 5 and Fig. 3, can find out, the community structure that obtains of experiment can make to connect closely non-similar node division in same community, and makes to connect sparse non-similar node division in different communities.
Utilize above-mentioned evaluation criterion to calculate the present invention and normalized mutual information value NMI corresponding to community structure detected
1, obtain NMI
1=1, i.e. the community structure that the present invention detects has very high accuracy.Therefore the present invention is a kind of effective community detection method.
Emulation two, user-commodity network that the Multicomm method that adopts the people such as Xutao Li to propose represents Fig. 3 carries out community's detection, and utilizes above-mentioned evaluation criterion calculating Multicomm method normalized mutual information value NMI corresponding to community structure to be detected
2, obtain NMI
2=0.9315.
The normalized mutual information NMI that Multicomm method is obtained
2=0.9315 normalized mutual information NMI obtaining with the present invention
1=1 compares, and found that the community structure that the present invention detects has higher accuracy, and experiment effect of the present invention is better than Multicomm method.
Claims (6)
1. the isomery community network community detection method based on genetic algorithm, comprises the steps:
1) the node classification in heterogeneous network is counted to the number n of k and every category node
1, n
2..., n
kadd up, obtain the total number n=n of nodes
1+ n
2+ ... + n
k; By contact details between the number of every category node and node, build the k dimension adjacency matrix A that describes isomery community network, the size of A is n
1* n
2* n
k;
2) make the big or small pn=50 of initial population, according to the total number n of node, produce at random the individuality of pn symbolization coding, with these individual initial population p that forms
0; Crossover probability pc=0.8 is set, variation Probability p m=0.2, first initial algebra g
0=1, maximum algebraically mg=50, current algebraically g=g
0, make g godfather for population p
gequal initial population p
0, i.e. p
g=p
0;
3) calculate g godfather for population p
gin each individual fitness function value D:
In formula, m is the genic value kind number in individuality, V
1, V
2..., V
mbe the node set of m community being obtained by individuality, V is all node set, i.e. V=V in network
1∪ V
2∪ ... ∪ V
m,
node set V
cdifference set in V,
1≤c≤m, | V
c| be the V of community
cin node number, L (V
c) be c the company's edge strip number in community,
shi Congge c community is connected to company's edge strip number of other communities except c community in network, and L (V) always connects edge strip number in network;
4) according to g godfather for population p
gin each individual fitness function value D, to g godfather for population p
gcarry out elite's reservation operations and algorithm of tournament selection operation, the g godfather after being upgraded is for population p
g';
5) produce at random the first random number rand
1, and to the first random number rand
1pc compares with crossover probability, if rand
1<pc, carries out the 6th) step; Otherwise, obtain with upgrade after g godfather for population p
g' g that equates is for progeny population chp
g, i.e. chp
g=p
g', carry out the 7th) step;
6) to the g godfather after upgrading for population p
g' carry out single channel interlace operation, obtain g for progeny population chp
g;
7) to g for progeny population chp
gin each individual R carry out mutation operation, the g after being upgraded is for progeny population chp
g';
8) make current algebraically g=g+1, current algebraically g and maximum algebraically mg are compared, if g≤mg, the parent population p in g generation
gwith upgrade after g-1 for progeny population chp
g-1' equate i.e. p
g=chp
g-1', return to step 3); Otherwise, perform step 9);
9) get g-1 after renewal for progeny population chp
g-1' the middle the highest individuality of fitness function value, this individuality is reduced into corresponding heterogeneous network, the community structure that decoding obtains marking off.
2. the isomery community network community detection method based on genetic algorithm as claimed in claim 1, wherein the number between the every category node of the use described in step 1) and contact details build the k dimension adjacency matrix A that describes isomery community network, according to following rule, carry out:
If the 1st category node v
i1, the 2nd category node v
i2..., k category node v
ikbetween mutually have connection, between this k node, have limit to be connected, the elements A (i in corresponding adjacency matrix
1, i
2..., i
k)=1, wherein 1≤i
1≤ n
1, 1≤i
2≤ n
2..., 1≤i
k≤ n
k;
If the 1st category node v
i1, the 2nd category node v
i2..., k category node v
ikbetween be not mutually to have connection, between this k node, do not have limit to be connected, the elements A (i in corresponding adjacency matrix
1, i
2..., i
k)=0.
3. the isomery community network community detection method based on genetic algorithm as claimed in claim 1, wherein step 2) described according to the total number n of node, produce at random the individuality that pn symbolization encoded, refer to each individual R=(r
1, r
2..., r
i..., r
n) in each gene r
iget at random the integer between 1 to n, wherein, 1≤i≤n, i genic value r
irepresent i node v
icommunity's label at place.
4. the isomery community network community detection method based on genetic algorithm as claimed in claim 1, wherein described in step 4) to g godfather for population p
gcarry out elite's reservation operations and algorithm of tournament selection operation, carry out in accordance with the following steps:
4a) establishing new population pnew is empty set, carries out elite's reservation operations 1 time, is about to parent population p
gthe individuality of middle fitness function value maximum is put into new population pnew;
4b) carry out pn-1 algorithm of tournament selection operation, at parent population p
gin choose at random two individualities, relatively these two individual fitness function values sizes, put into new population pnew by the larger individuality of fitness function value;
4c) the g godfather after order renewal is for population p
g' equal new population pnew, i.e. p
g'=pnew.
5. the isomery community network community detection method based on genetic algorithm as claimed in claim 1, wherein described in step 6) to the g godfather after upgrading for population p
g' carry out single channel interlace operation, carry out in accordance with the following steps:
5a) establish iterations q=1, establish g for progeny population chp
gfor empty set;
5b) random g godfather after renewal is for population p
g' two individual R of middle selection
1and R
2, and by R
1individual as source, by R
2individual as object;
5c) be chosen at random the integer i between 1 to n, establish genes of interest value e and equal the individual R in source
1in i genic value r
1 i, i.e. e=r
1 i, establishing nodal scheme set U is empty set, the individual R of reference source
1in each genic value r
1 jwith genes of interest value e, if r
1 j=e, puts into nodal scheme set U by nodal scheme j, obtains nodal scheme set V to be changed
e=U, wherein 1≤j≤n;
5d) for nodal scheme set V to be changed
ein each nodal scheme j, by the individual R of object
2j genic value r of middle correspondence
2 jchange into genes of interest value e, i.e. r
2 j=e, the new individual R after being intersected
2', by new individual R
2' put into g for progeny population chp
gin;
5e) make q=q+1, compare iterations q and parent Population Size np, if q≤np returns to step 5b); Otherwise, finish to carry out.
6. the isomery community network community detection method based on genetic algorithm as claimed in claim 1, wherein step 7) described to g for progeny population chp
gin each individual R carry out mutation operation, carry out in accordance with the following steps:
6a) establish iterations q=1;
6b) produce at random the second random number rand
2m compares with variation Probability p, if rand
2<pm, forwards step 6c to); Otherwise, forward step 6f to);
6c) establishing genic value set L is empty set, by adjacency matrix A, is obtained and q node v
qeach the node v that has connection
j, by each node v
jj genic value r of correspondence in individual R
jput into genic value set L, wherein 1≤j≤n, is made as minus infinity by maximum localized mode lumpiness max, i.e. max=-∞;
6d) to each the genic value r in genic value set L
j, suppose q genic value r in individual R
qequal genic value r
j, calculate r under this supposed situation
jthe localized mode lumpiness f of individual community
rj, and compare r
jthe localized mode lumpiness letter f of individual community
rjwith maximum localized mode lumpiness max, if f
rj>max, makes max=f
rj, order obtains the label l of community of maximum localized mode lumpiness
q=r
j;
Wherein, r
jthe localized mode lumpiness f of individual community
rjcomputing formula is as follows:
In formula, V
rjbe r
jthe node set of individual community, 1≤c≤n wherein, d
itfor node v
itdegree, be connected to node v
itcompany's limit number, 1≤i wherein
t≤ n
t, 1≤t≤k, | V| always connects limit number in network;
If each genic value r 6e) in genic value set L
jcorresponding localized mode lumpiness f
rjall calculate completely, make q genic value r in individual R
qequal to obtain the label l of community of maximum localized mode lumpiness
q, i.e. r
q=l
q, execution step 6f); Otherwise, return to step 6d);
6f) make q=q+1, and compare iterations q and the total number n of node, if q≤n returns to step 6b); Otherwise, finish to carry out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310651893.1A CN103605793A (en) | 2013-12-04 | 2013-12-04 | Heterogeneous social network community detection method based on genetic algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310651893.1A CN103605793A (en) | 2013-12-04 | 2013-12-04 | Heterogeneous social network community detection method based on genetic algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103605793A true CN103605793A (en) | 2014-02-26 |
Family
ID=50124015
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310651893.1A Pending CN103605793A (en) | 2013-12-04 | 2013-12-04 | Heterogeneous social network community detection method based on genetic algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103605793A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103942308A (en) * | 2014-04-18 | 2014-07-23 | 中国科学院信息工程研究所 | Method and device for detecting large-scale social network communities |
CN104199884A (en) * | 2014-08-19 | 2014-12-10 | 东北大学 | Social networking service viewpoint selection method based on R coverage rate priority |
CN104318306A (en) * | 2014-10-10 | 2015-01-28 | 西安电子科技大学 | Non-negative matrix factorization and evolutionary algorithm optimized parameter based self-adaption overlapping community detection method |
CN104484365A (en) * | 2014-12-05 | 2015-04-01 | 华中科技大学 | Method and system for predicting social relation in multi-source heterogeneous networks |
CN109150237A (en) * | 2018-08-15 | 2019-01-04 | 桂林电子科技大学 | A kind of robust multi-user detector design method |
CN109166022A (en) * | 2018-08-01 | 2019-01-08 | 浪潮通用软件有限公司 | Screening technique based on fuzzy neural network and genetic algorithm |
CN109726001A (en) * | 2018-12-29 | 2019-05-07 | 中山大学 | A kind of genetic algorithm for heterogeneous system |
CN110334264A (en) * | 2019-06-27 | 2019-10-15 | 北京邮电大学 | A kind of community detection method and device for isomery dynamic information network |
CN112351033A (en) * | 2020-11-06 | 2021-02-09 | 北京石油化工学院 | Deep learning intrusion detection method based on double-population genetic algorithm in industrial control network |
WO2021227130A1 (en) * | 2020-05-13 | 2021-11-18 | 深圳计算科学研究院 | Heterogeneous network community detection method, device, computer apparatus, and storage medium |
-
2013
- 2013-12-04 CN CN201310651893.1A patent/CN103605793A/en active Pending
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103942308A (en) * | 2014-04-18 | 2014-07-23 | 中国科学院信息工程研究所 | Method and device for detecting large-scale social network communities |
CN103942308B (en) * | 2014-04-18 | 2017-04-05 | 中国科学院信息工程研究所 | The detection method and device of extensive myspace |
CN104199884B (en) * | 2014-08-19 | 2017-09-22 | 东北大学 | A kind of social networks point of observation choosing method preferential based on R coverage rates |
CN104199884A (en) * | 2014-08-19 | 2014-12-10 | 东北大学 | Social networking service viewpoint selection method based on R coverage rate priority |
CN104318306A (en) * | 2014-10-10 | 2015-01-28 | 西安电子科技大学 | Non-negative matrix factorization and evolutionary algorithm optimized parameter based self-adaption overlapping community detection method |
CN104318306B (en) * | 2014-10-10 | 2017-03-15 | 西安电子科技大学 | Self adaptation based on Non-negative Matrix Factorization and evolution algorithm Optimal Parameters overlaps community detection method |
CN104484365A (en) * | 2014-12-05 | 2015-04-01 | 华中科技大学 | Method and system for predicting social relation in multi-source heterogeneous networks |
CN104484365B (en) * | 2014-12-05 | 2017-12-12 | 华中科技大学 | In a kind of multi-source heterogeneous online community network between network principal social relationships Forecasting Methodology and system |
CN109166022A (en) * | 2018-08-01 | 2019-01-08 | 浪潮通用软件有限公司 | Screening technique based on fuzzy neural network and genetic algorithm |
CN109150237A (en) * | 2018-08-15 | 2019-01-04 | 桂林电子科技大学 | A kind of robust multi-user detector design method |
CN109726001A (en) * | 2018-12-29 | 2019-05-07 | 中山大学 | A kind of genetic algorithm for heterogeneous system |
CN110334264A (en) * | 2019-06-27 | 2019-10-15 | 北京邮电大学 | A kind of community detection method and device for isomery dynamic information network |
WO2021227130A1 (en) * | 2020-05-13 | 2021-11-18 | 深圳计算科学研究院 | Heterogeneous network community detection method, device, computer apparatus, and storage medium |
CN112351033A (en) * | 2020-11-06 | 2021-02-09 | 北京石油化工学院 | Deep learning intrusion detection method based on double-population genetic algorithm in industrial control network |
CN112351033B (en) * | 2020-11-06 | 2022-09-13 | 北京石油化工学院 | Deep learning intrusion detection method based on double-population genetic algorithm in industrial control network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103605793A (en) | Heterogeneous social network community detection method based on genetic algorithm | |
Fronza et al. | Failure prediction based on log files using random indexing and support vector machines | |
Liang et al. | Structure of the global virtual carbon network: revealing important sectors and communities for emission reduction | |
CN104731962A (en) | Method and system for friend recommendation based on similar associations in social network | |
CN104134159A (en) | Method for predicting maximum information spreading range on basis of random model | |
CN105868334A (en) | Personalized film recommendation method and system based on feature augmentation | |
CN104200272A (en) | Complex network community mining method based on improved genetic algorithm | |
De Andrés et al. | A HYBRID DEVICE OF SELF ORGANIZING MAPS (SOM) AND MULTIVARIATE ADAPTIVE REGRESSION SPLINES (MARS) FOR THE FORECASTING OF FIRMS'BANKRUPTCY. | |
Deshpande et al. | PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets | |
CN104268629A (en) | Complex network community detecting method based on prior information and network inherent information | |
CN102722578B (en) | Unsupervised cluster characteristic selection method based on Laplace regularization | |
Islambekov et al. | Harnessing the power of topological data analysis to detect change points | |
Bi et al. | MM-GNN: Mix-moment graph neural network towards modeling neighborhood feature distribution | |
Atkinson et al. | Cluster detection and clustering with random start forward searches | |
CN101142479A (en) | System, method and computer program for non-binary sequence comparison | |
Wu et al. | Statistical learning techniques for the estimation of lifeline network performance and retrofit selection | |
Li et al. | Adaptive subgraph neural network with reinforced critical structure mining | |
Zhou et al. | Learning to correlate accounts across online social networks: An embedding-based approach | |
Wu et al. | Automatic network clustering via density-constrained optimization with grouping operator | |
Kuang et al. | Coarformer: Transformer for large graph via graph coarsening | |
Wind et al. | Link prediction in weighted networks | |
Jiao et al. | Generative evolutionary anomaly detection in dynamic networks | |
CN104899283A (en) | Frequent sub-graph mining and optimizing method for single uncertain graph | |
CN103678709B (en) | Recommendation system attack detection method based on time series data | |
CN112561599A (en) | Click rate prediction method based on attention network learning and fusing domain feature interaction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140226 |
|
WD01 | Invention patent application deemed withdrawn after publication |