CN109242713A - Three decision group dividing methods and device based on the processing of random walk Boundary Region - Google Patents
Three decision group dividing methods and device based on the processing of random walk Boundary Region Download PDFInfo
- Publication number
- CN109242713A CN109242713A CN201811045237.6A CN201811045237A CN109242713A CN 109242713 A CN109242713 A CN 109242713A CN 201811045237 A CN201811045237 A CN 201811045237A CN 109242713 A CN109242713 A CN 109242713A
- Authority
- CN
- China
- Prior art keywords
- node
- corporations
- division
- boundary region
- granulation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000005295 random walk Methods 0.000 title claims abstract description 38
- 238000012545 processing Methods 0.000 title claims abstract description 36
- 238000005469 granulation Methods 0.000 claims abstract description 92
- 230000003179 granulation Effects 0.000 claims abstract description 91
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 30
- 239000011159 matrix material Substances 0.000 claims description 82
- 230000005012 migration Effects 0.000 claims description 50
- 238000013508 migration Methods 0.000 claims description 50
- 238000010606 normalization Methods 0.000 claims description 12
- 230000007704 transition Effects 0.000 claims description 6
- 230000008901 benefit Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims 4
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 238000004458 analytical method Methods 0.000 abstract description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000008859 change Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000003111 delayed effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000007812 deficiency Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000008187 granular material Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Business, Economics & Management (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of three decision group dividing methods based on the processing of random walk Boundary Region, 1) method includes:, obtains abstract network;2) cluster granulation is carried out after, being initially granulated to abstract network, abstract network is divided into corporations after multiple divisions, and using the structural relation of corporations after division as the first division result;3) the first division result of the corresponding abstract network of overlapping corporations' modularity maximum value, is obtained;4), all nodes in Boundary Region are divided using Random Walk Algorithm;5) it, for each of updated Boundary Region node, is handled using three Decision Methods, obtains the second division result, and using the second division result as target division result.The invention discloses a kind of three decision community division devices based on the processing of random walk Boundary Region.With the application of the invention, the precision of community division can be improved, to facilitate analysis and awareness network structure, convenient for optimizing and managing to network.
Description
Technical field
The present invention relates to a kind of group dividing method and devices, are more particularly to a kind of based on the processing of random walk Boundary Region
Three decision group dividing methods and device.
Background technique
Network is widely present in our actual life, such as the interpersonal relationship in social system, in biosystem
The protein Internet, WWW and internet in computer network etc., each of these networks individual passes through information
Interchange channel is connect with other individuals.With the further investigation to network, it has been found that in many real networks all there is
Community structure, i.e. whole network are made of several corporations.For example, can be by network present in reality, such as computer network
In each device abstract at a node, and the network channel between each node can be abstracted into the side between node,
It can be found that connecting relative close between node/node inside each corporations and the connection between corporations is relatively sparse.
In order to realize the analysis to network, need to find according to the incidence relation between node and contact closely small-sized community structure, be somebody's turn to do
Process can be referred to as community division or community discovery.In recent years, go deep into research, it has been found that carrying out community division
Shi Jingchang will appear lap, i.e. a node may belong to multiple corporations.In fact, overlapping nodes are divided into individually
It is more conducive to existing rule in discovery corporations in corporations, and predicts the behavior and function of network.
Currently, common non-overlap community detecting algorithm, is broadly divided into following four: hierarchical clustering algorithm, objective function
Optimization algorithm, network dynamics algorithm and the community detecting algorithm based on Granule Computing etc., wherein applying more non-overlap society
Clique partitioning algorithm have based on the granulated community discovery algorithm of level, GN algorithm (the amp- Newman algorithm of Grivan-Newman, Ge Rui),
NFA algorithm (Newman fast algorithm, Newman fast algorithm) and LPA algorithm (Label Propagation
Algorithm, label propagation algorithm) etc..These existing non-overlap community detecting algorithms are from different angles and application
The division of non-overlap corporations is studied, and achieves plentiful and substantial research achievement.
But these algorithms to lap processing when be all applied only for two traditional decision-making techniques, i.e., according to existing
Information only make and accept or reject decision.But the node of lap is often because information content deficiency can not determine its ownership
Lap is appeared in, if forcing to make a policy, the result of final non-overlap community division may be will affect.Therefore, existing
Have in technology that there are the technical problems that non-overlap community division result precision is not high.
Summary of the invention
Technical problem to be solved by the present invention lies in provide it is a kind of based on random walk Boundary Region processing three certainly
Plan group dividing method and device, to improve the precision of non-overlap community division result.
The present invention is to solve above-mentioned technical problem by the following technical programs:
The embodiment of the invention provides a kind of three decision group dividing methods based on the processing of random walk Boundary Region, institutes
The method of stating includes:
1) network abstraction for the pending community division that, will acquire is at the abstract network connected by node;
2), for each of abstract network node, according to the connection attribute of the node, to the abstract net
Network carries out cluster granulation after being initially granulated, using the structural relation of abstract network after cluster granulation as the first division result, wherein
The connection attribute, comprising: the connection relationship of the node and other nodes;
3) the corresponding overlapping corporations modularity of each granulosa, is calculated, and obtains overlapping corporations' modularity maximum value pair
First division result of the abstract network answered, wherein the overlapping corporations are the corporations with same node point, and granulosa is pair
Answer the first division result of different default granulation coefficients;
4), using the lap of corporations after each division in first division result as Boundary Region;And using at random
Migration algorithm divides all nodes in the Boundary Region, updates first division result and the Boundary Region,
Until first division result and the update times of the Boundary Region reach preset times;
5) it, for each of updated Boundary Region node, is handled using three Decision Methods, obtains the
Two division results, and using second division result as target division result.
Optionally, 1) step, comprising:
Each of network for the pending community division that will acquire is abstracted into one by network individual interconnected
A node, the connecting line being then abstracted into the connecting link between each node between node.
Optionally, 2) step, comprising:
A: node centered on the node is obtained into the center for each of abstract network node
The set of the neighbor node of node, and whether there is in network that judge the central node and that the neighbor node is constituted
The corporations being made of node only connected by the central node;If so, the neighbours that obtain the central node and described
Included in the network that node is constituted, in addition to the central node between corporations connectionless relationship each corporations, and by institute
Each corporations are stated as corporations after dividing;If it is not, using the central node the and described neighbor node constitute network as
Corporations after division.
B: for the default granulation coefficient of each of the default granulation coefficient of preset quantity, for corporations after the division
There are corporations after the division of connection relationship for every a pair in the set of composition, obtain corporations after the division there are connection relationship
Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not
Less than default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until the division
Afterwards corporations constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation
Coefficient;Using the structural relation of the corporations of acquisition as the first division result.
Optionally, 3) before the step, the method also includes:
Duplicate removal processing is carried out to first division result.
Optionally, 3) step, comprising:
Using formula,Obtain each default grain
Change the corresponding overlapping corporations modularity of coefficient, wherein
EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient;M is in the abstract network
Side quantity;GrFor corporations after each default each division being granulated in the corresponding community division result of coefficient;I is i-th
The serial number of node;J is the serial number of j-th of node;oiFor the quantity of corporations after division belonging to i-th of node;ojFor j-th of node
The quantity of corporations after affiliated division;AijFor the corresponding adjacency matrix element of the abstract network;diFor the degree of i-th of node;dj
For the degree of j-th of node.
It is optionally, described that all nodes in the Boundary Region are divided using Random Walk Algorithm, comprising:
A: the corresponding adjacency matrix of the abstract network is obtained;
B: according to the adjacency matrix, using formula,Described in calculating
Similarity value in Boundary Region between the connected node of any two, wherein
S(vi,vj) it is similarity value between any two are connected in Boundary Region node;viFor i-th of section in Boundary Region
Point;vjFor i-th of node in Boundary Region;Min () is minimum value value finding function;YiγFor the i-th row all elements structure in adjacency matrix
At the vector that constitutes of vector and γ row all elements between inner product, and Yiγ=∑tAitAγt;AitIt is in adjacency matrix
The element of i row t column;AγtThe element arranged for γ row t in adjacency matrix;∑ is summing function;T is member in adjacency matrix
The serial number of column corresponding to element;YjγThe vector and γ row all elements constituted for jth row all elements in adjacency matrix is constituted
Vector between inner product, and Yjγ=∑tAjtAγt;AjtThe element arranged for jth row t in adjacency matrix;Max () is maximum value
Value finding function;γ is the serial number of row in adjacency matrix;
C: the similarity value in the Boundary Region between any two node is obtained;Wherein, in any two node
Serial number it is identical when, the similarity value between any two node is set as 1;Exist between any two node
When connecting side, the similarity value between any two node is calculated using the formula in step B;In any two section
When side being not present between point, the similarity value between any two node is set as 0;
D: the similarity value in the Boundary Region between any two node is normalized, normalized moments are obtained
Battle array, and utilize formula, u=(1- α) PT·u0+ α d calculates in the Boundary Region each node migration to other each sections
The probability vector of point, wherein
U be Boundary Region in each node migration to other each nodes probability vector;α jumps probability to be preset;P
For normalization matrix;u0For probability vector;D is transition probability vector;PTFor the transposed matrix of normalization matrix;
E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability to
Amount, and return and execute the D step, until probability vector of each node migration to other each nodes in the Boundary Region
Convergence obtains in the Boundary Region each node migration to the destination probability vector of other each nodes;
F: being directed to and utilize formula,Calculate institute
State the probability of each node migration corporation after each division into abstract network in Boundary Region, wherein
P(n→Cj) be Boundary Region in n-th of node migration to j-th divide after corporations probability;Avg { } is average
It is worth value finding function;N is the serial number of the node in Boundary Region;viFor corporations C after divisionjIn i-th of node;J is corporations after dividing
Serial number;For arbitrary function;
G: for each of Boundary Region node, the node division is arrived, the node migration is to each stroke
In the probability of Fen Hou corporations after the corresponding division of maximum value in corporations, corporations and packet after the Boundary Region, the division are updated
The first division result containing the structural relation of corporations after each division.
The embodiment of the invention provides a kind of three decision community division devices based on the processing of random walk Boundary Region, institutes
Stating device includes:
First obtains module, and the network abstraction for the pending community division that will acquire is abstract at being connected by node
Network;
Setup module is right according to the connection attribute of the node for being directed to each of abstract network node
The abstract network carries out cluster granulation after being initially granulated, and the abstract network is divided into corporations after multiple divisions, and will draw
The structural relation of Fen Hou corporations is as the first division result, wherein the connection attribute, comprising: the node and other nodes
Connection relationship;
Computing module for calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' module
Spend the first division result of the corresponding abstract network of maximum value, wherein the overlapping corporations are the corporations with same node point,
Granulosa is the first division result of corresponding different default granulation coefficient;
Division module, for using the lap of corporations after each division in first division result as Boundary Region;
And all nodes in the Boundary Region are divided using Random Walk Algorithm, update first division result and institute
Boundary Region is stated, until first division result and the update times of the Boundary Region reach preset times;
Second obtains module, for using three Decision Methods for each of updated Boundary Region node
It is handled, obtains the second division result, and using second division result as target division result.
Optionally, described first module is obtained, is also used to:
Each of network for the pending community division that will acquire is abstracted into one by network individual interconnected
A node, the connecting line being then abstracted into the connecting link between each node between node.
Optionally, the setup module, is also used to:
A: node centered on the node is obtained into the center for each of abstract network node
The set of the neighbor node of node, and whether there is in network that judge the central node and that the neighbor node is constituted
The corporations being made of node only connected by the central node;If so, the neighbours that obtain the central node and described
Included in the network that node is constituted, in addition to the central node between corporations connectionless relationship each corporations, and by institute
Each corporations are stated as corporations after dividing;If it is not, using the central node the and described neighbor node constitute network as
Corporations after division.
B: for the default granulation coefficient of each of the default granulation coefficient of preset quantity, for corporations after the division
There are corporations after the division of connection relationship for every a pair in the set of composition, obtain corporations after the division there are connection relationship
Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not
Less than default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until the division
Afterwards corporations constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation
Coefficient;Using the structural relation of the corporations of acquisition as the first division result.
Optionally, described device further include: deduplication module, for carrying out duplicate removal processing to first division result.
Optionally, the computing module, is also used to:
Using formula,Obtain each default grain
Change the corresponding overlapping corporations modularity of coefficient, wherein
EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient;M is in the abstract network
Side quantity;GrFor corporations after each default each division being granulated in the corresponding community division result of coefficient;I is i-th
The serial number of node;J is the serial number of j-th of node;oiFor the quantity of corporations after division belonging to i-th of node;ojFor j-th of node
The quantity of corporations after affiliated division;AijFor the corresponding adjacency matrix element of the abstract network;diFor the degree of i-th of node;dj
For the degree of j-th of node.
Optionally, the division module, is also used to:
A: the corresponding adjacency matrix of the abstract network is obtained;
B: according to the adjacency matrix, using formula,Described in calculating
Similarity value in Boundary Region between the connected node of any two, wherein
S(vi,vj) it is similarity value between any two are connected in Boundary Region node;viFor i-th of section in Boundary Region
Point;vjFor i-th of node in Boundary Region;Min () is minimum value value finding function;YiγFor the i-th row all elements structure in adjacency matrix
At the vector that constitutes of vector and γ row all elements between inner product, and Yiγ=∑tAitAγt;AitIt is in adjacency matrix
The element of i row t column;AγtThe element arranged for γ row t in adjacency matrix;∑ is summing function;T is member in adjacency matrix
The serial number of column corresponding to element;YjγThe vector and γ row all elements constituted for jth row all elements in adjacency matrix is constituted
Vector between inner product, and Yjγ=∑tAjtAγt;AjtThe element arranged for jth row t in adjacency matrix;Max () is maximum value
Value finding function;γ is the serial number of row in adjacency matrix;
C: the similarity value in the Boundary Region between any two node is obtained;Wherein, in any two node
Serial number it is identical when, the similarity value between any two node is set as 1;Exist between any two node
When connecting side, the similarity value between any two node is calculated using the formula in step B;In any two section
When side being not present between point, the similarity value between any two node is set as 0;
D: the similarity value in the Boundary Region between any two node is normalized, normalized moments are obtained
Battle array, and utilize formula, u=(1- α) PT·u0+ α d calculates in the Boundary Region each node migration to other each sections
The probability vector of point, wherein
U be Boundary Region in each node migration to other each nodes probability vector;α jumps probability to be preset;P
For normalization matrix;u0For probability vector;D is transition probability vector;PTFor the transposed matrix of normalization matrix;
E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability to
Amount, and return and execute the D step, until probability vector of each node migration to other each nodes in the Boundary Region
Convergence obtains in the Boundary Region each node migration to the destination probability vector of other each nodes;
F: being directed to and utilize formula,Calculate institute
State the probability of each node migration corporation after each division into abstract network in Boundary Region, wherein
P(n→Cj) be Boundary Region in n-th of node migration to j-th divide after corporations probability;Avg { } is average
It is worth value finding function;N is the serial number of the node in Boundary Region;viFor corporations C after divisionjIn i-th of node;J is corporations after dividing
Serial number;For arbitrary function;
G: for each of Boundary Region node, the node division is arrived, the node migration is to each stroke
In the probability of Fen Hou corporations after the corresponding division of maximum value in corporations, corporations and packet after the Boundary Region, the division are updated
The first division result containing the structural relation of corporations after each division.
The present invention has the advantage that compared with prior art
Using the embodiment of the present invention, the node in the Boundary Region that can not be made a policy is prolonged using three decision-making techniques
Slow decision, carry out decision to the node in Boundary Region again after getting sufficient information makes to reduce the fault of decision
Decision is more reasonable;In addition, the embodiment of the present invention also uses Random Walk Algorithm to divide the node in Boundary Region, thus not
Easily lead to local optimum.Therefore, compared with the existing technology using the embodiment of the present invention, more reasonable non-overlap society can be marked off
Group, to effectively improve the precision of community division.
Detailed description of the invention
Fig. 1 is a kind of three decision community division sides based on the processing of random walk Boundary Region provided in an embodiment of the present invention
The flow diagram of method;
Fig. 2 is the structural schematic diagram of the abstract network obtained in the embodiment of the present invention;
Fig. 3 is a kind of three decision community divisions dress based on the processing of random walk Boundary Region provided in an embodiment of the present invention
The structural schematic diagram set.
Specific embodiment
It elaborates below to the embodiment of the present invention, the present embodiment carries out under the premise of the technical scheme of the present invention
Implement, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to following implementation
Example.
The embodiment of the invention provides it is a kind of based on random walk Boundary Region processing three decision group dividing methods and
Device, first below with regard to a kind of three decision community divisions based on the processing of random walk Boundary Region provided in an embodiment of the present invention
Method is introduced.
Fig. 1 is a kind of three decision community division sides based on the processing of random walk Boundary Region provided in an embodiment of the present invention
The flow diagram of method;Fig. 2 is the structural schematic diagram of the abstract network obtained in the embodiment of the present invention, as depicted in figs. 1 and 2,
The described method includes:
S101: the network abstraction for the pending community division that will acquire is at the abstract network connected by node.
Specifically, each of network for the pending community division that can be will acquire passes through interconnected, network
Body is abstracted into a node, the connecting line being then abstracted into the connecting link between each node between node.
Illustratively, G=(X, f, T) can be defined, indicates a network, wherein X is the node set of abstract network G,
F is the line set of abstract network G, and T is the topological structure of abstract network G;
Assuming that the given undirected and unweighted network G=(X, f, T) such as Fig. 2, which includes 10 nodes, 17 sides, wherein X
=(1,2 ..., i ..., j ..., 10).
It is emphasized that above-mentioned side refers to, connection relationship directly occurs, rather than is connected by third party
The connecting line that link between the node of relationship obtains after being abstracted.
S102: for each of abstract network node, according to the connection attribute of the node, to described abstract
Network carries out cluster granulation after being initially granulated, using the structural relation of abstract network after cluster granulation as the first division result,
In, the connection attribute, comprising: the connection relationship of the node and other nodes.
Specifically, S102 step may include:
A: node centered on the node is obtained into the center for each of abstract network node
The set of the neighbor node of node, and whether there is in network that judge the central node and that the neighbor node is constituted
The corporations being made of node only connected by the central node;If so, the neighbours that obtain the central node and described
Included in the network that node is constituted, in addition to the central node between corporations connectionless relationship each corporations, and by institute
Each corporations are stated as corporations after dividing;If it is not, using the central node the and described neighbor node constitute network as
Corporations after division.
As illustrated in fig. 2, it is assumed that with node 1 be center node, then node 1 neighbor node constitute set are as follows: N (1)=
{ 2,3,4,5,6,9,10 } be the corporations that center node is constituted with node 1 can be therefore SubC (1)={ SubC1=1,2,
3,4},SubC2={ 1,5,6,9,10 } }.When with node 3 be center node building corporations for SubC (3)={ SubC1=1,
2,3,4}}。
B: for the default granulation coefficient of each of the default granulation coefficient of preset quantity, for corporations after the division
There are corporations after the division of connection relationship for every a pair in the set of composition, obtain corporations after the division there are connection relationship
Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not
Less than default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until the division
Afterwards corporations constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation
Coefficient;Using the structural relation of the corporations of acquisition as the first division result.
Specifically, the value of granulation coefficient lambda can be preset for any one, and λ ∈ (0,1), any two in step A are calculated
Adjacent corporations after division such as,AndGranulation coefficientThen, maximum grain is found out
Change coefficient are as follows:IfCorporations after then two are divided
Corporations after merging into a division, update the division result of abstract network, repeat aforesaid operations, until any two are drawn
The granulation coefficient of Fen Hou corporations is respectively less than default granulation coefficient lambda, it may be assumed that
It is granulated coefficientWhen, the quotient space of most coarseness is obtained,
Cluster granulation is completed, and the corporations after each division for being included can also be referred to as a grain, wherein m is default granulation system
Several serial numbers;I is the serial number of i-th of corporation;J is the serial number of j-th of corporation.
It should be noted that calculate corporations granulation coefficient be the prior art, the embodiment of the present invention herein not to its into
Row repeats.
Finally, can be using step B treated abstract network as the first division result of the default granulation coefficient, it can be with
Understand, each preset granulation coefficient corresponds to first division result.
Further, in practical applications, the preparation method for presetting granulation coefficient can be with are as follows: by the step of default granulation coefficient
Long value is 0.01, and then according to the step-length, the upper limit value from granulation coefficient is default to that can obtain several between lower limit value
It is granulated coefficient.Preset granulation coefficient is ranked up by step-length according to sequence from big to small, adjacent two default granulations
Difference between coefficient is 0.01.
The merging of corporations after being divided according to this step obtains the corporations of abstract network under each default granulation coefficient
Division result can be improved the dividing precision of the community division result of abstract network, facilitate using the above embodiment of the present invention
The structure of analysis and awareness network, convenient for optimizing and managing to network.
S103: calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' modularity maximum value
First division result of corresponding abstract network, wherein the overlapping corporations are the corporations with same node point, and granulosa is pair
Answer the first division result of different default granulation coefficients.
Specifically, can use formula,It obtains every
One is preset the corresponding overlapping corporations modularity of granulation coefficient, wherein
EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient;M is in the abstract network
Side quantity;GrFor corporations after each default each division being granulated in the corresponding community division result of coefficient;I is i-th
The serial number of node;J is the serial number of j-th of node;oiFor the quantity of corporations after division belonging to i-th of node;ojFor j-th of node
The quantity of corporations after affiliated division;AijFor the corresponding adjacency matrix element of the abstract network;diFor the degree of i-th of node;dj
For the degree of j-th of node.
It is understood that granulosa refers to multiple default granulation coefficients, grain is preset according to each
Change each first division result obtained after the cluster granulating operation that coefficient is carried out.
S104: using the lap of corporations after each division in first division result as Boundary Region;And using with
Machine migration algorithm divides all nodes in the Boundary Region, updates first division result and the boundary
Domain, until first division result and the update times of the Boundary Region reach preset times.
Specifically, described divide all nodes in the Boundary Region using Random Walk Algorithm, comprising:
A: obtaining the corresponding adjacency matrix of the abstract network, for example, obtained adjacency matrix are as follows:
If the element that the i-th row jth arranges in adjacency matrix is 1, taking out
As in network, indicating there is side connection between i-th of node and j-th of node, if the element that the i-th row jth arranges in adjacency matrix is
0, in abstract network, indicate there is no side connection between i-th of node and j-th of node.
B: according to the adjacency matrix, using formula,Described in calculating
Similarity value in Boundary Region between the connected node of any two, wherein
S(vi,vj) it is similarity value between any two are connected in Boundary Region node;viFor i-th of section in Boundary Region
Point;vjFor i-th of node in Boundary Region;Min () is minimum value value finding function;YiγFor the i-th row all elements structure in adjacency matrix
At the vector that constitutes of vector and γ row all elements between inner product, and Yiγ=∑tAitAγt;AitIt is in adjacency matrix
The element of i row t column;AγtFor the element of γ row t column;∑ is summing function;T is corresponding to element in adjacency matrix
The serial number of column;YjγBetween the vector that the vector and γ row all elements constituted for jth row all elements in adjacency matrix is constituted
Inner product, and Yjγ=∑tAjtAγt;AjtThe element arranged for jth row t in adjacency matrix;Max () is maximum value value finding function;
γ is the serial number of row in adjacency matrix.
Illustratively, the corresponding similarity moment of similarity in this step B, between the connected node of obtained any two
Battle array can be with are as follows:
C: the similarity value in the Boundary Region between any two node is obtained;Wherein, in any two node
Serial number it is identical when, the similarity value between any two node is set as 1;Exist between any two node
When connecting side, the similarity value between any two node is calculated using the formula in step B;In any two section
When side being not present between point, the similarity value between any two node is set as 0;
D: the similarity value in the Boundary Region between any two node is normalized, normalized moments are obtained
Battle array, and utilize formula, u=(1- α) PT·u0+ α d calculates in the Boundary Region each node migration to other each sections
The probability vector of point, wherein
U be Boundary Region in each node migration to other each nodes probability vector;α jumps probability to be preset;P
For normalization matrix;u0For probability vector;D is transition probability vector;PTFor the transposed matrix of normalization matrix;
E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability to
Amount, and return and execute the D step, until probability vector of each node migration to other each nodes in the Boundary Region
Convergence obtains in the Boundary Region each node migration to the destination probability vector u of other each nodes;
F: utilizing formula,Calculate the side
The probability of each node migration corporation after each division into abstract network in boundary domain, wherein
P(n→Cj) be Boundary Region in node n migration to j-th divide after corporations probability;Avg { } asks for average value
Value function;N is the serial number of the node in Boundary Region;viFor corporations C after divisionjIn i-th of node;J is the sequence of corporations after dividing
Number;For arbitrary function;
G: for each of Boundary Region node, by the node division to the node migration to each division
Afterwards in the probability of corporations after the corresponding division of maximum value in corporations, update after the Boundary Region, the division corporations and comprising
First division result of the structural relation of corporations after each division.
In this step, by the successive ignition to community division process, the precision of community division can be improved, in addition,
The quantity of Boundary Region interior joint can also be reduced, calculation amount is reduced, improves operation efficiency.
S105: it for each of updated Boundary Region node, is handled, is obtained using three Decision Methods
Second division result, and using second division result as target division result.
In practical applications, three Decision Methods are for handling those uncertain informations with certain practicability, for example, working as
When information deficiency, three decision theories are first divided into three domains to uncertain problem, i.e., positive domain (receiving), negative domain (refusal)
It (does not promise to undertake), the node that can not currently make a policy is placed in Boundary Region, i.e. Delayed Decision with Boundary Region;When getting abundance
Information after decision carried out to the node in Boundary Region again keep decision more reasonable to reduce the fault of decision.
In addition, Random Walk Algorithm has the selection independent of start node, not easily leads to local optimum and corporations
It was found that the advantages that precision is higher, therefore, the processing of corporations' lap after being divided using the embodiment of the present invention can be divided
More reasonable non-overlap corporations out facilitate the structure of analysis and awareness network, just to effectively improve the precision of community division
In optimizing and manage to network.
Using embodiment illustrated in fig. 1 of the present invention, using three decision-making techniques to the section in the Boundary Region that can not be made a policy
Point carries out Delayed Decision, decision is carried out to the node in Boundary Region again after getting sufficient information, to reduce decision
Fault, keep decision more reasonable;Therefore, compared with the existing technology using the embodiment of the present invention, can mark off more reasonably non-
Corporations are overlapped, to effectively improve the precision of community division, facilitate the structure of analysis and awareness network, convenient for being carried out to network
Optimization and management.
In addition, existing algorithm is easy to lead to local optimum during community division, to influence the essence of community division
Degree.Using the embodiment of the present invention, random walk is introduced into three decision thoughts and handles corporations' lap, to not easily lead to
Local optimum.
Moreover, the embodiment of the present invention obtains the optimum structure of overlapping corporations first, Boundary Region interior joint has more been fully considered
Between prototype structure, so as to obtain more true community structure;
Furthermore using embodiment illustrated in fig. 1 of the present invention, facilitate the structure of analysis and awareness network, convenient for network into
Row optimization and management.In social networks, such as wechat, QQ social software, using user as node, the good friend between user is closed
System is used as side, some users for having common friend can be recommended to become good friend by carrying out community division to these social networks,
These users increase the interaction between user often in same corporations, with this.
It is described on the basis of embodiment illustrated in fig. 1 of the present invention in a kind of specific embodiment of the embodiment of the present invention
Following steps are also added before S103 step, duplicate removal processing are carried out to first division result, i.e., for identical division
Corporations afterwards only retain one.
Using the above embodiment of the present invention, calculation amount can be reduced, improves calculating speed.
Corresponding with embodiment illustrated in fig. 1 of the present invention, the embodiment of the invention also provides one kind to be based on random walk boundary
Three decision community division devices of domain processing.
Fig. 3 is a kind of three decision community divisions dress based on the processing of random walk Boundary Region provided in an embodiment of the present invention
The structural schematic diagram set, as shown in figure 3, described device includes:
First obtains module 301, and the network abstraction for the pending community division that will acquire by node at being connected into
Abstract network;
Setup module 302, for being directed to each of abstract network node, according to the connection category of the node
Property, cluster granulation is carried out after being initially granulated to the abstract network, using the structural relation of abstract network after cluster granulation as the
One divides as a result, wherein, the connection attribute, comprising: the connection relationship of the node and other nodes;
Computing module 303 for calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' mould
First division result of the corresponding abstract network of lumpiness maximum value, wherein the overlapping corporations are the society with same node point
Group, granulosa are the first division result of corresponding different default granulation coefficient;
Division module 304, for using the lap of corporations after each division in first division result as boundary
Domain;And all nodes in the Boundary Region are divided using Random Walk Algorithm, update first division result with
And the Boundary Region, until first division result and the update times of the Boundary Region reach preset times;
Second obtains module 305, for using three decisions for each of updated Boundary Region node
Method is handled, and obtains the second division result, and using second division result as target division result.
Using embodiment illustrated in fig. 3 of the present invention, using three decision-making techniques to the section in the Boundary Region that can not be made a policy
Point carries out Delayed Decision, decision is carried out to the node in Boundary Region again after getting sufficient information, to reduce decision
Fault, keep decision more reasonable;Therefore, compared with the existing technology using the embodiment of the present invention, can mark off more reasonably non-
Corporations are overlapped, to effectively improve the precision of community division, facilitate the structure of analysis and awareness network, convenient for being carried out to network
Optimization and management.
In in a kind of specific embodiment of the embodiment of the present invention, described first obtains module 301, is also used to:
Each of network for the pending community division that will acquire is abstracted into one by network individual interconnected
A node, the connecting line being then abstracted into the connecting link between each node between node.
In a kind of specific embodiment of the embodiment of the present invention, the setup module 302 is also used to:
A: node centered on the node is obtained into the center for each of abstract network node
The set of the neighbor node of node, and whether there is in network that judge the central node and that the neighbor node is constituted
The corporations being made of node only connected by the central node;If so, the neighbours that obtain the central node and described
Included in the network that node is constituted, in addition to the central node between corporations connectionless relationship each corporations, and by institute
Each corporations are stated as corporations after dividing;If it is not, using the central node the and described neighbor node constitute network as
Corporations after division.
B: for the default granulation coefficient of each of the default granulation coefficient of preset quantity, for corporations after the division
There are corporations after the division of connection relationship for every a pair in the set of composition, obtain corporations after the division there are connection relationship
Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not
Less than default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until the division
Afterwards corporations constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation
Coefficient;Using the structural relation of the corporations of acquisition as the first division result.
In a kind of specific embodiment of the embodiment of the present invention, on the basis of embodiment illustrated in fig. 3 of the present invention, also increase
Deduplication module is added, for carrying out duplicate removal processing to first division result.
Using the above embodiment of the present invention, calculation amount can be reduced, improves calculating speed.
In a kind of specific embodiment of the embodiment of the present invention, the computing module 303 is also used to:
Using formula,Obtain each default grain
Change the corresponding overlapping corporations modularity of coefficient, wherein
EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient;M is in the abstract network
Side quantity;GrFor corporations after each default each division being granulated in the corresponding community division result of coefficient;I is i-th
The serial number of node;J is the serial number of j-th of node;oiFor the quantity of corporations after division belonging to i-th of node;ojFor j-th of node
The quantity of corporations after affiliated division;AijFor the corresponding adjacency matrix element of the abstract network;diFor the degree of i-th of node;dj
For the degree of j-th of node.
In a kind of specific embodiment of the embodiment of the present invention, the division module 304 is also used to:
A: the corresponding adjacency matrix of the abstract network is obtained;
B: according to the adjacency matrix, using formula,Described in calculating
Similarity value in Boundary Region between the connected node of any two, wherein
S(vi,vj) it is similarity value between any two are connected in Boundary Region node;viFor i-th of section in Boundary Region
Point;vjFor i-th of node in Boundary Region;Min () is minimum value value finding function;YiγFor the i-th row all elements structure in adjacency matrix
At the vector that constitutes of vector and γ row all elements between inner product, and Yiγ=∑tAitAγt;AitIt is in adjacency matrix
The element of i row t column;AγtThe element arranged for γ row t in adjacency matrix;∑ is summing function;T is member in adjacency matrix
The serial number of column corresponding to element;YjγThe vector and γ row all elements constituted for jth row all elements in adjacency matrix is constituted
Vector between inner product, and Yjγ=∑tAjtAγt;AjtThe element arranged for jth row t in adjacency matrix;Max () is maximum value
Value finding function;γ is the serial number of row in adjacency matrix;
C: the similarity value in the Boundary Region between any two node is obtained;Wherein, in any two node
Serial number it is identical when, the similarity value between any two node is set as 1;Exist between any two node
When connecting side, the similarity value between any two node is calculated using the formula in step B;In any two section
When side being not present between point, the similarity value between any two node is set as 0;
D: the similarity value in the Boundary Region between any two node is normalized, normalized moments are obtained
Battle array, and utilize formula, u=(1- α) PT·u0+ α d calculates in the Boundary Region each node migration to other each sections
The probability vector of point, wherein
U be Boundary Region in each node migration to other each nodes probability vector;α jumps probability to be preset;P
For normalization matrix;u0For probability vector;D is transition probability vector;PTFor the transposed matrix of normalization matrix;
E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability to
Amount, and return and execute the D step, until probability vector of each node migration to other each nodes in the Boundary Region
Convergence obtains in the Boundary Region each node migration to the destination probability vector u of other each nodes;
F: utilizing formula,Calculate the side
The probability of each node migration corporation after each division into abstract network in boundary domain, wherein
P(n→Cj) be Boundary Region in node n migration to j-th divide after corporations probability;Avg { } asks for average value
Value function;N is the serial number of the node in Boundary Region;viFor corporations C after divisionjIn i-th of node;J is the sequence of corporations after dividing
Number;For arbitrary function;
G: for each of Boundary Region node, by the node division to the node migration to each division
Afterwards in the probability of corporations after the corresponding division of maximum value in corporations, update after the Boundary Region, the division corporations and comprising
First division result of the structural relation of corporations after each division.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (12)
1. a kind of three decision group dividing methods based on the processing of random walk Boundary Region, which is characterized in that the method packet
It includes:
1) network abstraction for the pending community division that, will acquire is at the abstract network connected by node;
2), for each of abstract network node, according to the connection attribute of the node, at the beginning of the abstract network
Cluster granulation is carried out after the granulation that begins, and using the structural relation of abstract network after cluster granulation as the first division result, wherein institute
State connection attribute, comprising: the connection relationship of the node and other nodes;
3) the corresponding overlapping corporations modularity of each granulosa, is calculated, and it is corresponding to obtain overlapping corporations' modularity maximum value
First division result of abstract network, wherein the overlapping corporations are the corporations with same node point, and granulosa is to correspond to not
First division result of same default granulation coefficient;
4), using the lap of corporations after each division in first division result as Boundary Region;And utilize random walk
Algorithm divides all nodes in the Boundary Region, updates first division result and the Boundary Region, until
First division result and the update times of the Boundary Region reach preset times;
5) it, for each of updated Boundary Region node, is handled using three Decision Methods, obtains second stroke
Divide as a result, and using second division result as target division result.
2. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1,
It is characterized in that, 1) step, comprising:
Each of network for the pending community division that will acquire is abstracted into a section by network individual interconnected
Point, the connecting line being then abstracted into the connecting link between each node between node.
3. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1,
It is characterized in that, 2) step, comprising:
A: node centered on the node is obtained into the central node for each of abstract network node
Neighbor node set, and with the presence or absence of only leading in network that judge the central node and that the neighbor node is constituted
Cross the corporations of the central node connection being made of node;If so, the neighbor node that obtain the central node and described
Included in the network of composition, in addition to the central node between corporations connectionless relationship each corporations, and will be described each
A corporations are as corporations after dividing;If it is not, the network that the central node the and described neighbor node is constituted is as division
Corporations afterwards.
B: it for the default granulation coefficient of each of the default granulation coefficient of preset quantity, is constituted for corporations after the division
Set in every a pair there are corporations after the division of connection relationship, corporations are distinguished after obtaining the division there are connection relationship
Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not less than
Default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until society after the division
Group constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation coefficient;
Using the structural relation of the corporations of acquisition as the first division result.
4. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1,
It is characterized in that, 3) before the step, the method also includes:
Duplicate removal processing is carried out to first division result.
5. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1,
It is characterized in that, 3) step, comprising:
Using formula,Obtain each default granulation system
Number corresponding overlapping corporations modularity, wherein
EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient;M is the side for including in the abstract network
Quantity;GrFor corporations after each default each division being granulated in the corresponding community division result of coefficient;I is i-th of node
Serial number;J is the serial number of j-th of node;oiFor the quantity of corporations after division belonging to i-th of node;ojFor belonging to j-th of node
The quantity of corporations after division;AijFor the corresponding adjacency matrix element of the abstract network;diFor the degree of i-th of node;djFor jth
The degree of a node.
6. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1,
It is characterized in that, it is described that all nodes in the Boundary Region are divided using Random Walk Algorithm, comprising:
A: the corresponding adjacency matrix of the abstract network is obtained;
B: according to the adjacency matrix, using formula,Calculate the Boundary Region
Similarity value between the connected node of middle any two, wherein
S(vi,vj) it is similarity value between any two are connected in Boundary Region node;viFor i-th of node in Boundary Region;vj
For i-th of node in Boundary Region;Min () is minimum value value finding function;YiγIt is constituted for the i-th row all elements in adjacency matrix
The inner product between vector that vector and γ row all elements are constituted, and Yiγ=∑tAitAγt;AitFor the i-th row in adjacency matrix
The element of t column;AγtThe element arranged for γ row t in adjacency matrix;∑ is summing function;T is element institute in adjacency matrix
The serial number of corresponding column;YjγFor in adjacency matrix jth row all elements constitute vector and γ row all elements composition to
Inner product between amount, and Yjγ=∑tAjtAγt;AjtThe element arranged for jth row t in adjacency matrix;Max () is maximum value evaluation
Function;γ is the serial number of row in adjacency matrix;
C: the similarity value in the Boundary Region between any two node is obtained;Wherein, in the sequence of any two node
When number identical, the similarity value between any two node is set as 1;There is connection between any two node
Bian Shi calculates the similarity value between any two node using the formula in step B;Any two node it
Between be not present side when, the similarity value between any two node is set as 0;
D: the similarity value in the Boundary Region between any two node being normalized, normalization matrix is obtained,
And utilize formula, u=(1- α) PT·u0+ α d calculates in the Boundary Region each node migration to other each nodes
Probability vector, wherein
U be Boundary Region in each node migration to other each nodes probability vector;α jumps probability to be preset;P is to return
One changes matrix;u0For probability vector;D is transition probability vector;PTFor the transposed matrix of normalization matrix;
E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability vector, and
It returns and executes the D step, until the probability vector of each node migration to other each nodes restrains in the Boundary Region,
Each node migration is obtained in the Boundary Region to the destination probability vector of other each nodes;
F: being directed to and utilize formula,Calculate the side
The probability of each node migration corporation after each division into abstract network in boundary domain, wherein
P(n→Cj) be Boundary Region in n-th of node migration to j-th divide after corporations probability;Avg { } is average value evaluation
Function;N is the serial number of the node in Boundary Region;viFor corporations C after divisionjIn i-th of node;J is the sequence of corporations after dividing
Number;For arbitrary function;
G: for each of Boundary Region node, the node division is arrived, after the node migration to each division
In the probability of corporations after the corresponding division of maximum value in corporations, corporations are updated after the Boundary Region, the division and comprising each
First division result of the structural relation of corporations after a division.
7. a kind of three decision community division devices based on the processing of random walk Boundary Region, which is characterized in that described device packet
It includes:
First obtains module, and the network abstraction for the pending community division that will acquire is at the abstract net connected by node
Network;
Setup module, for being directed to each of abstract network node, according to the connection attribute of the node, to described
Abstract network carries out cluster granulation after being initially granulated, and the structural relation of abstract network after cluster granulation is divided as first and is tied
Fruit, wherein the connection attribute, comprising: the connection relationship of the node and other nodes;
Computing module for calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' modularity most
It is worth the first division result of corresponding abstract network greatly, wherein the overlapping corporations are the corporations with same node point, granulosa
For the first division result of the different default granulation coefficient of correspondence;
Division module, for using the lap of corporations after each division in first division result as Boundary Region;And benefit
All nodes in the Boundary Region are divided with Random Walk Algorithm, update first division result and the side
Boundary domain, until first division result and the update times of the Boundary Region reach preset times;
Second obtains module, for being carried out using three Decision Methods for each of updated Boundary Region node
Processing obtains the second division result, and using second division result as target division result.
8. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7,
It is characterized in that, described first obtains module, it is also used to:
Each of network for the pending community division that will acquire is abstracted into a section by network individual interconnected
Point, the connecting line being then abstracted into the connecting link between each node between node.
9. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7,
It is characterized in that, the setup module is also used to:
A: node centered on the node is obtained into the central node for each of abstract network node
Neighbor node set, and with the presence or absence of only leading in network that judge the central node and that the neighbor node is constituted
Cross the corporations of the central node connection being made of node;If so, the neighbor node that obtain the central node and described
Included in the network of composition, in addition to the central node between corporations connectionless relationship each corporations, and will be described each
A corporations are as corporations after dividing;If it is not, the network that the central node the and described neighbor node is constituted is as division
Corporations afterwards.
B: it for the default granulation coefficient of each of the default granulation coefficient of preset quantity, is constituted for corporations after the division
Set in every a pair there are corporations after the division of connection relationship, corporations are distinguished after obtaining the division there are connection relationship
Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not less than
Default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until society after the division
Group constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation coefficient;
Using the structural relation of the corporations of acquisition as the first division result.
10. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7,
It is characterized in that, described device further include: deduplication module, for carrying out duplicate removal processing to first division result.
11. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7,
It is characterized in that, the computing module is also used to:
Using formula,Obtain each default granulation system
Number corresponding overlapping corporations modularity, wherein
EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient;M is the side for including in the abstract network
Quantity;GrFor corporations after each default each division being granulated in the corresponding community division result of coefficient;I is i-th of node
Serial number;J is the serial number of j-th of node;oiFor the quantity of corporations after division belonging to i-th of node;ojFor belonging to j-th of node
The quantity of corporations after division;AijFor the corresponding adjacency matrix element of the abstract network;diFor the degree of i-th of node;djFor jth
The degree of a node.
12. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7,
It is characterized in that, the division module is also used to:
A: the corresponding adjacency matrix of the abstract network is obtained;
B: according to the adjacency matrix, using formula,Calculate the Boundary Region
Similarity value between the connected node of middle any two, wherein
S(vi,vj) it is similarity value between any two are connected in Boundary Region node;viFor i-th of node in Boundary Region;vj
For i-th of node in Boundary Region;Min () is minimum value value finding function;YiγIt is constituted for the i-th row all elements in adjacency matrix
The inner product between vector that vector and γ row all elements are constituted, and Yiγ=∑tAitAγt;AitFor the i-th row in adjacency matrix
The element of t column;AγtThe element arranged for γ row t in adjacency matrix;∑ is summing function;T is element institute in adjacency matrix
The serial number of corresponding column;YjγFor in adjacency matrix jth row all elements constitute vector and γ row all elements composition to
Inner product between amount, and Yjγ=∑tAjtAγt;AjtThe element arranged for jth row t in adjacency matrix;Max () is maximum value evaluation
Function;γ is the serial number of row in adjacency matrix;
C: the similarity value in the Boundary Region between any two node is obtained;Wherein, in the sequence of any two node
When number identical, the similarity value between any two node is set as 1;There is connection between any two node
Bian Shi calculates the similarity value between any two node using the formula in step B;Any two node it
Between be not present side when, the similarity value between any two node is set as 0;
D: the similarity value in the Boundary Region between any two node being normalized, normalization matrix is obtained,
And utilize formula, u=(1- α) PT·u0+ α d calculates in the Boundary Region each node migration to other each nodes
Probability vector, wherein
U be Boundary Region in each node migration to other each nodes probability vector;α jumps probability to be preset;P is to return
One changes matrix;u0For probability vector;D is transition probability vector;PTFor the transposed matrix of normalization matrix;
E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability vector, and
It returns and executes the D step, until the probability vector of each node migration to other each nodes restrains in the Boundary Region,
Each node migration is obtained in the Boundary Region to the destination probability vector of other each nodes;
F: being directed to and utilize formula,Calculate the side
The probability of each node migration corporation after each division into abstract network in boundary domain, wherein
P(n→Cj) be Boundary Region in n-th of node migration to j-th divide after corporations probability;Avg { } is average value evaluation
Function;N is the serial number of the node in Boundary Region;viFor corporations C after divisionjIn i-th of node;J is the sequence of corporations after dividing
Number;For arbitrary function;
G: for each of Boundary Region node, the node division is arrived, after the node migration to each division
In the probability of corporations after the corresponding division of maximum value in corporations, corporations are updated after the Boundary Region, the division and comprising each
First division result of the structural relation of corporations after a division.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811045237.6A CN109242713A (en) | 2018-09-07 | 2018-09-07 | Three decision group dividing methods and device based on the processing of random walk Boundary Region |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811045237.6A CN109242713A (en) | 2018-09-07 | 2018-09-07 | Three decision group dividing methods and device based on the processing of random walk Boundary Region |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109242713A true CN109242713A (en) | 2019-01-18 |
Family
ID=65067477
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811045237.6A Pending CN109242713A (en) | 2018-09-07 | 2018-09-07 | Three decision group dividing methods and device based on the processing of random walk Boundary Region |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109242713A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444454A (en) * | 2020-03-24 | 2020-07-24 | 哈尔滨工程大学 | Dynamic community dividing method based on spectrum method |
CN111756568A (en) * | 2020-05-06 | 2020-10-09 | 北京明略软件系统有限公司 | Method, device, computer storage medium and terminal for realizing community discovery |
CN112651764A (en) * | 2019-10-12 | 2021-04-13 | 武汉斗鱼网络科技有限公司 | Target user identification method, device, equipment and storage medium |
CN113011471A (en) * | 2021-02-26 | 2021-06-22 | 山东英信计算机技术有限公司 | Social group dividing method, social group dividing system and related devices |
-
2018
- 2018-09-07 CN CN201811045237.6A patent/CN109242713A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112651764A (en) * | 2019-10-12 | 2021-04-13 | 武汉斗鱼网络科技有限公司 | Target user identification method, device, equipment and storage medium |
CN112651764B (en) * | 2019-10-12 | 2023-03-31 | 武汉斗鱼网络科技有限公司 | Target user identification method, device, equipment and storage medium |
CN111444454A (en) * | 2020-03-24 | 2020-07-24 | 哈尔滨工程大学 | Dynamic community dividing method based on spectrum method |
CN111444454B (en) * | 2020-03-24 | 2023-05-05 | 哈尔滨工程大学 | Dynamic community division method based on spectrum method |
CN111756568A (en) * | 2020-05-06 | 2020-10-09 | 北京明略软件系统有限公司 | Method, device, computer storage medium and terminal for realizing community discovery |
CN113011471A (en) * | 2021-02-26 | 2021-06-22 | 山东英信计算机技术有限公司 | Social group dividing method, social group dividing system and related devices |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109242713A (en) | Three decision group dividing methods and device based on the processing of random walk Boundary Region | |
Gupta et al. | Top-k interesting subgraph discovery in information networks | |
CN104217015B (en) | Based on the hierarchy clustering method for sharing arest neighbors each other | |
Zhang et al. | An MST cluster analysis method under hesitant fuzzy environment | |
CN106254321A (en) | A kind of whole network abnormal data stream sorting technique | |
CN105930688A (en) | Improved PSO algorithm based protein function module detection method | |
Zou et al. | Answering pattern match queries in large graph databases via graph embedding | |
CN106708989A (en) | Spatial time sequence data stream application-based Skyline query method | |
CN103888541A (en) | Method and system for discovering cells fused with topology potential and spectral clustering | |
CN103279505B (en) | A kind of based on semantic mass data processing method | |
Akram et al. | Bipolar neutrosophic hypergraphs with applications | |
CN114896249A (en) | Unbalanced area tree index structure and n-dimensional space inverse nearest neighbor query algorithm | |
Gulzar et al. | Skyline query processing for incomplete data in cloud environment | |
CN103345509B (en) | Obtain the level partition tree method and system of the most farthest multiple neighbours on road network | |
Peng et al. | Member promotion in social networks via skyline | |
CN106611418A (en) | Image segmentation algorithm | |
Yuan et al. | Boundary-connection deletion strategy based method for community detection in complex networks | |
CN108776691A (en) | A kind of optimization method and system of space diagram aggregation | |
CN108829694A (en) | The optimization method of flexible polymer K-NN search G tree on road network | |
Zhao et al. | Joint optimization of latency and energy consumption for mobile edge computing based proximity detection in road networks | |
CN113255720A (en) | Multi-view clustering method and system based on hierarchical graph pooling | |
CN108804599A (en) | A kind of fast searching method of similar subgraph | |
CN111444454B (en) | Dynamic community division method based on spectrum method | |
Chou et al. | Finding Maximal Quasi-cliques Containing a Target Vertex in a Graph. | |
CN101477689A (en) | Aerial robot vision layered matching process based adaptive ant colony intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190118 |
|
RJ01 | Rejection of invention patent application after publication |