CN110825935A - Community core character mining method, system, electronic equipment and readable storage medium - Google Patents

Community core character mining method, system, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN110825935A
CN110825935A CN201910914473.5A CN201910914473A CN110825935A CN 110825935 A CN110825935 A CN 110825935A CN 201910914473 A CN201910914473 A CN 201910914473A CN 110825935 A CN110825935 A CN 110825935A
Authority
CN
China
Prior art keywords
community
node
target user
calculating
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910914473.5A
Other languages
Chinese (zh)
Inventor
黄萍
张江华
潘飞
吕绪祥
刘世峰
高佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd
Original Assignee
FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd filed Critical FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd
Priority to CN201910914473.5A priority Critical patent/CN110825935A/en
Publication of CN110825935A publication Critical patent/CN110825935A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a community core character mining method, which comprises the following steps: acquiring a target user group under a target group number, and acquiring communication data in the target user group; cleaning and converting the communication data to construct a target user communication sequence; dividing the community structure of the target user group by using a Louvain algorithm through the communication data; making shortest paths among all nodes of the community through a Dijkstra algorithm, calculating the centrality of each node, and exploring a central node of a network graph; and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group. The whole process of the invention is unsupervised, namely the final structure completely depends on algorithm clustering, and manual advance preset classification is not needed. The method has no strict requirement on the size of the graph, can quickly converge after several iterations, and has high algorithm efficiency.

Description

Community core character mining method, system, electronic equipment and readable storage medium
Technical Field
The invention relates to the technical field of data mining, in particular to a community core character mining method and system, electronic equipment and a readable storage medium.
Background
The conversation is one of the most common contact ways in the modern people communication process, and the relation between people can be mined through the conversation. In the existing illegal organization such as distribution and marketing, people with illegal activities can be distinguished and core people in the illegal activities can be distinguished by using call records. The existing community discovery algorithm comprises:
and community division is realized by analyzing the similarity of user messages. The method comprises the steps of establishing a specific message content library, mapping to specific users by analyzing the similarity degree of user messages and the specific message content library, specifically classifying the users and setting corresponding weights, so as to judge core users. And dividing the core communication circle of the core user in the two-by-two connected communication networks by taking the core user as a node. However, this method requires a specific message library to be established by itself and analyzed for message similarity. For the method without too much information exchange, the community discovery and the group organization architecture analysis cannot be carried out. For the situation that the characteristics of the information to be exchanged are not obvious, the method cannot establish a message library with obvious characteristics, and the effect of the algorithm is greatly reduced. And only the message similarity analysis corresponding to the user can not carry out the alternating current frequency analysis, so that the core people in the group can not be mined.
The community division is realized by expanding the topological graph through communication data in the telephone communication network. And selecting the edge with the highest weight from the current topological graph, and regarding the edge and two nodes of the edge as an interaction circle. And searching the neighbor node with the maximum attribution degree of each user node in the topological graph, judging whether the attribution degree is greater than a preset value, if so, enlarging the interaction circle, and if not, stopping enlarging the interaction circle. For this approach, if the designated user is not the core of the clique, the expanded circle of interaction may not meet the requirements, and the process is too extensive to be developed.
Disclosure of Invention
The invention aims to provide a method and a device for accurately mining a group organization structure and a core character of a mobile user.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a community core character mining method comprises the following steps:
acquiring a target user group under a target group number, and acquiring communication data in the target user group;
cleaning and converting the communication data to construct a target user communication sequence;
dividing the community structure of the target user group by using a Louvain algorithm through the communication data;
making shortest paths among all nodes of the community through a Dijkstra algorithm, calculating the centrality of each node, and exploring a central node of a network graph;
and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group.
Preferably, the process of dividing the community structure of the target user group by using the Louvain algorithm through the communication data includes:
s31: each target user is taken as a node and belongs to a community;
s32: calculating the change delta Q of the modularity degree value of the whole network after any node i is merged into an adjacent community, merging the change delta Q into the community with the maximum delta Q value, and if the calculation result is negative, not changing the attributive community of i;
s33: repeating the step S32 until the node is transferred to another adjacent community in the network and the delta Q cannot be improved;
s34: and merging communities, compressing the obtained communities into nodes, and giving the sum of the edge weights of all node pairs in the original community as a new weight to each edge between the community nodes.
Preferably, the process of making the shortest path between the nodes of the community by dijkstra algorithm comprises:
s41: for each node in the community, let S ═ { v ═ v0},T={other nodes}
S42: calculating the distance from S to all the vertexes in T if viTo v0Has an arc of viTo v0The distance value of (1) is a weight value on the arc if v to v0Without arc, then viTo v0The distance value of (a) is infinite;
s43: selecting a vertex w with the minimum distance value from the T, and adding the vertex w into the set S;
s44: modifying the vertex distance values in the rest T, and if w is added as a middle vertex, from v to viIf the distance value is shortened, modifying the distance value;
s45: and repeating the steps S43 and S44 until all the vertexes are contained in S.
Preferably, the process of calculating the centrality of each node comprises:
for each pair of nodes (s, t) within the community, calculating all shortest paths between them;
for each pair of nodes (s, t) in the community, judging whether the node v is on the solved shortest path;
accumulating the shortest paths, and calculating the node betweenness centrality of the node v:
Figure BDA0002215676960000021
wherein, σ st is the shortest path number from s to t, and σ st (v) is the number of nodes v passing through in the shortest path from s to t;
and calculating the node betweenness centrality of all the nodes.
Preferably, the process of calculating the centrality of each node further comprises: calculating the betweenness centrality of each edge:
calculating all shortest paths between node pairs in the community;
judging whether the edge e is on the shortest path;
accumulating the shortest paths to obtain the betweenness centrality of the edge e
Figure BDA0002215676960000022
Where σ st is the number of all shortest paths in graph G; σ st (e) is the number of in-path paths that contain edge e;
and calculating the betweenness centrality of all edges.
Preferably, the process of calculating the volatility of each said central node comprises:
by calculating the standard deviation of node v from all other nodes in the network:
Figure BDA0002215676960000031
wherein:
Figure BDA0002215676960000032
and the smaller the standard deviation is, the smaller the fluctuation is, the closer the node is to the center, and the node is the community core.
Preferably, the step of obtaining the target user group under the target group number includes: filtering the number with invalid number state in the target user, associating a bill list table of the target user, acquiring a call list of the target user with valid state in one month, and generating a call list sequence of the target user, wherein the call list comprises: user identification, called number and call duration.
In a second aspect, the present invention further provides a system for mining community core people, including:
an acquisition module: acquiring a target user group under a target group number, and acquiring communication data in the target user group;
a cleaning module: cleaning and converting the communication data to construct a target user communication sequence;
a dividing module: dividing the community structure of the target user group by using a Louvain algorithm through the communication data;
a central node module: making shortest paths among all nodes of the community through a Dijkstra algorithm, calculating the centrality of each node, and exploring a central node of a network graph;
a core mining module: and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group.
In a third aspect, the present invention further provides an electronic device for community core people mining, including a memory, a processor and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the community core people mining method when executing the program.
In a fourth aspect, the present invention further provides a readable storage medium for community core persona mining, having a computer program stored thereon, the computer program being executed by a processor to implement the steps of the community core persona mining method described above.
The invention provides a method for judging the centrality of nodes and calculating the volatility of the nodes based on operator call big data by applying a Louvain algorithm and a Dijkstra (Dijkstra) algorithm, so that automatic community division and automatic judgment of community core personnel are realized. The community structure obtained by the algorithm is layered, a new graph obtained after each round of calculation is a result discovered by a plurality of subdivided communities in a large community, and the layered structure is the natural attribute of each grid, so that researchers can deeply know the internal structure and the formation mechanism of a certain community. The invention uses Louvain algorithm, and has no supervision in the whole process, namely the final structure completely depends on algorithm clustering, and artificial advance preset classification is not needed. The algorithm has better performance, almost has no upper limit requirement on the size of the graph in comparison of some classical community classification algorithms, and can quickly converge after several iterations, so the algorithm has higher efficiency.
Drawings
FIG. 1 is a flowchart of an embodiment of a community core people mining method of the present invention;
FIG. 2 is a flowchart of step S30 in FIG. 1;
FIG. 3 is a flowchart of step S40 in FIG. 1;
fig. 4 is a schematic diagram illustrating the principle of dividing the community structure of the target user group by the Louvain algorithm in the present invention.
Detailed Description
The following further describes embodiments of the present invention with reference to the drawings. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Referring to fig. 1, 2 and 3, an embodiment of the present invention provides a community core character mining method, including:
s10: acquiring a target user group under a target group number, and acquiring communication data in the target user group; filtering the number with invalid number state in the target user, associating the list table of the call ticket of the target user, obtaining the call list of the target user with valid state in one month, and generating a call list sequence of the target user, wherein the call list comprises: user identification, called number and call duration.
S20: cleaning and converting the communication data to construct a target user communication sequence;
s30: and dividing the community structure of the target user group by using a Louvain algorithm through the communication data.
Each target user is taken as a node and belongs to a community; calculating the change delta Q of the modularity degree value of the whole network after any node i is merged into an adjacent community, merging the change delta Q into the community with the maximum delta Q value, and if the calculation result is negative, not changing the attributive community of i; the previous step, until a node is transferred to another adjacent community in the network, the improvement of delta Q cannot be brought; and merging communities, compressing the obtained communities into nodes, and giving the sum of the edge weights of all node pairs in the original community as a new weight to each edge between the community nodes.
S40: making shortest paths among all nodes of the community through a Dijkstra algorithm, calculating the centrality of each node, and exploring a central node of a network graph;
for each node in the community, let S ═ { v ═ v0T ═ other nodes }; calculating the distance from S to all the vertexes in T if viTo v0Has an arc of viTo v0The distance value of (1) is a weight value on the arc if v to v0Without arc, then viTo v0The distance value of (a) is infinite; selecting a vertex w with the minimum distance value from the T, and adding the vertex w into the set S; modifying the vertex distance values in the rest T, and if w is added as a middle vertex, from v to viIf the distance value is shortened, modifying the distance value; the previous two steps are repeated until all vertices are contained in S.
The centrality of a technology node includes calculating the node centrality of all nodes and calculating the centrality of intermediaries of each edge.
The process of calculating the centrality of each node comprises:
for each pair of nodes (s, t) within the community, calculating all shortest paths between them;
for each pair of nodes (s, t) in the community, judging whether the node v is on the solved shortest path;
accumulating the shortest paths, and calculating the node betweenness centrality of the node v:
Figure BDA0002215676960000041
wherein, σ st is the shortest path number from s to t, and σ st (v) is the number of nodes v passing through in the shortest path from s to t;
the process of calculating the centrality of each node further comprises: calculating the betweenness centrality of each edge:
calculating all shortest paths between node pairs in the community;
judging whether the edge e is on the shortest path;
accumulating the shortest paths to obtain the betweenness centrality of the edge e
Where σ st is the number of all shortest paths in graph G; σ st (e) is the number of in-path paths that contain edge e;
and calculating the betweenness centrality of all edges.
S50: and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group. By calculating the standard deviation of node v from all other nodes in the network:
Figure BDA0002215676960000052
wherein:
Figure BDA0002215676960000053
and the smaller the standard deviation is, the smaller the fluctuation is, the closer the node is to the center, and the node is the community core.
The invention provides a method for judging the centrality of nodes and calculating the volatility of the nodes based on operator call big data by applying a Louvain algorithm and a Dijkstra (Dijkstra) algorithm, so that automatic community division and automatic judgment of community core personnel are realized. The community structure obtained by the algorithm is layered, a new graph obtained after each round of calculation is a result discovered by a plurality of subdivided communities in a large community, and the layered structure is the natural attribute of each grid, so that researchers can deeply know the internal structure and the formation mechanism of a certain community. The invention uses Louvain algorithm, and has no supervision in the whole process, namely the final structure completely depends on algorithm clustering, and artificial advance preset classification is not needed. The algorithm has better performance, almost has no upper limit requirement on the size of the graph in comparison of some classical community classification algorithms, and can quickly converge after several iterations, so the algorithm has higher efficiency.
In another embodiment of the invention, a target user group under a group number is taken out according to the group number, a user with a valid mobile phone state is limited, and then a user bill list table is associated; and finally, taking out the communication data of the target user. The test data access period is one month, and the statistical period can be selected subsequently according to specific conditions.
Performing community discovery, as shown in fig. 4, 1) initializing, and assuming that each node in the network belongs to a community; 2) calculating the change delta Q of the Q value of the whole network after any node i is merged into an adjacent community, finding out the community with the maximum change of the Q value, and if the change delta Q is negative, not changing the attributive community of i;
Figure BDA0002215676960000054
can be simplified as follows:
Figure BDA0002215676960000055
wherein k isi,inRepresents the sum of the weights incident on cluster C by node i, Σ tot represents the total weight of incident cluster C, kiRepresenting the total weight of the incident node i.
3) Repeating the step 2 until the Q value is not changed any more, namely, transferring one node to another adjacent community in the network, wherein the delta Q cannot be improved, and all nodes in the current network are not moved any more; 4) and merging communities, namely, compressing the original image in the step, taking each community obtained in the previous steps as a node of the new image, and giving a new weight to each edge of the new image by taking the sum of the edge weights of all node pairs in the original community as the new weight.
The steps comprise two stages: and (4) solving the optimal solution of the Q value, and merging communities obtained by the round of division to obtain a new graph. The two stages are called as one round, the algorithm automatically enters the first stage of the next round of calculation after the calculation of the one round is finished, the Q value of the finally obtained network is not increased after a plurality of iterations, the network at the moment is aggregated into a plurality of small communities with close internal connection and sparse external connection, and the algorithm is finished at the moment.
It should be noted that the Louvain algorithm is a community discovery algorithm based on modularity, and the algorithm performs well in efficiency and effect, and can discover a hierarchical community structure, and the optimization goal is to maximize the modularity of the whole community network.
Modulation, Modularity definition:
Figure BDA0002215676960000061
Ai,j=fre qi,j*log(∑time)
wherein A isi,jWeight, freq, of the edge representing the node connecting the nodes i, ji,jRepresenting the frequency of calls between the nodes i, j; time represents the duration of each call; m represents the number of edges in the network;
kirepresenting the sum of all the edge weights connected with the node i; c. CiIs the community to which the node i belongs; and σ (c)i,cj) When two variables in the function are the same, the value is 1, otherwise, the value is 0.
The community structure obtained by the Louvain algorithm is layered, a new graph obtained after each round of calculation is the result discovered for a plurality of subdivided communities in a large community, and the layered structure is the natural attribute of each grid, so that researchers can deeply know the internal structure and the formation mechanism of a certain community. The whole calculation process of the algorithm is unsupervised, namely the final structure completely depends on algorithm clustering, and manual advance preset classification is not needed. The algorithm has good performance, and in comparison of some classical community classification algorithms, the Louvain algorithm has almost no upper limit requirement on the size of the graph and can be quickly converged after generations fall for several times.
The core molecules are mined according to the social network graph of the community, and the problem is abstracted into the problem of mining the central node of the complex network graph.
The complex network can measure the connection mechanism through the centrality of the nodes, and reasonably explains the actual phenomenon. In the study of complex networks, different centrality definitions are adopted: degree centrality, node betweenness centrality, tight centrality, edge betweenness centrality, feature vector centrality, and the like.
The embodiment of the invention uses: node betweenness centrality and edge betweenness centrality; meanwhile, the centrality fluctuation of the positions of the nodes is customized according to needs.
First we make the shortest path of the graph, which means: and starting from a certain vertex in the graph, and one path with the smallest sum of the weights on all the paths which pass from the edge of the graph to the other vertex is selected.
In the embodiment of the invention, Dijkstra (Dijkstra) is used for calculating the shortest path from one node to all other nodes. The method is mainly characterized in that the expansion is carried out layer by layer towards the outer part by taking the starting point as the center until the end point is reached.
The algorithm comprises the following steps:
1. initially, let S be { v }0},T={other nodes};
2. Calculating the distance from S to all the vertexes in T:
if v isiTo v0With arc (i.e. from v)iTo v0Exists), the distance is a weight on the arc,
if v isiTo v0If the distance of (v) is not present, then viTo v0The distance value of (a) is infinite;
3. and selecting a vertex w with the minimum distance value from the T, and adding the vertex w into the set S.
4. And modifying the vertex distance values in the rest T: if w is added as a middle vertex and the distance value from v to vi is shortened, the distance value is modified.
5. Repeating the steps 3 and 4; until all vertices are contained in S.
The process of calculating the centrality of each node comprises:
for each pair of nodes (s, t) within the community, calculating all shortest paths between them;
for each pair of nodes (s, t) in the community, judging whether the node v is on the solved shortest path;
accumulating the shortest paths, and calculating the node betweenness centrality of the node v:
Figure BDA0002215676960000071
wherein, σ st is the shortest path number from s to t, and σ st (v) is the number of nodes v passing through in the shortest path from s to t;
the process of calculating the centrality of each node further comprises: calculating the betweenness centrality of each edge:
calculating all shortest paths between node pairs in the community;
judging whether the edge e is on the shortest path;
accumulating the shortest paths to obtain the betweenness centrality of the edge e
Figure BDA0002215676960000072
Wherein σstIs the number of all shortest paths in graph G; sigmast(e) Is the number of passes in the shortest path containing edge e;
and calculating the betweenness centrality of all edges.
S50: and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group. By calculating the standard deviation of node v from all other nodes in the network:
Figure BDA0002215676960000073
wherein:
Figure BDA0002215676960000074
and the smaller the standard deviation is, the smaller the fluctuation is, the closer the node is to the center, and the node is the community core.
The invention also provides a system for mining the community core character, which comprises the following steps:
an acquisition module: acquiring a target user group under a target group number, and acquiring communication data in the target user group;
a cleaning module: cleaning and converting the communication data to construct a target user communication sequence;
a dividing module: dividing the community structure of the target user group by using a Louvain algorithm through the communication data;
a central node module: making shortest paths among all nodes of the community through a Dijkstra algorithm, calculating the centrality of each node, and exploring a central node of a network graph;
a core mining module: and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group.
The system for mining the community core people can also realize the community core people mining method.
The invention also provides electronic equipment for mining the community core characters, which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the steps of the community core character mining method are realized when the processor executes the program.
The invention also proposes a readable storage medium for community core persona mining, on which a computer program is stored, the computer program being executed by a processor for implementing the steps of the community core persona mining method described above.
The invention provides a method for judging node centrality based on operator call big data by applying a Louvain algorithm and a Dijkstra (Dijkstra) algorithm, and realizing automatic community division and automatic judgment of community core personnel. Meanwhile, the results obtained by the algorithm are layered, a new graph obtained after each round of calculation is the result discovered by a plurality of subdivided communities in a large community, and the layered structure is the natural attribute of each grid, so that researchers can deeply know the internal structure and the formation mechanism of a certain community.
The invention brings about a plurality of beneficial effects: the community structure obtained by the algorithm is layered, a new graph obtained after each round of calculation is a result discovered by a plurality of subdivided communities in a large community, and the layered structure is a natural attribute of each grid, so that researchers can deeply know the internal structure and the formation mechanism of a certain community. The invention uses Louvain algorithm, and has no supervision in the whole process, namely the final structure completely depends on algorithm clustering, and artificial advance preset classification is not needed. The performance of the algorithm is good, in the comparison of some classical community classification algorithms, the Louvain algorithm has almost no upper limit requirement on the size of the graph, and can be quickly converged after several iterations, so the algorithm efficiency is high.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, and the scope of protection is still within the scope of the invention.

Claims (10)

1. A community core character mining method is characterized by comprising the following steps:
acquiring a target user group under a target group number, and acquiring communication data in the target user group;
cleaning and converting the communication data to construct a target user communication sequence;
dividing the community structure of the target user group by using a Louvain algorithm through the communication data;
making shortest paths among all nodes of the community through a Dijkstra algorithm, calculating the centrality of each node, and exploring a central node of a network graph;
and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group.
2. The community core character mining method according to claim 1, wherein the process of dividing the community structure of the target user group by using the Louvain algorithm through the communication data comprises:
s31: each target user is taken as a node and belongs to a community;
s32: calculating the change delta Q of the modularity degree value of the whole network after any node i is merged into an adjacent community, merging the change delta Q into the community with the maximum delta Q value, and if the calculation result is negative, not changing the attributive community of i;
s33: repeating the step S32 until the node is transferred to another adjacent community in the network and the delta Q cannot be improved;
s34: and merging communities, compressing the obtained communities into nodes, and giving the sum of the edge weights of all node pairs in the original community as a new weight to each edge between the community nodes.
3. The community core character mining method according to any one of claims 1 or 2, wherein the process of making the shortest path between the community nodes by dijkstra algorithm comprises:
s41: for each node in the community, let S ═ { v ═ v0},T={other nodes};
S42: calculating the distance from S to all the vertexes in T if viTo v0With an arc, then vi to v0The distance value of (1) is a weight value on the arc if v to v0Without arc, then viTo v0The distance value of (a) is infinite;
s43: selecting a vertex w with the minimum distance value from the T, and adding the vertex w into the set S;
s44: modifying the vertex distance values in the rest T, and if w is added as a middle vertex, from v to viIf the distance value is shortened, modifying the distance value;
s45: and repeating the steps S43 and S44 until all the vertexes are contained in S.
4. The community core character mining method according to claim 3, wherein the process of calculating the centrality of each node comprises:
for each pair of nodes (s, t) within the community, calculating all shortest paths between them;
for each pair of nodes (s, t) in the community, judging whether the node v is on the solved shortest path;
accumulating the shortest paths, and calculating the node betweenness centrality of the node v:
wherein, σ st is the shortest path number from s to t, and σ st (v) is the number of nodes v passing through in the shortest path from s to t;
and calculating the node betweenness centrality of all the nodes.
5. The community core character mining method according to claim 3, wherein the process of calculating the centrality of each node further comprises: calculating the betweenness centrality of each edge:
calculating all shortest paths between node pairs in the community;
judging whether the edge e is on the shortest path;
accumulating the shortest paths to obtain the betweenness centrality of the edge e
Figure FDA0002215676950000021
Wherein σstIs the number of all shortest paths in graph G; sigmast(e) Is the number of passes in the shortest path containing edge e;
and calculating the betweenness centrality of all edges.
6. The community core character mining method according to claim 5, wherein the process of calculating the volatility of each of the central nodes comprises:
by calculating the standard deviation of node v from all other nodes in the network:
wherein:
and the smaller the standard deviation is, the smaller the fluctuation is, the closer the node is to the center, and the node is the community core.
7. The community core character mining method according to claim 1, wherein a target user group under a target group number is obtained, and the process of obtaining the communication data inside the target user group comprises: filtering the number with invalid number state in the target user, associating a bill list table of the target user, acquiring a call list of the target user with valid state in one month, and generating a call list sequence of the target user, wherein the call list comprises: user identification, called number and call duration.
8. A system for community core persona mining, comprising:
an acquisition module: acquiring a target user group under a target group number, and acquiring communication data in the target user group;
a cleaning module: cleaning and converting the communication data to construct a target user communication sequence;
a dividing module: dividing the community structure of the target user group by using a Louvain algorithm through the communication data;
a central node module: making shortest paths among all nodes of the community through a Dijkstra algorithm, calculating the centrality of each node, and exploring a central node of a network graph;
a core mining module: and calculating the volatility of each central node, wherein the central node with small volatility is the community core of the target user group.
9. An electronic device for community core persona mining, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor when executing the program performs the steps of the community core people mining method of any of claims 1-7.
10. A readable storage medium for community core persona mining, having a computer program stored thereon, characterized by: the computer program is executed by a processor to perform the steps of the community core persona mining method of any one of claims 1-7.
CN201910914473.5A 2019-09-26 2019-09-26 Community core character mining method, system, electronic equipment and readable storage medium Pending CN110825935A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910914473.5A CN110825935A (en) 2019-09-26 2019-09-26 Community core character mining method, system, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910914473.5A CN110825935A (en) 2019-09-26 2019-09-26 Community core character mining method, system, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN110825935A true CN110825935A (en) 2020-02-21

Family

ID=69548395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910914473.5A Pending CN110825935A (en) 2019-09-26 2019-09-26 Community core character mining method, system, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN110825935A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949835A (en) * 2020-07-13 2020-11-17 北京明略软件系统有限公司 Data processing method and device
CN112100427A (en) * 2020-09-03 2020-12-18 Oppo广东移动通信有限公司 Video processing method and device, electronic equipment and storage medium
CN113761080A (en) * 2021-04-01 2021-12-07 京东城市(北京)数字科技有限公司 Community division method, device, equipment and storage medium
CN114547143A (en) * 2022-02-15 2022-05-27 支付宝(杭州)信息技术有限公司 Core business object mining method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102202012A (en) * 2011-05-30 2011-09-28 中国人民解放军总参谋部第五十四研究所 Group dividing method and system of communication network
CN103020302A (en) * 2012-12-31 2013-04-03 中国科学院自动化研究所 Academic core author excavation and related information extraction method and system based on complex network
CN108509551A (en) * 2018-03-19 2018-09-07 西北大学 A kind of micro blog network key user digging system under the environment based on Spark and method
CN108509607A (en) * 2018-04-03 2018-09-07 三盟科技股份有限公司 A kind of community discovery method and system based on Louvain algorithms
US20190044821A1 (en) * 2017-08-01 2019-02-07 Elsevier, Inc. Systems and methods for extracting structure from large, dense, and noisy networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102202012A (en) * 2011-05-30 2011-09-28 中国人民解放军总参谋部第五十四研究所 Group dividing method and system of communication network
CN103020302A (en) * 2012-12-31 2013-04-03 中国科学院自动化研究所 Academic core author excavation and related information extraction method and system based on complex network
US20190044821A1 (en) * 2017-08-01 2019-02-07 Elsevier, Inc. Systems and methods for extracting structure from large, dense, and noisy networks
CN108509551A (en) * 2018-03-19 2018-09-07 西北大学 A kind of micro blog network key user digging system under the environment based on Spark and method
CN108509607A (en) * 2018-04-03 2018-09-07 三盟科技股份有限公司 A kind of community discovery method and system based on Louvain algorithms

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张玉琢: "《数据结构实验教程》", 31 August 2018 *
杨济海: "基于复杂网络的电力通信网拓扑分析与优化", 《计算机与数字工程》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949835A (en) * 2020-07-13 2020-11-17 北京明略软件系统有限公司 Data processing method and device
CN112100427A (en) * 2020-09-03 2020-12-18 Oppo广东移动通信有限公司 Video processing method and device, electronic equipment and storage medium
CN113761080A (en) * 2021-04-01 2021-12-07 京东城市(北京)数字科技有限公司 Community division method, device, equipment and storage medium
CN114547143A (en) * 2022-02-15 2022-05-27 支付宝(杭州)信息技术有限公司 Core business object mining method and device

Similar Documents

Publication Publication Date Title
CN110825935A (en) Community core character mining method, system, electronic equipment and readable storage medium
CN109978142A (en) The compression method and device of neural network model
CN105592405B (en) The mobile communication subscriber group configuration method propagated based on factions' filtering and label
CN107784327A (en) A kind of personalized community discovery method based on GN
CN113190939B (en) Large sparse complex network topology analysis and simplification method based on polygon coefficient
CN104391879B (en) The method and device of hierarchical clustering
CN110913405B (en) Intelligent communication system testing method and system based on scene grading and evaluation feedback
Barazandeh et al. A decentralized adaptive momentum method for solving a class of min-max optimization problems
Hennessey et al. A simplification algorithm for visualizing the structure of complex graphs
CN116012161A (en) Risk analysis method, device and equipment for user group
CN115309985A (en) Fairness evaluation method and AI model selection method of recommendation algorithm
Han et al. Opportunistic coded distributed computing: An evolutionary game approach
CN108738028A (en) A kind of cluster-dividing method that super-intensive group is off the net
CN111339376B (en) Method and device for clustering network nodes
CN103051476A (en) Topology analysis-based network community discovery method
CN113626657A (en) Method for discovering densely connected sub-networks by multi-value attribute graph structure
CN108737158B (en) Social network hierarchical community discovery method and system based on minimum spanning tree
CN113657136A (en) Identification method and device
CN109886313A (en) A kind of Dynamic Graph clustering method based on density peak
US20140126820A1 (en) Local Image Translating Method and Terminal with Touch Screen
Beddar-Wiesing Student Research Abstract: Using Local Activity Encoding for Dynamic Graph Pooling in Stuctural-Dynamic Graphs
Beddar-Wiesing Using local activity encoding for dynamic graph pooling in stuctural-dynamic graphs: student research abstract
CN115086179B (en) Detection method for community structure in social network
CN112256924A (en) Social network structure identification method based on form concept interestingness
Baskakov et al. Modeling of the Multiple Paths Finding Algorithm for Software-Defined Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200221