CN110247805B - Method and device for identifying propagation key nodes based on K-shell decomposition - Google Patents

Method and device for identifying propagation key nodes based on K-shell decomposition Download PDF

Info

Publication number
CN110247805B
CN110247805B CN201910550676.0A CN201910550676A CN110247805B CN 110247805 B CN110247805 B CN 110247805B CN 201910550676 A CN201910550676 A CN 201910550676A CN 110247805 B CN110247805 B CN 110247805B
Authority
CN
China
Prior art keywords
node
nodes
propagation
network
shell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910550676.0A
Other languages
Chinese (zh)
Other versions
CN110247805A (en
Inventor
钱琳
俞俊
朱广新
郭云涛
房涛
庞恒茂
许明杰
王琳
梅竹
陈海洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NARI Group Corp
Nari Technology Co Ltd
State Grid Shaanxi Electric Power Co Ltd
Original Assignee
NARI Group Corp
Nari Technology Co Ltd
State Grid Shaanxi Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NARI Group Corp, Nari Technology Co Ltd, State Grid Shaanxi Electric Power Co Ltd filed Critical NARI Group Corp
Priority to CN201910550676.0A priority Critical patent/CN110247805B/en
Publication of CN110247805A publication Critical patent/CN110247805A/en
Application granted granted Critical
Publication of CN110247805B publication Critical patent/CN110247805B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/52User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services

Abstract

The invention discloses a method and a device for identifying propagation key nodes based on K shell decomposition, wherein the method comprises the following steps: establishing a propagation network by collecting message forwarding of a social platform, and determining an individual forwarding a message in the propagation network as a node; obtaining the number of connecting edges corresponding to each node according to the friend list data; calculating K shell indexes of each node according to the node degree; calculating the shortest distance between each pair of nodes in the propagation network, wherein the shortest distance between the nodes is used for representing the propagation position of the individual in the social network; and calculating the ranking corresponding score of each node according to the shortest distance between the K-shell index and the nodes, and further obtaining a key propagator in the propagation network. The method and the device can more accurately position the node in the network, accurately excavate key propagators in the social network, and reduce the misjudgment rate.

Description

Method and device for identifying propagation key nodes based on K-shell decomposition
Technical Field
The invention relates to the field of network information mining, in particular to a method and a device for identifying propagation key nodes based on K-shell decomposition.
Background
Today in the network society, the message spreading is very rapid, and the key nodes in the spreading are special individuals which can affect the structure and function of the network to a greater extent, while the key nodes in the rumor spreading network refer to nodes which can maximally accelerate the rumor spreading. For example, microbolog V can accelerate rumor spread. Therefore, we need to accurately discover key propagators in the rumor propagation network from a large number of users, which helps us to better control the rumor propagation.
At present, most of rumor propagation adopts a mode of manual deletion, and some people adopt a node ordering method for a rumor propagation network, but only adopt a single method to determine the position or influence degree of a node, but the node at the center of the network may also appear at the edge of the network, the node position with high influence degree may also be biased, and the accuracy of identifying key propagators of the rumor propagation network is low.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects of the prior art, the invention provides a method for identifying propagation key nodes based on K-shell decomposition, which can solve the problem of error evaluation of high-K-shell nodes at the edge of a network and more accurately mine key propagators of a propagation network.
The technical scheme is as follows: the invention discloses a method for identifying propagation key nodes based on K-shell decomposition, which comprises the following steps:
establishing a transmission network by collecting message transmission of a social platform, determining a node set according to an individual transmitting a message in the transmission network, and collecting friend list data corresponding to each node in the node set;
obtaining the number of the directly connected edges corresponding to each node according to the friend list data; if the individuals are in a friend relationship, a direct connection edge exists between the two corresponding nodes;
determining the degree of each node in the propagation network according to the number of the directly connected edges of each node, and calculating a K-shell index of each node according to the degree of each node, wherein the K-shell index is used for representing the influence of individuals in the propagation network;
and calculating the shortest distance between each pair of nodes in the propagation network by using a Floyd algorithm, and calculating the ranking corresponding score of each node according to the K shell index and the shortest distance so as to obtain the propagation key nodes in the propagation network.
Further, comprising:
the determining the degree of each node in the propagation network according to the number of the directly connected edges of each node specifically includes: the degree of a node in the propagation network is equal to the number of directly successive edges of the node.
Further, comprising:
the K shell index of each node is calculated according to the node degree, and the method specifically comprises the following steps:
obtaining degrees of each node in a node set S in the propagation network, wherein the node set S is { S }1,s2,...,snN is the total number of nodes;
traversing the node set S, searching all nodes with the degree of 1, deleting the corresponding nodes and the straight edges of the nodes, and storing the corresponding nodes into the set S1Said set S1K-shell index of each node in the tree
Figure GDA0003281752530000021
Traversing the node set S, searching all nodes with the degree of 2, and deleting the corresponding nodes and the nodesThe connected edges store the corresponding nodes in a set S2Said set S2K-shell index of each node in the tree
Figure GDA0003281752530000022
Repeatedly traversing the node set S, adding 1 to the searched degree each time until all nodes corresponding to the maximum degree n in the set are searched, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set SnSaid set SnK-shell index of each node in the tree
Figure GDA0003281752530000023
Further, comprising:
the calculating the shortest distance between each pair of nodes in the propagation network by using the Floyd algorithm comprises the following steps:
acquiring a network adjacency matrix G corresponding to the propagation network;
initializing a distance matrix D according to an adjacency matrix G, and if a path from a node i to a node j can be reached, setting the distance matrix D [ i ] [ j ] ═ D, wherein D represents the length of the path; otherwise, D [ i ] [ j ] becomesinfinite, wherein i is more than or equal to 1 and less than j and less than or equal to n;
defining a matrix L to record the information of the inserted node, wherein L [ i ] [ j ] represents the node which needs to pass from the node i to the node j, and initializing L [ i ] [ j ] ═ j; inserting other nodes into the paths from i to j, and comparing the distance from i to j after the nodes are inserted with the original distance, wherein the distance is expressed as: d [ i ] [ j ] ═ min (D [ i ] [ j ], D [ i ] [ k ] + D [ k ] [ j ]), and if the value of D [ i ] [ j ] becomes smaller, L [ i ] [ j ] ═ k; and repeating the operation until the D is not updated any more, and outputting the distance matrix D.
Further, comprising:
the formula for calculating the ranking corresponding score of each node is as follows:
Figure GDA0003281752530000024
wherein, KS(t) is the t-th sectionCorresponding K shell index, di,tRepresenting the shortest distance between nodes i and t.
An apparatus for identifying propagation critical nodes based on K-shell decomposition, comprising:
the system comprises an acquisition module, a transmission module and a processing module, wherein the acquisition module is used for establishing a transmission network by acquiring message transmission of a social platform, determining a node set according to an individual transmitting a message in the transmission network, and acquiring friend list data corresponding to each node in the node set;
the direct connection edge calculation module is used for obtaining the number of the direct connection edges corresponding to each node according to the friend list data; if the individuals are in a friend relationship, a direct connection edge exists between the two corresponding nodes;
the K-shell index calculation module is used for determining the degree of each node in the propagation network according to the number of the directly connected edges of each node, and calculating the K-shell index of each node according to the degree of each node, wherein the K-shell index is used for representing the influence of individuals in the propagation network;
and the score calculation module is used for calculating the shortest distance between each pair of nodes in the propagation network by adopting a Floyd algorithm, and calculating the ranking corresponding score of each node according to the K shell index and the shortest distance so as to obtain the propagation key nodes in the propagation network.
Further, comprising:
in the K-shell index calculation module, the degree of the nodes in the propagation network is equal to the number of the directly connected edges of the nodes.
Further, comprising:
the K-shell index calculation module further includes:
obtaining degrees of each node in a node set S in the propagation network, wherein the node set S is { S }1,s2,...,snN is the total number of nodes;
traversing the node set S, searching all nodes with the degree of 1, deleting the corresponding nodes and the straight edges of the nodes, and storing the corresponding nodes into the set S1Said set S1K-shell index of each node in the tree
Figure GDA0003281752530000031
Traversing the node set S, searching all nodes with the degree of 2, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set S2Said set S2K-shell index of each node in the tree
Figure GDA0003281752530000032
Repeatedly traversing the node set S, adding 1 to the searched degree each time until all nodes corresponding to the maximum degree n in the set are searched, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set SnSaid set SnK-shell index of each node in the tree
Figure GDA0003281752530000033
Further, comprising:
the calculating the shortest distance between each pair of nodes in the propagation network by using the Floyd algorithm further comprises:
acquiring a network adjacency matrix G corresponding to the propagation network;
initializing a distance matrix D according to an adjacency matrix G, and if a path from a node i to a node j can be reached, setting the distance matrix D [ i ] [ j ] ═ D, wherein D represents the length of the path; otherwise, D [ i ] [ j ] becomesinfinite, wherein i is more than or equal to 1 and less than j and less than or equal to n;
defining a matrix L to record the information of the inserted node, wherein L [ i ] [ j ] represents the node which needs to pass from the node i to the node j, and initializing L [ i ] [ j ] ═ j; inserting other nodes into the paths from i to j, and comparing the distance from i to j after the nodes are inserted with the original distance, wherein the distance is expressed as: d [ i ] [ j ] ═ min (D [ i ] [ j ], D [ i ] [ k ] + D [ k ] [ j ]), and if the value of D [ i ] [ j ] becomes smaller, L [ i ] [ j ] ═ k; and repeating the operation until the D is not updated any more, and outputting the distance matrix D.
Further, comprising:
in the score calculation module, a formula for calculating the ranking corresponding score of each node is as follows:
Figure GDA0003281752530000041
wherein, KS(t) is the K shell index corresponding to the t-th node, di,tRepresenting the shortest distance between nodes i and t.
Has the advantages that: the invention improves the common indexes of the network nodes to obtain the ranking score which can comprehensively reflect the roles of the key propagation nodes in the propagation network, can more accurately position the nodes in the network, accurately excavates key propagators in the social network and reduces the misjudgment rate; in addition, the Floyd algorithm is adopted to calculate the distance between the nodes, compared with other distance algorithms, the time complexity is low, the time cost is lower when the method is suitable for a complex social network, and the identification efficiency is improved.
Drawings
FIG. 1 is a flow chart of an identification method in an embodiment of the present invention;
FIG. 2 is a network node connection diagram according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a structure of an identification device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device in an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In addition to the number of individuals that can be directly influenced, whether the position of the node where the node is located is sufficiently central determines the influence of the node on the propagation, so the node ranking method needs to consider the influence of the node itself (the number of individuals with whom the node is closely related in the social network) and the position of the node in the network (the position of the node where the node is located in the social network). The current classical node sequencing methods do not consider the roles of the two aspects at the same time, so that the roles of the nodes in the network cannot be comprehensively evaluated by applying the existing methods.
The k-shell decomposition method is widely used due to its low computational complexity and excellent recognition effect. Often, a node with a larger k shell is located in the center of the network, but may also appear at the edge of the network, and the k shell decomposition method still considers the node to be important, but the fact is not.
In order to solve the problems, the invention designs a new node sequencing method by integrating node path information into a k-shell decomposition method to improve the identification accuracy, corrects the error evaluation of the k-shell decomposition method on high k-shell nodes at the edge of the network, more accurately excavates key propagators of the network to be propagated, and has great application value.
Referring to fig. 1, the present invention provides a method for identifying propagation key nodes based on K-shell decomposition, which specifically includes:
s110, establishing a propagation network by collecting message forwarding of a social platform, determining a node set according to individuals forwarding messages in the propagation network, and collecting friend list data corresponding to each node in the node set, wherein the node set S is { S ═ S1,s2,...,snN is the total number of nodes.
In the embodiment of the invention, a specific network is not limited, the whole chain for forwarding a certain message on a microblog or other social platform can be collected, and a propagation network is established according to the whole chat.
S120, obtaining the number of connecting edges corresponding to each node according to the friend list data; if the individuals are in a friend relationship, a direct connection edge exists between the two corresponding nodes;
specifically, the method does not limit specific social software, if the direct friend relationship exists in the friend list of the individual, the friend list relationship in the corresponding database in the social software is acquired, if the friend relationship exists between the two individuals, a direct connection edge exists, if the friend relationship does not exist between the two individuals, the direct connection edge does not exist, and the individuals which are not friends are mutually influenced by the individuals which are friends at the same time.
S130, determining the degree of each node in the propagation network according to the number of the direct connection edges of each node, and calculating the K-shell index of each node according to the degree of the node, wherein the K-shell index is used for representing the direct influence of individuals in the propagation network.
The number of the directly connected edges corresponding to each node is the degree corresponding to the node and is represented as K (i), wherein i is the network node number, i is more than or equal to 1 and less than or equal to n, n is the total number of the nodes in the propagation network, and meanwhile, the node with the directly connected edge is a neighbor node corresponding to the node.
The calculating the K-shell index of each node according to the degree of the node specifically includes:
inputting: the degree of each node in the node set S in the propagation network;
the method comprises the following steps: s1 traversing the node set S, searching all nodes with degree of 1, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes in the set S1Said set S1K-shell index of each node in the tree
Figure GDA0003281752530000061
Figure GDA0003281752530000062
S2 traversing the node set S, searching all nodes with degree of 2, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes in the set S2Said set S2K-shell index of each node in the tree
Figure GDA0003281752530000063
S3 iterates through the set of nodes S, and the degree of search is incremented by 1 each time,until all nodes corresponding to the maximum degree n in the set are found, the corresponding nodes and the edges connected with the nodes are deleted, and the corresponding nodes are stored in a set SnSaid set SnK-shell index of each node in the tree
Figure GDA0003281752530000064
And (3) outputting: set of nodes S1,S2,...,SnAnd K shell index of corresponding node
Figure GDA0003281752530000065
S140, calculating the shortest distance between each pair of nodes in the propagation network by using a Floyd algorithm, wherein the shortest distance between the nodes is used for correcting the K-shell index of each node.
For example, a high-k shell node appearing at an edge of the network, whose influence is overestimated due to its great distance from other nodes of the network, should be given a lower weight, and we can describe this weight by the reciprocal of the distance between nodes, because the longer the distance, the smaller the reciprocal of the distance, the less important the node is.
The calculating the shortest distance between each pair of nodes in the propagation network by using the Floyd algorithm comprises the following steps:
inputting: a network adjacency matrix G corresponding to the propagation network;
the method comprises the following steps: (1) initializing a distance matrix D according to an adjacency matrix G, and if a path from a node i to a node j can be reached, setting the distance matrix D [ i ] [ j ] ═ D, wherein D represents the length of the path; otherwise, D [ i ] [ j ] becomesinfinite, wherein i is more than or equal to 1 and less than j and less than or equal to n;
(2) defining a matrix L to record the information of the inserted node, wherein L [ i ] [ j ] represents the node which needs to pass from the node i to the node j, and initializing L [ i ] [ j ] ═ j; inserting other nodes into the path of i → j, comparing the distance of i → j after inserting the nodes with the original distance, namely D [ i ] [ j ] min (D [ i ] [ j ], D [ i ] [ k ] + D [ k ] [ j ]), and if the value of D [ i ] [ j ] is reduced, L [ i ] [ j ] ═ k; repeating the above operations until D is not updated any more;
and (3) outputting: a distance matrix D between nodes in the network.
Further, comprising:
the ranking score formula of each node is as follows:
Figure GDA0003281752530000071
wherein, KS(t) is the K shell index corresponding to the t-th node, di,tRepresents the shortest distance between nodes i and t
S150, calculating ranking corresponding scores of all the nodes according to the K shell indexes and the shortest distance between the nodes, and further obtaining key propagators in the propagation network.
It should be noted that the method flowchart in the embodiment of the present invention is to more clearly illustrate the technical solution in the embodiment of the present invention, and is not limited to the technical solution provided in the embodiment of the present invention, and the embodiment of the present invention is also not limited to the application of a social network, and for other system structures and business applications, the technical solution provided in the embodiment of the present invention is also applicable to similar problems.
The following detailed description of the embodiments of the present invention will be made with reference to the example network of fig. 2.
Step 1: and calculating K shell indexes of all nodes. The nodes (7, 8, 10) with the degree of 1 in the network and the connecting edges thereof are removed, and the nodes with the degree of 1 in the network are not available, so that the k-shell index of the nodes 7, 8, 10 is 1, and the network only comprises seven nodes (1, 2, 3, 4, 5, 6, 9). And then removing nodes (5, 6, 9) and connecting edges thereof with the degree of 2 in the network, wherein the nodes (3, 4) with the degree of 2 still exist in the network, removing the nodes 3, 4 and connecting edges thereof, and remaining nodes (1, 2) with the degree of less than or equal to 2 still exist in the network, so that the nodes 1, 2 are removed, no nodes exist in the network at this time, and the k shell indexes of the nodes 1, 2, 3, 4, 5, 6, 9 are all 2. The K shell indices for all nodes are given in table 1:
TABLE 1 k-Shell indices for each node
Figure GDA0003281752530000072
Step 2: calculating the shortest distance d between each pair of nodes in the networkij
The input is the corresponding network adjacency matrix, and the output is the distance matrix, and the corresponding adjacency matrix in this embodiment is represented as:
0 1 1 1 0 0 0 0 0 0
1 0 1 1 0 0 0 0 1 0
1 1 0 0 0 0 0 0 1 1
1 1 0 0 1 1 0 0 0 0
0 0 0 1 0 1 1 1 0 0
0 0 0 1 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 1 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
taking node 1 as an example, since there are straight edges connecting with nodes 2, 3, and 4, the corresponding position is 1, and since there are no straight edges connecting with the remaining six nodes, the corresponding position is 0.
See table 2: the rows and columns in the table are the corresponding numbers of the nodes, and the output values correspond to the shortest distance matrix between the nodes, taking the node pair (1, 6) as an example, because the node 1 and the node 6 are not directly adjacent (a)1,60), the initial distance is set to infinity; then node 2 is inserted between nodes 1, 6, node 1 and node 6 cannot reach each other; inserting node 3, node 1 and node 6 still cannot reach each other; inserting node 4, node 1 and node 6 can reach each other, so the shortest distance between them is updated to 2; nodes 5, 6, 7, 8, 9 and 10 are added in sequence, and the shortest distance between the node 1 and the node 6 is updated, so d1,6=2。
TABLE 2 shortest distance between pairs of nodes
Figure GDA0003281752530000081
And step 3: and calculating the ranking score of each node, wherein the higher the ranking score is, the more important the node is.
Taking node 1 as an example:
Figure GDA0003281752530000082
the scores for all nodes are shown in table 3:
TABLE 3 score of each node
Figure GDA0003281752530000091
From the scores of table 3, a final node importance ranking can be obtained, see table 4:
TABLE 4 importance ranking of nodes
Figure GDA0003281752530000092
As can be seen from table 4, node 4 is the most important. It can be readily seen from fig. 1 that the node 4 is connected to two small communities on the left and right, and the node 4 in the "core network location" is indeed the most important node. Reflecting to the real propagation network, the node 4 has the best message propagation position and can not be used as a medium for communication between two groups. Compared with other nodes, such as node 5, a longer message propagation path is needed to reach the right group. If node 4 acts as a message spreader, more specifically, rumors will spread rapidly across the two groups around it, speeding up rumor propagation.
Based on the above embodiment, referring to fig. 3, in an embodiment of the present invention, an apparatus for identifying propagation key nodes based on K-shell decomposition includes:
the acquisition module 21 establishes a propagation network by acquiring message forwarding of the social platform, numbers each node with an individual forwarding a message in the propagation network as a node to obtain a node set S ═ S1,s2,...,snAnd collecting each node in the node setFriend list data corresponding to each node, wherein n is the total number of the nodes;
a direct connecting edge calculating module 22, configured to obtain the number of connecting edges corresponding to each node according to the buddy list data; if the individuals are in a friend relationship, a direct connection edge exists between the two corresponding nodes;
a K-shell index calculation module 23, configured to determine degrees of each node in the propagation network according to the number of directly-connected edges of each node, and calculate a K-shell index of each node according to the degrees of the nodes, where the K-shell index is used to represent a direct influence of an individual in the propagation network;
the shortest distance calculation module 24 is configured to calculate a shortest distance between each pair of nodes in the propagation network by using a Floyd algorithm, where the shortest distance between the nodes is used to represent a propagation position of an individual in a social network;
and the score calculating module 25 is configured to calculate a ranking corresponding score of each node according to the shortest distance between the K-shell index and the node, so as to obtain a key propagator in the propagation network.
Further, comprising:
in the K-shell index calculation module 23, the degree of a node in the propagation network is equal to the number of directly connected edges of the node.
Further, comprising:
the K-shell index calculating module 23 further includes:
an input unit, configured to input degrees of each node in a node set S in the propagation network;
a traversal unit for first traversing the node set S, searching all nodes with the degree of 1, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set S1Said set S1K-shell index of each node in the tree
Figure GDA0003281752530000101
Secondly, traversing the node set S, searching all nodes with the degree of 2, deleting the corresponding nodes and the edges connected with the nodes,storing the corresponding node to a set S2Said set S2K-shell index of each node in the tree
Figure GDA0003281752530000102
Figure GDA0003281752530000103
And finally, repeatedly traversing the node set S, adding 1 to the searched degree each time until all nodes corresponding to the maximum degree n in the set are searched, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set SnSaid set SnK shell index K of each node insn(m)=n,m∈Sn
Output unit, output node set S1,S2,...,SnAnd K shell index of corresponding node
Figure GDA0003281752530000104
Further, comprising:
the shortest distance calculating module 24 further includes:
an input unit that inputs a network adjacency matrix G corresponding to the propagation network;
a calculating unit, preferably, initializing a distance matrix D according to the adjacency matrix G, and if a path from the node i to the node j is reachable, the distance matrix D [ i ] [ j ] ═ D, and D represents the length of the path; otherwise, D [ i ] [ j ] becomesinfinite, wherein i is more than or equal to 1 and less than j and less than or equal to n;
secondly, defining a matrix L for recording the information of the inserted node, wherein L [ i ] [ j ] represents the node which needs to pass from the node i to the node j, and initializing L [ i ] [ j ] ═ j; inserting other nodes into the path of i → j, comparing the distance of i → j after inserting the nodes with the original distance, namely D [ i ] [ j ] min (D [ i ] [ j ], D [ i ] [ k ] + D [ k ] [ j ]), and if the value of D [ i ] [ j ] is reduced, L [ i ] [ j ] ═ k; repeating the above operations until D is not updated any more;
and the output unit is used for outputting the distance matrix D between the nodes in the network.
Further, comprising:
in the score calculation module 25, the ranking score formula of each node is:
Figure GDA0003281752530000111
wherein, KS(t) is the K shell index corresponding to the t-th node, di,tRepresenting the shortest distance between nodes i and t.
Referring to fig. 4, in an embodiment of the invention, a structural diagram of an electronic device is shown.
An embodiment of the present invention provides an electronic device, which may include a processor 310 (CPU), a memory 320, an input device 330, an output device 340, and the like, wherein the input device 330 may include a keyboard, a mouse, a touch screen, and the like, and the output device 340 may include a Display device, such as a Liquid Crystal Display (LCD), a Cathode Ray Tube (CRT), and the like.
Memory 320 may include Read Only Memory (ROM) and Random Access Memory (RAM), and provides processor 310 with program instructions and data stored in memory 320. In an embodiment of the present invention, the memory 320 may be used to store a program of the above method for identifying propagation key nodes based on K-shell decomposition.
The processor 310 is configured to execute the steps of any one of the methods for identifying propagation critical nodes based on K-shell decomposition according to the obtained program instructions by calling the program instructions stored in the memory 320.
Based on the foregoing embodiments, in the embodiments of the present invention, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the method for identifying propagation key nodes based on K-shell decomposition in any of the above method embodiments.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (8)

1. A method for identifying propagation key nodes based on K-shell decomposition is characterized by comprising the following steps:
establishing a transmission network by collecting message transmission of a social platform, determining a node set according to an individual transmitting a message in the transmission network, and collecting friend list data corresponding to each node in the node set;
obtaining the number of the directly connected edges corresponding to each node according to the friend list data; if the individuals are in a friend relationship, a direct connection edge exists between the two corresponding nodes;
determining the degree of each node in the propagation network according to the number of the directly connected edges of each node, and calculating a K-shell index of each node according to the degree of each node, wherein the K-shell index is used for representing the influence of individuals in the propagation network;
calculating the shortest distance between each pair of nodes in the propagation network by using a Floyd algorithm, and calculating the ranking corresponding score of each node according to the K shell index and the shortest distance so as to obtain a propagation key node in the propagation network;
the formula for calculating the ranking corresponding score of each node is as follows:
Figure FDA0003281752520000011
wherein, KS(t) is the K shell index corresponding to the t-th node, di,tRepresenting nodes i and tThe shortest distance therebetween.
2. The method for identifying propagation key nodes based on K-shell decomposition as claimed in claim 1, wherein the degree of each node in the propagation network is determined according to the number of directly successive edges of each node, specifically: the degree of a node in the propagation network is equal to the number of directly successive edges of the node.
3. The method for identifying propagation key nodes based on K-shell decomposition as claimed in claim 2, wherein the K-shell index of each node is calculated according to the degree of the node, and the specific steps include:
obtaining degrees of each node in a node set S in the propagation network, wherein the node set S is { S }1,s2,...,snN is the total number of nodes;
traversing the node set S, searching all nodes with the degree of 1, deleting the corresponding nodes and the straight edges of the nodes, and storing the corresponding nodes into the set S1Said set S1K shell index K of each node ins1(p)=1,p∈S1
Traversing the node set S, searching all nodes with the degree of 2, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set S2Said set S2K shell index K of each node ins2(q)=2,q∈S2
Repeatedly traversing the node set S, adding 1 to the searched degree each time until all nodes corresponding to the maximum degree n in the set are searched, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set SnSaid set SnK-shell index of each node in the tree
Figure FDA0003281752520000012
m∈Sn
4. The K-shell decomposition-based method for identifying propagation critical nodes of claim 3,
the calculating the shortest distance between each pair of nodes in the propagation network by using the Floyd algorithm comprises the following steps:
acquiring a network adjacency matrix G corresponding to the propagation network;
initializing a distance matrix D according to an adjacency matrix G, and if a path from a node i to a node j can be reached, setting the distance matrix D [ i ] [ j ] ═ D, wherein D represents the length of the path; otherwise, D [ i ] [ j ] becomesinfinite, wherein i is more than or equal to 1 and less than j and less than or equal to n;
defining a matrix L to record the information of the inserted node, wherein L [ i ] [ j ] represents the node which needs to pass from the node i to the node j, and initializing L [ i ] [ j ] ═ j; inserting other nodes into the paths from i to j, and comparing the distance from i to j after the nodes are inserted with the original distance, wherein the distance is expressed as: d [ i ] [ j ] ═ min (D [ i ] [ j ], D [ i ] [ k ] + D [ k ] [ j ]), and if the value of D [ i ] [ j ] becomes smaller, L [ i ] [ j ] ═ k; and repeating the operation until the D is not updated any more, and outputting the distance matrix D.
5. An apparatus for identifying propagation key nodes based on K-shell decomposition, comprising:
the system comprises an acquisition module, a transmission module and a processing module, wherein the acquisition module is used for establishing a transmission network by acquiring message transmission of a social platform, determining a node set according to an individual transmitting a message in the transmission network, and acquiring friend list data corresponding to each node in the node set;
the direct connection edge calculation module is used for obtaining the number of the direct connection edges corresponding to each node according to the friend list data; if the individuals are in a friend relationship, a direct connection edge exists between the two corresponding nodes;
the K-shell index calculation module is used for determining the degree of each node in the propagation network according to the number of the directly connected edges of each node, and calculating the K-shell index of each node according to the degree of each node, wherein the K-shell index is used for representing the influence of individuals in the propagation network;
the score calculation module is used for calculating the shortest distance between each pair of nodes in the propagation network by adopting a Floyd algorithm, calculating the ranking corresponding score of each node according to the K shell index and the shortest distance, and further obtaining the propagation key nodes in the propagation network;
in the score calculation module, the formula for calculating the ranking corresponding score of each node is as follows:
Figure FDA0003281752520000021
wherein, KS(t) is the K shell index corresponding to the t-th node, di,tRepresenting the shortest distance between nodes i and t.
6. The K-shell decomposition-based propagation key node identification apparatus according to claim 5, wherein in the K-shell index calculation module, the degree of nodes in the propagation network is equal to the number of directly connected edges of the nodes.
7. The apparatus for identifying propagation critical nodes based on K-shell decomposition of claim 6, wherein the K-shell index calculation module further comprises:
obtaining degrees of each node in a node set S in the propagation network, wherein the node set S is { S }1,s2,...,snN is the total number of nodes;
traversing the node set S, searching all nodes with the degree of 1, deleting the corresponding nodes and the straight edges of the nodes, and storing the corresponding nodes into the set S1Said set S1K-shell index of each node in the tree
Figure FDA0003281752520000031
p∈S1
Traversing the node set S, searching all nodes with the degree of 2, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set S2Said set S2K-shell index of each node in the tree
Figure FDA0003281752520000032
q∈S2
Repeatedly traversing the node set S, adding 1 to the searched degree each time until all nodes corresponding to the maximum degree n in the set are searched, deleting the corresponding nodes and the edges connected with the nodes, and storing the corresponding nodes into the set SnSaid set SnK-shell index of each node in the tree
Figure FDA0003281752520000033
m∈Sn
8. The K-shell decomposition-based apparatus for identifying propagation critical nodes of claim 7,
the calculating the shortest distance between each pair of nodes in the propagation network by using the Floyd algorithm further comprises:
acquiring a network adjacency matrix G corresponding to the propagation network;
initializing a distance matrix D according to an adjacency matrix G, and if a path from a node i to a node j can be reached, setting the distance matrix D [ i ] [ j ] ═ D, wherein D represents the length of the path; otherwise, D [ i ] [ j ] becomesinfinite, wherein i is more than or equal to 1 and less than j and less than or equal to n;
defining a matrix L to record the information of the inserted node, wherein L [ i ] [ j ] represents the node which needs to pass from the node i to the node j, and initializing L [ i ] [ j ] ═ j; inserting other nodes into the paths from i to j, and comparing the distance from i to j after the nodes are inserted with the original distance, wherein the distance is expressed as: d [ i ] [ j ] ═ min (D [ i ] [ j ], D [ i ] [ k ] + D [ k ] [ j ]), and if the value of D [ i ] [ j ] becomes smaller, L [ i ] [ j ] ═ k; and repeating the operation until the D is not updated any more, and outputting the distance matrix D.
CN201910550676.0A 2019-06-24 2019-06-24 Method and device for identifying propagation key nodes based on K-shell decomposition Active CN110247805B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910550676.0A CN110247805B (en) 2019-06-24 2019-06-24 Method and device for identifying propagation key nodes based on K-shell decomposition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910550676.0A CN110247805B (en) 2019-06-24 2019-06-24 Method and device for identifying propagation key nodes based on K-shell decomposition

Publications (2)

Publication Number Publication Date
CN110247805A CN110247805A (en) 2019-09-17
CN110247805B true CN110247805B (en) 2021-11-30

Family

ID=67889156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910550676.0A Active CN110247805B (en) 2019-06-24 2019-06-24 Method and device for identifying propagation key nodes based on K-shell decomposition

Country Status (1)

Country Link
CN (1) CN110247805B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837608B (en) * 2019-11-07 2024-04-12 中科天玑数据科技股份有限公司 Public opinion topic propagation path analysis system and method based on multi-source data
CN112231591B (en) * 2020-11-06 2024-02-09 烟台大学 Information recommendation method and system considering social network user group compactness
CN113259170B (en) * 2021-06-01 2021-09-24 宁波大学 Method for identifying sub-network and key target thereof in computer network and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992266A (en) * 2015-06-15 2015-10-21 广东电网有限责任公司电力调度控制中心 Method of determining power grid node importance degree and system thereof
CN106681334A (en) * 2017-03-13 2017-05-17 东莞市迪文数字技术有限公司 Automatic-guided-vehicle dispatching control method based on genetic algorithm

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140188994A1 (en) * 2012-12-28 2014-07-03 Wal-Mart Stores, Inc. Social Neighborhood Determination

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992266A (en) * 2015-06-15 2015-10-21 广东电网有限责任公司电力调度控制中心 Method of determining power grid node importance degree and system thereof
CN106681334A (en) * 2017-03-13 2017-05-17 东莞市迪文数字技术有限公司 Automatic-guided-vehicle dispatching control method based on genetic algorithm

Also Published As

Publication number Publication date
CN110247805A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
CN110213164B (en) Method and device for identifying network key propagator based on topology information fusion
CN110247805B (en) Method and device for identifying propagation key nodes based on K-shell decomposition
CN110168523B (en) Change monitoring cross-graph query
CN107330798B (en) Method for identifying user identity between social networks based on seed node propagation
US11727053B2 (en) Entity recognition from an image
CN111881350B (en) Recommendation method and system based on mixed graph structured modeling
CN104866781B (en) The community network data publication method for secret protection of Community-oriented detection application
US8438189B2 (en) Local computation of rank contributions
CN112308157B (en) Decision tree-oriented transverse federated learning method
US10394799B2 (en) System and method of extracting data from structured and unstructured sources of data using automated joins
CN107092667B (en) Group's lookup method and device based on social networks
Wu An algorithm for constructing parsimonious hybridization networks with multiple phylogenetic trees
US20130124502A1 (en) Method and apparatus for facilitating answering a query on a database
Vesdapunt et al. Identifying users in social networks with limited information
CN113656698B (en) Training method and device for interest feature extraction model and electronic equipment
CN112948608B (en) Picture searching method and device, electronic equipment and computer readable storage medium
CN111400452A (en) Text information classification processing method, electronic device and computer readable storage medium
CN109120431B (en) Method and device for selecting propagation source in complex network and terminal equipment
CN112464107B (en) Social network overlapping community discovery method and device based on multi-label propagation
Yoo et al. Sampling subgraphs with guaranteed treewidth for accurate and efficient graphical inference
Liu et al. Strong social graph based trust-oriented graph pattern matching with multiple constraints
CN108470251B (en) Community division quality evaluation method and system based on average mutual information
CN110825822B (en) Personnel relationship query method and device, electronic equipment and storage medium
CN114928548B (en) Social network information propagation scale prediction method and device
CN115994243A (en) Cross-modal retrieval model processing method, device, equipment, product and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant