CN106100870A - A kind of community network event detecting method based on link prediction - Google Patents
A kind of community network event detecting method based on link prediction Download PDFInfo
- Publication number
- CN106100870A CN106100870A CN201610374849.4A CN201610374849A CN106100870A CN 106100870 A CN106100870 A CN 106100870A CN 201610374849 A CN201610374849 A CN 201610374849A CN 106100870 A CN106100870 A CN 106100870A
- Authority
- CN
- China
- Prior art keywords
- network
- node
- event
- similarity
- evolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000004364 calculation method Methods 0.000 claims abstract description 38
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 33
- 238000001514 detection method Methods 0.000 claims abstract description 27
- 230000008859 change Effects 0.000 claims description 20
- 238000004458 analytical method Methods 0.000 claims description 17
- 230000007246 mechanism Effects 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 12
- 238000011156 evaluation Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 6
- 239000000470 constituent Substances 0.000 claims description 5
- 230000035945 sensitivity Effects 0.000 claims description 5
- 230000006855 networking Effects 0.000 claims 1
- 230000008034 disappearance Effects 0.000 description 4
- 230000006399 behavior Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003012 network analysis Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000005562 fading Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/142—Network analysis or design using statistical or mathematical methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Pure & Applied Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of community network event detecting method based on link prediction, first use algorithm SimC to calculate the similarity of each time period input network G, and according to result of calculation, draw network evolution sequence GraphS;Then on GraphS, combine threshold value T use incident Detection Algorithm EventD, outgoing event sequence EventS.The present invention contributes to analyzing community network and develops, the important event occurred in detection network, and then guides network normally to develop.
Description
Technical Field
The invention belongs to the technical field of social network analysis, and relates to a social network event detection method LinkEvent (consisting of a similarity calculation algorithm SimC and an event detection algorithm EventD) based on link prediction, which can be used for uniformly evaluating the volatility of different networks and establishing an event detection model according to the volatility.
Background
Social networks are constantly developing and changing, and network evolution analysis and event detection are important components of social network analysis. The network evolution analysis means that the evolution rule is described by tracking the characteristic changes of different stages of the network, so that the behaviors of network growth, propagation and the like are analyzed, the future structure is predicted, and even human intervention is performed to obtain an expected result. Network evolution analysis technology has been widely applied to the fields of user behavior analysis, message propagation guidance and the like along with the explosive development of social networks. However, the characteristics of different social networks are very different, the evolution mechanism is complicated, and how to efficiently simulate the behaviors of growth, propagation and the like of a real network becomes the primary challenge of network evolution analysis at present. Event detection is a specific application of network evolution analysis techniques. Generally, on the basis of describing a network evolution rule, the method detects events occurring in the network and proposes an intervention strategy by analyzing differences of each stage of the network. The event detection has important guiding significance in the aspects of analyzing the replacement of the core head in the criminal network, judging the change of the organization architecture in the company mail network and the like. In a real network, various events may cause the network to deviate from normal evolution, thereby presenting different structural changes. How to define and detect these events, evaluate the influence, and propose the corresponding strategy to intervene is a difficult point of the event detection research.
In order to reflect the characteristic change of the network and reveal the internal evolution law, scholars propose various models, typically(E-R) stochastic graph model, Watts-Strogatz (W-S) small world model, Barab si-Albert (B-A) scale-free model. The general steps of these methods are: constructing a network model based on one or more evolution mechanisms, adjusting model parameters to adapt to a real network, simulating to obtain networks of various time periods, and finally passing degreeAnd evaluating the description degree of the model on the real network by using network statistical characteristics such as distribution, average aggregation coefficient and the like. The model evolution Method (ME) has the advantages of simple realization and capability of adjusting parameters according to network characteristics to construct different networks. However, the above models are designed only for a specific network, and it is difficult to take into account various statistical characteristics, and the performance of different models lacks a uniform evaluation standard. Meanwhile, because the mutual relation of the network before and after each time period is not considered, the fluctuation in the network evolution process is ignored, the models are difficult to describe the stability degree of the network evolution, and the events in the network cannot be detected.
Link Prediction (LP) refers to how to accurately predict an edge newly appearing in a next time period given a topology (a Link relationship between a point and a point) of a current time period of a network. The method comprises the following specific steps: and calculating scores of all point pairs in the current time period according to a certain index, deleting the existing point pairs (namely the edges existing in the network), arranging the remaining point pairs in a descending order according to the scores, and selecting the previous L point pairs as prediction results to be output according to the evaluation index. Different from a model evolution method, the link prediction fully utilizes the existing information of the current time period, and adopts various indexes constructed by different evolution mechanisms to predict the network structure of the next time period. [ document 1] Palla G, Barab si A L, Viscek T.Quantifying social group evaluation [ J ] Nature, 2007, 446(7136):664 667.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method for describing the evolution trend of a network and detecting events in the network by using related indexes of link prediction based on similarity analysis of network evolution sequences.
The technical scheme adopted by the invention is as follows: a social network event detection method based on link prediction is characterized by comprising the following steps:
step 1: calculating the similarity of each time segment of the input network G by adopting an algorithm SimC, and obtaining a network evolution sequence GraphS according to the calculation result;
step 2: and (4) outputting an event sequence EventS by adopting an event detection algorithm EventD on the graph S in combination with a threshold value T.
Preferably, the specific implementation process of step 1 is as follows:
for a given network G ═ G1,g2,g3,…,gnGraph g for network snapshots at time ttRepresents; network snapshot G at GtAnd gt+1In (3), the similarity of the node i is defined as that the node i is at gt、gt+1To a degree that is stable, withRepresents; drawing gt、gt+1The similarity of (A) is a macroscopic expression of superposition of the similarity of each node in the graph, and is expressed by S (g)t,gt+1) Represents, defined as:
wherein, Ut,t+1=gt∪gt+1(ii) a V at times t and t +1iViewed as distinct nodes, respectively denoted asAnd
[t,t+1]network volatility usage over time periodsRepresents, defined as:
the network evolution sequence GraphS is defined as a set of volatility of each time period, and is shown as the following formula;
preferably, the similarity of the nodes iThe calculation process of (2) is: 8 indexes in link prediction are applied to similarity calculation, and a virtual node V is introducedvirtualCalled "observer", VvirtualA virtual edge exists between the network and all the points to obtain 8 kinds of node similarity calculation indexes, such as a table1 is shown in the specification;
table 1 node similarity calculation incorporating virtual points
Performing link prediction on the network G, and selecting an index corresponding to the best-performing AUC as an optimal index O, wherein the index O reflects an evolution mechanism of the network;
the AUC is used as a main index for measuring the accuracy of the link prediction algorithm, and is specifically defined as:
n represents the number of comparisons, n' represents the number of times the score of a randomly selected edge from the test set is greater than the score of a randomly selected edge from the absent edge constituent set, and n "represents the number of times of equality.
Preferably, the similarity of the nodes iAnd finally calculating the improved node similarityThe implementation process thereofThe method comprises the following substeps:
step 1: 8 indexes in link prediction are applied to similarity calculation, and a virtual node V is introducedvirtualAlso called "observer", VvirtualA virtual edge exists between the network and all the points, and 8 node similarity calculation indexes are obtained, as shown in table 1;
table 1 node similarity calculation incorporating virtual points
Performing link prediction on the network G, and selecting an index corresponding to the best-performing AUC as an optimal index O, wherein the index O reflects an evolution mechanism of the network;
the AUC is used as a main index for measuring the accuracy of the link prediction algorithm, and is specifically defined as:
n represents the number of comparisons, n' represents the number of times the score of a randomly selected edge from the test set is greater than the score of a randomly selected edge from the absent edge constituent set, n "represents the number of times of equality;
step 2: computing nodeWeights of evolution of points
And step 3: computing improved node similarity
Preferably, the evolution weight of the nodeByThe ratio to the random evolution value of 0.5 is determined, namely:
wherein,representing a node viIs defined as viIn gt+1The average value of the link prediction precision of the edge is newly added,
wherein,denotes viIn gt+1The set of newly added edges in the set of the new edges,the degree of conformity between the node change and the evolution rule is reflected, and the larger the value of the degree of conformity, the more the node change conforms to the evolution rule.
Preferably, the evolution weight of the nodeThe calculation formula of (2) is as follows:
where α denotes a scaling factor, intended to distinguish three levels more clearly;representing a node viIs defined as viIn gt+1The average value of the link prediction precision of the edge is newly added,
wherein,denotes viIn gt+1The set of newly added edges in the set of the new edges,the degree of conformity between the node change and the evolution rule is reflected, and the larger the value of the degree of conformity, the more the node change conforms to the evolution rule.
Preferably, the specific implementation of step 2 comprises the following sub-steps:
step 2.1, calculating an event occurrence value interval according to the existing event;
step 2.2, determining an event occurrence threshold T;
and 2.3, traversing the graph S and outputting an event sequence EventS.
Preferably, the specific implementation procedure of step 2 is that an event occurs when the event sequence eventoo ═ { k | t ═ k, composed of event points that have already occurred, k ∈ [1, m ═ k]And m is less than or equal to n }, analyzing the evolutionary sequence GraphS, and learning to obtain an event occurrence value interval [ L, H]Where L is the lower boundary of the event and H is the upper boundary, selecting T ∈ [ L, H]For the occurrence threshold, the threshold T is determined manually; when k equals t, ifThe network is in time period k, k +1]The state is in an event state, otherwise, the state is a relatively stable state; and after the analysis is finished, finally outputting an event sequence EventS:
preferably, the evaluation criteria of the event detection method are:
suppose k1,k2In network GTwo event points, k, in close proximity1+1<k2(ii) a G is in [ k ]1+1,k2]The time period is in a relatively steady state, [ k ]2,k2+1]In the event state, an event sensitive representation is defined:
wherein,
the event sensitivity performance Per is the ratio of the network event segment volatility to the stationary segment average volatility, and a larger ratio indicates that the event is more easily detected.
The invention has the beneficial effects that: the method is beneficial to analyzing social network evolution, detecting major events in the network, further guiding the normal evolution of the network and avoiding malignant group events.
Drawings
FIG. 1 is a schematic flow diagram of an embodiment of the present invention;
FIG. 2 is a diagram illustrating the evolution process of a simple network from t to t +3 in accordance with an embodiment of the present invention;
fig. 3 is a schematic diagram of an introduced evolution situation of a virtual node in a non-connected network according to an embodiment of the present invention.
Detailed Description
In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.
The method provided by the invention is based on similarity analysis of network evolution sequences, and a high-efficiency reasonable algorithm is designed by using related indexes of link prediction, so that the evolution trend of the network is described, and events in the network are detected.
Referring to fig. 1, the social network event detection method based on link prediction (hereinafter referred to as linkavent method) provided by the invention designs an efficient and reasonable algorithm by using relevant indexes of link prediction based on similarity analysis of network evolution sequences, so as to describe the evolution trend of a network and detect events in the network.
The LinkEvent method comprises the following two steps:
step 1: calculating the similarity of each time segment of the input network G by adopting an algorithm SimC, and obtaining a network evolution sequence GraphS (Graph evolution sequence) according to the calculation result;
step 2: and (4) outputting an event sequence EventS (event sequence) by adopting an event detection algorithm EventD on the graph S in combination with a threshold value T.
In step 1, an algorithm SimC is adopted to calculate the similarity of each time period for the input network G, and the specific implementation process is as follows:
for a given network G ═ G1,g2,g3,…,gnThe network snapshot at time t can be shown in fig. gtIs represented by the formula gtAnd gt+1The degree of similarity between them is influenced by three factors:
(1) relative to gt,gt+1The addition of new points in and the introduction of new edges resulting therefrom;
(2) relative to gt,gt+1The disappearance of the old and middle points and the disappearance of the corresponding edges;
(3)gt,gt+1the midpoint remains stable and the edges simply increase or decrease.
The three factors are mutually superposed, and each node and the associated edge thereof continuously change along with time, and are expressed as the fluctuation of the whole network macroscopically. How to describe the fluctuation degree of the network provides an analysis basis for event detection, which becomes a first problem to be faced at present. Therefore, the invention provides the SimC algorithm and discusses the relevant problems.
Network snapshot G at GtAnd gt+1In (3), the similarity of the node i is defined as that the node i is at gt,gt+1To a degree that is stable, withRepresents; v at times t and t +1iIs regarded asDifferent nodes, respectively denoted asAnd
drawing gt,gt+1The similarity of (A) is a macroscopic expression of superposition of the similarity of each node in the graph, and is expressed by S (g)t,gt+1) Express, define
Wherein, Ut,t+1=gt∪gt+1。
gt,gt+1The similarity of (A) reflects the degree of approximation between the two graphs, with a larger value indicating that the network is at [ t, t +1]]The smaller the variation in the time period, the smaller the fluctuation degree of the network. [ t, t +1]]Network volatility usage over time periodsRepresents, defines:
the network evolution sequence GraphS is defined as a set of volatility of each time segment, as shown in the following formula (3).
The following detailed descriptionThe calculation process of (2);
(1) traditional calculation methods;
the node similarity index in the link prediction is the similarity degree of two different nodes in a metric graph, and the core idea is that the similarity of the two nodes depends on the topological structure information (including the number of common neighbors, the degree and the like). By taking the idea as reference, the node i in the network G is in Gt,gt+1Can be seen as two different nodesSimilarity between the two can also be usedTopology is described.
For example, Jaccard index in link predictionDenotes viAnd vjThe similarity of (c) is determined by their common neighbors and, accordingly,andthe similarity of (A) can be described by the formula (4) and is marked as JAS index.
In this way, the PA metric in link prediction is applied toIn the similarity calculation, the expression (5) is expressed as the PAS index.
Combined type(4) Equation (1), equation (5) or equation (1) can be used to calculate graph gt,gt+1The similarity of (c).
In contrast to the similarity of graphs obtained by node similarity accumulation according to the present invention [ document 1]]A method for calculating the similarity overlap degree (relative overlap) of the front and back states of the same community in community evolution is given, and the method is applied to the gt,gt+1In (1), g can also be describedt,gt+1The similarity degree of (c) is expressed as ROS as shown in formula (6).
Wherein, A (g)t) Denotes gtThe set of all nodes in.
The superiority of the similarity calculation method proposed by the present invention is illustrated by way of example analysis below;
fig. 2 illustrates the evolution process of a simple network from t to t +3, only one node is added in each step 1, and the time window is set to 1.
G was calculated using JAS, PAS and ROS, respectivelyt,gt+1And gt+1,gt+2The results are shown in Table 2.
TABLE 2 results of three similarity index calculations
In order to increase the sensitivity of event detection, the variation range of the network fluctuation should be as large as possible, namely S (g)t,gt +1) The variations should be as pronounced as possible. As can be seen from Table 2, JAS performed better than ROS, PAS performed better than JAS. In fact, the ROS only macroscopically considers the change of the node and does not consider the change of the edge, so the effect is the worst. Although JAS is specific to the change of the topology structure of each node and its associated edge, JAS cannot reflect the evolution law of the network. The network shown in fig. 2 evolves in a preferential link manner, and thus PAS based on a preferential connection mechanism also works best.
Calculated by PAS and JAS, relative to gt,gt+1The similarity calculation value of the newly added node is 0, and the similarity calculation value of the disappeared node is 0. In this way of calculation, the overall disappearance or increase of isolated nodes (subgraphs of non-connected networks) in the network has no effect on the similarity calculation, which is obviously not practical.
(2) The invention introduces a virtual node VvirtualThe latter calculation method;
in order to solve the problem, the invention introduces a virtual node V in the networkvirtualAlso referred to as "observer", is understood to mean "observing" the changes of the entire network from the point of view of the node. VvirtualThere is a virtual edge with all points in the network, as shown in fig. 3.
Fig. 3 shows an evolution of a non-connected network. At time t +1, the originally present subgraph CDE disappears and the subgraph FG occurs. If V is not introducedvirtualThe influence of the disappearance and generation of the subgraphs on the network cannot be reflected when the similarity is calculated. Introduction of VvirtualThen can pass through VvirtualThe similarity calculation of (2) is shown.
After the virtual points are introduced, the node similarity calculation can completely describe the network volatility. By applying 8 indexes in the link prediction to the similarity calculation, 8 node similarity calculation indexes can be obtained, as shown in fig. 3.
Table 3 node similarity calculation incorporating virtual points
The network of a certain evolution mechanism adopts indexes of the same mechanism to calculate better performance than other indexes. For a network with unknown evolution mechanism, the evolution mechanism of the network is firstly deduced by utilizing link prediction, and then the similarity is calculated by adopting indexes of the corresponding mechanism. And performing link prediction on the network G, and selecting an index corresponding to the AUC with the best performance as an optimal index (also called evolution index) O. The index O reflects the evolution mechanism of the network.
The AUC is used as a main index for measuring the accuracy of the link prediction algorithm, and is specifically defined as:
n represents the number of comparisons, n' represents the number of times the score of a randomly selected edge from the test set is greater than the score of a randomly selected edge from the absent edge constituent set, and n "represents the number of times of equality.
According to the analysis, the invention provides a similarity calculation algorithm SimC, which is specifically shown as algorithm 1.
(3) The invention further introduces a calculation method after node evolution weight on the basis of the algorithm 1;
the original SimC algorithm treats all nodes equally, the similarity of the nodes is directly accumulated to obtain the similarity of the graph, and the microscopic difference of the nodes is not considered. In fact, if a certain node and the change of the topology structure around the node conform to the network evolution rule, the node can be regarded as normal evolution, and the influence on the network fluctuation is small. The change of the node does not accord with the evolution rule, and the internal evolution principle is broken possibly due to the occurrence of the event, so that the influence on the volatility of the network is large. Therefore, different nodes should be treated differently when computing similarity. To this end, the present invention introduces the concept of node evolution weights.
Defining evolution weight w of a node as the node and its surrounding topological structureThe degree of conformity of the change with the network evolution law. The larger w represents that the change of the node conforms to the evolution rule.Representing a node viAt [ t, t +1]An evolving weight of the time segment.
Link prediction accuracySpecifically, in the analysis of the microscopic level, the network G is assumed to infer that the optimal index (i.e. evolution index) is O, G through link predictiontMiddle node vi、vjAbsence of edges, gt+1Between which an edge e is generatedijHandle eijAs test set, gtMiddle viUsing the edge which does not exist with other nodes as a random selection set, and taking e each timeijThe score value under the O index is compared to the edges in the random selection set. After n comparisons, the AUC obtained is defined as edge eijLink prediction accuracy ofRepresents an edge eijThe greater the value of the degree of engagement between the generation of (a) and the evolution law, the more the value of (a) indicates (e)ijThe more consistent the generation of (c) is with the evolution mechanism corresponding to O.
Further, define node viLink prediction accuracy ofIs v isiIn gt+1The average value of the link prediction precision of the newly added edge in the middle is shown as a formula (8).
Wherein,denotes viIn gt+1The newly added set of edges.The degree of conformity between the node change and the evolution rule is reflected, and the larger the value of the degree of conformity, the more the node change conforms to the evolution rule.
According toThe evolution weight of the node can be calculatedThe invention provides twoAnd (4) a weight strategy.
Of a first typeByAnd the ratio of the random evolution value to the random evolution value of 0.5 is determined as shown in a formula (9).
The second strategy ranks the link prediction accuracy of the nodes,indicating that the evolution law is not met, 0.8 indicating that the evolution law is better met, and 1.0 indicating that the evolution law is completely met. The calculation mode of the evolution weight corresponding to each grade is shown as a formula (10).
Where α represents a scaling factor, intended to more clearly distinguish the three levels.
After considering the node evolution weights, an improved wSimC algorithm such as algorithm 2 is obtained.
In the original SimC algorithm, forAnd (3) the network G (taking N time snapshots, the average node number is N, the average degree is k), and the node number needing to calculate the similarity is (N-1) N. In the improved wSimC algorithm, when the similarity of a certain node is calculated, the link prediction precision of all edges of the node is also required to be calculated, and the time complexity is O ((N-k) k), so the time complexity of the whole algorithm is O ((N-1) N (N-k) k). Barabasi and Alber find that most of networks obey power law distribution when researching the real networks, most of networks are sparse graphs, the degrees of nodes of the sparse graphs are small, and therefore k can be regarded as a constant. The number of time snapshots n of the network can also be considered a constant due to the collection of the data set. Thus, for sparse large networks, the time complexity of the wSimC algorithm is O (N)2)。
And performing similarity calculation on the network to obtain a network evolution sequence GraphS. Graph describes the evolution trends of the network, such as settling, evolution, fading, etc. The various stages of the analysis sequence can detect the occurrence of a new event based on the information of events that have already occurred.
Network smoothing and events
The network evolution situation can be divided into three states according to the volatility:
(1) if the networks of the network G at the time t and the time t +1 are identical, the network G is called to be in the time period [ t, t +1]]In an absolutely steady state. At this time, provision is made forAn absolute steady state is an ideal state that is almost non-existent in a real network.
(2) If the fluctuation of the network G at the time of T, T +1 is smaller than the threshold value T, the network G is called to be in a relatively steady state in a time period [ T, T +1], and the time period [ T, T +1] is called as a steady section. The relatively steady state reflects normal fluctuations of the network under the action of the evolution law.
(3) If the volatility of the network G at the time T, T +1 exceeds a threshold value T, the network G is called to be in an event state in a time period [ T, T +1], T is called an event point, and the time period [ T, T +1] is called an event segment. An event may be defined as an event that interferes with the normal evolution of the network by changing the topology of a particular point or edge to affect the network evolution.
When the real network evolves, the real network is in a relatively stable state for a long time, the event enters an event state due to the occurrence of the event, and the event is restored to a new relatively stable state after the influence of the event disappears, and the events are alternated in sequence.
The specific implementation process of the event detection algorithm in the step 2 is as follows:
and analyzing GraphS based on a set formed by the existing event points, so that the network evolution state can be judged, and the occurrence of the event can be detected. The event detection algorithm EventD basic idea provided by the invention is as follows:
an event occurs when EventO ═ k | t ═ k according to an event sequence composed of event points that have occurred,analyzing the evolution sequence GraphS, and learning to obtain an event occurrence value interval [ L, H](L is the lower boundary of the occurrence of the event, H is the upper boundary), T ∈ [ L, H ] is selected]Is the occurrence threshold. To increase the flexibility of the temporal detection, the threshold T is determined manually. When k equals t, ifThe network is in time period k, k +1]In the event state. Otherwise, the state is a relatively steady state. And finally outputting the event sequence EventS after the analysis is finished, as shown in the formula (11).
The specific process is algorithm 3.
The invention also provides a simple evaluation standard of the event detection method;
suppose k1,k2For two event points (k) in the network G that are immediately adjacent1+1<k2) G is at [ k ]1+1,k2]The time period is in a relatively steady state, [ k ]2,k2+1]In the event state, defining an event-sensitive representation
Wherein,this is because the volatility of the event segment is necessarily greater than the stationary segment.
The event sensitivity performance Per is the ratio of the network event segment fluctuation to the stationary segment average fluctuation, and the larger the ratio is, the more easily the event is detected, and the event sensitivity performance Per can be used for evaluating the performance of the event detection method. In practical application, different event detection algorithms can be designed according to parameters such as similarity indexes, weight strategies and the like, and the optimal parameter configuration is selected according to the evaluation result of Per.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (9)
1. A social network event detection method based on link prediction is characterized by comprising the following steps:
step 1: calculating the similarity of each time segment of the input network G by adopting an algorithm SimC, and obtaining a network evolution sequence GraphS according to the calculation result;
step 2: and (4) outputting an event sequence EventS by adopting an event detection algorithm EventD on the graph S in combination with a threshold value T.
2. The method for detecting social network events based on link prediction as claimed in claim 1, wherein the step 1 is implemented by:
for a given network G ═ G1,g2,g3,…,gnGraph g for network snapshots at time ttRepresents; network snapshot G at GtAnd gt+1In (3), the similarity of the node i is defined as that the node i is at gt、gt+1To a degree that is stable, withRepresents; drawing gt、gt+1The similarity of (A) is a macroscopic expression of superposition of the similarity of each node in the graph, and is expressed by S (g)t,gt+1) Represents, defined as:
wherein, Ut,t+1=gt∪gt+1(ii) a V at times t and t +1iViewed as distinct nodes, respectively denoted asAnd
[t,t+1]network volatility usage over time periodsRepresents, defined as:
the network evolution sequence GraphS is defined as a set of volatility of each time period, and is shown as the following formula;
3. the method of claim 2, wherein the similarity of the nodes i is determined by the similarity of the nodes iThe calculation process of (2) is: 8 indexes in link prediction are applied to similarity calculation, and a virtual node V is introducedvirtualCalled "observer", VvirtualA virtual edge exists between the network and all the points, and 8 node similarity calculation indexes are obtained, as shown in table 1;
table 1 node similarity calculation incorporating virtual points
Performing link prediction on the network G, and selecting an index corresponding to the best-performing AUC as an optimal index O, wherein the index O reflects an evolution mechanism of the network;
the AUC is used as a main index for measuring the accuracy of the link prediction algorithm, and is specifically defined as:
n represents the number of comparisons, n' represents the number of times the score of a randomly selected edge from the test set is greater than the score of a randomly selected edge from the absent edge constituent set, and n "represents the number of times of equality.
4. The method of claim 2, wherein the similarity of the nodes i is determined by the similarity of the nodes iAnd finally calculating the improved node similarityThe implementation process comprises the following sub-steps:
step 1: 8 indexes in link prediction are applied to similarity calculation, and a virtual node V is introducedvirtualAlso called "observer", VvirtualA virtual edge exists between the network and all the points to obtain 8 kinds of node similarity calculation indexes, such asShown in Table 1;
table 1 node similarity calculation incorporating virtual points
Performing link prediction on the network G, and selecting an index corresponding to the best-performing AUC as an optimal index O, wherein the index O reflects an evolution mechanism of the network;
the AUC is used as a main index for measuring the accuracy of the link prediction algorithm, and is specifically defined as:
n represents the number of comparisons, n' represents the number of times the score of a randomly selected edge from the test set is greater than the score of a randomly selected edge from the absent edge constituent set, n "represents the number of times of equality;
step 2: calculating evolving weights of nodes
And step 3: computing improved node similarity
5. The method of claim 4, wherein the evolution weight of the node is selected from the group consisting of a weight of a node, and a weight of a nodeByThe ratio to the random evolution value of 0.5 is determined, namely:
wherein,representing a node viIs defined as viIn gt+1The average value of the link prediction precision of the edge is newly added,
wherein,denotes viIn gt+1The set of newly added edges in the set of the new edges,the degree of conformity between the node change and the evolution rule is reflected, and the larger the value of the degree of conformity, the more the node change conforms to the evolution rule.
6. The method of claim 4, wherein the evolution weight of the node is selected from the group consisting of a weight of a node, and a weight of a nodeThe calculation formula of (2) is as follows:
where α denotes a scaling factor, intended to distinguish three levels more clearly;representing a node viIs defined as viIn gt+1The average value of the link prediction precision of the edge is newly added,
wherein,denotes viIn gt+1The set of newly added edges in the set of the new edges,the degree of conformity between the node change and the evolution rule is reflected, and the larger the value of the degree of conformity, the more the node change conforms to the evolution rule.
7. The method for detecting social networking events based on link prediction as claimed in claim 2, wherein the step 2 is implemented by the following steps:
step 2.1, calculating an event occurrence value interval according to the existing event;
step 2.2, determining an event occurrence threshold T;
and 2.3, traversing the graph S and outputting an event sequence EventS.
8. The method for detecting social network events based on link prediction as claimed in claim 2, wherein step 2 is implemented by generating events according to an event sequence EventO { k | t ═ k composed of event points that have already occurred, k ∈ [1, m ═ k]And m is less than or equal to n }, analyzing the evolutionary sequence GraphS, and learning to obtain an event occurrence value interval [ L, H]Where L is the lower boundary of the event and H is the upper boundary, selecting T ∈ [ L, H]For the occurrence threshold, the threshold T is determined manually; when k equals t, ifThe network is in time period k, k +1]The state is in an event state, otherwise, the state is a relatively stable state; and after the analysis is finished, finally outputting an event sequence EventS:
9. the method for social network event detection based on link prediction according to any one of claims 2 to 8, wherein the evaluation criteria of the event detection method are:
suppose k1,k2For two event points in the immediate vicinity of the network G, k1+1<k2(ii) a G is in [ k ]1+1,k2]The time period is in a relatively steady state, [ k ]2,k2+1]In the event state, an event sensitive representation is defined:
wherein,
the event sensitivity performance Per is the ratio of the network event segment volatility to the stationary segment average volatility, and a larger ratio indicates that the event is more easily detected.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610374849.4A CN106100870A (en) | 2016-05-31 | 2016-05-31 | A kind of community network event detecting method based on link prediction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610374849.4A CN106100870A (en) | 2016-05-31 | 2016-05-31 | A kind of community network event detecting method based on link prediction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106100870A true CN106100870A (en) | 2016-11-09 |
Family
ID=57230318
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610374849.4A Pending CN106100870A (en) | 2016-05-31 | 2016-05-31 | A kind of community network event detecting method based on link prediction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106100870A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107086933A (en) * | 2017-05-23 | 2017-08-22 | 杨武略 | A kind of link prediction method based on Bayesian Estimation and seed node degree |
CN109245952A (en) * | 2018-11-16 | 2019-01-18 | 大连理工大学 | A kind of disappearance link prediction method based on MPA model |
CN109347697A (en) * | 2018-10-10 | 2019-02-15 | 南昌航空大学 | Opportunistic network link prediction method, apparatus and readable storage medium storing program for executing |
CN110245237A (en) * | 2018-03-09 | 2019-09-17 | 北京国双科技有限公司 | Event prediction method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100145771A1 (en) * | 2007-03-15 | 2010-06-10 | Ariel Fligler | System and method for providing service or adding benefit to social networks |
US20130144818A1 (en) * | 2011-12-06 | 2013-06-06 | The Trustees Of Columbia University In The City Of New York | Network information methods devices and systems |
CN103995866A (en) * | 2014-05-19 | 2014-08-20 | 北京邮电大学 | Commodity information pushing method and device based on link forecasting |
CN105354749A (en) * | 2015-10-16 | 2016-02-24 | 重庆邮电大学 | Social network based mobile terminal user grouping method |
-
2016
- 2016-05-31 CN CN201610374849.4A patent/CN106100870A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100145771A1 (en) * | 2007-03-15 | 2010-06-10 | Ariel Fligler | System and method for providing service or adding benefit to social networks |
US20130144818A1 (en) * | 2011-12-06 | 2013-06-06 | The Trustees Of Columbia University In The City Of New York | Network information methods devices and systems |
CN103995866A (en) * | 2014-05-19 | 2014-08-20 | 北京邮电大学 | Commodity information pushing method and device based on link forecasting |
CN105354749A (en) * | 2015-10-16 | 2016-02-24 | 重庆邮电大学 | Social network based mobile terminal user grouping method |
Non-Patent Citations (1)
Title |
---|
胡文斌等: "基于链路预测的社会网络事件检测方法", 《软件学报》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107086933A (en) * | 2017-05-23 | 2017-08-22 | 杨武略 | A kind of link prediction method based on Bayesian Estimation and seed node degree |
CN110245237A (en) * | 2018-03-09 | 2019-09-17 | 北京国双科技有限公司 | Event prediction method and device |
CN109347697A (en) * | 2018-10-10 | 2019-02-15 | 南昌航空大学 | Opportunistic network link prediction method, apparatus and readable storage medium storing program for executing |
CN109347697B (en) * | 2018-10-10 | 2019-12-03 | 南昌航空大学 | Opportunistic network link prediction method, apparatus and readable storage medium storing program for executing |
CN109245952A (en) * | 2018-11-16 | 2019-01-18 | 大连理工大学 | A kind of disappearance link prediction method based on MPA model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109005055B (en) | Complex network information node importance evaluation method based on multi-scale topological space | |
Peel et al. | Detecting change points in the large-scale structure of evolving networks | |
CN108960303B (en) | Unmanned aerial vehicle flight data anomaly detection method based on LSTM | |
CN103840967B (en) | A kind of method of fault location in power telecom network | |
CN106100870A (en) | A kind of community network event detecting method based on link prediction | |
CN106127590A (en) | A kind of information Situation Awareness based on node power of influence and propagation management and control model | |
CN112132430B (en) | Reliability evaluation method and system for distributed state sensor of power distribution main equipment | |
CN110232524A (en) | Social networks cheats the construction method of model, antifraud method and apparatus | |
CN110362772B (en) | Real-time webpage quality evaluation method and system based on deep neural network | |
CN106789253A (en) | A kind of elasticity of complex information network evaluates and optimizes method | |
CN109242250A (en) | A kind of user's behavior confidence level detection method based on Based on Entropy method and cloud model | |
CN106296465A (en) | A kind of intelligent grid exception electricity consumption behavioral value method | |
CN113484813B (en) | Intelligent ammeter fault rate prediction method and system under multi-environment stress | |
Kabbur et al. | Content-based methods for predicting web-site demographic attributes | |
CN115278741A (en) | Fault diagnosis method and device based on multi-mode data dependency relationship | |
CN115580446A (en) | Non-intrusive load detection method based on decentralized federal learning | |
CN108491559A (en) | A kind of time series method for detecting abnormality based on normalized mutual information estimation | |
CN109345011A (en) | A kind of Air-conditioning Load Prediction method and system returning forest based on depth | |
Zhu et al. | Network functional varying coefficient model | |
CN114385403A (en) | Distributed cooperative fault diagnosis method based on double-layer knowledge graph framework | |
Batic et al. | Towards transparent load disaggregation–a framework for quantitative evaluation of explainability using explainable ai | |
CN104111887A (en) | Software fault prediction system and method based on Logistic model | |
CN103413027B (en) | A kind of evaluation methodology of community network overlap community discovery method | |
CN103886169A (en) | Link prediction algorithm based on AdaBoost | |
CN113364699B (en) | Cloud data flow management and control method and system based on multi-scale self-similar characteristic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161109 |
|
RJ01 | Rejection of invention patent application after publication |