CN112597439A - Method and system for detecting abnormal account of online social network - Google Patents
Method and system for detecting abnormal account of online social network Download PDFInfo
- Publication number
- CN112597439A CN112597439A CN202011428803.9A CN202011428803A CN112597439A CN 112597439 A CN112597439 A CN 112597439A CN 202011428803 A CN202011428803 A CN 202011428803A CN 112597439 A CN112597439 A CN 112597439A
- Authority
- CN
- China
- Prior art keywords
- node
- centrality
- importance
- value
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 56
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000004927 fusion Effects 0.000 claims abstract description 35
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 33
- 230000005540 biological transmission Effects 0.000 claims abstract description 14
- 238000011156 evaluation Methods 0.000 claims abstract description 12
- 238000010606 normalization Methods 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 14
- 239000000126 substance Substances 0.000 claims description 13
- 238000012546 transfer Methods 0.000 claims description 7
- 238000001514 detection method Methods 0.000 abstract description 26
- 238000010586 diagram Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 238000005295 random walk Methods 0.000 description 3
- 206010000117 Abnormal behaviour Diseases 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 238000000855 fermentation Methods 0.000 description 1
- 230000004151 fermentation Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Abstract
The invention relates to a method and a system for detecting an abnormal account of an online social network. The method comprises the following steps: generating a simple graph according to a user relationship data set of the online social network; calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the introductivity centrality, the approach centrality and the medium centrality of the node; fusing the importance of each node to obtain an importance fusion value of each node; taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph; combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value; and determining the account corresponding to the node with the smaller trust value in the authorized graph as the abnormal account. The method and the device can improve the accuracy of abnormal account detection.
Description
Technical Field
The invention relates to the field of account detection, in particular to a method and a system for detecting an abnormal account of an online social network.
Background
With the widespread use of the internet and mobile terminals, Online Social Networks (Online Social Networks) play an increasingly important role in daily work, study and life of people. Perhaps more importantly, different types of online social networks that meet different needs of people are emerging. Together with other technologies, they have driven the rapid development of digital economy, forming an increasingly tight link with people's daily lives. Thus, the large number of users owned by an online social network often represents a large economic benefit. They bring convenience to people and meet various demands of people, and meanwhile, bring certain risks. Earning benefits through the creation of false accounts, robotic accounts, and hijack accounts has become a common phenomenon in online social networks that severely impacts normal user experience and personal information property security. Some malicious users spread rumors, fry crops and promote fermentation sensitive topics through abnormal account numbers, and the activities of making bad public opinion guidance and the like also bring certain influence on social stability and stable conglomeration. Thus, anomalous accounts have severely compromised the reputation evaluation hierarchy of online social networks and the trust relationships of users. Therefore, the abnormal account analysis and discovery technology becomes one of the key problems to be solved in the current digital economic development.
For the detection of abnormal account numbers, a great number of solutions are put forward by the joint efforts of academia and industry. These schemes can be broadly divided into two categories, namely supervised detection schemes based on behavioral characteristics and content and unsupervised detection schemes based on graph structure. The supervised detection scheme comprises methods of extracting information entropy through user content, behavior characteristics and the like, detecting semantics or behavior analysis based on registration information and user activities in combination with an LDA (latent Dirichlet Allocation) model, constructing a classifier through self-defined abnormal behaviors, filtering trigger words and the like. Therefore, due to the adoption of a supervision detection method, a classifier needs to be trained in advance, abnormal account numbers or attackers often continuously update behavior (attack) modes to avoid detection, and therefore the capability of detecting unknown attack modes is poor.
Disclosure of Invention
The invention aims to provide a method and a system for detecting abnormal account numbers of an online social network, so as to improve the accuracy of abnormal account number detection.
In order to achieve the purpose, the invention provides the following scheme:
a method for detecting abnormal account numbers of an online social network comprises the following steps:
generating a simple graph according to a user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
fusing the importance of each node to obtain an importance fusion value of each node;
taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph;
combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and determining the account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
Optionally, the calculating, according to the simple graph, the importance of each node through a node importance evaluation algorithm specifically includes:
using formulasCalculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; x vu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
using formulasCalculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
using formulasCalculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, tu) represents the number of shortest paths that pass through node u of all shortest paths from node s to node t.
Optionally, the fusing the importance of each node to obtain an importance fused value of each node specifically includes:
using formulasNormalizing the centrality of the incoming degree of each node;the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
using formulasNormalizing the approximate centrality of each node; wherein the content of the first and second substances,a near centrality normalization value for node u; cC(u) isThe near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
using Euler's formulaFusing the importance of each node; wherein the content of the first and second substances,is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
Optionally, the performing power iteration of trust seed delivery in combination with the weight and the degree of emergence of each node in the weighted graph, and assigning a corresponding trust value to each node in the weighted graph specifically includes:
determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
based on the out-degree of each node, using a formulaPerforming power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
The invention also provides a system for detecting the abnormal account of the online social network, which comprises the following steps:
the simple graph generating module is used for generating a simple graph according to the user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
the node importance calculating module is used for calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
the importance fusion module is used for fusing the importance of each node to obtain an importance fusion value of each node;
the weighted graph generating module is used for taking the importance fusion value of each node as the weight of the corresponding node and converting the simple graph into a weighted graph;
the trust value transmission module is used for combining the weight and the out degree of each node in the authorized graph to carry out power iteration of trust seed transmission and endow each node in the authorized graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and the abnormal account determining module is used for determining an account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
Optionally, the node importance calculating module specifically includes:
an in-degree centrality calculation unit for using a formulaCalculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
a near-centrality calculation unit for using a formulaCalculating the proximity centrality of each node; wherein the content of the first and second substances,CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
a mediation centrality calculation unit for utilizing the formulaCalculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, tu) represents the number of shortest paths that pass through node u of all shortest paths from node s to node t.
Optionally, the importance fusion module specifically includes:
an in-degree centrality normalization unit for utilizing a formulaNormalizing the centrality of the incoming degree of each node;the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
a near-centrality normalization unit for utilizing the formulaNormalizing the approximate centrality of each node; wherein the content of the first and second substances,a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
an importance fusion unit for utilizing Euler's formulaFusing the importance of each node; wherein the content of the first and second substances,is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
Optionally, the trust value transfer module specifically includes:
the edge weight determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
a trust value transfer unit for utilizing a formula based on the out-degree of each nodePerforming power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method, the node trust values in the authorized graph are iterated through the importance of the nodes, and finally the abnormal account is identified through the node trust values, so that the accuracy of detecting the abnormal account is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart of a method for detecting an abnormal account in an online social network according to the present invention;
FIG. 2 is a schematic diagram of a simple graph formed by an online social network;
FIG. 3 is a schematic diagram of a weighted graph;
FIG. 4 is a schematic diagram of a first iteration of trust seed delivery using the method of the present invention;
FIG. 5 is a schematic diagram of a second iteration of trust seed delivery using the method of the present invention;
FIG. 6 is a schematic diagram of a third iteration of trust seed delivery using the method of the present invention;
FIG. 7 is a schematic diagram of a fourth iteration of trust seed delivery using the method of the present invention;
FIG. 8 is a schematic diagram of a trust seed delivery iteration performed using the method of the present invention;
FIG. 9 is a schematic diagram of trust seed delivery after iteration is completed using the SybilRank algorithm;
fig. 10 is a schematic structural diagram of a system for detecting an abnormal account in an online social network according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Compared with a supervised detection method, the unsupervised detection scheme is mainly based on graph detection and does not need to train a classifier in advance. The method is essentially characterized in that a friend relationship graph is utilized, the relationship between an unknown node and a known node is judged through algorithms such as random walk, self-adaptive maximum flow, power iteration and Markov random field, and whether the node is abnormal or not is detected. The method can detect unknown abnormal behaviors, has the advantage of being not easy to be bypassed by attackers, and gradually becomes a research hotspot for detecting abnormal account numbers. However, the unsupervised detection method has obvious defects, the accuracy is relatively low compared with the supervised detection method, and the detection effect is different in different types of online social networks. At present, the graph-based detection method is more in theoretical research and is deployed relatively less in reality.
Aiming at the requirement of credit evaluation based on an online social network in the Internet financial industry, the invention provides three criteria of abnormal account detection based on a graph structure, and performs data cleaning and credit evaluation on the basis, so that the network structure is more important in the credit evaluation process than the behavior characteristics of individual users. The invention provides an improved algorithm based on a SybilRank algorithm for detecting abnormal account numbers, redefines a power iteration formula in the SybilRank algorithm through the importance of nodes, and effectively improves the accuracy of abnormal account number detection. Meanwhile, the distributed frame pregel based on graph calculation realizes abnormal account detection of a large-scale social network, and reduces time overhead.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a schematic flow chart of a method for detecting an abnormal account in an online social network according to the present invention. As shown in fig. 1, the method for detecting an abnormal account in an online social network of the present invention includes the following steps:
step 100: a simple graph is generated from a user relationship dataset of an online social network. The user relationship data set includes accounts of the users and associations between the accounts. The method comprises the steps that accounts of users in an online social network are taken as nodes, incidence relations formed by mutual attention or other forms among the accounts are shown as edges, and some incidence relations are bidirectional, such as friend relations and the like; some associations are one-way, such as comment replies and the like; a simple graph is formed based on the incidence relation between the accounts.
The SybilRank algorithm isAnd (3) a detection algorithm based on a random walk model. Selecting part of nodes as trust seeds, transmitting trust values to other nodes through O (logn) power iteration, normalizing the trust values according to the degrees of the nodes and sequencing results, wherein the nodes with smaller trust values are regarded as suspicious nodes. Since online social networks are usually directed graphs, the SybilRank algorithm is an undirected graph-based network anomaly account detection algorithm. Therefore, when a graph formed by directed edges is processed by using the SybilRank algorithm, the original topological structure of the directed graph is changed, and some original attributes of the online social network are lost, so that the calculation accuracy is reduced. Such as simply changing a directed graph to an undirected graph to satisfy itMuch information is lost. Since the degree (out degree) and the degree (indegree) are not distinguished in the undirected graph, the degree of a node is determined only by the edges connected to the node. Therefore, an attacker can often avoid detection of the SybilRank algorithm by paying more attention to the normal account number to improve the degree of the attacker. Even different abnormal account numbers can simulate normal network structures by paying attention to each other to avoid detection.
In the directed graph, the abnormal account is difficult to simulate the network structure of the normal account, because the abnormal account will pay attention to the normal account in a large amount, and the normal account will pay attention to the abnormal account in a small amount. Meanwhile, the SybilRank algorithm is only suitable for online social networks with few attack edges. With the increase of the number of the attack edges, the effectiveness of the algorithm is gradually reduced, meanwhile, the algorithm is easily influenced by the distribution of the attack edges, the farther the attack edges are away from the selected trust seeds, the better the detection effect of the algorithm is.
Based on this, the invention adopts the improved SybilRank algorithm to identify the abnormal accounts of the online social network. Different accounts may present different levels of importance, as online social networks often map the social characteristics of the account itself. If more trust seeds can be distributed to the top points with high importance, the accuracy of the SybilRank algorithm in identifying abnormal account numbers can be greatly improved. Because the weight of the user is difficult to be directly given by the online social network and the importance degree of the nodes in the graph cannot be reflected, the simple graph is changed into the weighted graph by giving effective trust value weight to each node so as to improve the calculation accuracy. The specific process is shown as step 200-step 400.
Step 200: and calculating the importance of each node through a node importance evaluation algorithm according to the simple graph. The importance of a node includes the in-degree centrality, near centrality, and intermediate centrality of the node.
In a real online social network, the more the attention is paid, the higher the prestige is, and the more the top points are shown to have higher importance in the graph, namely, the centrality of the introductivity is the most direct index for describing the centrality of the nodes, and C is usedD(u) represents:
wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes in the simple graph except node u.
The proximity centrality is an index used for measuring the proximity in the network, which represents the distance between a certain node and all other nodes, and is used for describing the difficulty of the node reaching other nodes through the network. The value of the node is the reciprocal of the sum of the shortest distances between all nodes in the network and the node, and for a node, the closer the node is to other nodes, the greater the proximity centrality of the node is, and the higher the importance of the node is. With CC(u) represents:
wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node uPath, shortest path d (v, u) min (X)v1+X12…+Xij+…+X(k-1)k+XkuAnd 1, 2, i, j, (k-1) and k are nodes which pass through in the path from the node v to the node u in sequence.
The intermediate centrality refers to the number of times a certain node is passed by the shortest path between any two other nodes in the network, which is expressed by the importance of the node when connecting other nodes, and C is usedB(u) represents a group of a compound represented by,
wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
Step 300: and fusing the importance of each node to obtain an importance fusion value of each node. Firstly, normalization processing is carried out on the in-degree centrality and the near centrality according to an optimal mode:
normalizing the centrality of the degree of each node:
the value is the input centrality normalization value of the node u; n denotes the number of nodes in the simple graph.
The proximity centrality of each node is normalized:
wherein the content of the first and second substances,a near centrality normalization value for node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) Is the maximum value of the near centrality among all nodes of the simple graph.
Then, the importance of each node is fused by using an Euler formula:
wherein the content of the first and second substances,is the importance fusion value of the node u.
Step 400: and taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph.
Step 500: and combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value. The trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value. Since the SybilRank algorithm is essentially an undirected graph-based random walk algorithm, the invention performs power iteration on the trust seed based on the degree of occurrence. Based on the simple graph shown in fig. 2 (in fig. 2, gray nodes are Sybil accounts, that is, abnormal accounts, and white nodes are non-Sybil accounts), after the weight of each node is calculated, the weight of each edge is calculated according to the node weights at both ends of the edge and the output of the node, and is given to each edge, so that the weighted graph shown in fig. 3 is obtained. In this figure, connections between nodes may be considered undirected, but information transfer between nodes is implied.
And then, based on the starting node and the ending node of information transmission and the output degree of each node, performing power iteration on the trust value of each node for O (logn) times to obtain the trust value of each node. The trust value update formula is as follows:
wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u; sigma(u,v)∈URepresenting parameters corresponding to all edges pointing to node uAnd (6) summing.
Any two nodes in fig. 3 are taken as trust seeds and given certain trust values to complete iteration, and the iteration process is as shown in fig. 4-8. And directly marking the trust value obtained by calculation in the iterative process on the node. In fig. 8, the final trust value obtained after the iteration is completed is obtained.
Step 600: and determining the account corresponding to the node with the smaller trust value in the authorized graph as the abnormal account.
In order to compare the final results, iteration is performed according to the conventional SybilRank algorithm based on the simple graph shown in fig. 2, and the final Rank value is shown in fig. 9. By comparing fig. 8 with fig. 9, it can be found that the trust values of the benign domain are all greater than the trust values of all nodes in the Sybil domain after the iteration of the improved algorithm is completed; in the calculation result obtained by the original SybilRank algorithm, the trust values of 3 nodes in the 6 nodes in the benign domain are lower than those of a certain node in the Sybil domain, and obviously, the calculated ordering structure is not ideal enough. Therefore, the accuracy of the improved algorithm is significantly higher than that of the original Sybil algorithm.
Based on the method, the invention further provides a detection system for the abnormal account of the online social network, and fig. 10 is a schematic structural diagram of the detection system for the abnormal account of the online social network. As shown in fig. 10, the system for detecting an abnormal account in an online social network of the present invention includes:
a simple graph generation module 1001, configured to generate a simple graph according to a user relationship data set of an online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of the users, and the edges are incidence relations between the two users.
A node importance calculating module 1002, configured to calculate importance of each node according to the simple graph through a node importance evaluation algorithm; the importance of the node includes the in-degree centrality, near centrality, and intermediate centrality of the node.
An importance fusion module 1003, configured to fuse the importance of each node to obtain an importance fusion value of each node.
And the weighted graph generating module 1004 is configured to convert the simple graph into a weighted graph by using the importance fusion value of each node as the weight of the corresponding node.
A trust value transfer module 1005, configured to perform power iteration of trust seed transfer by combining the weight and the degree of each node in the weighted graph, and assign a corresponding trust value to each node in the weighted graph; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value.
An abnormal account determining module 1006, configured to determine an account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
As a specific embodiment, in the system for detecting an abnormal account in an online social network according to the present invention, the node importance calculating module 1002 specifically includes:
an in-degree centrality calculation unit for using a formulaCalculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes in the simple graph except node u.
A near-centrality calculation unit for using a formulaCalculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u.
A mediation centrality calculation unit for utilizing the formulaCalculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
As a specific embodiment, in the system for detecting an abnormal account in an online social network, the importance fusion module 1003 specifically includes:
an in-degree centrality normalization unit for utilizing a formulaNormalizing the centrality of the incoming degree of each node;the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n denotes the number of nodes in the simple graph.
A near-centrality normalization unit for utilizing the formulaConnection to each nodeNormalizing the recenterness; wherein the content of the first and second substances,a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) Is the maximum value of the near centrality among all nodes of the simple graph.
An importance fusion unit for utilizing Euler's formulaFusing the importance of each node; wherein the content of the first and second substances,is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
As a specific embodiment, in the system for detecting an abnormal account in an online social network, the trust value transmitting module 1005 specifically includes:
and the edge weight value determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the out degree of the nodes at the two ends of each edge.
A trust value transfer unit for utilizing a formula based on the out-degree of each nodePerforming power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.
Claims (8)
1. A method for detecting an abnormal account number of an online social network is characterized by comprising the following steps:
generating a simple graph according to a user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
fusing the importance of each node to obtain an importance fusion value of each node;
taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph;
combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and determining the account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
2. The method for detecting the abnormal account of the online social network according to claim 1, wherein the calculating the importance of each node through a node importance evaluation algorithm according to the simple graph specifically comprises:
using formulasCalculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, Xvu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
using formulasCalculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
using formulasCalculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
3. The method for detecting the abnormal account of the online social network according to claim 1, wherein the fusing the importance of each node to obtain an importance fused value of each node specifically comprises:
using formulasNormalizing the centrality of the incoming degree of each node;the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
using formulasNormalizing the approximate centrality of each node; wherein the content of the first and second substances,a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
4. The method for detecting an abnormal account in an online social network according to claim 1, wherein the power iteration of trust seed delivery is performed in combination with the weight and the out-degree of each node in the weighted graph, and each node in the weighted graph is assigned with a corresponding trust value, specifically comprising:
determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
based on the out-degree of each node, using a formulaPerforming power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
5. A system for detecting abnormal account numbers of an online social network is characterized by comprising:
the simple graph generating module is used for generating a simple graph according to the user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
the node importance calculating module is used for calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
the importance fusion module is used for fusing the importance of each node to obtain an importance fusion value of each node;
the weighted graph generating module is used for taking the importance fusion value of each node as the weight of the corresponding node and converting the simple graph into a weighted graph;
the trust value transmission module is used for combining the weight and the out degree of each node in the authorized graph to carry out power iteration of trust seed transmission and endow each node in the authorized graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and the abnormal account determining module is used for determining an account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
6. The system for detecting an abnormal account in an online social network according to claim 5, wherein the node importance calculating module specifically includes:
an in-degree centrality calculation unit for using a formulaCalculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, Xvu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
a near-centrality calculation unit for using a formulaCalculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
a mediation centrality calculation unit for utilizing the formulaCalculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
7. The system for detecting the abnormal account of the online social network according to claim 5, wherein the importance fusion module specifically comprises:
an in-degree centrality normalization unit for utilizing a formulaNormalizing the centrality of the incoming degree of each node;the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
a near-centrality normalization unit for utilizing the formulaNormalizing the approximate centrality of each node; wherein the content of the first and second substances,a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
8. The system for detecting an abnormal account number in an online social network according to claim 5, wherein the trust value transferring module specifically comprises:
the edge weight determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
a trust value transfer unit for utilizing a formula based on the out-degree of each nodePerforming power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011428803.9A CN112597439B (en) | 2020-12-07 | 2020-12-07 | Method and system for detecting abnormal account number of online social network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011428803.9A CN112597439B (en) | 2020-12-07 | 2020-12-07 | Method and system for detecting abnormal account number of online social network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112597439A true CN112597439A (en) | 2021-04-02 |
CN112597439B CN112597439B (en) | 2024-03-01 |
Family
ID=75191163
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011428803.9A Active CN112597439B (en) | 2020-12-07 | 2020-12-07 | Method and system for detecting abnormal account number of online social network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112597439B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113326437A (en) * | 2021-06-22 | 2021-08-31 | 哈尔滨工程大学 | Microblog early rumor detection method based on dual-engine network and DRQN |
CN113378899A (en) * | 2021-05-28 | 2021-09-10 | 百果园技术(新加坡)有限公司 | Abnormal account identification method, device, equipment and storage medium |
CN113610521A (en) * | 2021-07-27 | 2021-11-05 | 胜斗士(上海)科技技术发展有限公司 | Method and apparatus for detecting anomalies in behavioral data |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108932669A (en) * | 2018-06-27 | 2018-12-04 | 北京工业大学 | A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP) |
-
2020
- 2020-12-07 CN CN202011428803.9A patent/CN112597439B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108932669A (en) * | 2018-06-27 | 2018-12-04 | 北京工业大学 | A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113378899A (en) * | 2021-05-28 | 2021-09-10 | 百果园技术(新加坡)有限公司 | Abnormal account identification method, device, equipment and storage medium |
CN113326437A (en) * | 2021-06-22 | 2021-08-31 | 哈尔滨工程大学 | Microblog early rumor detection method based on dual-engine network and DRQN |
CN113326437B (en) * | 2021-06-22 | 2022-06-21 | 哈尔滨工程大学 | Microblog early rumor detection method based on dual-engine network and DRQN |
CN113610521A (en) * | 2021-07-27 | 2021-11-05 | 胜斗士(上海)科技技术发展有限公司 | Method and apparatus for detecting anomalies in behavioral data |
Also Published As
Publication number | Publication date |
---|---|
CN112597439B (en) | 2024-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yazdinejad et al. | Secure intelligent fuzzy blockchain framework: Effective threat detection in iot networks | |
Al-Qurishi et al. | Sybil defense techniques in online social networks: a survey | |
CN112597439A (en) | Method and system for detecting abnormal account of online social network | |
Wang et al. | Graph-based security and privacy analytics via collective classification with joint weight learning and propagation | |
Bindu et al. | Discovering spammer communities in twitter | |
Huang et al. | Resilient routing mechanism for wireless sensor networks with deep learning link reliability prediction | |
Goga et al. | The doppelgänger bot attack: Exploring identity impersonation in online social networks | |
Jiang et al. | Gatrust: A multi-aspect graph attention network model for trust assessment in osns | |
Liu et al. | An intrusion detection model with hierarchical attention mechanism | |
Moodi et al. | A hybrid intelligent approach to detect android botnet using smart self-adaptive learning-based PSO-SVM | |
Soniya et al. | Intrusion detection system: Classification and techniques | |
Koroniotis et al. | A new Intelligent Satellite Deep Learning Network Forensic framework for smart satellite networks | |
Aljumah | Detection of distributed denial of service attacks using artificial neural networks | |
Sharma et al. | An efficient hybrid deep learning model for denial of service detection in cyber physical systems | |
Saurabh et al. | Nfdlm: A lightweight network flow based deep learning model for ddos attack detection in iot domains | |
Raghebi et al. | A new trust evaluation method based on reliability of customer feedback for cloud computing | |
Lata et al. | A comprehensive survey of fraud detection techniques | |
Tao et al. | Structural identity representation learning for blockchain-enabled metaverse based on complex network analysis | |
Zheng et al. | Tegdetector: a phishing detector that knows evolving transaction behaviors | |
Vyawahare et al. | Fake profile recognition using profanity and gender identification on online social networks | |
Mulamba et al. | On sybil classification in online social networks using only structural features | |
Subramani et al. | Deep learning based IDS for secured routing in wireless sensor networks using fuzzy genetic approach | |
Lin et al. | DTRM: A new reputation mechanism to enhance data trustworthiness for high-performance cloud computing | |
Kumar et al. | Dr. Phish: Phishing Website Detector | |
Shorfuzzaman | Detection of cyber attacks in IoT using tree-based ensemble and feedforward neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |