CN112597439A - Method and system for detecting abnormal account of online social network - Google Patents

Method and system for detecting abnormal account of online social network Download PDF

Info

Publication number
CN112597439A
CN112597439A CN202011428803.9A CN202011428803A CN112597439A CN 112597439 A CN112597439 A CN 112597439A CN 202011428803 A CN202011428803 A CN 202011428803A CN 112597439 A CN112597439 A CN 112597439A
Authority
CN
China
Prior art keywords
node
centrality
importance
value
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011428803.9A
Other languages
Chinese (zh)
Other versions
CN112597439B (en
Inventor
邓明森
丁健
喻曦
龙昌庭
刘振涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou University of Finance and Economics
Original Assignee
Guizhou University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou University of Finance and Economics filed Critical Guizhou University of Finance and Economics
Priority to CN202011428803.9A priority Critical patent/CN112597439B/en
Publication of CN112597439A publication Critical patent/CN112597439A/en
Application granted granted Critical
Publication of CN112597439B publication Critical patent/CN112597439B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention relates to a method and a system for detecting an abnormal account of an online social network. The method comprises the following steps: generating a simple graph according to a user relationship data set of the online social network; calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the introductivity centrality, the approach centrality and the medium centrality of the node; fusing the importance of each node to obtain an importance fusion value of each node; taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph; combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value; and determining the account corresponding to the node with the smaller trust value in the authorized graph as the abnormal account. The method and the device can improve the accuracy of abnormal account detection.

Description

Method and system for detecting abnormal account of online social network
Technical Field
The invention relates to the field of account detection, in particular to a method and a system for detecting an abnormal account of an online social network.
Background
With the widespread use of the internet and mobile terminals, Online Social Networks (Online Social Networks) play an increasingly important role in daily work, study and life of people. Perhaps more importantly, different types of online social networks that meet different needs of people are emerging. Together with other technologies, they have driven the rapid development of digital economy, forming an increasingly tight link with people's daily lives. Thus, the large number of users owned by an online social network often represents a large economic benefit. They bring convenience to people and meet various demands of people, and meanwhile, bring certain risks. Earning benefits through the creation of false accounts, robotic accounts, and hijack accounts has become a common phenomenon in online social networks that severely impacts normal user experience and personal information property security. Some malicious users spread rumors, fry crops and promote fermentation sensitive topics through abnormal account numbers, and the activities of making bad public opinion guidance and the like also bring certain influence on social stability and stable conglomeration. Thus, anomalous accounts have severely compromised the reputation evaluation hierarchy of online social networks and the trust relationships of users. Therefore, the abnormal account analysis and discovery technology becomes one of the key problems to be solved in the current digital economic development.
For the detection of abnormal account numbers, a great number of solutions are put forward by the joint efforts of academia and industry. These schemes can be broadly divided into two categories, namely supervised detection schemes based on behavioral characteristics and content and unsupervised detection schemes based on graph structure. The supervised detection scheme comprises methods of extracting information entropy through user content, behavior characteristics and the like, detecting semantics or behavior analysis based on registration information and user activities in combination with an LDA (latent Dirichlet Allocation) model, constructing a classifier through self-defined abnormal behaviors, filtering trigger words and the like. Therefore, due to the adoption of a supervision detection method, a classifier needs to be trained in advance, abnormal account numbers or attackers often continuously update behavior (attack) modes to avoid detection, and therefore the capability of detecting unknown attack modes is poor.
Disclosure of Invention
The invention aims to provide a method and a system for detecting abnormal account numbers of an online social network, so as to improve the accuracy of abnormal account number detection.
In order to achieve the purpose, the invention provides the following scheme:
a method for detecting abnormal account numbers of an online social network comprises the following steps:
generating a simple graph according to a user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
fusing the importance of each node to obtain an importance fusion value of each node;
taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph;
combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and determining the account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
Optionally, the calculating, according to the simple graph, the importance of each node through a node importance evaluation algorithm specifically includes:
using formulas
Figure BDA0002820065260000021
Calculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; x vu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
using formulas
Figure BDA0002820065260000022
Calculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
using formulas
Figure BDA0002820065260000023
Calculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, tu) represents the number of shortest paths that pass through node u of all shortest paths from node s to node t.
Optionally, the fusing the importance of each node to obtain an importance fused value of each node specifically includes:
using formulas
Figure BDA0002820065260000031
Normalizing the centrality of the incoming degree of each node;
Figure BDA0002820065260000032
the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
using formulas
Figure BDA0002820065260000033
Normalizing the approximate centrality of each node; wherein the content of the first and second substances,
Figure BDA0002820065260000034
a near centrality normalization value for node u; cC(u) isThe near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
using Euler's formula
Figure BDA0002820065260000035
Fusing the importance of each node; wherein the content of the first and second substances,
Figure BDA0002820065260000036
is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
Optionally, the performing power iteration of trust seed delivery in combination with the weight and the degree of emergence of each node in the weighted graph, and assigning a corresponding trust value to each node in the weighted graph specifically includes:
determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
based on the out-degree of each node, using a formula
Figure BDA0002820065260000037
Performing power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;
Figure BDA0002820065260000038
is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
The invention also provides a system for detecting the abnormal account of the online social network, which comprises the following steps:
the simple graph generating module is used for generating a simple graph according to the user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
the node importance calculating module is used for calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
the importance fusion module is used for fusing the importance of each node to obtain an importance fusion value of each node;
the weighted graph generating module is used for taking the importance fusion value of each node as the weight of the corresponding node and converting the simple graph into a weighted graph;
the trust value transmission module is used for combining the weight and the out degree of each node in the authorized graph to carry out power iteration of trust seed transmission and endow each node in the authorized graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and the abnormal account determining module is used for determining an account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
Optionally, the node importance calculating module specifically includes:
an in-degree centrality calculation unit for using a formula
Figure BDA0002820065260000041
Calculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
a near-centrality calculation unit for using a formula
Figure BDA0002820065260000042
Calculating the proximity centrality of each node; wherein the content of the first and second substances,CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
a mediation centrality calculation unit for utilizing the formula
Figure BDA0002820065260000043
Calculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, tu) represents the number of shortest paths that pass through node u of all shortest paths from node s to node t.
Optionally, the importance fusion module specifically includes:
an in-degree centrality normalization unit for utilizing a formula
Figure BDA0002820065260000051
Normalizing the centrality of the incoming degree of each node;
Figure BDA0002820065260000052
the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
a near-centrality normalization unit for utilizing the formula
Figure BDA0002820065260000053
Normalizing the approximate centrality of each node; wherein the content of the first and second substances,
Figure BDA0002820065260000054
a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
an importance fusion unit for utilizing Euler's formula
Figure BDA0002820065260000055
Fusing the importance of each node; wherein the content of the first and second substances,
Figure BDA0002820065260000056
is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
Optionally, the trust value transfer module specifically includes:
the edge weight determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
a trust value transfer unit for utilizing a formula based on the out-degree of each node
Figure BDA0002820065260000057
Performing power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;
Figure BDA0002820065260000058
is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method, the node trust values in the authorized graph are iterated through the importance of the nodes, and finally the abnormal account is identified through the node trust values, so that the accuracy of detecting the abnormal account is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart of a method for detecting an abnormal account in an online social network according to the present invention;
FIG. 2 is a schematic diagram of a simple graph formed by an online social network;
FIG. 3 is a schematic diagram of a weighted graph;
FIG. 4 is a schematic diagram of a first iteration of trust seed delivery using the method of the present invention;
FIG. 5 is a schematic diagram of a second iteration of trust seed delivery using the method of the present invention;
FIG. 6 is a schematic diagram of a third iteration of trust seed delivery using the method of the present invention;
FIG. 7 is a schematic diagram of a fourth iteration of trust seed delivery using the method of the present invention;
FIG. 8 is a schematic diagram of a trust seed delivery iteration performed using the method of the present invention;
FIG. 9 is a schematic diagram of trust seed delivery after iteration is completed using the SybilRank algorithm;
fig. 10 is a schematic structural diagram of a system for detecting an abnormal account in an online social network according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Compared with a supervised detection method, the unsupervised detection scheme is mainly based on graph detection and does not need to train a classifier in advance. The method is essentially characterized in that a friend relationship graph is utilized, the relationship between an unknown node and a known node is judged through algorithms such as random walk, self-adaptive maximum flow, power iteration and Markov random field, and whether the node is abnormal or not is detected. The method can detect unknown abnormal behaviors, has the advantage of being not easy to be bypassed by attackers, and gradually becomes a research hotspot for detecting abnormal account numbers. However, the unsupervised detection method has obvious defects, the accuracy is relatively low compared with the supervised detection method, and the detection effect is different in different types of online social networks. At present, the graph-based detection method is more in theoretical research and is deployed relatively less in reality.
Aiming at the requirement of credit evaluation based on an online social network in the Internet financial industry, the invention provides three criteria of abnormal account detection based on a graph structure, and performs data cleaning and credit evaluation on the basis, so that the network structure is more important in the credit evaluation process than the behavior characteristics of individual users. The invention provides an improved algorithm based on a SybilRank algorithm for detecting abnormal account numbers, redefines a power iteration formula in the SybilRank algorithm through the importance of nodes, and effectively improves the accuracy of abnormal account number detection. Meanwhile, the distributed frame pregel based on graph calculation realizes abnormal account detection of a large-scale social network, and reduces time overhead.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a schematic flow chart of a method for detecting an abnormal account in an online social network according to the present invention. As shown in fig. 1, the method for detecting an abnormal account in an online social network of the present invention includes the following steps:
step 100: a simple graph is generated from a user relationship dataset of an online social network. The user relationship data set includes accounts of the users and associations between the accounts. The method comprises the steps that accounts of users in an online social network are taken as nodes, incidence relations formed by mutual attention or other forms among the accounts are shown as edges, and some incidence relations are bidirectional, such as friend relations and the like; some associations are one-way, such as comment replies and the like; a simple graph is formed based on the incidence relation between the accounts.
The SybilRank algorithm isAnd (3) a detection algorithm based on a random walk model. Selecting part of nodes as trust seeds, transmitting trust values to other nodes through O (logn) power iteration, normalizing the trust values according to the degrees of the nodes and sequencing results, wherein the nodes with smaller trust values are regarded as suspicious nodes. Since online social networks are usually directed graphs, the SybilRank algorithm is an undirected graph-based network anomaly account detection algorithm. Therefore, when a graph formed by directed edges is processed by using the SybilRank algorithm, the original topological structure of the directed graph is changed, and some original attributes of the online social network are lost, so that the calculation accuracy is reduced. Such as simply changing a directed graph to an undirected graph to satisfy it
Figure BDA0002820065260000081
Much information is lost. Since the degree (out degree) and the degree (indegree) are not distinguished in the undirected graph, the degree of a node is determined only by the edges connected to the node. Therefore, an attacker can often avoid detection of the SybilRank algorithm by paying more attention to the normal account number to improve the degree of the attacker. Even different abnormal account numbers can simulate normal network structures by paying attention to each other to avoid detection.
In the directed graph, the abnormal account is difficult to simulate the network structure of the normal account, because the abnormal account will pay attention to the normal account in a large amount, and the normal account will pay attention to the abnormal account in a small amount. Meanwhile, the SybilRank algorithm is only suitable for online social networks with few attack edges. With the increase of the number of the attack edges, the effectiveness of the algorithm is gradually reduced, meanwhile, the algorithm is easily influenced by the distribution of the attack edges, the farther the attack edges are away from the selected trust seeds, the better the detection effect of the algorithm is.
Based on this, the invention adopts the improved SybilRank algorithm to identify the abnormal accounts of the online social network. Different accounts may present different levels of importance, as online social networks often map the social characteristics of the account itself. If more trust seeds can be distributed to the top points with high importance, the accuracy of the SybilRank algorithm in identifying abnormal account numbers can be greatly improved. Because the weight of the user is difficult to be directly given by the online social network and the importance degree of the nodes in the graph cannot be reflected, the simple graph is changed into the weighted graph by giving effective trust value weight to each node so as to improve the calculation accuracy. The specific process is shown as step 200-step 400.
Step 200: and calculating the importance of each node through a node importance evaluation algorithm according to the simple graph. The importance of a node includes the in-degree centrality, near centrality, and intermediate centrality of the node.
In a real online social network, the more the attention is paid, the higher the prestige is, and the more the top points are shown to have higher importance in the graph, namely, the centrality of the introductivity is the most direct index for describing the centrality of the nodes, and C is usedD(u) represents:
Figure BDA0002820065260000082
wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes in the simple graph except node u.
The proximity centrality is an index used for measuring the proximity in the network, which represents the distance between a certain node and all other nodes, and is used for describing the difficulty of the node reaching other nodes through the network. The value of the node is the reciprocal of the sum of the shortest distances between all nodes in the network and the node, and for a node, the closer the node is to other nodes, the greater the proximity centrality of the node is, and the higher the importance of the node is. With CC(u) represents:
Figure BDA0002820065260000091
wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node uPath, shortest path d (v, u) min (X)v1+X12…+Xij+…+X(k-1)k+XkuAnd 1, 2, i, j, (k-1) and k are nodes which pass through in the path from the node v to the node u in sequence.
The intermediate centrality refers to the number of times a certain node is passed by the shortest path between any two other nodes in the network, which is expressed by the importance of the node when connecting other nodes, and C is usedB(u) represents a group of a compound represented by,
Figure BDA0002820065260000092
wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
Step 300: and fusing the importance of each node to obtain an importance fusion value of each node. Firstly, normalization processing is carried out on the in-degree centrality and the near centrality according to an optimal mode:
normalizing the centrality of the degree of each node:
Figure BDA0002820065260000093
Figure BDA0002820065260000094
the value is the input centrality normalization value of the node u; n denotes the number of nodes in the simple graph.
The proximity centrality of each node is normalized:
Figure BDA0002820065260000095
wherein the content of the first and second substances,
Figure BDA0002820065260000096
a near centrality normalization value for node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) Is the maximum value of the near centrality among all nodes of the simple graph.
Then, the importance of each node is fused by using an Euler formula:
Figure BDA0002820065260000101
wherein the content of the first and second substances,
Figure BDA0002820065260000102
is the importance fusion value of the node u.
Step 400: and taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph.
Step 500: and combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value. The trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value. Since the SybilRank algorithm is essentially an undirected graph-based random walk algorithm, the invention performs power iteration on the trust seed based on the degree of occurrence. Based on the simple graph shown in fig. 2 (in fig. 2, gray nodes are Sybil accounts, that is, abnormal accounts, and white nodes are non-Sybil accounts), after the weight of each node is calculated, the weight of each edge is calculated according to the node weights at both ends of the edge and the output of the node, and is given to each edge, so that the weighted graph shown in fig. 3 is obtained. In this figure, connections between nodes may be considered undirected, but information transfer between nodes is implied.
And then, based on the starting node and the ending node of information transmission and the output degree of each node, performing power iteration on the trust value of each node for O (logn) times to obtain the trust value of each node. The trust value update formula is as follows:
Figure BDA0002820065260000103
wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;
Figure BDA0002820065260000104
is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u; sigma(u,v)∈URepresenting parameters corresponding to all edges pointing to node u
Figure BDA0002820065260000105
And (6) summing.
Any two nodes in fig. 3 are taken as trust seeds and given certain trust values to complete iteration, and the iteration process is as shown in fig. 4-8. And directly marking the trust value obtained by calculation in the iterative process on the node. In fig. 8, the final trust value obtained after the iteration is completed is obtained.
Step 600: and determining the account corresponding to the node with the smaller trust value in the authorized graph as the abnormal account.
In order to compare the final results, iteration is performed according to the conventional SybilRank algorithm based on the simple graph shown in fig. 2, and the final Rank value is shown in fig. 9. By comparing fig. 8 with fig. 9, it can be found that the trust values of the benign domain are all greater than the trust values of all nodes in the Sybil domain after the iteration of the improved algorithm is completed; in the calculation result obtained by the original SybilRank algorithm, the trust values of 3 nodes in the 6 nodes in the benign domain are lower than those of a certain node in the Sybil domain, and obviously, the calculated ordering structure is not ideal enough. Therefore, the accuracy of the improved algorithm is significantly higher than that of the original Sybil algorithm.
Based on the method, the invention further provides a detection system for the abnormal account of the online social network, and fig. 10 is a schematic structural diagram of the detection system for the abnormal account of the online social network. As shown in fig. 10, the system for detecting an abnormal account in an online social network of the present invention includes:
a simple graph generation module 1001, configured to generate a simple graph according to a user relationship data set of an online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of the users, and the edges are incidence relations between the two users.
A node importance calculating module 1002, configured to calculate importance of each node according to the simple graph through a node importance evaluation algorithm; the importance of the node includes the in-degree centrality, near centrality, and intermediate centrality of the node.
An importance fusion module 1003, configured to fuse the importance of each node to obtain an importance fusion value of each node.
And the weighted graph generating module 1004 is configured to convert the simple graph into a weighted graph by using the importance fusion value of each node as the weight of the corresponding node.
A trust value transfer module 1005, configured to perform power iteration of trust seed transfer by combining the weight and the degree of each node in the weighted graph, and assign a corresponding trust value to each node in the weighted graph; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value.
An abnormal account determining module 1006, configured to determine an account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
As a specific embodiment, in the system for detecting an abnormal account in an online social network according to the present invention, the node importance calculating module 1002 specifically includes:
an in-degree centrality calculation unit for using a formula
Figure BDA0002820065260000121
Calculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, X vu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes in the simple graph except node u.
A near-centrality calculation unit for using a formula
Figure BDA0002820065260000122
Calculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u.
A mediation centrality calculation unit for utilizing the formula
Figure BDA0002820065260000123
Calculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
As a specific embodiment, in the system for detecting an abnormal account in an online social network, the importance fusion module 1003 specifically includes:
an in-degree centrality normalization unit for utilizing a formula
Figure BDA0002820065260000124
Normalizing the centrality of the incoming degree of each node;
Figure BDA0002820065260000125
the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n denotes the number of nodes in the simple graph.
A near-centrality normalization unit for utilizing the formula
Figure BDA0002820065260000126
Connection to each nodeNormalizing the recenterness; wherein the content of the first and second substances,
Figure BDA0002820065260000127
a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) Is the maximum value of the near centrality among all nodes of the simple graph.
An importance fusion unit for utilizing Euler's formula
Figure BDA0002820065260000131
Fusing the importance of each node; wherein the content of the first and second substances,
Figure BDA0002820065260000132
is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
As a specific embodiment, in the system for detecting an abnormal account in an online social network, the trust value transmitting module 1005 specifically includes:
and the edge weight value determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the out degree of the nodes at the two ends of each edge.
A trust value transfer unit for utilizing a formula based on the out-degree of each node
Figure BDA0002820065260000133
Performing power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;
Figure BDA0002820065260000134
is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (8)

1. A method for detecting an abnormal account number of an online social network is characterized by comprising the following steps:
generating a simple graph according to a user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
fusing the importance of each node to obtain an importance fusion value of each node;
taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph;
combining the weight and the out degree of each node in the weighted graph, performing power iteration of trust seed transmission, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and determining the account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
2. The method for detecting the abnormal account of the online social network according to claim 1, wherein the calculating the importance of each node through a node importance evaluation algorithm according to the simple graph specifically comprises:
using formulas
Figure FDA0002820065250000011
Calculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, Xvu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
using formulas
Figure FDA0002820065250000012
Calculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
using formulas
Figure FDA0002820065250000013
Calculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
3. The method for detecting the abnormal account of the online social network according to claim 1, wherein the fusing the importance of each node to obtain an importance fused value of each node specifically comprises:
using formulas
Figure FDA0002820065250000021
Normalizing the centrality of the incoming degree of each node;
Figure FDA0002820065250000022
the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
using formulas
Figure FDA0002820065250000023
Normalizing the approximate centrality of each node; wherein the content of the first and second substances,
Figure FDA0002820065250000024
a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
using Euler's formula
Figure FDA0002820065250000025
Fusing the importance of each node; wherein the content of the first and second substances,
Figure FDA0002820065250000026
is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
4. The method for detecting an abnormal account in an online social network according to claim 1, wherein the power iteration of trust seed delivery is performed in combination with the weight and the out-degree of each node in the weighted graph, and each node in the weighted graph is assigned with a corresponding trust value, specifically comprising:
determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
based on the out-degree of each node, using a formula
Figure FDA0002820065250000027
Performing power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;
Figure FDA0002820065250000028
is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
5. A system for detecting abnormal account numbers of an online social network is characterized by comprising:
the simple graph generating module is used for generating a simple graph according to the user relationship data set of the online social network; the user relationship data set comprises accounts of the users and incidence relations among the accounts; the nodes in the simple graph are accounts of users, and the edges are incidence relations between the two users;
the node importance calculating module is used for calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node comprises the in-degree centrality, the near centrality and the medium centrality of the node;
the importance fusion module is used for fusing the importance of each node to obtain an importance fusion value of each node;
the weighted graph generating module is used for taking the importance fusion value of each node as the weight of the corresponding node and converting the simple graph into a weighted graph;
the trust value transmission module is used for combining the weight and the out degree of each node in the authorized graph to carry out power iteration of trust seed transmission and endow each node in the authorized graph with a corresponding trust value; the trust seeds are part of nodes randomly selected in the weighted graph, and each trust seed is endowed with an initial trust value;
and the abnormal account determining module is used for determining an account corresponding to the node with the smaller trust value in the authorized graph as an abnormal account.
6. The system for detecting an abnormal account in an online social network according to claim 5, wherein the node importance calculating module specifically includes:
an in-degree centrality calculation unit for using a formula
Figure FDA0002820065250000031
Calculating the centrality of the incoming degree of each node; wherein, CD(u) is the in-degree centrality of node u; xvu1 or 0, Xvu1 means that node v points to node u with a connection, Xvu0 means that the node v points to the direction of the node u without connection; n-1 represents the number of all nodes except the node u in the simple graph;
a near-centrality calculation unit for using a formula
Figure FDA0002820065250000032
Calculating the proximity centrality of each node; wherein, CC(u) is the near-centrality of node u; d (v, u) is the shortest path from node v to node u;
a mediation centrality calculation unit for utilizing the formula
Figure FDA0002820065250000033
Calculating the intermediary centrality of each node; wherein, CB(u) is the mediation centrality of node u; v is a set of nodes in the simple graph; σ (s, t) represents the number of shortest paths from node s to node t; σ (s, t | u) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
7. The system for detecting the abnormal account of the online social network according to claim 5, wherein the importance fusion module specifically comprises:
an in-degree centrality normalization unit for utilizing a formula
Figure FDA0002820065250000041
Normalizing the centrality of the incoming degree of each node;
Figure FDA0002820065250000042
the value is the input centrality normalization value of the node u; cD(u) is the in-degree centrality of node u; cD(i) Is the in-degree centrality of node i; n represents the number of nodes in the simple graph;
a near-centrality normalization unit for utilizing the formula
Figure FDA0002820065250000043
Normalizing the approximate centrality of each node; wherein the content of the first and second substances,
Figure FDA0002820065250000044
a near centrality normalization value for node u; cC(u) is the near-centrality of node u; MinCC(i) The minimum value of the approximate centrality of all nodes of the simple graph; MaxCC(i) The maximum value of the approximate centrality of all nodes in the simple graph;
an importance fusion unit for utilizing Euler's formula
Figure FDA0002820065250000045
Fusing the importance of each node; wherein the content of the first and second substances,
Figure FDA0002820065250000046
is the importance fusion value of the node u; cB(u) is the mediation centrality of node u.
8. The system for detecting an abnormal account number in an online social network according to claim 5, wherein the trust value transferring module specifically comprises:
the edge weight determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the out degree of nodes at two ends of each edge;
a trust value transfer unit for utilizing a formula based on the out-degree of each node
Figure FDA0002820065250000047
Performing power iteration for O (logn) times to obtain a trust value of each node; wherein, T(i)(u) is the trust value of the node u obtained by the ith iteration; t is(i-1)(v) Obtaining a trust value of the node v for the i-1 st iteration; outdeg (v) is the degree of departure of node v;
Figure FDA0002820065250000048
is the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents the edge that node v points to node u.
CN202011428803.9A 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network Active CN112597439B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011428803.9A CN112597439B (en) 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011428803.9A CN112597439B (en) 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network

Publications (2)

Publication Number Publication Date
CN112597439A true CN112597439A (en) 2021-04-02
CN112597439B CN112597439B (en) 2024-03-01

Family

ID=75191163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011428803.9A Active CN112597439B (en) 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network

Country Status (1)

Country Link
CN (1) CN112597439B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326437A (en) * 2021-06-22 2021-08-31 哈尔滨工程大学 Microblog early rumor detection method based on dual-engine network and DRQN
CN113378899A (en) * 2021-05-28 2021-09-10 百果园技术(新加坡)有限公司 Abnormal account identification method, device, equipment and storage medium
CN113610521A (en) * 2021-07-27 2021-11-05 胜斗士(上海)科技技术发展有限公司 Method and apparatus for detecting anomalies in behavioral data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932669A (en) * 2018-06-27 2018-12-04 北京工业大学 A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP)

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932669A (en) * 2018-06-27 2018-12-04 北京工业大学 A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378899A (en) * 2021-05-28 2021-09-10 百果园技术(新加坡)有限公司 Abnormal account identification method, device, equipment and storage medium
CN113326437A (en) * 2021-06-22 2021-08-31 哈尔滨工程大学 Microblog early rumor detection method based on dual-engine network and DRQN
CN113326437B (en) * 2021-06-22 2022-06-21 哈尔滨工程大学 Microblog early rumor detection method based on dual-engine network and DRQN
CN113610521A (en) * 2021-07-27 2021-11-05 胜斗士(上海)科技技术发展有限公司 Method and apparatus for detecting anomalies in behavioral data

Also Published As

Publication number Publication date
CN112597439B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
Yazdinejad et al. Secure intelligent fuzzy blockchain framework: Effective threat detection in iot networks
Al-Qurishi et al. Sybil defense techniques in online social networks: a survey
CN112597439A (en) Method and system for detecting abnormal account of online social network
Wang et al. Graph-based security and privacy analytics via collective classification with joint weight learning and propagation
Bindu et al. Discovering spammer communities in twitter
Huang et al. Resilient routing mechanism for wireless sensor networks with deep learning link reliability prediction
Goga et al. The doppelgänger bot attack: Exploring identity impersonation in online social networks
Jiang et al. Gatrust: A multi-aspect graph attention network model for trust assessment in osns
Liu et al. An intrusion detection model with hierarchical attention mechanism
Moodi et al. A hybrid intelligent approach to detect android botnet using smart self-adaptive learning-based PSO-SVM
Soniya et al. Intrusion detection system: Classification and techniques
Koroniotis et al. A new Intelligent Satellite Deep Learning Network Forensic framework for smart satellite networks
Aljumah Detection of distributed denial of service attacks using artificial neural networks
Sharma et al. An efficient hybrid deep learning model for denial of service detection in cyber physical systems
Saurabh et al. Nfdlm: A lightweight network flow based deep learning model for ddos attack detection in iot domains
Raghebi et al. A new trust evaluation method based on reliability of customer feedback for cloud computing
Lata et al. A comprehensive survey of fraud detection techniques
Tao et al. Structural identity representation learning for blockchain-enabled metaverse based on complex network analysis
Zheng et al. Tegdetector: a phishing detector that knows evolving transaction behaviors
Vyawahare et al. Fake profile recognition using profanity and gender identification on online social networks
Mulamba et al. On sybil classification in online social networks using only structural features
Subramani et al. Deep learning based IDS for secured routing in wireless sensor networks using fuzzy genetic approach
Lin et al. DTRM: A new reputation mechanism to enhance data trustworthiness for high-performance cloud computing
Kumar et al. Dr. Phish: Phishing Website Detector
Shorfuzzaman Detection of cyber attacks in IoT using tree-based ensemble and feedforward neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant