CN112597439B - Method and system for detecting abnormal account number of online social network - Google Patents

Method and system for detecting abnormal account number of online social network Download PDF

Info

Publication number
CN112597439B
CN112597439B CN202011428803.9A CN202011428803A CN112597439B CN 112597439 B CN112597439 B CN 112597439B CN 202011428803 A CN202011428803 A CN 202011428803A CN 112597439 B CN112597439 B CN 112597439B
Authority
CN
China
Prior art keywords
node
centrality
importance
graph
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011428803.9A
Other languages
Chinese (zh)
Other versions
CN112597439A (en
Inventor
邓明森
丁健
喻曦
龙昌庭
刘振涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou University of Finance and Economics
Original Assignee
Guizhou University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou University of Finance and Economics filed Critical Guizhou University of Finance and Economics
Priority to CN202011428803.9A priority Critical patent/CN112597439B/en
Publication of CN112597439A publication Critical patent/CN112597439A/en
Application granted granted Critical
Publication of CN112597439B publication Critical patent/CN112597439B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Algebra (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method and a system for detecting an abnormal account number of an online social network. The method comprises the following steps: generating a simple graph according to a user relation data set of the online social network; according to the simple graph, calculating the importance of each node through a node importance evaluation algorithm; the importance of the node comprises the centrality of the incoming degree, the proximity centrality and the intermediacy centrality of the node; fusing the importance of each node to obtain an importance fusion value of each node; the importance fusion value of each node is used as the weight of the corresponding node, and the simple graph is converted into a weighted graph; combining the weight and the degree of each node in the weighted graph, carrying out power iteration of trust seed transfer, and endowing each node in the weighted graph with a corresponding trust value; and determining the account corresponding to the node with the smaller trust value in the weight graph as an abnormal account. The method and the device can improve the accuracy of abnormal account detection.

Description

Method and system for detecting abnormal account number of online social network
Technical Field
The invention relates to the field of account detection, in particular to a method and a system for detecting an abnormal account of an online social network.
Background
With the widespread use of the internet and mobile terminals, online social networks (Online Social Networks) play an increasingly important role in people's daily work, learning and life. Perhaps more importantly, different types of online social networks meeting people's different needs are finding their way into existence. They, along with other technologies, drive the rapid development of digital economies, forming an increasingly intimate association with people's daily lives. Thus, the vast amount of users owned by online social networks often means great economic benefits. They bring convenience to people and meet various demands of people, and also bring a certain risk. The benefits of collusion by creating false accounts, robotic accounts, and hijack accounts have become a popular type of phenomenon in online social networks that severely impacts the normal user experience and personal information property security. Some malicious users spread rumors, stir-fry, promote fermentation sensitive topics through abnormal account numbers, and activities such as poor public opinion guidance are produced, so that a certain influence is brought to social stability and stable agglomeration. Thus, the anomalous account has seriously compromised the reputation rating system of the online social network and the trust relationship of the user. Therefore, the abnormal account analysis and discovery technology is one of the key problems to be solved in the current digital economic development.
For detection of abnormal account numbers, a number of solutions have been proposed by both academia and industry. These schemes can be broadly divided into two categories, namely supervised detection schemes based on behavioral characteristics and content and unsupervised detection schemes based on graph structures. The supervised detection scheme comprises the steps of extracting information entropy through user content, behavior characteristics and the like, detecting semantics or behavior analysis based on registration information and user activities in combination with an LDA model, and constructing a classifier, trigger word filtering and the like through self-defined abnormal behaviors. Therefore, due to the supervised detection method, the classifier needs to be trained in advance, and abnormal account numbers or attackers often continuously update the behavior (attack) mode to avoid detection, so that the unknown attack mode detection capability is poor.
Disclosure of Invention
The invention aims to provide a method and a system for detecting an abnormal account number of an online social network, so as to improve the accuracy of detecting the abnormal account number.
In order to achieve the above object, the present invention provides the following solutions:
a detection method of an online social network abnormal account comprises the following steps:
generating a simple graph according to a user relation data set of the online social network; the user relation data set comprises the association relation between the accounts of the users; nodes in the simple graph are accounts of users, and edges are association relations between two users;
according to the simple graph, calculating the importance of each node through a node importance evaluation algorithm; the importance of the node includes the centrality of the node's centrality of penetration, proximity centrality, and intermediacy centrality;
fusing the importance of each node to obtain an importance fusion value of each node;
the importance fusion value of each node is used as the weight of the corresponding node, and the simple graph is converted into a weighted graph;
combining the weight and the degree of each node in the weighted graph, performing power iteration of trust seed transfer, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes selected randomly in the weighted graph, and each trust seed is endowed with an initial trust value;
and determining an account corresponding to the node with the smaller trust value in the weight graph as an abnormal account.
Optionally, the calculating the importance of each node according to the simple graph through a node importance evaluation algorithm specifically includes:
using the formulaCalculating the centrality of the degree of penetration of each node; wherein C is D (u) is the centrality of the importances of node u; x is X vu =1 or 0, x vu =1 means that node v has a connection pointing in the direction of node u, X vu =0 means that node v points in the direction of node u without connection; n-1 represents the number of all nodes except node u in the simple graph;
using the formulaCalculating the proximity centrality of each node; wherein C is C (u) is the proximity centrality of node u; d (v, u) is the shortest path from node v to node u;
using the formulaCalculating the intermediacy of each node; wherein C is B (u) is the intermediacy centrality of node u; v is a set of nodes in the simple graph; sigma (s, t) represents the number of shortest paths from node s to node t; σ (s, tu) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
Optionally, the fusing the importance of each node to obtain an importance fusion value of each node specifically includes:
using the formulaNormalizing the centrality of the degree of penetration of each node;the input centrality normalization value of the node u; c (C) D (u) is the centrality of the importances of node u; c (C) D (i) The centrality of the degree of entry of the node i; n represents the number of nodes in the simple graph;
using the formulaNormalizing the proximity centrality of each node; wherein (1)>The approximate centrality normalization value of the node u; c (C) C (u) is the proximity centrality of node u; minC (minC) C (i) A minimum of near centrality among all nodes of the simple graph; maxC C (i) A maximum value of near centrality among all nodes of the simple graph;
using Euler's formulaImportance fusion for each node; wherein (1)>An importance fusion value for node u; c (C) B (u) is the intermediacy of node u.
Optionally, the performing power iteration of trust seed transfer by combining the weight and the degree of egress of each node in the weighted graph, and assigning a corresponding trust value to each node in the weighted graph specifically includes:
according to the weight and the degree of the two end nodes of each side, determining the weight of each side in each transmission direction;
based on the degree of departure of each node, the formula is utilizedPerforming power iteration and co-iterating O (logn) times to obtain a trust value of each node; wherein T is (i) (u) is the trust value of the node u obtained by the ith iteration; t (T) (i-1) (v) The trust value of the node v is obtained for the i-1 th iteration; outeg (v) is the degree of egress of node v; />The importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents an edge of node v pointing to node u.
The invention also provides a detection system of the online social network abnormal account, which comprises the following steps:
the simple graph generation module is used for generating a simple graph according to the user relation data set of the online social network; the user relation data set comprises the association relation between the accounts of the users; nodes in the simple graph are accounts of users, and edges are association relations between two users;
the node importance calculating module is used for calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node includes the centrality of the node's centrality of penetration, proximity centrality, and intermediacy centrality;
the importance fusion module is used for fusing the importance of each node to obtain an importance fusion value of each node;
the weighted graph generating module is used for taking the importance fusion value of each node as the weight of the corresponding node and converting the simple graph into a weighted graph;
the trust value transfer module is used for carrying out power iteration of trust seed transfer by combining the weight and the degree of egress of each node in the weighted graph, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes selected randomly in the weighted graph, and each trust seed is endowed with an initial trust value;
and the abnormal account determining module is used for determining an account corresponding to the node with the smaller trust value in the ownership graph as an abnormal account.
Optionally, the node importance calculating module specifically includes:
an incorrectness calculation unit for using the formulaCalculating the centrality of the degree of penetration of each node; wherein C is D (u) is the centrality of the importances of node u; x is X vu =1 or 0, x vu =1 means that node v has a connection pointing in the direction of node u, X vu =0 means that node v points in the direction of node u without connection; n-1 represents the number of all nodes except node u in the simple graph;
a proximity centrality calculating unit for using the formulaCalculating the proximity centrality of each node; wherein C is C (u) is the proximity centrality of node u; d (v, u) is the shortest path from node v to node u;
an intermediate centrality calculating unit for using the formulaCalculating the intermediacy of each node; wherein C is B (u) is the intermediacy centrality of node u; v is a set of nodes in the simple graph; sigma (s, t) represents the number of shortest paths from node s to node t; σ (s, tu) represents the number of shortest paths passing through node u among all shortest paths from node s to node t.
Optionally, the importance fusion module specifically includes:
an incorrectness center normalization unit for using the formulaNormalizing the centrality of the degree of penetration of each node; />The input centrality normalization value of the node u; c (C) D (u) isThe centrality of the penetration of the node u; c (C) D (i) The centrality of the degree of entry of the node i; n represents the number of nodes in the simple graph;
a near centrality normalization unit for using the formulaNormalizing the proximity centrality of each node; wherein (1)>The approximate centrality normalization value of the node u; c (C) C (u) is the proximity centrality of node u; minC (minC) C (i) A minimum of near centrality among all nodes of the simple graph; maxC C (i) A maximum value of near centrality among all nodes of the simple graph;
importance fusion unit for using Euler formulaImportance fusion for each node; wherein (1)>An importance fusion value for node u; c (C) B (u) is the intermediacy of node u.
Optionally, the trust value transmission module specifically includes:
the edge weight determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the output degree of the nodes at the two ends of each edge;
a trust value transfer unit for using a formula based on the degree of departure of each nodePerforming power iteration and co-iterating O (logn) times to obtain a trust value of each node; wherein T is (i) (u) is the trust value of the node u obtained by the ith iteration; t (T) (i-1) (v) The trust value of the node v is obtained for the i-1 th iteration; outeg (v) is the degree of egress of node v; />The importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents an edge of node v pointing to node u.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method and the device for detecting the abnormal account, iteration is carried out on the node trust value in the weighted graph through the importance of the node, and finally the abnormal account is identified through the node trust value, so that the accuracy of detecting the abnormal account is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for detecting an abnormal account number of an online social network;
FIG. 2 is a schematic illustration of a simple diagram of online social network formation;
FIG. 3 is a schematic diagram of a weighted graph;
FIG. 4 is a schematic diagram of a first iteration of trusted seed delivery using the method of the present invention;
FIG. 5 is a schematic diagram of a second iteration of trusted seed delivery using the method of the present invention;
FIG. 6 is a schematic diagram of a third iteration of trusted seed delivery using the method of the present invention;
FIG. 7 is a schematic diagram of a fourth iteration of trusted seed delivery using the method of the present invention;
FIG. 8 is a schematic diagram of a trusted seed delivery iteration performed using the method of the present invention;
FIG. 9 is a schematic diagram of an iteration performed on trusted seed delivery using the SybilRank algorithm;
fig. 10 is a schematic structural diagram of a detection system for abnormal accounts of an online social network.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Compared with a supervised detection method, the non-supervision detection scheme is mainly based on graph detection, and a classifier does not need to be trained in advance. The essence of the method is that a friend relation diagram is utilized, the relation between an unknown node and a known node is judged through algorithms such as random walk, self-adaptive maximum flow, power iteration, markov random field and the like, and whether the node is abnormal or not is detected. The method can detect unknown abnormal behaviors, has the advantage of being not easy to bypass by attackers, and gradually becomes a research hotspot for detecting abnormal account numbers. However, the drawbacks of the unsupervised detection method are also obvious, compared with the supervised detection method, the accuracy is relatively low, and the detection effect is different in different types of online social networks. The detection method based on the graph at the present stage is more in theoretical research, and relatively few in deployment in reality.
Aiming at the demand of credit evaluation based on an online social network in the internet financial industry, the invention provides three criteria for detecting abnormal account numbers based on a graph structure, and performs data cleaning and credit evaluation on the basis, and discovers that the network structure is more important than the behavior characteristics of a user individual in the credit evaluation process. The invention provides an improved algorithm for detecting the SybilRank algorithm based on the graph to detect the abnormal account, and redefines a power iteration formula in the SybilRank algorithm through the importance of the nodes, so that the accuracy of detecting the abnormal account is effectively improved. Meanwhile, the distributed framework pregel based on graph calculation realizes abnormal account detection of the large-scale social network, and time expenditure is reduced.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
Fig. 1 is a flow chart of a method for detecting an abnormal account number of an online social network. As shown in FIG. 1, the method for detecting the abnormal account number of the online social network comprises the following steps:
step 100: a simple graph is generated from a user relationship dataset of an online social network. The user relationship data set includes associations between accounts of the user. Taking accounts of users in the online social network as nodes, wherein association relations formed by mutual attention among the accounts or other forms are expressed as edges, and the association relations are bidirectional, such as friend relations and the like; some associations are unidirectional, such as comment replies, etc.; a simple graph is formed based on the association between accounts.
The SybilRank algorithm is a detection algorithm based on a random walk model. Firstly, selecting partial nodes as trust seeds, then, transmitting trust values to other nodes through O (log) power iteration (power iteration), normalizing the trust values according to the degrees of the nodes, and sequencing the results, wherein the nodes with smaller trust values are considered as suspicious nodes. Since online social networks are usually directed graphs, the SybilRank algorithm is a network anomaly account detection algorithm based on undirected graphs. Therefore, when the directed edge graph is processed by the SybilRank algorithm, the original topological structure of the directed graph is changed, and some original attributes of the online social network are lost, so that the calculation accuracy is reduced. For example, the directed graph is simply changed into the undirected graph, so that the directed graph meets the requirementMuch information is lost. Since the degree of departure (out degree) and the degree of entry (index) are not distinguished in the undirected graph, the degree of a node is determined only by the edge connected to the node. Therefore, an attacker can often improve his own degree by paying a lot of attention to the normal account number so as to avoid detection of the SybilRank algorithm. Even different abnormal accounts can simulate a normal network structure by focusing on each other to avoid detection.
In the directed graph, the abnormal account is difficult to simulate the network structure of the normal account, because the abnormal account is greatly focused on the normal account, and the normal account is rarely focused on the abnormal account. Meanwhile, the SybilRank algorithm is only applicable to online social networks with fewer attack edges. The effectiveness of the algorithm is gradually reduced along with the increase of the number of attack edges, meanwhile, the algorithm is easily influenced by the distribution of the attack edges, and the detection effect of the algorithm is better as the attack edges are far away from the selected trust seeds.
Based on the method, the improved SybilRank algorithm is adopted to identify the abnormal account of the online social network. Because online social networks often map the social nature of the account itself, different accounts may exhibit different degrees of importance. If more trust seeds can be allocated to the peaks with high importance, the accuracy of identifying the abnormal account number by the SybilRank algorithm can be greatly improved. Because the online social network is difficult to directly give the weight of the user and cannot reflect the importance degree of the nodes in the graph, the method changes the simple graph into the weighted graph by giving effective trust value weight to each node so as to improve the calculation accuracy. The specific process is shown in steps 200-400.
Step 200: according to the simple graph, the importance of each node is calculated by a node importance assessment algorithm. The importance of a node includes the centrality of the node's centrality of importances, proximity centrality, and intermediacy centrality.
In a real online social network, the more the representation reputation that is often focused is, the higher the importance of the vertex which is reflected in the graph as the higher the degree of penetration is, namely the degree of penetration centrality, is the most direct index for describing the node centrality, and C is used for D (u) represents:
wherein C is D (u) is the centrality of the importances of node u; x is X vu =1 or 0, x vu =1 means that node v has a connection pointing in the direction of node u, X vu =0 denotes a nodev is directed to the direction of the node u without connection; n-1 represents the number of all nodes except node u in the simple graph.
The proximity centrality is an index for measuring the proximity in the network, and represents the distance between a certain node and all other nodes, and is used for describing the difficulty of the node to reach other nodes through the network. The value is the inverse of the sum of the shortest distances of all nodes in the network to the node, and for a node, the closer it is to the other node, the more central it is, the higher the importance of the node. By C C (u) represents:
wherein C is C (u) is the proximity centrality of node u; d (v, u) is the shortest path from node v to node u, shortest path d (v, u) =min (X v1 +X 12 …+X ij +…+X (k-1)k +X ku 1, 2, i, j, (k-1), k are nodes sequentially passing through in the path from node v to node u.
The mediating centrality refers to the number of times a node is passed through by the shortest path of any two other nodes in the network, and is represented by the importance of the node when connecting other nodes, C is used B (u) is represented by the formula (I),
wherein C is B (u) is the intermediacy centrality of node u; v is a set of nodes in the simple graph; sigma (s, t) represents the number of shortest paths from node s to node t; σ (s, t|u) represents the number of shortest paths through node u among all shortest paths from node s to node t.
Step 300: and fusing the importance of each node to obtain an importance fusion value of each node. Firstly, normalizing the incorporativity and the near centrality according to an optimal mode:
the centrality of the penetration of each node is normalized:
the input centrality normalization value of the node u; n represents the number of nodes in the simple graph.
The proximity centrality of each node is normalized:
wherein,the approximate centrality normalization value of the node u; minC (minC) C (i) A minimum of near centrality among all nodes of the simple graph; maxC C (i) For simplicity, the maximum value of the closeness centrality among all nodes of the graph.
Then, the importance of each node is fused by using the Euler formula:
wherein,is an importance fusion value of the node u.
Step 400: and taking the importance fusion value of each node as the weight of the corresponding node, and converting the simple graph into a weighted graph.
Step 500: and carrying out power iteration of trust seed transfer by combining the weight and the degree of each node in the weighted graph, and endowing each node in the weighted graph with a corresponding trust value. The trust seeds are randomly selected partial nodes in the weight graph, and each trust seed is given an initial trust value. Since the SybilRank algorithm is essentially a random walk algorithm based on an undirected graph, the present invention iterates the powers on the basis of the degree of confidence. Based on the simple graph shown in fig. 2 (the gray node in fig. 2 is a Sybil account, i.e. an abnormal account, and the white node is a non-Sybil account), after the weight of each node is calculated, the weight of the edge is calculated according to the node weights at the two ends of the edge and the degree of the node, and is given to each edge, so as to obtain the weighted graph shown in fig. 3. In this graph, the connections between nodes may be considered undirected, but the transfer of information between nodes is inclusive.
Then, based on the starting node and the ending node of the information transfer, performing power iteration on the trust value of each node based on the degree of each node, and performing O (log) iteration for the same time to obtain the trust value of each node. The trust value update formula is as follows:
wherein T is (i) (u) is the trust value of the node u obtained by the ith iteration; t (T) (i-1) (v) The trust value of the node v is obtained for the i-1 th iteration; outeg (v) is the degree of egress of node v;the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents an edge of node v pointing to node u; sigma (sigma) (u,v)∈U Representing parameters corresponding to all edges pointing to node uAnd (5) summing.
In fig. 3, any two nodes are taken as trust seeds and a certain trust value is given to complete iteration, and the iteration process is as shown in fig. 4-8. The trust value calculated in the iterative process is directly marked on the node. In fig. 8, the final trust value obtained after the iteration is completed is obtained.
Step 600: and determining the account corresponding to the node with the smaller trust value in the weight graph as an abnormal account.
For ease of comparison of the final results, the iteration was performed according to the conventional SybilRank algorithm based on the simple graph shown in fig. 2, and the resulting final Rank values are shown in fig. 9. By comparing fig. 8 and fig. 9, it can be found that, after the iteration is completed, the trust values of the benign domains are all greater than the trust values of all the nodes in the Sybil domain; in the calculation result obtained by the original SybilRank algorithm, the trust value of 3 nodes in 6 nodes in the benign domain is lower than that of a certain node in the Sybil domain, and obviously, the ordering structure obtained by calculation is not ideal enough. Therefore, the accuracy of the improved algorithm is significantly higher than that of the original Sybil algorithm.
Based on the method, the invention also provides a detection system of the online social network abnormal account, and fig. 10 is a schematic structural diagram of the detection system of the online social network abnormal account. As shown in fig. 10, the detection system of the online social network abnormal account of the present invention includes:
a simple graph generating module 1001, configured to generate a simple graph according to a user relationship data set of an online social network; the user relation data set comprises the association relation between the accounts of the users; in the simple graph, nodes are accounts of users, and edges are association relations between two users.
A node importance calculating module 1002, configured to calculate the importance of each node according to the simple graph through a node importance evaluation algorithm; the importance of the node includes the centrality of the node's centrality of penetration, proximity centrality, and intermediacy centrality.
The importance fusion module 1003 is configured to fuse the importance of each node to obtain an importance fusion value of each node.
The weighted graph generating module 1004 is configured to convert the simple graph into a weighted graph by taking the importance fusion value of each node as the weight of the corresponding node.
The trust value transfer module 1005 is configured to perform power iteration of trust seed transfer by combining the weight and the degree of egress of each node in the ownership map, and assign a corresponding trust value to each node in the ownership map; the trust seeds are part of nodes selected randomly in the weight graph, and each trust seed is endowed with an initial trust value.
And the abnormal account determining module 1006 is configured to determine an account corresponding to a node with a smaller trust value in the ownership graph as an abnormal account.
As a specific embodiment, in the system for detecting abnormal account numbers of online social network of the present invention, the node importance calculating module 1002 specifically includes:
an incorrectness calculation unit for using the formulaCalculating the centrality of the degree of penetration of each node; wherein C is D (u) is the centrality of the importances of node u; x is X vu =1 or 0, x vu =1 means that node v has a connection pointing in the direction of node u, X vu =0 means that node v points in the direction of node u without connection; n-1 represents the number of all nodes except node u in the simple graph.
A proximity centrality calculating unit for using the formulaCalculating the proximity centrality of each node; wherein C is C (u) is the proximity centrality of node u; d (v, u) is the shortest path from node v to node u.
An intermediate centrality calculating unit for using the formulaCalculating the intermediacy of each node; wherein C is B (u) is the intermediacy centrality of node u; v is a set of nodes in the simple graph; sigma (s, t) represents the number of shortest paths from node s to node t; σ (s, t|u) represents the number of shortest paths through node u among all shortest paths from node s to node t.
As a specific embodiment, in the system for detecting an abnormal account of an online social network of the present invention, the importance fusion module 1003 specifically includes:
an incorrectness center normalization unit for using the formulaNormalizing the centrality of the degree of penetration of each node; />The input centrality normalization value of the node u; c (C) D (u) is the centrality of the importances of node u; c (C) D (i) The centrality of the degree of entry of the node i; n represents the number of nodes in the simple graph.
A near centrality normalization unit for using the formulaNormalizing the proximity centrality of each node; wherein (1)>The approximate centrality normalization value of the node u; c (C) C (u) is the proximity centrality of node u; minC (minC) C (i) A minimum of near centrality among all nodes of the simple graph; maxC C (i) For simplicity, the maximum value of the closeness centrality among all nodes of the graph.
Importance fusion unit for using Euler formulaImportance fusion for each node; wherein (1)>An importance fusion value for node u; c (C) B (u) is the intermediacy of node u.
As a specific embodiment, in the system for detecting an abnormal account of an online social network of the present invention, the trust value transmission module 1005 specifically includes:
and the edge weight value determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the degree of emergence of the nodes at the two ends of each edge.
A trust value transfer unit for using a formula based on the degree of departure of each nodePerforming power iteration and co-iterating O (logn) times to obtain a trust value of each node; wherein T is (i) (u) is the trust value of the node u obtained by the ith iteration; t (T) (i-1) (v) The trust value of the node v is obtained for the i-1 th iteration; outeg (v) is the degree of egress of node v; />The importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents an edge of node v pointing to node u.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims (8)

1. The method for detecting the abnormal account number of the online social network is characterized by comprising the following steps of:
generating a simple graph according to a user relation data set of the online social network; the user relation data set comprises the association relation between the accounts of the users; nodes in the simple graph are accounts of users, and edges are association relations between two users;
according to the simple graph, calculating the importance of each node through a node importance evaluation algorithm; the importance of the node includes the centrality of the node's centrality of penetration, proximity centrality, and intermediacy centrality;
fusing the importance of each node to obtain an importance fusion value of each node;
the importance fusion value of each node is used as the weight of the corresponding node, and the simple graph is converted into a weighted graph;
combining the weight and the degree of each node in the weighted graph, performing power iteration of trust seed transfer, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes selected randomly in the weighted graph, and each trust seed is endowed with an initial trust value;
and determining an account corresponding to the node with the smaller trust value in the weight graph as an abnormal account.
2. The method for detecting abnormal account numbers of online social network according to claim 1, wherein the calculating the importance of each node through a node importance evaluation algorithm according to the simple graph specifically comprises:
using the formulaCalculating the centrality of the degree of penetration of each node; wherein C is D (u) is the centrality of the importances of node u; x is X vu =1 or 0, x vu =1 means that node v has a connection pointing in the direction of node u, X vu =0 means that node v points in the direction of node u without connection; n-1 represents the number of all nodes except node u in the simple graph;
using the formulaCalculating the proximity centrality of each node; wherein C is C (u) is the proximity centrality of node u; d (v, u) is the shortest path from node v to node u;
using the formulaCalculating the intermediacy of each node; wherein C is B (u) is the intermediacy centrality of node u; v is a set of nodes in the simple graph; sigma (s, t) represents the number of shortest paths from node s to node t; σ (s, t|u) represents the number of shortest paths through node u among all shortest paths from node s to node t.
3. The method for detecting abnormal account numbers of online social network according to claim 1, wherein the fusing the importance of each node to obtain the importance fusion value of each node specifically comprises:
using the formulaNormalizing the centrality of the degree of penetration of each node; />The input centrality normalization value of the node u; c (C) D (u) is the centrality of the importances of node u; c (C) D (i) The centrality of the degree of entry of the node i; n represents the number of nodes in the simple graph;
using the formulaNormalizing the proximity centrality of each node; wherein (1)>The approximate centrality normalization value of the node u; c (C) C (u) is the proximity centrality of node u; minC (minC) C (i) A minimum of near centrality among all nodes of the simple graph; maxC C (i) A maximum value of near centrality among all nodes of the simple graph;
using Euler's formulaImportance fusion for each node; wherein (1)>An importance fusion value for node u; c (C) B (u) is the intermediacy of node u.
4. The method for detecting an online social network abnormal account according to claim 1, wherein the performing power iteration of trust seed transfer by combining the weight and the degree of each node in the weight graph, and assigning a corresponding trust value to each node in the weight graph specifically comprises:
according to the weight and the degree of the two end nodes of each side, determining the weight of each side in each transmission direction;
based on the degree of departure of each node, the formula is utilizedPerforming power iteration and co-iterating O (logn) times to obtain a trust value of each node; wherein T is (i) (u) is the trust value of the node u obtained by the ith iteration; t (T) (i-1) (v) The trust value of the node v is obtained for the i-1 th iteration; outeg (v) is the degree of egress of node v; />The importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents an edge of node v pointing to node u.
5. The system for detecting the abnormal account number of the online social network is characterized by comprising the following steps:
the simple graph generation module is used for generating a simple graph according to the user relation data set of the online social network; the user relation data set comprises the association relation between the accounts of the users; nodes in the simple graph are accounts of users, and edges are association relations between two users;
the node importance calculating module is used for calculating the importance of each node through a node importance evaluation algorithm according to the simple graph; the importance of the node includes the centrality of the node's centrality of penetration, proximity centrality, and intermediacy centrality;
the importance fusion module is used for fusing the importance of each node to obtain an importance fusion value of each node;
the weighted graph generating module is used for taking the importance fusion value of each node as the weight of the corresponding node and converting the simple graph into a weighted graph;
the trust value transfer module is used for carrying out power iteration of trust seed transfer by combining the weight and the degree of egress of each node in the weighted graph, and endowing each node in the weighted graph with a corresponding trust value; the trust seeds are part of nodes selected randomly in the weighted graph, and each trust seed is endowed with an initial trust value;
and the abnormal account determining module is used for determining an account corresponding to the node with the smaller trust value in the ownership graph as an abnormal account.
6. The system for detecting abnormal accounts of online social networks according to claim 5, wherein the node importance calculating module specifically comprises:
an incorrectness calculation unit for using the formulaCalculating the centrality of the degree of penetration of each node; wherein C is D (u) is the centrality of the importances of node u; x is X vu =1 or 0, x vu =1 means that node v has a connection pointing in the direction of node u, X vu =0 means that node v points in the direction of node u without connection; n-1 represents the number of all nodes except node u in the simple graph;
a proximity centrality calculating unit for using the formulaCalculating the proximity centrality of each node; wherein C is C (u) Is the proximity centrality of node u; d (v, u) is the shortest path from node v to node u;
an intermediate centrality calculating unit for using the formulaCalculating the intermediacy of each node; wherein C is B (u) is the intermediacy centrality of node u; v is a set of nodes in the simple graph; sigma (s, t) represents the number of shortest paths from node s to node t; σ (s, t|u) represents the number of shortest paths through node u among all shortest paths from node s to node t.
7. The system for detecting abnormal accounts of online social networks according to claim 5, wherein the importance fusion module specifically comprises:
an incorrectness center normalization unit for using the formulaNormalizing the centrality of the degree of penetration of each node; />The input centrality normalization value of the node u; c (C) D (u) is the centrality of the importances of node u; c (C) D (i) The centrality of the degree of entry of the node i; n represents the number of nodes in the simple graph;
a near centrality normalization unit for using the formulaNormalizing the proximity centrality of each node; wherein (1)>The approximate centrality normalization value of the node u; c (C) C (u) is the proximity centrality of node u; minC (minC) C (i) A minimum of near centrality among all nodes of the simple graph; maxC C (i) For simplicity ofA maximum value of near centrality among all nodes of the single graph;
importance fusion unit for using Euler formulaImportance fusion for each node; wherein (1)>An importance fusion value for node u; c (C) B (u) is the intermediacy of node u.
8. The system for detecting abnormal accounts of online social networks according to claim 5, wherein the trust value transfer module specifically comprises:
the edge weight determining unit is used for determining the weight of each edge in each transmission direction according to the weight and the output degree of the nodes at the two ends of each edge;
a trust value transfer unit for using a formula based on the degree of departure of each nodePerforming power iteration and co-iterating O (logn) times to obtain a trust value of each node; wherein T is (i) (u) is the trust value of the node u obtained by the ith iteration; t (T) (i-1) (v) The trust value of the node v is obtained for the i-1 th iteration; outeg (v) is the degree of egress of node v;the importance fusion value of the node v; u is the set of edges in the weighted graph; (u, v) represents an edge of node v pointing to node u.
CN202011428803.9A 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network Active CN112597439B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011428803.9A CN112597439B (en) 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011428803.9A CN112597439B (en) 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network

Publications (2)

Publication Number Publication Date
CN112597439A CN112597439A (en) 2021-04-02
CN112597439B true CN112597439B (en) 2024-03-01

Family

ID=75191163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011428803.9A Active CN112597439B (en) 2020-12-07 2020-12-07 Method and system for detecting abnormal account number of online social network

Country Status (1)

Country Link
CN (1) CN112597439B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326437B (en) * 2021-06-22 2022-06-21 哈尔滨工程大学 Microblog early rumor detection method based on dual-engine network and DRQN
CN113610521A (en) * 2021-07-27 2021-11-05 胜斗士(上海)科技技术发展有限公司 Method and apparatus for detecting anomalies in behavioral data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932669A (en) * 2018-06-27 2018-12-04 北京工业大学 A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP)

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932669A (en) * 2018-06-27 2018-12-04 北京工业大学 A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP)

Also Published As

Publication number Publication date
CN112597439A (en) 2021-04-02

Similar Documents

Publication Publication Date Title
Yazdinejad et al. Secure intelligent fuzzy blockchain framework: Effective threat detection in iot networks
Ali et al. Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm‐based feature selection and weighting
Liu et al. An intrusion detection model with hierarchical attention mechanism
Pham et al. Phishing-aware: A neuro-fuzzy approach for anti-phishing on fog networks
CN112597439B (en) Method and system for detecting abnormal account number of online social network
Moodi et al. A hybrid intelligent approach to detect android botnet using smart self-adaptive learning-based PSO-SVM
Chawla Phishing website analysis and detection using Machine Learning
Nie et al. Intrusion detection in green internet of things: a deep deterministic policy gradient-based algorithm
Ma et al. Machine learning empowered trust evaluation method for IoT devices
Chapla et al. A machine learning approach for url based web phishing using fuzzy logic as classifier
Han et al. A packet-length-adjustable attention model based on bytes embedding using flow-wgan for smart cybersecurity
CN113159866A (en) Method for building network user trust evaluation model in big data environment
Ding et al. AnoGLA: An efficient scheme to improve network anomaly detection
Zheng et al. Tegdetector: a phishing detector that knows evolving transaction behaviors
Haoran et al. A CMA‐ES‐Based Adversarial Attack Against Black‐Box Object Detectors
Alsubaei et al. Enhancing phishing detection: A novel hybrid deep learning framework for cybercrime forensics
Fries Evolutionary optimization of a fuzzy rule-based network intrusion detection system
Chen et al. An advanced deep attention collaborative mechanism for secure educational email services
Nguyen et al. An efficient approach based on neuro-fuzzy for phishing detection
Raja et al. An efficient fuzzy self-classifying clustering based framework for cloud security
Cui et al. Research on network security quantitative model based on probabilistic attack graph
Zaimi et al. A deep learning approach to detect phishing websites using CNN for privacy protection
Sharma et al. Recent trend in Intrusion detection using Fuzzy-Genetic algorithm
Lin et al. DTRM: A new reputation mechanism to enhance data trustworthiness for high-performance cloud computing
Kumar et al. Intrusion detection using soft computing techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant