CN112445957A - Social network abnormal user detection method, system, medium, equipment and terminal - Google Patents

Social network abnormal user detection method, system, medium, equipment and terminal Download PDF

Info

Publication number
CN112445957A
CN112445957A CN202011226262.1A CN202011226262A CN112445957A CN 112445957 A CN112445957 A CN 112445957A CN 202011226262 A CN202011226262 A CN 202011226262A CN 112445957 A CN112445957 A CN 112445957A
Authority
CN
China
Prior art keywords
social network
abnormal
matrix
user
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011226262.1A
Other languages
Chinese (zh)
Inventor
朱辉
俞志鹏
李鹤麟
李晖
兰玮
文浩斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202011226262.1A priority Critical patent/CN112445957A/en
Publication of CN112445957A publication Critical patent/CN112445957A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention belongs to the technical field of social network data mining, and discloses a method, a system, a medium, equipment and a terminal for detecting abnormal users in a social network, which preprocess crawled social network data and construct a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix; based on the social network attribute matrix and the social network adjacent attribute matrix, obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure, and updating an abnormal value of each user in the social network; and finally, evaluating the abnormal degree of each user in the social network through the abnormal value, and finishing the detection and identification of the abnormal user in the social network. According to the method, network representation learning and the detection task of the abnormal social network user are combined, the influence of the abnormal social network user on the representation learning of the social network can be effectively reduced while the abnormal social network user is identified, a robust network embedded vector is generated, and convenience is provided for a downstream data mining task.

Description

Social network abnormal user detection method, system, medium, equipment and terminal
Technical Field
The invention belongs to the technical field of social network data mining, and particularly relates to a method, a system, a medium, equipment and a terminal for detecting abnormal users in a social network.
Background
At present: with the rapid development and wide application of internet technology, social networks gradually become an essential component in people's digital life due to their convenience, entertainment and real-time. On one hand, the social network bears massive media information and social information, and on the other hand, the social network also contains a large amount of privacy information and huge commercial value, so that the social network attracts a large number of malicious attackers. Malicious attackers create false accounts or steal normal accounts, and carry out malicious behaviors such as malicious information publishing, financial transaction fraud, network attack launching and the like in the social network, thereby seriously threatening the life and property safety of people and the regular order and trust relationship of the social network. These malicious attackers are collectively referred to as anomalous users.
The following difficulties exist in the detection and identification of abnormal user nodes in the social network:
(1) the traditional social network anomaly detection method needs to spend a great deal of time overhead and labor cost. The user base number in the social network is large, the coverage range is wide, various abnormal users are covered, the behavior characteristics of the abnormal users can be dynamically changed along with the time, and when the abnormal users change the behavior modes, the traditional social network abnormality detection method cannot effectively process the abnormal users.
(2) The complexity of social networks presents a great difficulty for anomaly detection efforts. Due to the existence of edges in the topological structure, the expression of the data shows the characteristics of high-dimensional sparsity, high coupling of user nodes and repeated iteration of relationships among the user nodes, so that the user characteristics are difficult to capture.
At present, in view of the above problems, solutions have been proposed:
(1) the method solves the abnormal factors of each social network account on the consistency of the topological structure, the node attribute and the structure attribute by constructing a joint optimization model of the topological structure and the node attribute in the social network, and jointly evaluates the three abnormal factors to complete the detection and the identification of the abnormal social network account.
(2) A method for detecting abnormal users in a social network based on graph embedding comprises the steps of constructing a user node embedding model according to community attribution relation values of user nodes in the social network, further solving an embedding weighting vector and an abnormal level of the user nodes, and defining the user nodes with the abnormal level larger than a maximum threshold value or smaller than a minimum threshold value as abnormal user nodes.
However, the above solutions all have certain limitations:
(1) the method and the system for detecting the abnormal account of the social network based on network representation learning have the defects that:
1) matrix factorization techniques are not suitable for large-scale social networks;
2) real-world social networks exhibit highly complex non-linearities that are difficult to capture by matrix decomposition techniques.
(2) The method for detecting the abnormal users in the social network based on graph embedding has the defects that:
1) the community structure of the social network is lack of universal definition, the community detection in the large-scale social network is difficult, and the accuracy of community structure division directly relates to the effect of the method;
2) the method lacks the restriction on the abnormal user nodes during embedding so as to reduce the influence of the abnormal user nodes on the final embedded vector, so that the method is difficult to construct a robust graph embedding model.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) the existing social network abnormal account detection method and system matrix decomposition technology based on network representation learning are not suitable for large-scale social networks; real-world social networks exhibit highly complex non-linearities that are difficult to capture by matrix decomposition techniques.
(2) The community structure of the social network of the existing abnormal user detection method based on graph embedding in the social network is lack of universal definition, the community detection in a large-scale social network is difficult, and the accuracy of community structure division directly relates to the effect of the method; and the method lacks the restriction on the abnormal user nodes during embedding so as to reduce the influence of the abnormal user nodes on the final embedded vector, so that the method is difficult to construct a robust graph embedding model.
The difficulty in solving the above problems and defects is: aiming at a large-scale social network, a robust model which provides convenience for downstream data mining tasks can be generated while effectively detecting and identifying abnormal users in the social network is ensured.
The significance of solving the problems and the defects is as follows: the method has important significance for the problems of social network site safety, user privacy protection and the like, and also has important research value for the problems of group event monitoring, public opinion guide analysis and the like in the social network.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method, a system, a medium, equipment and a terminal for detecting abnormal users in a social network.
The invention is realized in such a way, and provides a method for detecting abnormal users in a social network, which comprises the following steps:
preprocessing the crawled social network data, constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix, wherein the social network attribute matrix is used as the input of the model, the social network adjacency attribute matrix is used as the expected output of the model, and the model is trained on the basis of loss reduction.
Based on the social network attribute matrix and the social network adjacent attribute matrix, a social network user low-dimensional representation matrix is obtained by utilizing a deep neural network model of a self-coding structure, and an abnormal value of each user in the social network is updated, so that on one hand, the abnormal detection and identification of the social network can be carried out, and on the other hand, a robust user low-dimensional representation matrix is generated by introducing a coefficient factor inversely proportional to the abnormal value.
And evaluating the abnormal degree of each user in the social network through the abnormal value, wherein the abnormal value of each user is ranked from high to low so as to complete the detection and identification of the abnormal user in the social network.
Further, the social network abnormal user detection method comprises the following steps:
(1) constructing a social network (V, E, A) by using the social network data set, wherein V is a set of all nodes in the social network, E is a set of all edges in the social network, and A is a set of all node attributes in the social network;
(2) preprocessing the social network data (V, E, A) in the step (1) to construct an N multiplied by N dimensional social network adjacency matrix G, N multiplied by M dimensional social network attribute matrix A and an N multiplied by M dimensional social network adjacency attribute matrix
Figure BDA0002763749540000031
(3) Constructing a social network abnormal user detection model based on network embedding;
(4) initializing relevant parameters of the model, and repeatedly and iteratively calculating by a gradient descent method to reduce the loss function value
Figure BDA0002763749540000041
And sequencing all the nodes in the social network from high to low according to the abnormal values until convergence, and outputting the result and feeding back the result to data mining personnel for detecting and identifying the abnormal nodes in the social network.
Further, the preprocessing step includes:
1) and (3) shaping the unique identification of the social network node: aiming at each user node v in the social network data set, taking the row number of the user node v as a unique integer index of the user node v, and starting the row number index from 1;
2) constructing a social network adjacency matrix: constructing an N multiplied by N-dimensional social network adjacency matrix G based on every two pairwise attention relationship matrixes of the dimension of E multiplied by 2 in the social network data;
3) constructing a social network attribute matrix: the attribute vectors of all nodes in the social network data set form an N multiplied by M-dimensional social network attribute matrix A;
4) constructing a social network adjacency attribute matrix: aiming at each node v in the graph, acquiring a neighbor node set Neigh (v), if the neighbor node exists, the adjacent attribute vector is the average value of all the neighbor node attribute vectors, namely
Figure BDA0002763749540000042
If there is no neighbor node, then the adjacent attribute vector is assigned as its own attribute vector, i.e. the neighbor node does not exist
Figure BDA0002763749540000043
The operation is executed on all nodes in the social network to obtain the social network adjacent attribute matrix with the dimension of N multiplied by M
Figure BDA0002763749540000044
Further, the model building step includes:
1) constructing a deep neural network model based on a self-coding structure: the encoder mainly comprises K layers of full connection layers, wherein K is a positive integer greater than or equal to 1, M-dimensional attribute vectors are finally reduced into D-dimensional hidden layer output through the K layers of encoders, and the full connection layers are connected through a hyperbolic tangent activation function;
the decoder also comprises K layers of full connection layers, the D-dimensional input vector is finally expanded into an M-dimensional output vector through the K layers of decoders, and the full connection layers are connected through a hyperbolic tangent activation function;
constructing an N multiplied by D-dimensional social network user low-dimensional representation matrix E based on the N multiplied by D-dimensional social network user low-dimensional representation matrix E;
2) constructing a loss function of a deep neural network model based on a self-coding structure: the method has the advantages that the abnormal nodes can be detected and identified to the maximum extent by the social network abnormal user detection model based on network embedding, meanwhile, the influence of the abnormal nodes on social media network representation learning is reduced as much as possible, and the following loss function is constructed;
3) updating the outliers of the deep neural network model based on the self-coding structure.
Further, the deep neural network model based on the self-coding structure is constructed, and the full connection layers are connected through a hyperbolic tangent activation function:
Figure BDA0002763749540000051
the decoder consists of K layers of full connection layers, and the full connection layers are connected through a hyperbolic tangent activation function:
Figure BDA0002763749540000052
further, the loss function:
Figure BDA0002763749540000053
where N is the total number of nodes in the social network, M is the dimension of the node attribute vector, and Y represents the adjacency attribute matrix, specifically, YijRepresents the jth adjacency attribute value of the ith node,
Figure BDA0002763749540000054
a adjacency matrix representing the output of the deep neural network based on the self-coding structure may be, in particular,
Figure BDA0002763749540000055
a j-th adjacency attribute value, λ, of an ith node in a matrix representing the output of the deep neural networkiAn outlier representing the ith node, identifying the degree of anomaly for that node;
outlier updates are based on the following formula:
Figure BDA0002763749540000056
where M is the dimension of the node attribute vector and Y represents the adjacency attribute matrix, in particular YijRepresents the jth adjacency attribute value of the ith node,
Figure BDA0002763749540000057
a adjacency matrix representing the output of the deep neural network based on the self-coding structure may be, in particular,
Figure BDA0002763749540000058
a j-th adjacency attribute value representing an i-th node in a matrix of the deep neural network output,
Figure BDA0002763749540000061
representing a calculation matrix
Figure BDA0002763749540000062
The Frobenius norm of the time-varying abnormal value needs to be updated after the deep neural network parameters are updated in each iteration.
It is a further object of the invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
preprocessing the crawled social network data, and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
based on the social network attribute matrix and the social network adjacent attribute matrix, obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure, and updating an abnormal value of each user in the social network;
and evaluating the abnormal degree of each user in the social network through the abnormal value, and finishing the detection and identification of the abnormal user in the social network.
It is another object of the present invention to provide a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
preprocessing the crawled social network data, and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
based on the social network attribute matrix and the social network adjacent attribute matrix, obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure, and updating an abnormal value of each user in the social network;
and evaluating the abnormal degree of each user in the social network through the abnormal value, and finishing the detection and identification of the abnormal user in the social network.
The invention also aims to provide an information data processing terminal, which is used for realizing the social network abnormal user detection method.
Another object of the present invention is to provide a system for detecting an abnormal user in a social network, which implements the method for detecting an abnormal user in a social network, the system comprising:
the data preprocessing module is used for preprocessing the crawled social network data and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
the abnormal value updating module is used for obtaining a social network user low-dimensional representation matrix by utilizing a deep neural network model of a self-coding structure based on the social network attribute matrix and the social network adjacent attribute matrix, and updating the abnormal value of each user in the social network;
and the abnormal user detection and identification module is used for evaluating the abnormal degree of each user in the social network through the abnormal value so as to complete the detection and identification of the abnormal user in the social network.
By combining all the technical schemes, the invention has the advantages and positive effects that: the experimental effect is shown in fig. 6, the abscissa represents the data of the top L% of abnormal user nodes in the social network after all the user nodes are ranked from high to low according to the abnormal values, and the ordinate represents the recall rate of the abnormal user node detection and identification in the social network.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.
Fig. 1 is a flowchart of a social network abnormal user detection method according to an embodiment of the present invention.
FIG. 2 is a schematic structural diagram of a system for detecting abnormal users in a social network according to an embodiment of the present invention;
in fig. 2: 1. a data preprocessing module; 2. an outlier update module; 3. and an abnormal user detection and identification module.
Fig. 3 is a flowchart of an implementation of a method for detecting an abnormal user in a social network according to an embodiment of the present invention.
Fig. 4 is an overall frame diagram of a social network abnormal user detection method according to an embodiment of the present invention.
Fig. 5 is a social network diagram of a social network abnormal user detection method according to an embodiment of the present invention.
Fig. 6 is an experimental effect diagram of an embodiment of the social network abnormal user detection method according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides a method, a system, a medium, a device and a terminal for detecting abnormal users in a social network, and the invention is described in detail with reference to the accompanying drawings.
As shown in fig. 1, the method for detecting abnormal users in social networks provided by the present invention includes the following steps:
s101: preprocessing the crawled social network data, and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
s102: based on the social network attribute matrix and the social network adjacent attribute matrix, obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure, and updating an abnormal value of each user in the social network;
s103: and evaluating the abnormal degree of each user in the social network through the abnormal value, and finishing the detection and identification of the abnormal user in the social network.
Those skilled in the art can also implement the method for detecting abnormal users in social network provided by the present invention by using other steps, and the method for detecting abnormal users in social network provided by the present invention in fig. 1 is only a specific embodiment.
As shown in fig. 2, the system for detecting abnormal users in social networks provided by the present invention includes:
the data preprocessing module 1 is used for preprocessing the crawled social network data and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
the abnormal value updating module 2 is used for obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure based on the social network attribute matrix and the social network adjacent attribute matrix, and updating the abnormal value of each user in the social network;
and the abnormal user detection and identification module 3 is used for evaluating the abnormal degree of each user in the social network through the abnormal value so as to complete the detection and identification of the abnormal user in the social network.
The technical solution of the present invention is further described below with reference to the accompanying drawings.
As shown in fig. 3, the method for detecting abnormal users in social networks provided by the present invention includes the following steps:
(1) constructing a social network (V, E, a) by using a social network data set as shown in fig. 4, where in the data set, the total number of user nodes | | | V | | is 2708, the total number of directed edges | | | E | | | is 5429, and the user attribute vector dimension M is 1433;
(2) preprocessing the social network data (V, E, A) in (1) to construct a 2708 × 2708 dimensional social network adjacency matrix G, a 2708 × 1433 dimensional social network attribute matrix A and a 2708 × 1433 dimensional social network adjacency attribute matrix
Figure BDA0002763749540000091
Specifically, the pretreatment steps are as follows:
1) and (3) shaping the unique identification of the social network node: aiming at each user node v in the social network data set, taking the row number of the user node v as a unique integer index of the user node v, and starting the row number index from 1;
2) constructing a social network adjacency matrix: constructing a 2708X 2708 dimensional social network adjacency matrix G based on 5429X 2 dimensional pairwise attention relationship matrixes in social network data;
3) constructing a social network attribute matrix: the attribute vectors of all nodes in the social network data set form a 2708 multiplied by 1433 dimensional social network attribute matrix A;
4) constructing a social network adjacency attribute matrix: aiming at each node v in the graph, acquiring a neighbor node set Neigh (v), if the neighbor node exists, the adjacent attribute vector is the average value of all the neighbor node attribute vectors, namely
Figure BDA0002763749540000092
If there is no neighbor node, then the adjacent attribute vector is assigned as its own attribute vector, i.e. the neighbor node does not exist
Figure BDA0002763749540000093
The 2708 x 1433 dimensional social network adjacency attribute matrix can be obtained by executing the above operations on all nodes in the social network
Figure BDA0002763749540000094
(3) The method comprises the following steps of constructing a social network abnormal user detection model based on network embedding, specifically, the model construction steps are as follows:
1) constructing a deep neural network model based on a self-coding structure: in the encoder of this embodiment, we assign K to 5 and D to 32, as shown in fig. 5, the encoder is mainly composed of five fully-connected layers, the first layer reduces 1433-dimensional attribute vector to 512-dimensional, the second layer reduces 512-dimensional data of the upper layer to 256-dimensional, the third layer reduces 256-dimensional data of the upper layer to 128-dimensional, the fourth layer reduces 128-dimensional data of the upper layer to 64-dimensional, the fifth layer reduces 64-dimensional data of the upper layer to 32-dimensional hidden layer output, and the fully-connected layers are connected by hyperbolic tangent activation function, that is:
Figure BDA0002763749540000101
Figure BDA0002763749540000102
Figure BDA0002763749540000103
in the decoder of this embodiment, we assign K to 5 and D to 32, as shown in fig. 5, the decoder is mainly composed of five fully-connected layers, the first layer expands the input data of 32 dimensions to 64 dimensions, the second layer expands the 64-dimensional data of the upper layer to 128 dimensions, the third layer expands the 128-dimensional data of the upper layer to 256 dimensions, the fourth layer expands the 256-dimensional data of the upper layer to 512 dimensions, the fifth layer expands the 512-dimensional data of the upper layer to 1433 dimensions, and the fully-connected layers are connected by a hyperbolic tangent activation function, that is:
Figure BDA0002763749540000104
Figure BDA0002763749540000105
constructing a 2708 multiplied by 32-dimensional social network user low-dimensional representation matrix E based on the representation matrix;
2) constructing a loss function of a deep neural network model based on a self-coding structure: in order to enable a social network abnormal user detection model based on network embedding to detect and identify abnormal nodes to the maximum extent and reduce the influence of the abnormal nodes on social media network representation learning as much as possible, the following loss functions are constructed:
Figure BDA0002763749540000106
where 2708 is the total number of nodes in the social network, 1433 is the dimension of the node attribute vector, and Y represents the adjacency attribute matrix, specifically, YijRepresents the jth adjacency attribute value of the ith node,
Figure BDA0002763749540000107
a adjacency matrix representing the output of the deep neural network based on the self-coding structure may be, in particular,
Figure BDA0002763749540000108
a j-th adjacency attribute value, λ, of an ith node in a matrix representing the output of the deep neural networkiAn outlier representing the ith node, identifying the degree of abnormality for that node.
3) Updating abnormal values of the deep neural network model based on the self-coding structure: outlier updates are based on the following formula:
Figure BDA0002763749540000111
where 1433 is the dimension of the node attribute vector and Y represents the adjacency attribute matrix, in particular YijRepresents the jth adjacency attribute value of the ith node,
Figure BDA0002763749540000112
a adjacency matrix representing the output of the deep neural network based on the self-coding structure may be, in particular,
Figure BDA0002763749540000113
a j-th adjacency attribute value representing an i-th node in a matrix of the deep neural network output,
Figure BDA0002763749540000114
representing a calculation matrix
Figure BDA0002763749540000115
The Frobenius norm of the time-varying abnormal value needs to be updated after the deep neural network parameters are updated in each iteration.
(4) Initializing relevant parameters of the model, and repeatedly and iteratively calculating by a gradient descent method to reduce the loss function value
Figure BDA0002763749540000116
Until convergence, all nodes in the social network are ranked from high to low according to abnormal values, and the result is output and fed back to data mining personnel for detecting and identifying the abnormal nodes existing in the social network, wherein the experimental effect is as shown in fig. 6, the abscissa represents the data of the first L% of abnormal user nodes ranked from high to low according to the abnormal values in the social network, and the ordinate represents the recall rate of the abnormal user nodes detected and identified in the social network.
Demonstration section (concrete examples/Positive Experimental data capable of demonstrating the inventive step of the present invention, etc.)
The experimental data set is shown in the following table:
Figure BDA0002763749540000117
the experimental effect evaluation is shown in the following table:
Figure BDA0002763749540000121
it should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for detecting abnormal users in a social network is characterized by comprising the following steps:
preprocessing the crawled social network data, and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
based on the social network attribute matrix and the social network adjacent attribute matrix, obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure, and updating an abnormal value of each user in the social network;
and evaluating the abnormal degree of each user in the social network through the abnormal value, and finishing the detection and identification of the abnormal user in the social network.
2. The method of detecting users with abnormal social network according to claim 1, wherein the method of detecting users with abnormal social network comprises the steps of:
(1) constructing a social network (V, E, A) by using the social network data set, wherein V is a set of all nodes in the social network, E is a set of all edges in the social network, and A is an attribute matrix of all nodes in the social network;
(2) preprocessing the social network data (V, E, A) in the step (1) to construct an N multiplied by N dimensional social network adjacency matrix G, N multiplied by M dimensional social network attribute matrix A and an N multiplied by M dimensional social network adjacency attribute matrix
Figure FDA0002763749530000011
(3) Constructing a social network abnormal user detection model based on network embedding;
(4) initializing relevant parameters of the model, and repeatedly and iteratively calculating by a gradient descent method to reduce the loss function value
Figure FDA0002763749530000012
And sequencing all the nodes in the social network from high to low according to the abnormal values until convergence, and outputting the result and feeding back the result to data mining personnel for detecting and identifying the abnormal nodes in the social network.
3. The social networking anomalous user detection method of claim 2, wherein said preprocessing step includes:
1) and (3) shaping the unique identification of the social network node: aiming at each user node v in the social network data set, taking the row number of the user node v as a unique integer index of the user node v, and starting the row number index from 1;
2) constructing a social network adjacency matrix: constructing an N multiplied by N-dimensional social network adjacency matrix G based on every two pairwise attention relationship matrixes of the dimension of E multiplied by 2 in the social network data;
3) constructing a social network attribute matrix: the attribute vectors of all nodes in the social network data set form an N multiplied by M-dimensional social network attribute matrix A;
4) constructing a social network adjacency attribute matrix: aiming at each node v in the graph, acquiring a neighbor node set Neigh (v), if the neighbor node exists, the adjacent attribute vector is the average value of all the neighbor node attribute vectors, namely
Figure FDA0002763749530000021
If there is no neighbor node, then the adjacent attribute vector is assigned as its own attribute vector, i.e. the neighbor node does not exist
Figure FDA0002763749530000022
The operation is executed on all nodes in the social network to obtain the social network adjacent attribute matrix with the dimension of N multiplied by M
Figure FDA0002763749530000023
4. The method of social network anomaly user detection according to claim 2, said model building step comprising:
1) constructing a deep neural network model based on a self-coding structure: the encoder mainly comprises K layers of full connection layers, wherein K is a positive integer greater than or equal to 1, M-dimensional attribute vectors are finally reduced into D-dimensional hidden layer output through the K layers of encoders, and the full connection layers are connected through a hyperbolic tangent activation function;
the decoder also comprises K layers of full connection layers, the D-dimensional input vector is finally expanded into an M-dimensional output vector through the K layers of decoders, and the full connection layers are connected through a hyperbolic tangent activation function;
constructing an N multiplied by D-dimensional social network user low-dimensional representation matrix E based on the N multiplied by D-dimensional social network user low-dimensional representation matrix E;
2) constructing a loss function of a deep neural network model based on a self-coding structure: the method has the advantages that the abnormal nodes can be detected and identified to the maximum extent by the social network abnormal user detection model based on network embedding, meanwhile, the influence of the abnormal nodes on social media network representation learning is reduced as much as possible, and the following loss function is constructed;
3) updating the outliers of the deep neural network model based on the self-coding structure.
5. The method for detecting the abnormal users in the social network as claimed in claim 4, wherein the deep neural network model based on the self-coding structure is constructed, and the full connection layers are connected through a hyperbolic tangent activation function:
Figure FDA0002763749530000024
the decoder consists of K layers of full connection layers, and the full connection layers are connected through a hyperbolic tangent activation function:
Figure FDA0002763749530000031
6. the social networking anomalous user detection method of claim 4, wherein said loss function:
Figure FDA0002763749530000032
wherein N is the total number of nodes in the social networkM is the dimension of the node attribute vector, Y represents the adjacency attribute matrix, in particular YijRepresents the jth adjacency attribute value of the ith node,
Figure FDA0002763749530000033
a adjacency matrix representing the output of the deep neural network based on the self-coding structure may be, in particular,
Figure FDA0002763749530000034
a j-th adjacency attribute value, λ, of an ith node in a matrix representing the output of the deep neural networkiAn outlier representing the ith node, identifying the degree of anomaly for that node;
outlier updates are based on the following formula:
Figure FDA0002763749530000035
where M is the dimension of the node attribute vector and Y represents the adjacency attribute matrix, in particular YijRepresents the jth adjacency attribute value of the ith node,
Figure FDA0002763749530000036
a adjacency matrix representing the output of the deep neural network based on the self-coding structure may be, in particular,
Figure FDA0002763749530000037
a j-th adjacency attribute value representing an i-th node in a matrix of the deep neural network output,
Figure FDA0002763749530000038
representing a calculation matrix
Figure FDA0002763749530000039
The Frobenius norm of the time-varying abnormal value needs to be updated after the deep neural network parameters are updated in each iteration.
7. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:
preprocessing the crawled social network data, and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
based on the social network attribute matrix and the social network adjacent attribute matrix, obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure, and updating an abnormal value of each user in the social network;
and evaluating the abnormal degree of each user in the social network through the abnormal value, and finishing the detection and identification of the abnormal user in the social network.
8. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
preprocessing the crawled social network data, and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
based on the social network attribute matrix and the social network adjacent attribute matrix, obtaining a social network user low-dimensional representation matrix by using a deep neural network model of a self-coding structure, and updating an abnormal value of each user in the social network;
and evaluating the abnormal degree of each user in the social network through the abnormal value, and finishing the detection and identification of the abnormal user in the social network.
9. An information data processing terminal, characterized in that the information data processing terminal is used for implementing the social network abnormal user detection method of any one of claims 1 to 6.
10. A social network abnormal user detection system for implementing the social network abnormal user detection method of any one of claims 1 to 6, wherein the social network abnormal user detection system comprises:
the data preprocessing module is used for preprocessing the crawled social network data and constructing a social network adjacency matrix, a social network attribute matrix and a social network adjacency attribute matrix;
the abnormal value updating module is used for obtaining a social network user low-dimensional representation matrix by utilizing a deep neural network model of a self-coding structure based on the social network attribute matrix and the social network adjacent attribute matrix, and updating the abnormal value of each user in the social network;
and the abnormal user detection and identification module is used for evaluating the abnormal degree of each user in the social network through the abnormal value so as to complete the detection and identification of the abnormal user in the social network.
CN202011226262.1A 2020-11-05 2020-11-05 Social network abnormal user detection method, system, medium, equipment and terminal Pending CN112445957A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011226262.1A CN112445957A (en) 2020-11-05 2020-11-05 Social network abnormal user detection method, system, medium, equipment and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011226262.1A CN112445957A (en) 2020-11-05 2020-11-05 Social network abnormal user detection method, system, medium, equipment and terminal

Publications (1)

Publication Number Publication Date
CN112445957A true CN112445957A (en) 2021-03-05

Family

ID=74736610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011226262.1A Pending CN112445957A (en) 2020-11-05 2020-11-05 Social network abnormal user detection method, system, medium, equipment and terminal

Country Status (1)

Country Link
CN (1) CN112445957A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537272A (en) * 2021-03-29 2021-10-22 之江实验室 Semi-supervised social network abnormal account detection method based on deep learning
CN115086270A (en) * 2022-07-28 2022-09-20 深圳市爱聊科技有限公司 User social interaction method, platform, equipment and storage medium
CN116680633A (en) * 2023-05-06 2023-09-01 国网四川省电力公司广安供电公司 Abnormal user detection method, system and storage medium based on multitask learning

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109743268A (en) * 2018-12-06 2019-05-10 东南大学 Millimeter wave channel estimation and compression method based on deep neural network
US20190197236A1 (en) * 2017-12-27 2019-06-27 Nec Laboratories America, Inc. Reconstruction-based anomaly detection
CN110191110A (en) * 2019-05-20 2019-08-30 山西大学 Social networks exception account detection method and system based on network representation study
CN110321493A (en) * 2019-06-24 2019-10-11 重庆邮电大学 A kind of abnormality detection of social networks and optimization method, system and computer equipment
CN110532436A (en) * 2019-07-17 2019-12-03 中国人民解放军战略支援部队信息工程大学 Across social network user personal identification method based on community structure
CN110781406A (en) * 2019-10-14 2020-02-11 西安交通大学 Social network user multi-attribute inference method based on variational automatic encoder
CN110865625A (en) * 2018-08-28 2020-03-06 中国科学院沈阳自动化研究所 Process data anomaly detection method based on time series
CN111126437A (en) * 2019-11-22 2020-05-08 中国人民解放军战略支援部队信息工程大学 Abnormal group detection method based on weighted dynamic network representation learning
CN111340641A (en) * 2020-05-22 2020-06-26 浙江工业大学 Abnormal hospitalizing behavior detection method
CN111538614A (en) * 2020-04-29 2020-08-14 济南浪潮高新科技投资发展有限公司 Method for detecting time sequence abnormal operation behavior of operating system
CN111767472A (en) * 2020-07-08 2020-10-13 吉林大学 Method and system for detecting abnormal account of social network
US20200342329A1 (en) * 2019-04-25 2020-10-29 Sap Se Architecture search without using labels for deep autoencoders employed for anomaly detection

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190197236A1 (en) * 2017-12-27 2019-06-27 Nec Laboratories America, Inc. Reconstruction-based anomaly detection
CN110865625A (en) * 2018-08-28 2020-03-06 中国科学院沈阳自动化研究所 Process data anomaly detection method based on time series
CN109743268A (en) * 2018-12-06 2019-05-10 东南大学 Millimeter wave channel estimation and compression method based on deep neural network
US20200342329A1 (en) * 2019-04-25 2020-10-29 Sap Se Architecture search without using labels for deep autoencoders employed for anomaly detection
CN110191110A (en) * 2019-05-20 2019-08-30 山西大学 Social networks exception account detection method and system based on network representation study
CN110321493A (en) * 2019-06-24 2019-10-11 重庆邮电大学 A kind of abnormality detection of social networks and optimization method, system and computer equipment
CN110532436A (en) * 2019-07-17 2019-12-03 中国人民解放军战略支援部队信息工程大学 Across social network user personal identification method based on community structure
CN110781406A (en) * 2019-10-14 2020-02-11 西安交通大学 Social network user multi-attribute inference method based on variational automatic encoder
CN111126437A (en) * 2019-11-22 2020-05-08 中国人民解放军战略支援部队信息工程大学 Abnormal group detection method based on weighted dynamic network representation learning
CN111538614A (en) * 2020-04-29 2020-08-14 济南浪潮高新科技投资发展有限公司 Method for detecting time sequence abnormal operation behavior of operating system
CN111340641A (en) * 2020-05-22 2020-06-26 浙江工业大学 Abnormal hospitalizing behavior detection method
CN111767472A (en) * 2020-07-08 2020-10-13 吉林大学 Method and system for detecting abnormal account of social network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张常华等: "深度自编码器在数据异常检测中的应用研究", 《计算机工程与应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537272A (en) * 2021-03-29 2021-10-22 之江实验室 Semi-supervised social network abnormal account detection method based on deep learning
CN113537272B (en) * 2021-03-29 2024-03-19 之江实验室 Deep learning-based semi-supervised social network abnormal account detection method
CN115086270A (en) * 2022-07-28 2022-09-20 深圳市爱聊科技有限公司 User social interaction method, platform, equipment and storage medium
CN115086270B (en) * 2022-07-28 2022-11-18 深圳市爱聊科技有限公司 User social interaction method, platform, equipment and storage medium
CN116680633A (en) * 2023-05-06 2023-09-01 国网四川省电力公司广安供电公司 Abnormal user detection method, system and storage medium based on multitask learning
CN116680633B (en) * 2023-05-06 2024-01-26 国网四川省电力公司广安供电公司 Abnormal user detection method, system and storage medium based on multitask learning

Similar Documents

Publication Publication Date Title
CN112529168B (en) GCN-based attribute multilayer network representation learning method
CN112445957A (en) Social network abnormal user detection method, system, medium, equipment and terminal
CN110968701A (en) Relationship map establishing method, device and equipment for graph neural network
CN110135157A (en) Malware homology analysis method, system, electronic equipment and storage medium
Tsui et al. Data mining methods and applications
Duan et al. MS2GAH: Multi-label semantic supervised graph attention hashing for robust cross-modal retrieval
Qiu et al. Transparent sequential learning for statistical process control of serially correlated data
Sichao et al. Two‐order graph convolutional networks for semi‐supervised classification
CN110991603B (en) Local robustness verification method of neural network
Huang Network intrusion detection based on an improved long-short-term memory model in combination with multiple spatiotemporal structures
Zheng et al. An automatic data process line identification method for dam safety monitoring data outlier detection
CN115796229A (en) Graph node embedding method, system, device and storage medium
Lou et al. Classification-based prediction of network connectivity robustness
Liu et al. When broad learning system meets label noise learning: A reweighting learning framework
Ding et al. User identification across multiple social networks based on naive Bayes model
CN110717116A (en) Method, system, device and storage medium for predicting link of relational network
CN116186295B (en) Attention-based knowledge graph link prediction method, attention-based knowledge graph link prediction device, attention-based knowledge graph link prediction equipment and attention-based knowledge graph link prediction medium
Yang et al. Reliability assessment of CNC machining center based on Weibull neural network
CN116861923A (en) Multi-view unsupervised graph contrast learning model construction method, system, computer, storage medium and application
CN115982654A (en) Node classification method and device based on self-supervision graph neural network
Zhang et al. End‐to‐end generation of structural topology for complex architectural layouts with graph neural networks
CN111178630A (en) Load prediction method and device
CN110909777A (en) Multi-dimensional feature map embedding method, device, equipment and medium
Крикун Improving the Accuracy of the Neural Network Models Interpretation of Nonlinear Dynamic Objects
US11609936B2 (en) Graph data processing method, device, and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210305