CN116304311A - Online social network spam comment user detection method - Google Patents
Online social network spam comment user detection method Download PDFInfo
- Publication number
- CN116304311A CN116304311A CN202310148077.2A CN202310148077A CN116304311A CN 116304311 A CN116304311 A CN 116304311A CN 202310148077 A CN202310148077 A CN 202310148077A CN 116304311 A CN116304311 A CN 116304311A
- Authority
- CN
- China
- Prior art keywords
- node
- neighbors
- dimension
- neural network
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title description 12
- 238000013528 artificial neural network Methods 0.000 claims abstract description 44
- 239000011159 matrix material Substances 0.000 claims abstract description 18
- 238000000034 method Methods 0.000 claims abstract description 16
- 230000002776 aggregation Effects 0.000 claims abstract description 10
- 238000004220 aggregation Methods 0.000 claims abstract description 10
- 230000003993 interaction Effects 0.000 claims abstract description 8
- 238000010276 construction Methods 0.000 claims abstract description 5
- 238000005457 optimization Methods 0.000 claims abstract description 5
- 230000004931 aggregating effect Effects 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims description 17
- 230000002159 abnormal effect Effects 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 10
- 230000009467 reduction Effects 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 abstract description 2
- 238000000605 extraction Methods 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- 244000046052 Phaseolus vulgaris Species 0.000 description 1
- 241001334134 Rugopharynx epsilon Species 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9538—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a method for detecting users of online social network spam comments, which comprises the following steps: step one, diagram construction and pretreatment: establishing a graph structure by taking online social platform users as nodes and the interaction relationship among the users as edges, and constructing an adjacency matrix; manually marking part of data, giving the number and the label of the marked nodes, wherein 1 represents a spammer and 0 represents a normal user; establishing a confidence vector; step two, constructing a graph neural network: the graph neural network comprises two layers, the output dimension of the last layer is 2, the 1 st dimension represents the confidence that the neural network judges the node as a spam comment sender, the 2 nd dimension represents the confidence that the neural network judges the node as a normal user, the graph neural network acquires the characteristics of the node by aggregating the characteristics of the node neighbors, the category information of the neighbors is considered when the node characteristics are extracted, and different characteristic aggregation strategies are executed for the neighbors of different types; and thirdly, iterative optimization.
Description
Technical Field
The invention belongs to the field of data mining, and relates to an anomaly detection method based on a graph neural network. According to the method, an online social network is modeled as a graph, the users with the spam comments are regarded as abnormal nodes in the online social network, the graph neural network fused with side information is used for extracting the characteristics of the users of the online social network, and the characteristics are input into a classifier for semi-supervised abnormal detection.
Background
With the development of the internet, more and more online platforms such as microblogs, popular critique, bean paste and the like are emerging. With the rise in user cardinality, there are increasing numbers of nonsensical and malicious reviews on these online platforms. In addition, some bad merchants hire special swipe accounts to send good reviews under their products for the purpose of swiping the rate. These malicious users severely disrupt the trustworthiness of various online platforms. Only manual inspection consumes a lot of manpower, so the demand for intelligent detection of users of spam comments is increasing.
By modeling users as nodes, various interactions between users as edges may establish an online social network for an online social platform, such that graph-based algorithms may be used to detect spammer senders. Due to the complexity of the graph data, camouflage of the spammer, and other characteristics, detection of the spammer still faces various challenges.
The graph neural network is widely applied to various graph learning tasks due to the excellent performance of the graph neural network on graph feature extraction. Fdgars [1] A method for detecting spammer comment senders by using a graph neural network. However, in order to avoid being detected by the algorithm, the spammer may generate some disguising behavior, such as establishing normal interactions with a large number of normal users, or disguising his own user attributes and transmitted comments to be similar to normal user attributes and comments. At this time, the structure of the graph neural network needs to be optimized to adapt to the detection task of the spammer when camouflage behaviors exist.
[1]Wang J,Wen R,Wu C,et al.FdGars:Fraudster Detection via Graph ConvolutionalNetworks in Online App Review System[C].In Companion of The 2019 WorldWide Web Conference,2019:310–316
Disclosure of Invention
The invention mainly aims to provide a method for detecting users of spam comments in an online social network, which is used for more accurately detecting spammer comment senders in the social network. The technical proposal is as follows:
a method for detecting users of online social network spam comments comprises the following steps:
step one, graph construction and preprocessing
(1) Establishing a graph structure by taking online social platform users as nodes and the interaction relationship among the users as edges, and constructing an adjacency matrix;
(2) Digitizing the user attributes, constructing an attribute matrix, wherein each row of the attribute matrix represents the attribute of the corresponding user;
(3) Manually marking part of data, giving the number and the label of the marked nodes, wherein 1 represents a spammer, 0 represents a normal user, and dividing a training set and a testing set;
(4) Establishing a confidence vectorWherein N represents the number of nodes, the i-th site is 0 time and represents the node i more likely to be a normal user, and the i-th site is 1 time and represents the node i more likely to be a spammer comment sender; initializing the confidence vector, enabling the corresponding position of the node with the label of 0 in the training set to be 0 in the B, enabling the corresponding position of the node with the label of 1 in the training set to be 1 in the B, and enabling the rest positions to be 0;
step two, constructing a graph neural network
The graph neural network comprises two layers, the output dimension of the last layer is 2, the 1 st dimension represents the confidence that the neural network judges the node as a spam comment sender, the 2 nd dimension represents the confidence that the neural network judges the node as a normal user, the graph neural network acquires the characteristics of the node by aggregating the characteristics of the node neighbors, the category information of the neighbors is considered when the node characteristics are extracted, and different characteristic aggregation strategies are executed for the neighbors of different types;
the neural network comprises the following processes for each layer of graph:
(1) Feature h of user u using full connectivity layer u The user characteristic z after dimension reduction is obtained by dimension reduction u The formula is as follows:
z u =W t h u
wherein,,weight matrix of full connection layer, d in Inputting dimensions for the layer, d out Outputting dimensions for the layer;
(2) Regarding the node v as a central node, for each neighbor u of the node v under the relationship r, calculating the importance coefficient of the node v according to the relationship with the central nodeThe formula is as follows:
(3) Judging whether the neighbors of the node are similar to the neighbors of the node according to the confidence vector B, and putting the similar neighbors of the node v under the relation r into a setIn the heterogeneous neighbors put into the set +.>In (a) and (b);
(4) Respectively carrying out normalization operation on importance coefficients of the two types of neighbors to obtain attention scores for aggregation; for node v's neighbor u under relationship r, ifNode u is similar to node v, its attention scoreThe method is characterized by comprising the following steps:
wherein,,is a set of homogeneous neighbors of node v; exp is a natural exponential function; sigma is a nonlinear activation function; similarly, if the neighbor node is heterogeneous with node v, its attention score +.>The method is characterized by comprising the following steps:
(5) Respectively calculating the embedding of similar neighbors of the central node v under the relation r according to the attention scores obtained by the previous stepEmbedding of heterogeneous neighbors of the central node v>The calculation formula is as follows:
(6) For each node v, in its characteristic z v The Euclidean distance between the node and other node characteristics is used as a basis to obtain a k neighbor graph formed by k neighbor nodes
(7) According toPerforming an aggregation operation to obtain k neighbor embedding h of each node v knn,v The formula is as follows:
wherein K is the neighbor number selected by K neighbors;is a weight matrix; />K neighbor set for node v;
(8) For each node v, its homogeneous neighbors are embeddedHeterogeneous neighbor embedding->And k neighbor embedding h knn,v The fusion can obtain the comprehensive embedding of the node v under the relation r>
(9) Introducing a multi-head attention mechanism, repeating the steps (1) to (8) for H times, and combining the stepsSplicing to obtain the characteristic of the multi-head post-attention node v under the relation r>
(10) The operation of splicing and linear transformation is adopted to make the relation of multipleIntegration into h' v ;
After stacking two layers of the graph neural network, the output h of the last layer is obtained v,out The output is a two-dimensional vector; the 1 st dimension represents the confidence level of the sender of the spam comment, and the 2 nd dimension represents the confidence level of the normal user; for h v,out The probability value of the node belonging to the normal node and the abnormal node can be represented after the softmax operation is taken; when h v,out When the 0 th dimension value of (2) is larger than the 1 st dimension value, the node is judged as a normal node; when h v,out When the 1 st dimension value is larger than the 0 th dimension value, the node is determined as an abnormal node;
step three, iterative optimization
(1) Inputting the whole graph into a graph neural network to obtain an output result h out ,Is all h v,out Is connected with the longitudinal splicing of the two parts;
(2) Undersampling training labels to obtain node sets participating in loss calculationThe number of normal nodes participating in loss calculation is similar to that of abnormal nodes, so that the influence of imbalance of the labels 01 is avoided;
Wherein y is v A label representing node v;
(4) Output h according to model out Updating confidence vector B to let h out The corresponding position of the row with the 1 st dimension being greater than the 2 nd dimension in the B is 1, and the rest is 0;
step four, outputting the unlabeled user category
(1) Obtaining model output h out Taking out the row corresponding to the node without the label;
(2) If node i is at h out The 1 st dimension value of the corresponding row in the list is larger than the 2 nd dimension value, the node is a spammer comment sender, and otherwise, the node is a normal user.
Firstly modeling users as nodes, modeling the interaction relationship among the users as edges to establish a graph structure, and simultaneously, manually marking a small number of spammer comment senders; then, a graph neural network is built, the neural network mainly comprises three parts, namely neighborhood feature extraction, global feature extraction and feature fusion, the graph neural network finally outputs a two-dimensional vector, the first dimension can be regarded as the probability that the user is a spammer comment sender, and the second dimension can be regarded as the probability that the user is a normal user; then, iterative optimization is carried out by using a gradient descent algorithm, and in each iteration, the loss of the neural network is calculated by using the label information and the cross entropy and the parameters of the neural network are updated by using gradient descent according to the loss; and finally, acquiring the output of the neural network as a detection result after loss convergence. The invention has the following characteristics: the label information required to be manually marked is less; the detection capability of the disguised spammer is high.
Drawings
FIG. 1 is a flow chart of the steps performed.
Detailed Description
Users have various interaction relationships on an online social platform, for example, in a popular comment, the users can interact through mutual comments, and also can interact through mutual comments on the same article. Therefore, the invention mainly solves the problem of detecting the spammer comment senders on the multiple relation diagrams. For a multiple relationship graphWherein,, representing node set,/->Representing a collection of node attributes. For each relation r.epsilon. {1,2, …, R } there is a set of edges +.>Wherein (1)>Representative node v i And node v j There is an edge under the relation r. The specific steps of the invention are as follows:
1) Graph construction and preprocessing
First, users are nodes, and the interactive relation among the users is an edge to build a graph structure and an adjacency matrix.
And secondly, digitizing the user attributes to construct an attribute matrix, wherein each row of the matrix represents the attributes of the corresponding user.
And thirdly, manually marking 3% of data, and giving the number and the label of the marked nodes. 1 represents a spam comment sender, and 0 represents a normal user.
And fourthly, dividing the training set and the testing set according to the manual labeling, wherein the ratio of the training set to the testing set is 7:3.
Fifth, a confidence vector is establishedWherein N represents the number of nodes, the ith position is 0 time representing node i more likely to be a normal user, and the ith position is 1 time representing node i more likely to be a spammer comment sender. Initializing the confidence vector, enabling the corresponding position of the node with the label of 0 in the training set to be 0 in the B, and enabling the corresponding position of the node with the label of 1 in the training set to be 1 in the B. The remaining positions are all 0.
2) Graph neural network construction
The graph neural network comprises two layers, the output dimension of the last layer is 2, the 1 st dimension represents the confidence that the neural network judges the node as a spammer comment sender, and the 2 nd dimension represents the confidence that the neural network judges the node as a normal user.
Conventional spam sender detection often ignores the disguising phenomenon of the spammer. The characteristics of the node are acquired by aggregating the characteristics of the node neighbors through the graph neural network. If the spammer interacts with a large number of normal users, the spammer may get similar features to the normal users after passing through the neural network. Therefore, the method considers the category information of the neighbors when extracting the node characteristics, and executes different characteristic aggregation strategies for the neighbors of different types.
The following procedure is included for each layer of neural network:
algorithm 1: graph neural network of fusion edge type
First step, using the full connection layer to connect the feature h of user u u The user characteristic z after dimension reduction is obtained by dimension reduction u The specific formula is as follows:
z u =W t h u
wherein,,weight matrix of full connection layer, d in Inputting dimensions for the layer, d out Outputting dimensions for the layer;
secondly, regarding the node v as a central node, and calculating importance coefficients of each neighbor u of the node v under the relationship r according to the relationship between the node v and the central nodeThe specific formula is as follows:
Thirdly, judging whether the neighbors of the node are similar to the nodes according to the confidence coefficient vector B, and putting the similar neighbors of the node v under the relation r into a setIn the heterogeneous neighbors put into the set +.>Is a kind of medium.
And fourthly, respectively carrying out normalization operation on importance coefficients of the two types of neighbors to obtain attention scores for aggregation. For the neighbor u of the node v under the relation r, if the node u is similar to the node v, the attention score of the neighbor uThe following formula can be used to determine:
wherein,,is a set of homogeneous neighbors of node v; exp is a natural exponential function; sigma is an arbitrary nonlinear activation function. Similarly, if the neighbor node is heterogeneous with node v, its attention score +.>The following formula can be used to determine:
Fifthly, respectively calculating the embedding of the similar neighbors of the central node v under the relation r according to the attention score calculated in the previous stepEmbedding of heterogeneous neighbors of the central node v>The calculation formula is as follows:
sixth step, for each node v, the characteristic z v The Euclidean distance between the node and other node characteristics is used as a basis to obtain a k neighbor graph formed by k neighbor nodes
Seventh step, according toPerforming an aggregation operation to obtain k neighbor embedding h of each node v knn,v The formula is as follows:
wherein K is the neighbor number selected by K neighbors, and is generally 2;is a weight matrix; />Is the k-nearest neighbor set of node v.
Eighth step, for each node v, its same kind of neighbors are embeddedHeterogeneous neighbor embedding->And k neighbor embedding h knn,v The fusion can obtain the comprehensive embedding of the node v under the relation r>
Wherein,,is a linear transformation matrix for embedding the node itself, the similar neighbor and the heterogeneous neighborInlet and k nearest neighbor embedding integration as d out Vector of dimension; and I is a splicing operation.
Ninth, a multi-head attention mechanism is introduced, the first to eighth steps are repeated for H times, and these are repeatedSplicing to obtain the characteristic of the multi-head post-attention node v under the relation r>The recommended value of H is 4.
Tenth step, the relation is thatIntegration into h' v . Directly adopts splicing and linear transformation operation. The formula is as follows:
The above is an operation flow of the one-layer graph neural network. The output is a two-dimensional vector. The 1 st dimension represents the confidence of the sender of the spam comment, and the 2 nd dimension represents the confidence of the normal user. For h v,out And after the softmax operation is taken, the probability value of the node belonging to the normal node and the abnormal node can be represented. When h v,out When the 0 th dimension value of (2) is larger than the 1 st dimension value, the node is judged as a normal node; when h v,out When the 1 st dimension value is larger than the 0 th dimension value, the node is determined to be an abnormal node.
3) Iterative optimization
The first step, inputting the whole graph into a graph neural network to obtain an output result h out 。Is all h v,out Is a longitudinal splice of (c).
Step two, undersampling the training label to obtain a node set participating in loss calculationThe number of normal nodes participating in loss calculation is similar to that of abnormal nodes, so that the influence of imbalance of the tag 01 is avoided.
Wherein y is v A label representing node v.
Fourth step, outputting h according to the model out Updating confidence vector B to let h out Rows of dimension 1 and greater than dimension 2 of the row in B correspond to positions 1 and the remainder are 0.
4) Label-free user class output
First, obtaining a model output h out And taking out the row corresponding to the node without the label.
Second, if node i is at h out The 1 st dimension of the corresponding row is largeIn dimension 2, the node is the spammer, and otherwise is the normal user.
Thirdly, if various indexes such as the accuracy of the model and the like are required to be obtained, the labels of the test set and the corresponding output results are used for carrying out corresponding calculation.
The method and the system can be suitable for detection tasks of the spammer in various online platforms. And the method can effectively detect the spammer comment sender in camouflage. In comment data of Amazon instrument commodities, the invention takes users as nodes, interaction relations among the users are edges, user attributes are node characteristics to establish a graph structure, and a candidate list of spam comment senders is output after iterative training is carried out by using a graph neural network. The recall rate can reach 90%.
Claims (1)
1. A method for detecting users of online social network spam comments comprises the following steps:
step one, graph construction and preprocessing
(1) Establishing a graph structure by taking online social platform users as nodes and the interaction relationship among the users as edges, and constructing an adjacency matrix;
(2) Digitizing the user attributes, constructing an attribute matrix, wherein each row of the attribute matrix represents the attribute of the corresponding user;
(3) Manually marking part of data, giving the number and the label of the marked nodes, wherein 1 represents a spammer, 0 represents a normal user, and dividing a training set and a testing set;
(4) Establishing a confidence vectorWherein N represents the number of nodes, the i-th site is 0 time and represents the node i more likely to be a normal user, and the i-th site is 1 time and represents the node i more likely to be a spammer comment sender; initializing the confidence vector, enabling the corresponding position of the node with the label of 0 in the training set to be 0 in the B, enabling the corresponding position of the node with the label of 1 in the training set to be 1 in the B, and enabling the rest positions to be 0;
step two, constructing a graph neural network
The graph neural network comprises two layers, the output dimension of the last layer is 2, the 1 st dimension represents the confidence that the neural network judges the node as a spam comment sender, the 2 nd dimension represents the confidence that the neural network judges the node as a normal user, the graph neural network acquires the characteristics of the node by aggregating the characteristics of the node neighbors, the category information of the neighbors is considered when the node characteristics are extracted, and different characteristic aggregation strategies are executed for the neighbors of different types;
the neural network comprises the following processes for each layer of graph:
(1) Feature h of user u using full connectivity layer u The user characteristic z after dimension reduction is obtained by dimension reduction u The formula is as follows:
z u =W t h u
wherein,,weight matrix of full connection layer, d in Inputting dimensions for the layer, d out Outputting dimensions for the layer;
(2) Regarding the node v as a central node, for each neighbor u of the node v under the relationship r, calculating the importance coefficient of the node v according to the relationship with the central nodeThe formula is as follows:
(3) Judging whether the neighbors of the node are similar to the neighbors of the node according to the confidence vector B, and putting the similar neighbors of the node v under the relation r into a setIn the heterogeneous neighbors put into the set +.>In (a) and (b);
(4) Respectively carrying out normalization operation on importance coefficients of the two types of neighbors to obtain attention scores for aggregation; for the neighbor u of the node v under the relation r, if the node u is similar to the node v, the attention score of the neighbor uThe method is characterized by comprising the following steps:
wherein,,is a set of homogeneous neighbors of node v; exp is a natural exponential function; sigma is a nonlinear activation function; similarly, if the neighbor node is heterogeneous with node v, its attention score +.>The method is characterized by comprising the following steps:
(5) Respectively calculating the embedding of similar neighbors of the central node v under the relation r according to the attention scores obtained by the previous stepEmbedding of heterogeneous neighbors of the central node v>The calculation formula is as follows:
(6) For each node v, in its characteristic z v The Euclidean distance between the node and other node characteristics is used as a basis to obtain a k neighbor graph formed by k neighbor nodes
(7) According toPerforming an aggregation operation to obtain k neighbor embedding h of each node v knn,v The formula is as follows:
wherein K is the neighbor number selected by K neighbors;is a weight matrix; />K neighbor set for node v;
(8) For each node v, its homogeneous neighbors are embeddedHeterogeneous neighbor embedding->And k neighbor embedding h knn,v The fusion can obtain the comprehensive embedding of the node v under the relation r>
(9) Introducing a multi-head attention mechanism, repeating the steps (1) to (8) for H times, and combining the stepsSplicing to obtain the characteristic of the multi-head post-attention node v under the relation r>
(10) The operation of splicing and linear transformation is adopted to make the relation of multipleIntegration into h' v ;
After stacking two layers of the graph neural network, the output h of the last layer is obtained v,out The output is a two-dimensional vector; the 1 st dimension represents the confidence level of the sender of the spam comment, and the 2 nd dimension represents the confidence level of the normal user; for h v,out The probability value of the node belonging to the normal node and the abnormal node can be represented after the softmax operation is taken; when h v,out When the 0 th dimension value of (2) is larger than the 1 st dimension value, the node is judged as a normal node; when h v,out When the 1 st dimension value is larger than the 0 th dimension value, the node is determined as an abnormal node;
step three, iterative optimization
(1) Inputting the whole graph into a graph neural network to obtain an output result h out ,Is all h v,out Is connected with the longitudinal splicing of the two parts;
(2) Undersampling training labels to obtain node sets participating in loss calculationThe number of normal nodes participating in loss calculation is similar to that of abnormal nodes, so that the influence of imbalance of the labels 01 is avoided;
Wherein y is v A label representing node v;
(4) Output h according to model out Updating confidence vector B to let h out The corresponding position of the row with the 1 st dimension being greater than the 2 nd dimension in the B is 1, and the rest is 0;
step four, outputting the unlabeled user category
(1) Obtaining model output h out Taking out the row corresponding to the node without the label;
(2) If node i is at h out The 1 st dimension value of the corresponding row in the list is larger than the 2 nd dimension value, and the node is a spam commentThe sender, and vice versa, is a normal user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310148077.2A CN116304311A (en) | 2023-02-22 | 2023-02-22 | Online social network spam comment user detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310148077.2A CN116304311A (en) | 2023-02-22 | 2023-02-22 | Online social network spam comment user detection method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116304311A true CN116304311A (en) | 2023-06-23 |
Family
ID=86816079
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310148077.2A Pending CN116304311A (en) | 2023-02-22 | 2023-02-22 | Online social network spam comment user detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116304311A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117828514A (en) * | 2024-03-04 | 2024-04-05 | 清华大学深圳国际研究生院 | User network behavior data anomaly detection method based on graph structure learning |
-
2023
- 2023-02-22 CN CN202310148077.2A patent/CN116304311A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117828514A (en) * | 2024-03-04 | 2024-04-05 | 清华大学深圳国际研究生院 | User network behavior data anomaly detection method based on graph structure learning |
CN117828514B (en) * | 2024-03-04 | 2024-05-03 | 清华大学深圳国际研究生院 | User network behavior data anomaly detection method based on graph structure learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108805200B (en) | Optical remote sensing scene classification method and device based on depth twin residual error network | |
CN111881350B (en) | Recommendation method and system based on mixed graph structured modeling | |
CN111222332B (en) | Commodity recommendation method combining attention network and user emotion | |
CN113961718A (en) | Knowledge inference method based on industrial machine fault diagnosis knowledge graph | |
CN111292195A (en) | Risk account identification method and device | |
CN114817663B (en) | Service modeling and recommendation method based on class perception graph neural network | |
CN111753207B (en) | Collaborative filtering method for neural map based on comments | |
CN110851491A (en) | Network link prediction method based on multiple semantic influences of multiple neighbor nodes | |
CN110489661B (en) | Social relationship prediction method based on generation of confrontation network and transfer learning | |
CN112417063B (en) | Heterogeneous relation network-based compatible function item recommendation method | |
KR102284436B1 (en) | Method and Device for Completing Social Network Using Artificial Neural Network | |
CN112381179A (en) | Heterogeneous graph classification method based on double-layer attention mechanism | |
CN109447110A (en) | The method of the multi-tag classification of comprehensive neighbours' label correlative character and sample characteristics | |
CN116304311A (en) | Online social network spam comment user detection method | |
CN115687925A (en) | Fault type identification method and device for unbalanced sample | |
CN106997373A (en) | A kind of link prediction method based on depth confidence network | |
CN112819024B (en) | Model processing method, user data processing method and device and computer equipment | |
CN114064627A (en) | Knowledge graph link completion method and system for multiple relations | |
CN114898121A (en) | Concrete dam defect image description automatic generation method based on graph attention network | |
CN113628059A (en) | Associated user identification method and device based on multilayer graph attention network | |
CN112465226A (en) | User behavior prediction method based on feature interaction and graph neural network | |
CN114942998B (en) | Knowledge graph neighborhood structure sparse entity alignment method integrating multi-source data | |
CN113361928B (en) | Crowd-sourced task recommendation method based on heterogram attention network | |
CN112668633B (en) | Adaptive graph migration learning method based on fine granularity field | |
CN109978013A (en) | A kind of depth clustering method for figure action identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |