CN112732921B - False user comment detection method and system - Google Patents
False user comment detection method and system Download PDFInfo
- Publication number
- CN112732921B CN112732921B CN202110070347.3A CN202110070347A CN112732921B CN 112732921 B CN112732921 B CN 112732921B CN 202110070347 A CN202110070347 A CN 202110070347A CN 112732921 B CN112732921 B CN 112732921B
- Authority
- CN
- China
- Prior art keywords
- comment
- vector
- comments
- representing
- formula
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention provides a false user comment detection method and a false user comment detection system, which comprise the following steps: collecting product comments of users and subject texts related to the comments, and establishing a user comment data set; using user comment data setsSPre-training a false user comment detection model, wherein the model consists of a text generator G, a discriminator D and a classifier C; using user comment data setsSCarrying out countermeasure training on the false user comment detection model; and inputting the user comment and the subject into a classifier of the false user comment detection model, and outputting a detection result of the user comment, namely the user comment is a false comment or a real comment. The method can obtain a detection result with higher accuracy.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a false user comment detection method and system.
Background
The false user comment refers to unreal comment for intentionally promoting or defatting the reputation of a commodity and a public praise, and the detection of false user comment is a basic task of a text classification task in natural language processing, and the basic goal is to analyze the semantic relationship of the text classification task according to the related information of the user comment and detect the false. With the rapid development and the gradual maturity of e-commerce platforms, the problem of false user comment is more and more prominent, and many domestic and foreign researchers begin to work on the problem.
Early studies of false user comment detection typically employed traditional supervised learning algorithms, which focused on extracting features by methods such as N-gram, LDA, etc. to train classifiers. These methods require complicated feature engineering to extract text features, which is cumbersome. Recently, deep-learning Neural Network models, such as Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), have shown the most advanced performance on this task without any laborious feature engineering. LiL, the like, uses a convolutional neural network to perform semantic representation on a document level to perform false comment classification, and adds an attention mechanism into a CNN, uses KL divergence as weight calculation, calculates the importance of each word in a sentence, further obtains the importance weight of the comment sentence, and combines the importance weight of the comment sentence with a comment sentence vector after weighting into a document vector for classification; zhao et al propose to embed word order features in the convolutional layer and the pooling layer of CNN to capture semantic features related to the word order of comments, making CNN more suitable for solving the problem of false comment detection; wang et al propose a CNN model based on attention mechanism, which performs feature extraction through CNN, and analyzes two dimensions of semantics and behaviors of comments by combining the attention mechanism, so that the model learns to classify from the perspective of semantics or behaviors, even by referring to the two angles at the same time; ren et al use convolutional neural networks and combine with recurrent neural networks to build models to identify false comments, where convolutional neural networks are used to learn comment sentence expressions, then gated recurrent neural networks with attention mechanisms are used to combine them, modeling is performed on dialog information and document vectors are generated, and finally, the document expression forms are directly used for false comment identification; yuan et al combines with reviewers and products to perform feature extraction and false comment classification, provides a self-attention-based model, obtains semantic representation by performing self-attention coding on comment texts, obtains reviewer-related representation and product-related representation by utilizing vector decomposition, and performs classification after combining features; li et al propose a false comment detection based on Graph Convolutional neural Network (GCN), which uses an heteromorphic Graph and a homographic Graph to acquire local information and global information, extracts key features from complex Graph data structure information and multi-modal attribute information through aggregation, and performs false comment detection by combining the key features to adapt to more varied comment environments; deng et al propose a PU learning-based self-coding model, construct a feature vector based on the input comment-related metadata, perform coding learning on the feature vector through the self-coding model, calculate clustering distance by using a K-means method to determine categories, and perform PU learning; the model FakeGAN for introducing the GAN into a false comment detection task for the first time is proposed by Aghakhani and the like, a small part of marking data is used for generating a GAN sample by adopting a SeqGAN-based framework, and a large amount of marking data generated by the GAN are utilized to meet the huge sample requirement of a classified neural network, so that a good result is obtained; stantong et al propose SpamGAN, which is improved on the basis of FakeGAN, to reduce the amount of calculation and optimize the reward function, thereby achieving performance improvement.
Although the introduction of deep learning greatly improves the performance of the false comment detection model, the false comment has certain concealment and confusion, the number of comments is large, the difficulty of manual detection is high, the marked data set is deficient, the existing deep learning model is easy to over-fit, so that the model still has a large optimization space, meanwhile, the identification dimensionality of the false comment detection is only a comment text, the angle is too single, and the model detection performance is easily interfered by outlier noise.
Disclosure of Invention
In view of this, the present invention provides a method and a system for detecting false user comments, where the model detection is not easily interfered by outlier noise, and the obtained result is more accurate.
The invention is realized by adopting the following scheme: a false user comment detection method specifically comprises the following steps:
step A: collecting product comments of users and subject texts related to the comments, and establishing a user comment data set S-SL∪SUIn which S isLRepresenting a tagged user comment data set, SURepresenting an unlabeled user comment dataset;
and B: pre-training a false user comment detection model by using a user comment data set S, wherein the model consists of a text generator G, a discriminator D and a classifier C;
and C: carrying out countermeasure training on the false user comment detection model by using the user comment data set S;
step D: and inputting the user comment and the subject into a classifier of the false user comment detection model, and outputting a detection result of the user comment, namely the user comment is a false comment or a real comment.
Further, the step B specifically includes the steps of:
step B1: pre-training a text generator by using a user comment data set S;
step B2: generating comments by using the text generator obtained in the step B1, and using the comments together with the comments in the user comment data set S to pre-train the identifier and the evaluator thereof;
step B3: the classifier and its evaluator are pre-trained using a user review data set S.
Further, step B1 specifically includes the following steps:
step B11: traversing the comment training set S, and comparing SLRepresents S as (r, t, c), SUEach unlabeled training sample in (a) is represented as s ═ r, t, where r represents the comment text, t represents the subject text to which the comment relates, and c is a category label of whether the comment is false or not; segmenting the comment r and the subject t in the training sample s and removing stop words, then setting the texts of the comment r and the subject t to be fixed lengths N and M respectively, and if the number of words in the comment r and the subject t after segmentation and removal of the stop words is smaller than the fixed length value, using the supplementary symbols<PAD>Supplementing, and cutting if the length is larger than the fixed length value;
after the comment r is subjected to word segmentation and stop word removal and is set to be a fixed length, the comment r is represented as:
in the formula (I), the compound is shown in the specification,the method comprises the following steps that 1,2, RN is the ith word in a text after a comment r is subjected to word segmentation and stop word removal and is set to be of a fixed length, and is equal to 1,2, wherein RN is less than or equal to N;
after the topic t is subjected to word segmentation and stop word removal and is set to be a fixed length, the topic t is represented as follows:
in the formula (I), the compound is shown in the specification,the method comprises the following steps that i is 1,2, wherein TM is a subject t, words are segmented, stop words are removed, and the i is the ith word in a text with a fixed length, i is 1,2, TM is less than or equal to M;
step B12: coding the comment text r and the subject text t processed in the step B11 to respectively obtain the representation vectors v of the comment and the subjectrAnd vt;
Wherein v isrExpressed as:
in the formula (I), the compound is shown in the specification,as the ith word of comment textCorresponding word vectors are obtained by pre-training a word vector matrixThe method includes searching, wherein i is 1,2, N, d represents the dimension of a word vector, and | V | is the number of words in a dictionary;
wherein v istExpressed as:
in the formula (I), the compound is shown in the specification,as the ith word of the subject textCorresponding word vectors are obtained by pre-training a word vector matrixWhere, i ═ 1, 2., M, d denote the dimension of the word vector, | V | is the number of words in the dictionary;
step B13: characterization vector v for a topictExtracting the characterization vector of the trunk information of the theme by adopting maximum pooling after linear transformation and activation function
Wherein the content of the first and second substances,is a characterization vector of the stem information of the topic,for the weight matrix, a matrix dot product operation is represented,is a bias term;
step B14: will form vrVector sequence ofSequentially inputting a multi-head attention unit of the fusion subject in the generator, and inputting the ith time step asAt each time step willAndcombining the comments with the topic information through a multi-head attention mechanism to obtain a feature vector of the fusion topic of each time step and random noiseSplicing to obtain the vector sequence { x1,x2,...,xi,...,xN};
Step B15: the vector sequence { x ] obtained in the step B141,x2,...,xi,...,xNInputting bidirectional GRU, at the ith time step, outputting hidden layer state vector of bidirectional GRU 1, 2.. N, for the reverse layer of a bidirectional GRU, the output hidden layer state vector is1,2, N, d is an activation function; updating each weight matrix of GRU by adopting spectral normalization at each time step, and using Wi GA certain weight matrix representing GRU at the ith time step is obtained to obtain Wi GMaximum singular value ofTo Wi GPerforming spectrum normalization to obtain a weight matrix of GRU at the (i + 1) th time stepIs represented as follows:
repeating the steps to obtain a forward hidden layer state vector sequenceAnd reverse hidden layer state vector sequence
Step B16: connecting the forward hidden layer state vector with the reverse hidden layer state vector to obtain a comment characterization vector H of the fusion subject, wherein H is [ H ═ H1,...,hi,...,hN]T,hiAs forward hidden layer state vectorsAnd reverse hidden layer state vectorThe connection of (1);
step B17: linearly transforming the comment characterization vector H of the fusion subject, inputting softmax to obtain a word probability distribution matrix B, randomly sampling according to the word probability distribution matrix B, and generating a word sequence y of the comment text which is y ═ y { (y)1,y2,...,yi,...,yN};
Step B18: the text generator G is trained according to the following objective loss function:
wherein the content of the first and second substances,representing the conditional probability, theta, calculated by the generator at the target word positiongTo generate a set of parameters, c is the class label and z is random noise.
Further, the step B14 is specifically:
first, with XiInput representing the ith time stepTo XiIn thatIs subjected to orthogonal decomposition operation in the vector direction to obtain XiInformation about the subject portion and other information in (1), respectively corresponding to the parallel vectorsAnd vertical vectorExpressed as:
in the formula (I), the compound is shown in the specification,is a parallel vector, and is a parallel vector,is a vertical vector, and is,representing a vectorTransposing;
then, information screening is carried out by using a multi-head attention mechanism: for each attention head, the pair of parallel vectorsIs subjected to linear transformation to obtainAs Q in a multi-head attention mechanism; to pairIs subjected to linear transformation to obtainAndas K and V in the multi-head attention mechanism, respectively, are expressed as:
in the formula (I), the compound is shown in the specification,respectively are weight matrixes to be trained;
then, willInputting into a multi-head attention unit to perform multi-head attention calculation, and expressing as follows:
in the formula (I), the compound is shown in the specification,the output vector of the multi-head attention mechanism in the parallel direction is shown, M pieces A show the multi-head attention mechanism, H shows the total number of the attention heads,representing the result of the calculation of the ith head of attention,is a weight matrix to be trained;
after that, the function of softmax is used to control the operation of the computerMapping between 0 and 1 to obtain parallel vectorInformation gate vector in parallel direction after multi-head attention mechanismExpressed as:
for vertical vectorIs subjected to linear transformation to obtainAs Q in a multi-head attention mechanism, pairIs subjected to linear transformation to obtainAndas K and V in the attention mechanism, respectively, will Inputting the data into a multi-head attention unit to perform multi-head attention calculation to obtainAnd obtaining a vertical vector through a softmax functionInformation gate vector in vertical direction after multi-head attention mechanismBy usingAndtwo gate vector pairs XiScreening information to obtain the characterization vector of the fusion subject of the ith time stepExpressed as:
in the formula (I), the compound is shown in the specification,representing the weight matrices in the parallel and vertical directions respectively,respectively representInput bias terms in the parallel and vertical directions, representing a matrix dot product operation;
then will beAnd random noiseSplicing to obtain the output vector x of the ith time stepiExpressed as:
in the formula (I), the compound is shown in the specification,(ii) a It is shown that the connection operation is performed,random noise, expressed as:
in the formula (I), the compound is shown in the specification,random distribution P from a standard-Gauss distributionzObtained by intermediate sampling, PzRandom distribution P conforming to a standard Gaussian distribution and class labels c from conforming to a standard Bernoulli distributioncThe average value is obtained, and c is 1 to represent a normal comment, and c is 0 to represent a false comment.
Further, step B2 specifically includes the following steps:
step B21: after the pre-training of the generator G is completed, the generator G is utilized to generate a comment data set SGFrom SGAnd randomly extracting marked comments and unmarked comments from the S to form a pre-training set S of the discriminator DD,SDEach training sample in (1) representsIs s ═ r, cD) Where r denotes comment text, cDWhether the comment text is a category label generated by the generator or not is shown, SDInputting the training sample in (1) into a Transformer-based discriminator D for pre-training;
step B22: to SDAccording to the step B11, obtaining an initial characterization vector v of the comment text r for each training sample in the comment text rrAdding position vector to obtain position sensing characterization vectorExpressed as:
in the formula (I), the compound is shown in the specification,for the position vector, by consulting the position vector matrix Ep∈Rd×NObtained, expressed as:
in the formula (I), the compound is shown in the specification,representing a position coding vector corresponding to the ith word, d representing the dimension of the position vector, wherein the dimension of the position vector is the same as the dimension of the word vector, and N is the fixed maximum length of the comment text;
step B23: will be provided withInputting the feature vector of the comment into a Transformer network of a discriminator D
Step B24: to ODAfter linear transformation, softmax is input and identification is calculatedClass probability distribution Q of the discriminator D over all words of the commentD:
QD=softmax(ODWD+bD);
In the formula (I), the compound is shown in the specification,representing the actual class probability distribution of comments over all terms, QDThe ith row in the figure represents the actual class probability distribution of the discriminator at the ith word,as a weight matrix, the weight matrix is,is a bias term;
from a review of the class probability distribution Q over all termsDGet the entire sentence about category cDAverage class probability distribution of discriminator:
in the formula (I), the compound is shown in the specification,representing the conditional probability of the class, θ, calculated by the discriminator on the commentsdParameter set, Q, representing discriminator DD iRepresenting the actual category probability distribution of the discriminator on the ith term;
step B25: token vector O of the commentDInput evaluator DcriticThe evaluator consists of a fully connected layer, ODAfter linear transformation and softmax, obtaining the category probability distribution V of the commentsD:
In the formula (I), the compound is shown in the specification,expressing the probability distribution of the target category of the comments on all the terms, and evaluating the probability distribution Q of the actual category by taking the probability distribution as a standardD,VDRow i in the figure shows the target class probability distribution of the discriminator on the ith term,is an evaluator weight matrix for the discriminator,is a bias term;
step B26: using cross entropy lossesTraining the discriminator by using the loss of mean square errorTo evaluator DcriticTraining is carried out;
in the formula (I), the compound is shown in the specification,represents a pair SDIs divided into two fractions of a sample extracted from SThe loss of the class(s) is,represents a pair SDIs extracted from SGIs lost in the classification of the sample(s),indicating that a desired calculation of the comments upsampled on the dataset S resulted in a category cDThe expected value of the cross-entropy loss of (c),indicating that a desired calculation of a generator-generated comment resulted with respect to category cD(ii) a cross entropy loss expectation;
in the formula (I), the compound is shown in the specification,expected value of mean square error loss, V, representing the target class probability distribution and actual class probability distribution of the evaluatorD iRepresenting the target class probability distribution of the discriminator on the ith term.
Further, step B3 specifically includes the following steps:
step B31: using annotated data sets SLPre-training the classifier, and performing SLS ═ r, t, c), the process steps of B11-B12 are followed to obtain a comment characterization vector vrAnd a topic representation vector vtObtaining the main information representation of the subject according to the processing step of B13
Step B32: according to the processing procedure of B14, the structure vrThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, andcombining, fusing the comments and the topics through a multi-head attention mechanism to obtain a fusion vector of each time step, and fusing the fusion vector of each time step with random noiseSplicing to obtain the comment characterization vector of the fusion subjectWhereinRepresenting the feature vector of the ith word in the comment feature vector of the fusion subject; query location vector matrix Ep∈Rd×NObtaining a position vectorAndadding to obtain the comment characterization vector of position perceptionInputting the words into a Transformer network to obtain a representation matrix of all the words in the comment
Step B33: to OCAfter linear transformation, softmax is input, and the class probability distribution of all the words of the comment by the classifier is calculated
QC=softmax(OCWC+bC);
In the formula (I), the compound is shown in the specification,as a weight matrix, the weight matrix is,is a bias term;
according to QCGet the average class probability distribution of the classifier for the whole sentence with respect to class c:
in the formula, QC iRepresenting the probability distribution of the actual category of the comment on the ith word,representing the category conditional probability obtained by the evaluator through calculation on the comments;
using cross entropy lossesThe classifier is pre-trained and the classifier is pre-trained,the calculation formula of (a) is as follows:
in the formula (I), the compound is shown in the specification,representing pairs of slave data sets SLThe expected calculation of the mid-sampled samples yields the expected value of cross-entropy loss for class c,the expression discriminator calculates the commentConditional probability of arrival class, θcRepresenting classifier parameters;
step B34: token vector O of the commentCInput evaluator CcriticThe evaluator consists of a fully connected layer, OCAfter linear transformation and softmax, obtaining the target distribution V of the actual class probability distributionCExpressed as:
in the formula (I), the compound is shown in the specification,is the evaluator weight matrix of the classifier,is a bias term;
in the formula, VC iRepresenting a target category probability distribution of the comment on the ith word for category c.
Further, the step C specifically includes the steps of:
step C1: traversing each training sample in the data set S, and obtaining a comment characterization vector v for each training sample according to the processing steps of B11-B12rAnd a topic characterization vector vtObtaining the main information representation of the subject according to the processing step of B13
Step C2: for each of the data sets STraining samples, with generators, from random distribution PzAnd randomly distributing PcRespectively sampling to obtain random noise z and class c to obtain noise containing class informationExpressed as:
step C3: according to the processing procedure of B14, the structure vrThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, andcombining, fusing the comments and the topics through a multi-head attention mechanism to obtain a fusion vector of each time step, and fusing the fusion vector of each time step with random noiseSplicing to obtain the comment characterization vector of the fusion subjectWherein inA token vector representing the ith word in the comment token vector of the fused topic, wherein the superscript FGRepresenting a multi-headed attention calculation of a fusion topic to the generator input; then generating a comment y according to the processing steps of B15-B17;
step C4: inputting the y and the corresponding training sample in the data set S into a discriminator and a classifier together, classifying the comment classes respectively, and adopting a loss function for the discriminatorUpdating parameters, and applying a loss function of countermeasure training to the classifierUpdating is carried out;
in the formula (I), the compound is shown in the specification,is the cross entropy of the classifier' S predicted classification score on the labeled samples of the dataset S;is the loss of the classifier's classification prediction on the reviews generated by the generator, whereExpressing the Shannon entropy, wherein alpha is a balance parameter used for balancing the influence of the Shannon entropy;
step C5: and training the generator in a reinforcement learning mode.
Further, step C5 specifically includes:
the process of generating comments by the generator is regarded as a sequence decision process, the generator is used as an agent or an actor in reinforcement learning, and the generated term sequence { y is used in the process of generating comments1,y2,...,yi-1Consider the state the agent is currently inNext word y to be generatediActions taken for the agent, the actions taken by the agent being based on policy distributionAnd selecting, wherein the strategy distribution gives the probability of each behavior by calculating the expected reward of each behavior, the agent selects the corresponding behavior according to the probability, and the generator agent learns to maximize the expected reward, namely:
wherein the content of the first and second substances,
wherein R (r) represents the reward of the whole comment sample, and is determined and provided by the identifier and the classifier together, D represents the category conditional probability calculated by the identifier on the comment,representing the category conditional probability obtained by the evaluator through calculation on the comments;
to maximizeThe generator adjusts the parameter theta of the generator by learning step by step through a gradient strategy algorithmgExpressed as:
in the formula, Qi-ViIs a merit function, wherein:
where β is a linearly decreasing parameter, β -N-i, used to update the generator parameter θgThe importance of the initially generated words is improved, so that the generator obtains more diversified generated terms in the initial generation stage.
The present invention provides a false user comment detection system comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions when executed by the processor being capable of implementing the method steps as above.
The present invention also provides a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions when executed by the processor being capable of performing the method steps as above.
Compared with the prior art, the invention has the following beneficial effects: the model is not easy to generate the phenomena of overfitting and mode collapse, and has the angle between the comment text and the theme text, the detection performance of the model is not easy to be interfered by outlier noise, and the detection result has higher accuracy.
Drawings
FIG. 1 is a schematic flow chart of a method according to an embodiment of the present invention.
Fig. 2 is a schematic system structure according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a method and a system for detecting false user comments, which specifically include the following steps:
step A: collecting product comments of users and subject texts related to the comments, and establishing a user comment data set S-SL∪SUIn which S isLRepresenting a marked user comment data set, SURepresenting an unlabeled user comment dataset;
and B: pre-training a false user comment detection model by using a user comment data set S, wherein the model consists of a text generator G, an identifier D and a classifier C;
and C: carrying out countermeasure training on the false user comment detection model by using the user comment data set S;
step D: and inputting the user comment and the subject into a classifier of the false user comment detection model, and outputting a detection result of the user comment, namely the user comment is a false comment or a real comment.
In this embodiment, step B specifically includes the following steps:
step B1: pre-training a text generator by using a user comment data set S;
step B2: generating comments by using the text generator obtained in the step B1, and using the comments together with the comments in the user comment data set S to pre-train the identifier and the evaluator thereof;
step B3: the classifier and its evaluator are pre-trained using a user review data set S.
In this embodiment, step B1 specifically includes the following steps:
step B11: traversing the comment training set S, and comparing SLEach of which is labeledLet training sample be denoted as S ═ r, t, c, let SUEach unlabeled training sample in (a) is represented as s ═ r, t, where r represents the comment text, t represents the subject text to which the comment relates, and c is a category label of whether the comment is false or not; segmenting the comment r and the subject t in the training sample s and removing stop words, then setting the texts of the comment r and the subject t to be fixed lengths N and M respectively, and if the number of words in the comment r and the subject t after segmentation and removal of the stop words is smaller than the fixed length value, using the supplementary symbols<PAD>Supplementing, and cutting if the length is larger than the fixed length value;
after the comment r is subjected to word segmentation and stop word removal and is set to be of a fixed length, the comment r is expressed as follows:
in the formula (I), the compound is shown in the specification,the method comprises the following steps that i is 1,2, RN is the ith word in a text after a comment r is subjected to word segmentation and stop word removal, and the fixed length is set, i is 1,2, and RN is not more than N;
after the topic t is subjected to word segmentation and stop word removal and is set to be a fixed length, the topic t is represented as follows:
in the formula (I), the compound is shown in the specification,the method comprises the following steps that i is 1,2, wherein TM is a subject t, words are segmented, stop words are removed, and the i is the ith word in a text with a fixed length, i is 1,2, TM is less than or equal to M;
step B12: coding the comment text r and the subject text t processed in the step B11 to respectively obtain the representation vectors v of the comment and the subjectrAnd vt;
Wherein v isrExpressed as:
in the formula (I), the compound is shown in the specification,as the ith word of comment textCorresponding word vectors are obtained by pre-training a word vector matrixThe method includes searching, wherein i is 1,2, N, d represents the dimension of a word vector, and | V | is the number of words in a dictionary;
wherein v istExpressed as:
in the formula (I), the compound is shown in the specification,for the ith word w of the subject texti tCorresponding word vectors are obtained by pre-training a word vector matrixWhere, i ═ 1, 2., M, d denote the dimension of the word vector, | V | is the number of words in the dictionary;
step B13: characterization vector v for a topictExtracting the characterization vector of the trunk information of the theme by adopting maximum pooling after linear transformation and activation function
Wherein the content of the first and second substances,is a characterization vector of the stem information of the topic,for the weight matrix, a matrix dot product operation is represented,is a bias term;
step B14: will form vrVector sequence of (2)Sequentially inputting a multi-head attention unit (TMAU) of the fusion subject in the generator, and inputting the ith time stepAt each time step willAndcombining, fusing the comments and the theme information through a multi-head attention mechanism to obtain a feature vector of a fusion theme of each time step, and combining the feature vector with random noiseSplicing to obtain the vector sequence { x1,x2,...,xi,...,xN};
Step B15: the vector sequence { x ] obtained in the step B141,x2,...,xi,...,xNInputting bidirectional GRU, at the ith time step, outputting hidden layer state vector of bidirectional GRU 1, 2.. N, for the reverse layer of a bidirectional GRU, the output hidden layer state vector is1,2, N, f is an activation function; updating each weight matrix of GRU by adopting spectral normalization at each time step, and using Wi GA certain weight matrix representing GRU at the ith time step is obtained to obtain Wi GMaximum singular value ofTo Wi GPerforming spectrum normalization to obtain a weight matrix of GRU at the (i + 1) th time stepIs represented as follows:
repeating the steps to obtain a forward hidden layer state vector sequenceAnd reverse hidden layer state vector sequence
Step B16: connecting the forward hidden layer state vector with the reverse hidden layer state vector to obtain a comment characterization vector H of the fusion subject, wherein H is [ H ═ H1,...,hi,...,hN]T,hiAs forward hidden layer state vectorsAnd reverse hidden layer state vectorThe connection of (2);
step B17: linearly transforming the comment characterization vector H of the fusion subject, inputting softmax to obtain a word probability distribution matrix B, randomly sampling according to the word probability distribution matrix B, and generating a word sequence y of the comment text which is y ═ y { (y)1,y2,...,yi,...,yN};
Step B18: the text generator G is trained according to the following objective loss function:
wherein, the first and the second end of the pipe are connected with each other,representing the conditional probability, theta, calculated by the generator at the target word positiongFor the generator's parameter set, c is the class label and z is random noise.
In this embodiment, the step B14 specifically includes:
first, with XiInput representing the ith time stepTo XiIn thatIs subjected to orthogonal decomposition operation in the vector direction to obtain XiThe information about the subject part and other information in (1) respectively correspond to the parallel vectorsAnd a vertical vectorExpressed as:
in the formula (I), the compound is shown in the specification,is a parallel vector, and is a parallel vector,is a vertical vector, and is,representing a vectorTransposing;
then, information screening is carried out by using a multi-head attention mechanism: for each attention head, the pair of parallel vectorsIs subjected to linear transformation to obtainAs Q in a multi-head attention mechanism; to pairIs subjected to linear transformation to obtainAndas K and V in the multi-head attention mechanism, respectively, are expressed as:
in the formula (I), the compound is shown in the specification,respectively are weight matrixes to be trained;
then, willInputting into a multi-head attention unit to perform multi-head attention calculation, and expressing as follows:
in the formula (I), the compound is shown in the specification,the output vector of the multi-head attention mechanism in the parallel direction is shown, M pieces A show the multi-head attention mechanism, H shows the total number of the attention heads,representing the result of the calculation of the ith head of attention,is a weight matrix to be trained;
after that, the function of softmax is used to control the operation of the computerMapping between 0 and 1 to obtain parallel vectorInformation gate vector in parallel direction after multi-head attention mechanismExpressed as:
for vertical vectorIs subjected to linear transformation to obtainAs Q in a multi-head attention mechanism, pairIs subjected to linear transformation to obtainAndas K and V in the attention mechanism, respectively, will Inputting the data into a multi-head attention unit to perform multi-head attention calculation to obtainAnd obtaining a vertical vector through a softmax functionInformation gate vector in vertical direction after multi-head attention mechanismBy usingAndtwo gate vector pairs XiScreening information to obtain the characterization vector of the fusion subject of the ith time stepExpressed as:
in the formula (I), the compound is shown in the specification,representing the weight matrices in the parallel and vertical directions respectively,representing the input bias terms in the parallel and vertical directions, respectively, representing the matrix dot product operation;
then will beAnd random noiseSplicing to obtain the output vector x of the ith time stepiExpressed as:
in the formula (I), the compound is shown in the specification,(ii) a It is shown that the connection operation is performed,random noise, expressed as:
in the formula (I), the compound is shown in the specification,random distribution P from a standard Gaussian distributionzObtained by intermediate sampling, PzRandom distribution P conforming to a standard Gaussian distribution and class labels c from conforming to a standard Bernoulli distributioncAnd c is obtained, wherein the normal comment is represented when c is 1, and the false comment is represented when c is 0.
In this embodiment, step B2 specifically includes the following steps:
step B21: after the pre-training of the generator G is completed, the generator G is utilized to generate a comment data set SGFrom SGAnd randomly extracting marked comments and unmarked comments from the S to form a pre-training set S of the discriminator DD,SDEach training sample in (a) is denoted as s ═ (r, c)D) Where r denotes comment text, cDWhether the comment text is a category label generated by the generator or not is shown, SDInputting the training sample in (1) into a Transformer-based discriminator D for pre-training;
step B22: to SDAccording to the step B11, obtaining an initial characterization vector v of the comment text r for each training sample in the comment text rrAdding position vector to obtain position sensing characterization vectorExpressed as:
in the formula (I), the compound is shown in the specification,for the position vector, by consulting the position vector matrix Ep∈Rd×NObtained, expressed as:
in the formula (I), the compound is shown in the specification,representing a position coding vector corresponding to the ith word, d representing the dimension of the position vector, wherein the dimension of the position vector is the same as the dimension of the word vector, and N is the fixed maximum length of the comment text;
step B23: will be provided withInputting the feature vector of the comment into a Transformer network of a discriminator D
Step B24: to ODAfter linear transformation, softmax is input, and the class probability distribution Q of the discriminator D on all the words of the comment is calculatedD:
QD=softmax(ODWD+bD);
In the formula (I), the compound is shown in the specification,representing the actual class probability distribution of comments over all terms, QDThe ith row in the figure represents the actual class probability distribution of the discriminator at the ith word,as a weight matrix, the weight matrix is,is a bias term;
from a review of the class probability distribution Q over all termsDGet the entire sentence about category cDAverage class probability distribution of discriminator:
in the formula (I), the compound is shown in the specification,representing the conditional probability of the class, θ, calculated by the discriminator on the commentsdParameter set, Q, representing discriminator DD iRepresenting the actual class probability distribution of the discriminator on the ith term;
step B25: token vector O of the commentDInput evaluator DcriticThe evaluator consists of a fully connected layer, ODAfter linear transformation and softmax, obtaining the category probability distribution V of the commentsD:
In the formula (I), the compound is shown in the specification,expressing the probability distribution of the target category of the comments on all the terms, and evaluating the probability distribution Q of the actual category by taking the probability distribution as a standardD,VDRow i in the figure shows the target class probability distribution of the discriminator on the ith term,for the evaluator weight matrix of the discriminator,is a bias term;
step B26: using cross entropy lossesTraining the discriminator by using the loss of mean square errorTo evaluator DcriticTraining is carried out;
in the formula (I), the compound is shown in the specification,represents a pair SDThe loss of classification of the samples extracted from S,represents a pair SDIs extracted from SGIs lost in the classification of the sample(s),indicating that a desired calculation of the comments upsampled on the dataset S resulted in a category cDThe expected value of the cross-entropy loss of (c),indicating that a desired calculation of a generator-generated comment resulted with respect to category cD(ii) a cross entropy loss expectation;
in the formula (I), the compound is shown in the specification,expected value of mean square error loss, V, representing the target class probability distribution and actual class probability distribution of the evaluatorD iRepresenting the target class probability distribution of the discriminator on the ith term.
In this embodiment, step B3 specifically includes the following steps:
step B31: using annotated data sets SLPre-training the classifier, and performing SLS ═ r, t, c), the process steps of B11-B12 are followed to obtain a comment characterization vector vrAnd a topic characterization vector vtObtaining the main information representation of the subject according to the processing step of B13
Step B32: according to the processing procedure of B14, v is formedrThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, andcombining, fusing the comments and the topics through a multi-head attention mechanism to obtain a fusion vector of each time step, and fusing the fusion vector of each time step with random noiseSplicing to obtain the comment characterization vector of the fusion subjectWhereinRepresenting the feature vector of the ith word in the comment feature vector of the fusion subject; query location vector matrix Ep∈Rd×NObtaining a position vectorAndadding to obtain the comment characterization vector of position perceptionInputting the words into a Transformer network to obtain a representation matrix of all the words in the comment
Step B33: to OCAfter linear transformation, softmax is input, and the class probability distribution of all the words of the comment is calculated by a classifier
QC=softmax(OCWC+bC);
In the formula (I), the compound is shown in the specification,as a weight matrix, the weight matrix is,is a bias term;
according to QCGet the average class probability distribution of the classifier for the whole sentence with respect to class c:
in the formula, QC iIndicating that the comment is true on the ith wordThe probability distribution of the inter-category,representing the category conditional probability obtained by the evaluator through calculation on the comments;
using cross entropy lossesThe classifier is pre-trained and the classifier is pre-trained,the calculation formula of (a) is as follows:
in the formula (I), the compound is shown in the specification,representing pairs of slave data sets SLThe expected calculation of the mid-sampled samples yields the expected value of cross-entropy loss for class c,representing the conditional probability of the class, θ, calculated by the discriminator on the commentscRepresenting classifier parameters;
step B34: token vector O of the commentCInput evaluator CcriticThe evaluator consists of a fully connected layer, OCAfter linear transformation and softmax, obtaining the target distribution V of the actual class probability distributionCExpressed as:
in the formula (I), the compound is shown in the specification,is the evaluator weight matrix of the classifier,is a bias term;
in the formula, VC iRepresenting a target category probability distribution of the comment on the ith word for category c.
In this embodiment, step C specifically includes the following steps:
step C1: traversing each training sample in the data set S, and obtaining a comment characterization vector v for each training sample according to the processing steps of B11-B12rAnd a topic characterization vector vrObtaining the main information representation of the subject according to the processing step of B13
Step C2: for each training sample in the data set S, the generator is used to randomly distribute PzAnd randomly distributing PcRespectively sampling to obtain random noise z and class c to obtain noise containing class informationExpressed as:
step C3: according to the processing procedure of B14, the structure vrThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, andcombining, fusing the comments and the topics through a multi-head attention mechanism to obtain a fusion vector of each time step, and fusing the fusion vector of each time step with random noiseSplicing to obtain the comment characterization vector of the fusion subjectWherein inA token vector representing the ith word in the comment token vector of the fused topic, wherein the superscript FGRepresenting a multi-headed attention calculation of a fusion topic to the generator input; then generating a comment y according to the processing steps of B15-B17;
step C4: inputting the y and the corresponding training sample in the data set S into a discriminator and a classifier together, classifying the comment classes respectively, and adopting a loss function for the discriminatorUpdating parameters, and applying a loss function of countermeasure training to the classifierUpdating is carried out;
in the formula (I), the compound is shown in the specification,is the cross entropy of the classifier' S predicted classification score on the labeled samples of the dataset S;is the loss of the classifier's classification prediction on the comments generated by the generator, whereExpressing the Shannon entropy, wherein alpha is a balance parameter used for balancing the influence of the Shannon entropy;
step C5: and training the generator by adopting a reinforcement learning mode.
In this embodiment, step C5 specifically includes:
the process of generating comments by the generator is regarded as a sequence decision process, the generator is used as an agent or an actor in reinforcement learning, and the generated term sequence { y is used in the process of generating comments1,y2,...,yi-1The situation is regarded as the current state of the intelligent agent, the next word yi to be generated is the action taken by the intelligent agent, and the action taken by the intelligent agent is based on strategy distributionAnd selecting, wherein the strategy distribution gives the probability of each behavior by calculating the expected reward of each behavior, the agent selects the corresponding behavior according to the probability, and the generator agent learns to maximize the expected reward, namely:
wherein the content of the first and second substances,
wherein R (r) represents the reward of the whole comment sample, and is determined and provided by the identifier and the classifier together, D represents the category conditional probability calculated by the identifier on the comment,representing the category conditional probability obtained by the evaluator through calculation on the comments;
to maximizeThe generator learns and adjusts the parameter theta of the generator step by step through a gradient strategy algorithmgExpressed as:
in the formula, Qi-ViIs a merit function, wherein:
where β is a linearly decreasing parameter, β -N-i, used to update the generator parameter θgThe importance of the initially generated words is improved, so that the generator obtains more diversified generated terms in the initial generation stage.
The present embodiment also provides a false user comment detection system comprising a memory, a processor, and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, enable the method steps as above.
The present embodiments also provide a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, being capable of performing the method steps as above.
Preferably, as shown in fig. 2, the present embodiment correspondingly includes the following functional modules:
the data collection module is used for extracting the user comments and the theme information related to the comments, labeling the false category labels of the comments and constructing a training set;
the text preprocessing module is used for preprocessing the training samples in the training set, including unifying case and case, processing word segmentation and removing stop words;
the text coding module is used for searching word vectors of words in the preprocessed user comments and topics in the pre-trained word vector matrix to obtain the characteristic vectors of the user comments and the characteristic vectors of the topics;
and the pre-training module is used for inputting the characterization vectors of the user comments and the characterization vectors of the topics into each component of the deep learning network for pre-training respectively to obtain a pre-trained deep network model.
The countercheck training module is used for inputting the characterization vectors of the comments of the user and the characterization vectors of the subjects into each module of the deep learning network, each module obtains comment characterization vectors fusing the subjects, the deep learning network is trained through reinforcement learning, the probability that the characterization vectors belong to a certain class and the marks in a training set are used as losses, the overall deep learning network is trained by taking the minimum losses as a target, and a deep learning network model which is subjected to countercheck training is obtained;
and the false comment analysis module is used for analyzing and processing the input user comments and topics by utilizing the countertraining deep learning network model and outputting false categories of the user comments.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention will still fall within the protection scope of the technical solution of the present invention.
Claims (8)
1. A false user comment detection method is characterized by comprising the following steps:
step A: collecting product comments of users and subject texts related to the comments, and establishing a user comment data set S-SL∪SUIn which S isLRepresenting a marked user comment data set, SURepresenting an unlabeled user comment dataset;
and B: pre-training a false user comment detection model by using a user comment data set S, wherein the model consists of a text generator G, a discriminator D and a classifier C;
and C: carrying out countermeasure training on the false user comment detection model by using the user comment data set S;
step D: inputting the user comment and the theme into a classifier of a false user comment detection model, and outputting a detection result of the user comment, namely the user comment is a false comment or a real comment;
the step B specifically comprises the following steps:
step B1: pre-training a text generator by using a user comment data set S;
step B2: b1, generating comments by using the text generator obtained in the step B1, and using the comments together with the comments in the user comment data set S to pre-train the identifier and the evaluator thereof;
step B3: pre-training the classifier and the evaluator thereof by using a user comment data set S;
step B1 specifically includes the following steps:
step B11: traversing the comment training set S, and dividing SLIs given as S ═ r, t, c, let S denoteUIs represented as s ═ r, t, where r represents the comment text, t represents the subject text to which the comment relates, and c is the subject text to which the comment relatesA category label to comment false or not; segmenting the comment r and the subject t in the training sample s and removing stop words, then setting the texts of the comment r and the subject t to be fixed lengths N and M respectively, and if the number of words in the comment r and the subject t after segmentation and removal of the stop words is smaller than the fixed length value, using the supplementary symbols<PAD>Supplementing, and cutting if the length is larger than the fixed length value;
after the comment r is subjected to word segmentation and stop word removal and is set to be of a fixed length, the comment r is expressed as follows:
in the formula (I), the compound is shown in the specification,dividing words for the comment r, removing stop words, and setting the comment r as the ith word in the text with a fixed length, wherein i is 1,2, and RN is less than or equal to N;
after the topic t is subjected to word segmentation and stop word removal and is set to be a fixed length, the topic t is represented as follows:
in the formula (I), the compound is shown in the specification,dividing words for a subject t, removing stop words, and setting the subject t as the ith word in the text with a fixed length, wherein i is 1,2, TM is less than or equal to M;
step B12: coding the comment text r and the subject text t processed in the step B11 to respectively obtain the representation vectors v of the comment and the subjectrAnd vt;
Wherein v isrExpressed as:
in the formula (I), the compound is shown in the specification,as the ith word of comment textCorresponding word vectors are obtained by pre-training a word vector matrixThe method includes searching, wherein i is 1,2, N, d represents the dimension of a word vector, and | V | is the number of words in a dictionary;
wherein v istExpressed as:
in the formula (I), the compound is shown in the specification,as the ith word of the subject textCorresponding word vectors are obtained by pre-training a word vector matrixWhere, i ═ 1, 2., M, d denote the dimension of the word vector, | V | is the number of words in the dictionary;
step B13: characterization vector v for a topictExtracting the characterization vector of the trunk information of the theme by adopting maximum pooling after linear transformation and activation function
Wherein the content of the first and second substances,is a characterization vector of the stem information of the topic,for the weight matrix, a matrix dot product operation is represented,is a bias term;
step B14: will form vrVector sequence of (2)Sequentially inputting a multi-head attention unit of the fusion subject in the generator, and inputting the ith time step asAt each time step willAndcombining, fusing the comments and the theme information through a multi-head attention mechanism to obtain a feature vector of a fusion theme of each time step, and combining the feature vector with random noiseSplicing to obtain the vector sequence { x1,x2,...,xi,...,xN};
Step B15: the vector sequence { x ] obtained in the step B141,x2,...,xi,…,xNThe inputThe bidirectional GRU outputs a hidden layer state vector of the forward layer of the bidirectional GRU at the ith time step For the reverse layer of a bidirectional GRU, the output hidden layer state vector is f is an activation function; updating each weight matrix of GRU by adopting spectral normalization at each time step, and using Wi GA certain weight matrix representing GRU at the ith time step is obtained to obtain Wi GMaximum singular value ofTo Wi GPerforming spectrum normalization to obtain a weight matrix of GRU at the (i + 1) th time stepIs represented as follows:
repeating the steps to obtain a forward hidden layer state vector sequenceAnd reverse hidden layer state vector sequence
Step B16: connecting forward and reverse hiddenLayer state vector to obtain comment characterization vector H of fusion subject, [ H ═ H [1,...,hi,...,hN]T,hiAs forward hidden layer state vectorsAnd reverse hidden layer state vectorThe connection of (1);
step B17: linearly transforming the comment characteristic vector H of the fusion subject, inputting softmax to obtain a word probability distribution matrix B, randomly sampling according to the word probability distribution matrix B, and generating a word sequence y of the comment text, wherein y is { y ═ y }1,y2,...,yi,...,yN};
Step B18: the text generator G is trained according to the following objective loss function:
2. The method for detecting false user comments as claimed in claim 1, wherein the step B14 is specifically as follows:
first, with XiInput representing the ith time stepTo XiIn thatIs subjected to orthogonal decomposition operation in the vector direction to obtain XiThe information about the subject part and other information in (1) respectively correspond to the parallel vectorsAnd a vertical vectorExpressed as:
in the formula (I), the compound is shown in the specification,is a parallel vector, and is a parallel vector,is a vertical vector, and is,representing a vectorTransposing;
then, information screening is carried out by using a multi-head attention mechanism: for each attention head, the pair of parallel vectorsIs subjected to linear transformation to obtainAs Q in a multi-head attention mechanism; to pairIs subjected to linear transformation to obtainAndas K and V in the multi-head attention mechanism, respectively, are expressed as:
in the formula (I), the compound is shown in the specification,respectively are weight matrixes to be trained;
then, willInputting into a multi-head attention unit to perform multi-head attention calculation, and expressing as follows:
in the formula (I), the compound is shown in the specification,the output vector of the multi-head attention mechanism in the parallel direction is shown, MHA represents the multi-head attention mechanism, H represents the total number of attention heads,indicating the result of the calculation of the ith head of attention,is a weight matrix to be trained;
after that, the function of softmax is used to control the operation of the computerMapping between 0 and 1 to obtain parallel vectorInformation gate vector in parallel direction after multi-head attention mechanismExpressed as:
for vertical vectorIs subjected to linear transformation to obtainAs Q in a multi-head attention mechanism, pairIs subjected to linear transformation to obtainAndas K and V in the attention mechanism, respectively, willInputting the data into a multi-head attention unit to perform multi-head attention calculation to obtainAnd obtaining a vertical vector through a softmax functionInformation gate vector in vertical direction after multi-head attention mechanismBy usingAndtwo gate vector pairs XiInformation screening is carried out to obtain the characterization vector of the fusion subject of the ith time stepExpressed as:
in the formula (I), the compound is shown in the specification,representing weights in the parallel and perpendicular directions, respectivelyA matrix of values is formed by a matrix of values, representing the input bias terms in the parallel and vertical directions, respectively, representing the matrix dot product operation;
then will beAnd random noiseSplicing to obtain the output vector x of the ith time stepiExpressed as:
in the formula (I), the compound is shown in the specification,(ii) a It is shown that the connection operation is performed,random noise, expressed as:
in the formula (I), the compound is shown in the specification,random distribution P from a standard-Gauss distributionzObtained by intermediate sampling, PzRandom distribution P conforming to a standard Gaussian distribution and class labels c from conforming to a standard Bernoulli distributioncMedium sample, c is 1, indicating normal comment, whenWhen c is 0, it represents a false comment.
3. The method for detecting false user comments as claimed in claim 1, wherein the step B2 specifically comprises the following steps:
step B21: after the pre-training of the generator G is completed, the generator G is utilized to generate a comment data set SGFrom SGAnd randomly extracting marked comments and unmarked comments from the S to form a pre-training set S of the discriminator DD,SDEach training sample in (a) is denoted as s ═ (r, c)D) Where r denotes comment text, cDWhether the comment text is a category label generated by the generator or not is shown, SDInputting the training sample in (1) into a Transformer-based discriminator D for pre-training;
step B22: to SDAccording to the step B11, obtaining an initial characterization vector v of the comment text r for each training sample in the comment text rrAdding position vector to obtain position sensing characterization vectorExpressed as:
in the formula (I), the compound is shown in the specification,for the position vector, by consulting the position vector matrix Ep∈Rd×NObtained, expressed as:
in the formula (I), the compound is shown in the specification,indicating the position coding direction corresponding to the ith wordQuantity, d represents the dimension of the position vector, which is the same as the dimension of the word vector, and N is the fixed maximum length of the comment text;
step B23: will be provided withInputting the feature vector of the comment into a Transformer network of a discriminator D
Step B24: to ODAfter linear transformation, softmax is input, and the class probability distribution Q of the discriminator D on all the words of the comment is calculatedD:
QD=softmax(ODWD+bD);
In the formula (I), the compound is shown in the specification,representing the actual class probability distribution of comments over all terms, QDThe ith row in the figure represents the actual class probability distribution of the discriminator at the ith word,as a weight matrix, the weight matrix is,is a bias term;
from a review of the class probability distribution Q over all termsDGet the entire sentence about category cDAverage class probability distribution of discriminator:
in the formula (I), the compound is shown in the specification,presentation evaluator tallies reviewsCalculated class conditional probability, θdParameter set, Q, representing discriminator DD iRepresenting the actual class probability distribution of the discriminator on the ith term;
step B25: token vector O of the commentDInput evaluator Dcritic,The evaluator consists of a fully connected layer, ODAfter linear transformation and softmax, obtaining the category probability distribution V of the commentsD:
In the formula (I), the compound is shown in the specification,expressing the probability distribution of the target category of the comments on all the terms, and evaluating the probability distribution Q of the actual category by taking the probability distribution as a standardD,VDRow i in the figure shows the target class probability distribution of the discriminator on the ith term, for the evaluator weight matrix of the discriminator,is a bias term;
step B26: using cross entropy lossesTraining the discriminator by using the loss of mean square errorTo evaluator DcriticTraining is carried out;
in the formula (I), the compound is shown in the specification,represents a pair SDThe loss of classification of the sample extracted from S,represents a pair SDIs extracted from SGIs lost in the classification of the sample(s),indicating that a desired calculation of the comments upsampled on the dataset S resulted in a category cDThe expected value of the cross-entropy loss of (c),indicating that a desired calculation of a generator-generated comment resulted with respect to category cD(ii) a cross entropy loss expectation;
in the formula (I), the compound is shown in the specification,expected value of mean square error loss, V, representing the target class probability distribution and actual class probability distribution of the evaluatorD iRepresenting the target class probability distribution of the discriminator on the ith term.
4. The method for detecting false user comments as claimed in claim 1, wherein the step B3 specifically comprises the following steps:
step B31: using annotated data sets SLPre-training the classifier, and performing SLS ═ r, t, c), the process steps of B11-B12 are followed to obtain a comment characterization vector vrAnd a topic representation vector vtObtaining the main information representation of the subject according to the processing step of B13
Step B32: according to the processing procedure of B14, v is formedrThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, andcombining, fusing the comments and the topics through a multi-head attention mechanism to obtain a fusion vector of each time step, and fusing the fusion vector of each time step with random noiseSplicing to obtain the comment characterization vector of the fusion subjectWhereinRepresenting the feature vector of the ith word in the comment feature vector of the fusion subject; query location vector matrix Ep∈Rd×NObtaining a position vectorAnd withAdding to obtain the comment characterization vector of position perceptionInputting the words into a Transformer network to obtain a representation matrix of all the words in the comment
Step B33: to O isCAfter linear transformation, softmax is input, and the class probability distribution of all the words of the comment by the classifier is calculated
QC=softmax(OCWC+bC);
In the formula (I), the compound is shown in the specification,as a weight matrix, the weight matrix is,is a bias term;
according to QCGet the average class probability distribution of the classifier for the whole sentence with respect to class c:
in the formula, QC iRepresenting the probability distribution of the actual category of the comment on the ith word,representing the category conditional probability obtained by the evaluator through calculation on the comments;
using cross entropy lossesThe classifier is pre-trained and the classifier is pre-trained,the calculation formula of (c) is as follows:
in the formula (I), the compound is shown in the specification,representing pairs of slave data sets SLThe expected calculation for the mid-sampled samples yields the expected cross-entropy loss for class c,representing the conditional probability of the class, θ, calculated by the discriminator on the commentscRepresenting classifier parameters;
step B34: token vector O of the commentCInput evaluator CcriticThe evaluator consists of a fully connected layer, OCAfter linear transformation and softmax, obtaining target distribution V of actual class probability distributionCExpressed as:
in the formula (I), the compound is shown in the specification,is the evaluator weight matrix of the classifier,is a bias term;
in the formula, VC iRepresenting a target category probability distribution of the comment on the ith word for category c.
5. The method for detecting false user comments as claimed in claim 1, wherein the step C specifically includes the steps of:
step C1: traversing each training sample in the data set S, and obtaining a comment characterization vector v for each training sample according to the processing steps of B11-B12rAnd a topic representation vector vtObtaining the main information representation of the subject according to the processing step of B13
Step C2: for each training sample in the data set S, the generator is used to randomly distribute PzAnd randomly distributing PcRespectively sampling to obtain random noise z and class c to obtain noise containing class informationExpressed as:
step C3: according to the processing procedure of B14, the structure vrThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, andcombining, fusing the comments and the topics through a multi-head attention mechanism to obtain a fusion vector of each time step, and fusing the fusion vector of each time step with random noiseSplicing to obtain the comment characterization vector of the fusion subjectWherein the content of the first and second substances,a token vector representing the ith word in the comment token vector of the fused topic, wherein the superscript FGRepresenting a multi-headed attention calculation of a fusion topic to the generator input; then generating a comment y according to the processing steps of B15-B17;
step C4: inputting the y and the corresponding training sample in the data set S into a discriminator and a classifier together, classifying the comment classes respectively, and adopting a loss function for the discriminatorUpdating parameters, and applying a loss function of countermeasure training to the classifierUpdating is carried out;
in the formula (I), the compound is shown in the specification,is the cross entropy of the classifier' S predicted classification score on the labeled samples of the dataset S;is the loss of the classifier's classification prediction on the reviews generated by the generator, whereExpressing the Shannon entropy, wherein alpha is a balance parameter used for balancing the influence of the Shannon entropy;
step C5: and training the generator in a reinforcement learning mode.
6. The method for detecting false user comments as claimed in claim 5, wherein the step C5 is specifically as follows:
the process of generating comments by the generator is regarded as a sequence decision process, the generator is used as an agent or an actor in reinforcement learning, and the generated term sequence { y is used in the process of generating comments1,y2,...,yi-1The next word y to be generated is regarded as the current state of the agentiActions taken by the agent, the actions taken by the agent based on policiesAre slightly distributedAnd selecting, wherein the strategy distribution gives the probability of each behavior by calculating the expected reward of each behavior, the agent selects the corresponding behavior according to the probability, and the generator agent learns to maximize the expected reward, namely:
wherein the content of the first and second substances,
wherein R (r) represents the reward of the whole comment sample, and is determined and provided by the identifier and the classifier together, D represents the category conditional probability calculated by the identifier on the comment,representing the category conditional probability obtained by the evaluator through calculation on the comments;
to maximizeThe generator learns and adjusts the parameter theta of the generator through a gradient strategy algorithmgExpressed as:
in the formula, Qi-ViIs a merit function, wherein:
where β is a linearly decreasing parameter, β -N-i, used to update the generator parameter θgThe importance of the initially generated words is improved, so that the generator obtains more diversified generated terms in the initial generation stage.
7. A false user comment detection system comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions when executed by the processor being capable of carrying out the method steps of any one of claims 1 to 6.
8. A computer-readable storage medium, on which computer program instructions are stored which are executable by a processor, the method steps of any of claims 1-6 being implementable when the processor executes the computer program instructions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110070347.3A CN112732921B (en) | 2021-01-19 | 2021-01-19 | False user comment detection method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110070347.3A CN112732921B (en) | 2021-01-19 | 2021-01-19 | False user comment detection method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112732921A CN112732921A (en) | 2021-04-30 |
CN112732921B true CN112732921B (en) | 2022-06-14 |
Family
ID=75592450
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110070347.3A Active CN112732921B (en) | 2021-01-19 | 2021-01-19 | False user comment detection method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112732921B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113392334B (en) * | 2021-06-29 | 2024-03-08 | 长沙理工大学 | False comment detection method in cold start environment |
CN114610877B (en) * | 2022-02-23 | 2023-04-25 | 苏州大学 | Criticizing variance criterion-based film evaluation emotion analysis preprocessing method and system |
CN115168677B (en) * | 2022-06-09 | 2023-03-28 | 天翼爱音乐文化科技有限公司 | Comment classification method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670542A (en) * | 2018-12-11 | 2019-04-23 | 田刚 | A kind of false comment detection method based on comment external information |
CN109829733A (en) * | 2019-01-31 | 2019-05-31 | 重庆大学 | A kind of false comment detection system and method based on Shopping Behaviors sequence data |
KR20190123397A (en) * | 2018-04-24 | 2019-11-01 | 성균관대학교산학협력단 | Classification model selection method for discriminating fake review |
CN110580341A (en) * | 2019-09-19 | 2019-12-17 | 山东科技大学 | False comment detection method and system based on semi-supervised learning model |
CN111666480A (en) * | 2020-06-10 | 2020-09-15 | 东北电力大学 | False comment identification method based on rolling type collaborative training |
-
2021
- 2021-01-19 CN CN202110070347.3A patent/CN112732921B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190123397A (en) * | 2018-04-24 | 2019-11-01 | 성균관대학교산학협력단 | Classification model selection method for discriminating fake review |
CN109670542A (en) * | 2018-12-11 | 2019-04-23 | 田刚 | A kind of false comment detection method based on comment external information |
CN109829733A (en) * | 2019-01-31 | 2019-05-31 | 重庆大学 | A kind of false comment detection system and method based on Shopping Behaviors sequence data |
CN110580341A (en) * | 2019-09-19 | 2019-12-17 | 山东科技大学 | False comment detection method and system based on semi-supervised learning model |
CN111666480A (en) * | 2020-06-10 | 2020-09-15 | 东北电力大学 | False comment identification method based on rolling type collaborative training |
Non-Patent Citations (1)
Title |
---|
在线产品虚假评论检测技术研究;吕海等;《沈阳理工大学学报》;20181215;第37卷(第6期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112732921A (en) | 2021-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN110134757B (en) | Event argument role extraction method based on multi-head attention mechanism | |
CN110532900B (en) | Facial expression recognition method based on U-Net and LS-CNN | |
CN112732921B (en) | False user comment detection method and system | |
CN111126386B (en) | Sequence domain adaptation method based on countermeasure learning in scene text recognition | |
CN110321563B (en) | Text emotion analysis method based on hybrid supervision model | |
CN110232395B (en) | Power system fault diagnosis method based on fault Chinese text | |
CN110866542B (en) | Depth representation learning method based on feature controllable fusion | |
CN110287323B (en) | Target-oriented emotion classification method | |
CN104573669A (en) | Image object detection method | |
CN110046356B (en) | Label-embedded microblog text emotion multi-label classification method | |
CN112732916A (en) | BERT-based multi-feature fusion fuzzy text classification model | |
Islam et al. | InceptB: a CNN based classification approach for recognizing traditional bengali games | |
CN110297888A (en) | A kind of domain classification method based on prefix trees and Recognition with Recurrent Neural Network | |
CN112231477A (en) | Text classification method based on improved capsule network | |
KR20200010672A (en) | Smart merchandise searching method and system using deep learning | |
CN115526236A (en) | Text network graph classification method based on multi-modal comparative learning | |
CN112733764A (en) | Method for recognizing video emotion information based on multiple modes | |
CN116383387A (en) | Combined event extraction method based on event logic | |
CN115827954A (en) | Dynamically weighted cross-modal fusion network retrieval method, system and electronic equipment | |
CN115168579A (en) | Text classification method based on multi-head attention mechanism and two-dimensional convolution operation | |
CN113240033B (en) | Visual relation detection method and device based on scene graph high-order semantic structure | |
CN112347252B (en) | Interpretability analysis method based on CNN text classification model | |
CN111708865B (en) | Technology forecasting and patent early warning analysis method based on improved XGboost algorithm | |
CN116775880A (en) | Multi-label text classification method and system based on label semantics and transfer learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |