CN112732921B

CN112732921B - False user comment detection method and system

Info

Publication number: CN112732921B
Application number: CN202110070347.3A
Authority: CN
Inventors: 陈羽中; 徐闽樟
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2021-01-19
Filing date: 2021-01-19
Publication date: 2022-06-14
Anticipated expiration: 2041-01-19
Also published as: CN112732921A

Abstract

The invention provides a false user comment detection method and a false user comment detection system, which comprise the following steps: collecting product comments of users and subject texts related to the comments, and establishing a user comment data set; using user comment data setsSPre-training a false user comment detection model, wherein the model consists of a text generator G, a discriminator D and a classifier C; using user comment data setsSCarrying out countermeasure training on the false user comment detection model; and inputting the user comment and the subject into a classifier of the false user comment detection model, and outputting a detection result of the user comment, namely the user comment is a false comment or a real comment. The method can obtain a detection result with higher accuracy.

Description

False user comment detection method and system

Technical Field

The invention relates to the technical field of natural language processing, in particular to a false user comment detection method and system.

Background

The false user comment refers to unreal comment for intentionally promoting or defatting the reputation of a commodity and a public praise, and the detection of false user comment is a basic task of a text classification task in natural language processing, and the basic goal is to analyze the semantic relationship of the text classification task according to the related information of the user comment and detect the false. With the rapid development and the gradual maturity of e-commerce platforms, the problem of false user comment is more and more prominent, and many domestic and foreign researchers begin to work on the problem.

Early studies of false user comment detection typically employed traditional supervised learning algorithms, which focused on extracting features by methods such as N-gram, LDA, etc. to train classifiers. These methods require complicated feature engineering to extract text features, which is cumbersome. Recently, deep-learning Neural Network models, such as Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), have shown the most advanced performance on this task without any laborious feature engineering. LiL, the like, uses a convolutional neural network to perform semantic representation on a document level to perform false comment classification, and adds an attention mechanism into a CNN, uses KL divergence as weight calculation, calculates the importance of each word in a sentence, further obtains the importance weight of the comment sentence, and combines the importance weight of the comment sentence with a comment sentence vector after weighting into a document vector for classification; zhao et al propose to embed word order features in the convolutional layer and the pooling layer of CNN to capture semantic features related to the word order of comments, making CNN more suitable for solving the problem of false comment detection; wang et al propose a CNN model based on attention mechanism, which performs feature extraction through CNN, and analyzes two dimensions of semantics and behaviors of comments by combining the attention mechanism, so that the model learns to classify from the perspective of semantics or behaviors, even by referring to the two angles at the same time; ren et al use convolutional neural networks and combine with recurrent neural networks to build models to identify false comments, where convolutional neural networks are used to learn comment sentence expressions, then gated recurrent neural networks with attention mechanisms are used to combine them, modeling is performed on dialog information and document vectors are generated, and finally, the document expression forms are directly used for false comment identification; yuan et al combines with reviewers and products to perform feature extraction and false comment classification, provides a self-attention-based model, obtains semantic representation by performing self-attention coding on comment texts, obtains reviewer-related representation and product-related representation by utilizing vector decomposition, and performs classification after combining features; li et al propose a false comment detection based on Graph Convolutional neural Network (GCN), which uses an heteromorphic Graph and a homographic Graph to acquire local information and global information, extracts key features from complex Graph data structure information and multi-modal attribute information through aggregation, and performs false comment detection by combining the key features to adapt to more varied comment environments; deng et al propose a PU learning-based self-coding model, construct a feature vector based on the input comment-related metadata, perform coding learning on the feature vector through the self-coding model, calculate clustering distance by using a K-means method to determine categories, and perform PU learning; the model FakeGAN for introducing the GAN into a false comment detection task for the first time is proposed by Aghakhani and the like, a small part of marking data is used for generating a GAN sample by adopting a SeqGAN-based framework, and a large amount of marking data generated by the GAN are utilized to meet the huge sample requirement of a classified neural network, so that a good result is obtained; stantong et al propose SpamGAN, which is improved on the basis of FakeGAN, to reduce the amount of calculation and optimize the reward function, thereby achieving performance improvement.

Although the introduction of deep learning greatly improves the performance of the false comment detection model, the false comment has certain concealment and confusion, the number of comments is large, the difficulty of manual detection is high, the marked data set is deficient, the existing deep learning model is easy to over-fit, so that the model still has a large optimization space, meanwhile, the identification dimensionality of the false comment detection is only a comment text, the angle is too single, and the model detection performance is easily interfered by outlier noise.

Disclosure of Invention

In view of this, the present invention provides a method and a system for detecting false user comments, where the model detection is not easily interfered by outlier noise, and the obtained result is more accurate.

The invention is realized by adopting the following scheme: a false user comment detection method specifically comprises the following steps:

step A: collecting product comments of users and subject texts related to the comments, and establishing a user comment data set S-S_L∪S_UIn which S is_LRepresenting a tagged user comment data set, S_URepresenting an unlabeled user comment dataset;

and B: pre-training a false user comment detection model by using a user comment data set S, wherein the model consists of a text generator G, a discriminator D and a classifier C;

and C: carrying out countermeasure training on the false user comment detection model by using the user comment data set S;

step D: and inputting the user comment and the subject into a classifier of the false user comment detection model, and outputting a detection result of the user comment, namely the user comment is a false comment or a real comment.

Further, the step B specifically includes the steps of:

step B1: pre-training a text generator by using a user comment data set S;

step B2: generating comments by using the text generator obtained in the step B1, and using the comments together with the comments in the user comment data set S to pre-train the identifier and the evaluator thereof;

step B3: the classifier and its evaluator are pre-trained using a user review data set S.

Further, step B1 specifically includes the following steps:

step B11: traversing the comment training set S, and comparing S_LRepresents S as (r, t, c), S_UEach unlabeled training sample in (a) is represented as s ═ r, t, where r represents the comment text, t represents the subject text to which the comment relates, and c is a category label of whether the comment is false or not; segmenting the comment r and the subject t in the training sample s and removing stop words, then setting the texts of the comment r and the subject t to be fixed lengths N and M respectively, and if the number of words in the comment r and the subject t after segmentation and removal of the stop words is smaller than the fixed length value, using the supplementary symbols<PAD>Supplementing, and cutting if the length is larger than the fixed length value;

after the comment r is subjected to word segmentation and stop word removal and is set to be a fixed length, the comment r is represented as:

in the formula (I), the compound is shown in the specification,

the method comprises the following steps that 1,2, RN is the ith word in a text after a comment r is subjected to word segmentation and stop word removal and is set to be of a fixed length, and is equal to 1,2, wherein RN is less than or equal to N;

after the topic t is subjected to word segmentation and stop word removal and is set to be a fixed length, the topic t is represented as follows:

in the formula (I), the compound is shown in the specification,

the method comprises the following steps that i is 1,2, wherein TM is a subject t, words are segmented, stop words are removed, and the i is the ith word in a text with a fixed length, i is 1,2, TM is less than or equal to M;

step B12: coding the comment text r and the subject text t processed in the step B11 to respectively obtain the representation vectors v of the comment and the subject_rAnd v_t；

Wherein v is_rExpressed as:

in the formula (I), the compound is shown in the specification,

as the ith word of comment text

Corresponding word vectors are obtained by pre-training a word vector matrix

The method includes searching, wherein i is 1,2, N, d represents the dimension of a word vector, and | V | is the number of words in a dictionary;

wherein v is_tExpressed as:

in the formula (I), the compound is shown in the specification,

as the ith word of the subject text

Corresponding word vectors are obtained by pre-training a word vector matrix

Where, i ═ 1, 2., M, d denote the dimension of the word vector, | V | is the number of words in the dictionary;

step B13: characterization vector v for a topic_tExtracting the characterization vector of the trunk information of the theme by adopting maximum pooling after linear transformation and activation function

Wherein the content of the first and second substances,

is a characterization vector of the stem information of the topic,

for the weight matrix, a matrix dot product operation is represented,

is a bias term;

step B14: will form v_rVector sequence of

Sequentially inputting a multi-head attention unit of the fusion subject in the generator, and inputting the ith time step as

At each time step will

And

combining the comments with the topic information through a multi-head attention mechanism to obtain a feature vector of the fusion topic of each time step and random noise

Splicing to obtain the vector sequence { x₁，x₂，...，x_i，...，x_N}；

Step B15: the vector sequence { x ] obtained in the step B14₁，x₂，...，x_i，...，x_NInputting bidirectional GRU, at the ith time step, outputting hidden layer state vector of bidirectional GRU

1, 2.. N, for the reverse layer of a bidirectional GRU, the output hidden layer state vector is

1,2, N, d is an activation function; updating each weight matrix of GRU by adopting spectral normalization at each time step, and using W_i ^GA certain weight matrix representing GRU at the ith time step is obtained to obtain W_i ^GMaximum singular value of

To W_i ^GPerforming spectrum normalization to obtain a weight matrix of GRU at the (i + 1) th time step

Is represented as follows:

repeating the steps to obtain a forward hidden layer state vector sequence

And reverse hidden layer state vector sequence

Step B16: connecting the forward hidden layer state vector with the reverse hidden layer state vector to obtain a comment characterization vector H of the fusion subject, wherein H is [ H ═ H₁，...，h_i，...，h_N]^T，

h_iAs forward hidden layer state vectors

And reverse hidden layer state vector

The connection of (1);

step B17: linearly transforming the comment characterization vector H of the fusion subject, inputting softmax to obtain a word probability distribution matrix B, randomly sampling according to the word probability distribution matrix B, and generating a word sequence y of the comment text which is y ═ y { (y)₁，y₂，...，y_i，...，y_N}；

Step B18: the text generator G is trained according to the following objective loss function:

wherein the content of the first and second substances,

representing the conditional probability, theta, calculated by the generator at the target word position_gTo generate a set of parameters, c is the class label and z is random noise.

Further, the step B14 is specifically:

first, with X_iInput representing the ith time step

To X_iIn that

Is subjected to orthogonal decomposition operation in the vector direction to obtain X_iInformation about the subject portion and other information in (1), respectively corresponding to the parallel vectors

And vertical vector

Expressed as:

in the formula (I), the compound is shown in the specification,

is a parallel vector, and is a parallel vector,

is a vertical vector, and is,

representing a vector

Transposing;

then, information screening is carried out by using a multi-head attention mechanism: for each attention head, the pair of parallel vectors

Is subjected to linear transformation to obtain

As Q in a multi-head attention mechanism; to pair

Is subjected to linear transformation to obtain

And

as K and V in the multi-head attention mechanism, respectively, are expressed as:

in the formula (I), the compound is shown in the specification,

respectively are weight matrixes to be trained;

then, will

Inputting into a multi-head attention unit to perform multi-head attention calculation, and expressing as follows:

in the formula (I), the compound is shown in the specification,

the output vector of the multi-head attention mechanism in the parallel direction is shown, M pieces A show the multi-head attention mechanism, H shows the total number of the attention heads,

representing the result of the calculation of the ith head of attention,

is a weight matrix to be trained;

after that, the function of softmax is used to control the operation of the computer

Mapping between 0 and 1 to obtain parallel vector

Information gate vector in parallel direction after multi-head attention mechanism

Expressed as:

for vertical vector

Is subjected to linear transformation to obtain

As Q in a multi-head attention mechanism, pair

Is subjected to linear transformation to obtain

And

as K and V in the attention mechanism, respectively, will

Inputting the data into a multi-head attention unit to perform multi-head attention calculation to obtain

And obtaining a vertical vector through a softmax function

Information gate vector in vertical direction after multi-head attention mechanism

By using

And

two gate vector pairs X_iScreening information to obtain the characterization vector of the fusion subject of the ith time step

Expressed as:

in the formula (I), the compound is shown in the specification,

representing the weight matrices in the parallel and vertical directions respectively,

respectively representInput bias terms in the parallel and vertical directions, representing a matrix dot product operation;

then will be

And random noise

Splicing to obtain the output vector x of the ith time step_iExpressed as:

in the formula (I), the compound is shown in the specification,

(ii) a It is shown that the connection operation is performed,

random noise, expressed as:

in the formula (I), the compound is shown in the specification,

random distribution P from a standard-Gauss distribution_zObtained by intermediate sampling, P_zRandom distribution P conforming to a standard Gaussian distribution and class labels c from conforming to a standard Bernoulli distribution_cThe average value is obtained, and c is 1 to represent a normal comment, and c is 0 to represent a false comment.

Further, step B2 specifically includes the following steps:

step B21: after the pre-training of the generator G is completed, the generator G is utilized to generate a comment data set S_GFrom S_GAnd randomly extracting marked comments and unmarked comments from the S to form a pre-training set S of the discriminator D_D，S_DEach training sample in (1) representsIs s ═ r, c^D) Where r denotes comment text, c^DWhether the comment text is a category label generated by the generator or not is shown, S_DInputting the training sample in (1) into a Transformer-based discriminator D for pre-training;

step B22: to S_DAccording to the step B11, obtaining an initial characterization vector v of the comment text r for each training sample in the comment text r_rAdding position vector to obtain position sensing characterization vector

Expressed as:

in the formula (I), the compound is shown in the specification,

for the position vector, by consulting the position vector matrix E_p∈Rd×^NObtained, expressed as:

in the formula (I), the compound is shown in the specification,

representing a position coding vector corresponding to the ith word, d representing the dimension of the position vector, wherein the dimension of the position vector is the same as the dimension of the word vector, and N is the fixed maximum length of the comment text;

step B23: will be provided with

Inputting the feature vector of the comment into a Transformer network of a discriminator D

Step B24: to O_DAfter linear transformation, softmax is input and identification is calculatedClass probability distribution Q of the discriminator D over all words of the comment_D：

Q_D＝softmax(O_DW^D+b^D)；

In the formula (I), the compound is shown in the specification,

representing the actual class probability distribution of comments over all terms, Q_DThe ith row in the figure represents the actual class probability distribution of the discriminator at the ith word,

as a weight matrix, the weight matrix is,

is a bias term;

from a review of the class probability distribution Q over all terms_DGet the entire sentence about category c^DAverage class probability distribution of discriminator:

in the formula (I), the compound is shown in the specification,

representing the conditional probability of the class, θ, calculated by the discriminator on the comments_dParameter set, Q, representing discriminator D_D ⁱRepresenting the actual category probability distribution of the discriminator on the ith term;

step B25: token vector O of the comment_DInput evaluator D_criticThe evaluator consists of a fully connected layer, O_DAfter linear transformation and softmax, obtaining the category probability distribution V of the comments_D：

In the formula (I), the compound is shown in the specification,

expressing the probability distribution of the target category of the comments on all the terms, and evaluating the probability distribution Q of the actual category by taking the probability distribution as a standard_D，V_DRow i in the figure shows the target class probability distribution of the discriminator on the ith term,

is an evaluator weight matrix for the discriminator,

is a bias term;

step B26: using cross entropy losses

Training the discriminator by using the loss of mean square error

To evaluator D_criticTraining is carried out;

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

represents a pair S_DIs divided into two fractions of a sample extracted from SThe loss of the class(s) is,

represents a pair S_DIs extracted from S_GIs lost in the classification of the sample(s),

indicating that a desired calculation of the comments upsampled on the dataset S resulted in a category c^DThe expected value of the cross-entropy loss of (c),

indicating that a desired calculation of a generator-generated comment resulted with respect to category c^D(ii) a cross entropy loss expectation;

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

expected value of mean square error loss, V, representing the target class probability distribution and actual class probability distribution of the evaluator_D ⁱRepresenting the target class probability distribution of the discriminator on the ith term.

Further, step B3 specifically includes the following steps:

step B31: using annotated data sets S_LPre-training the classifier, and performing S_LS ═ r, t, c), the process steps of B11-B12 are followed to obtain a comment characterization vector v_rAnd a topic representation vector v_tObtaining the main information representation of the subject according to the processing step of B13

Step B32: according to the processing procedure of B14, the structure v_rThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, and

combining, fusing the comments and the topics through a multi-head attention mechanism to obtain a fusion vector of each time step, and fusing the fusion vector of each time step with random noise

Splicing to obtain the comment characterization vector of the fusion subject

Wherein

Representing the feature vector of the ith word in the comment feature vector of the fusion subject; query location vector matrix E_p∈R^d×NObtaining a position vector

And

adding to obtain the comment characterization vector of position perception

Inputting the words into a Transformer network to obtain a representation matrix of all the words in the comment

Step B33: to O_CAfter linear transformation, softmax is input, and the class probability distribution of all the words of the comment by the classifier is calculated

Q_C＝softmax(O_CW_C+b_C)；

In the formula (I), the compound is shown in the specification,

as a weight matrix, the weight matrix is,

is a bias term;

according to Q_CGet the average class probability distribution of the classifier for the whole sentence with respect to class c:

in the formula, Q_C ⁱRepresenting the probability distribution of the actual category of the comment on the ith word,

representing the category conditional probability obtained by the evaluator through calculation on the comments;

using cross entropy losses

The classifier is pre-trained and the classifier is pre-trained,

the calculation formula of (a) is as follows:

in the formula (I), the compound is shown in the specification,

representing pairs of slave data sets S_LThe expected calculation of the mid-sampled samples yields the expected value of cross-entropy loss for class c,

the expression discriminator calculates the commentConditional probability of arrival class, θ_cRepresenting classifier parameters;

step B34: token vector O of the comment_CInput evaluator C_criticThe evaluator consists of a fully connected layer, O_CAfter linear transformation and softmax, obtaining the target distribution V of the actual class probability distribution_CExpressed as:

in the formula (I), the compound is shown in the specification,

is the evaluator weight matrix of the classifier,

is a bias term;

step B35: using mean square error loss

Evaluator C for classifier_criticTraining is carried out:

in the formula, V_C ⁱRepresenting a target category probability distribution of the comment on the ith word for category c.

Further, the step C specifically includes the steps of:

step C1: traversing each training sample in the data set S, and obtaining a comment characterization vector v for each training sample according to the processing steps of B11-B12_rAnd a topic characterization vector v_tObtaining the main information representation of the subject according to the processing step of B13

Step C2: for each of the data sets STraining samples, with generators, from random distribution P_zAnd randomly distributing P_cRespectively sampling to obtain random noise z and class c to obtain noise containing class information

Expressed as:

step C3: according to the processing procedure of B14, the structure v_rThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, and

Splicing to obtain the comment characterization vector of the fusion subject

Wherein in

A token vector representing the ith word in the comment token vector of the fused topic, wherein the superscript F_GRepresenting a multi-headed attention calculation of a fusion topic to the generator input; then generating a comment y according to the processing steps of B15-B17;

step C4: inputting the y and the corresponding training sample in the data set S into a discriminator and a classifier together, classifying the comment classes respectively, and adopting a loss function for the discriminator

Updating parameters, and applying a loss function of countermeasure training to the classifier

Updating is carried out;

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

is the cross entropy of the classifier' S predicted classification score on the labeled samples of the dataset S;

is the loss of the classifier's classification prediction on the reviews generated by the generator, where

Expressing the Shannon entropy, wherein alpha is a balance parameter used for balancing the influence of the Shannon entropy;

step C5: and training the generator in a reinforcement learning mode.

Further, step C5 specifically includes:

the process of generating comments by the generator is regarded as a sequence decision process, the generator is used as an agent or an actor in reinforcement learning, and the generated term sequence { y is used in the process of generating comments₁，y₂，...，y_i-1Consider the state the agent is currently inNext word y to be generated_iActions taken for the agent, the actions taken by the agent being based on policy distribution

And selecting, wherein the strategy distribution gives the probability of each behavior by calculating the expected reward of each behavior, the agent selects the corresponding behavior according to the probability, and the generator agent learns to maximize the expected reward, namely:

wherein the content of the first and second substances,

wherein R (r) represents the reward of the whole comment sample, and is determined and provided by the identifier and the classifier together, D represents the category conditional probability calculated by the identifier on the comment,

to maximize

The generator adjusts the parameter theta of the generator by learning step by step through a gradient strategy algorithm_gExpressed as:

in the formula, Qⁱ-VⁱIs a merit function, wherein:

where β is a linearly decreasing parameter, β -N-i, used to update the generator parameter θ_gThe importance of the initially generated words is improved, so that the generator obtains more diversified generated terms in the initial generation stage.

The present invention provides a false user comment detection system comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions when executed by the processor being capable of implementing the method steps as above.

The present invention also provides a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions when executed by the processor being capable of performing the method steps as above.

Compared with the prior art, the invention has the following beneficial effects: the model is not easy to generate the phenomena of overfitting and mode collapse, and has the angle between the comment text and the theme text, the detection performance of the model is not easy to be interfered by outlier noise, and the detection result has higher accuracy.

Drawings

FIG. 1 is a schematic flow chart of a method according to an embodiment of the present invention.

Fig. 2 is a schematic system structure according to an embodiment of the present invention.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

As shown in fig. 1, the present embodiment provides a method and a system for detecting false user comments, which specifically include the following steps:

step A: collecting product comments of users and subject texts related to the comments, and establishing a user comment data set S-S_L∪S_UIn which S is_LRepresenting a marked user comment data set, S_URepresenting an unlabeled user comment dataset;

and B: pre-training a false user comment detection model by using a user comment data set S, wherein the model consists of a text generator G, an identifier D and a classifier C;

In this embodiment, step B specifically includes the following steps:

step B1: pre-training a text generator by using a user comment data set S;

In this embodiment, step B1 specifically includes the following steps:

step B11: traversing the comment training set S, and comparing S_LEach of which is labeledLet training sample be denoted as S ═ r, t, c, let S_UEach unlabeled training sample in (a) is represented as s ═ r, t, where r represents the comment text, t represents the subject text to which the comment relates, and c is a category label of whether the comment is false or not; segmenting the comment r and the subject t in the training sample s and removing stop words, then setting the texts of the comment r and the subject t to be fixed lengths N and M respectively, and if the number of words in the comment r and the subject t after segmentation and removal of the stop words is smaller than the fixed length value, using the supplementary symbols<PAD>Supplementing, and cutting if the length is larger than the fixed length value;

after the comment r is subjected to word segmentation and stop word removal and is set to be of a fixed length, the comment r is expressed as follows:

in the formula (I), the compound is shown in the specification,

the method comprises the following steps that i is 1,2, RN is the ith word in a text after a comment r is subjected to word segmentation and stop word removal, and the fixed length is set, i is 1,2, and RN is not more than N;

in the formula (I), the compound is shown in the specification,

Wherein v is_rExpressed as:

in the formula (I), the compound is shown in the specification,

as the ith word of comment text

Corresponding word vectors are obtained by pre-training a word vector matrix

wherein v is_tExpressed as:

in the formula (I), the compound is shown in the specification,

for the ith word w of the subject text_i ^tCorresponding word vectors are obtained by pre-training a word vector matrix

Wherein the content of the first and second substances,

is a characterization vector of the stem information of the topic,

for the weight matrix, a matrix dot product operation is represented,

is a bias term;

step B14: will form v_rVector sequence of (2)

Sequentially inputting a multi-head attention unit (TMAU) of the fusion subject in the generator, and inputting the ith time step

At each time step will

And

combining, fusing the comments and the theme information through a multi-head attention mechanism to obtain a feature vector of a fusion theme of each time step, and combining the feature vector with random noise

1,2, N, f is an activation function; updating each weight matrix of GRU by adopting spectral normalization at each time step, and using W_i ^GA certain weight matrix representing GRU at the ith time step is obtained to obtain W_i ^GMaximum singular value of

Is represented as follows:

repeating the steps to obtain a forward hidden layer state vector sequence

And reverse hidden layer state vector sequence

Step B16: connecting the forward hidden layer state vector with the reverse hidden layer state vector to obtain a comment characterization vector H of the fusion subject, wherein H is [ H ═ H₁,...,h_i,...,h_N]^T，

h_iAs forward hidden layer state vectors

And reverse hidden layer state vector

The connection of (2);

wherein, the first and the second end of the pipe are connected with each other,

representing the conditional probability, theta, calculated by the generator at the target word position_gFor the generator's parameter set, c is the class label and z is random noise.

In this embodiment, the step B14 specifically includes:

first, with X_iInput representing the ith time step

To X_iIn that

Is subjected to orthogonal decomposition operation in the vector direction to obtain X_iThe information about the subject part and other information in (1) respectively correspond to the parallel vectors

And a vertical vector

Expressed as:

in the formula (I), the compound is shown in the specification,

is a parallel vector, and is a parallel vector,

is a vertical vector, and is,

representing a vector

Transposing;

Is subjected to linear transformation to obtain

As Q in a multi-head attention mechanism; to pair

Is subjected to linear transformation to obtain

And

in the formula (I), the compound is shown in the specification,

respectively are weight matrixes to be trained;

then, will

in the formula (I), the compound is shown in the specification,

representing the result of the calculation of the ith head of attention,

is a weight matrix to be trained;

Mapping between 0 and 1 to obtain parallel vector

Expressed as:

for vertical vector

Is subjected to linear transformation to obtain

As Q in a multi-head attention mechanism, pair

Is subjected to linear transformation to obtain

And

as K and V in the attention mechanism, respectively, will

And obtaining a vertical vector through a softmax function

By using

And

Expressed as:

in the formula (I), the compound is shown in the specification,

representing the input bias terms in the parallel and vertical directions, respectively, representing the matrix dot product operation;

then will be

And random noise

Splicing to obtain the output vector x of the ith time step_iExpressed as:

in the formula (I), the compound is shown in the specification,

(ii) a It is shown that the connection operation is performed,

random noise, expressed as:

in the formula (I), the compound is shown in the specification,

random distribution P from a standard Gaussian distribution_zObtained by intermediate sampling, P_zRandom distribution P conforming to a standard Gaussian distribution and class labels c from conforming to a standard Bernoulli distribution_cAnd c is obtained, wherein the normal comment is represented when c is 1, and the false comment is represented when c is 0.

In this embodiment, step B2 specifically includes the following steps:

step B21: after the pre-training of the generator G is completed, the generator G is utilized to generate a comment data set S_GFrom S_GAnd randomly extracting marked comments and unmarked comments from the S to form a pre-training set S of the discriminator D_D，S_DEach training sample in (a) is denoted as s ═ (r, c)^D) Where r denotes comment text, c^DWhether the comment text is a category label generated by the generator or not is shown, S_DInputting the training sample in (1) into a Transformer-based discriminator D for pre-training;

Expressed as:

in the formula (I), the compound is shown in the specification,

for the position vector, by consulting the position vector matrix E_p∈R^d×NObtained, expressed as:

in the formula (I), the compound is shown in the specification,

step B23: will be provided with

Step B24: to O_DAfter linear transformation, softmax is input, and the class probability distribution Q of the discriminator D on all the words of the comment is calculated_D：

Q_D＝softmax(O_DW^D+b^D)；

In the formula (I), the compound is shown in the specification,

as a weight matrix, the weight matrix is,

is a bias term;

in the formula (I), the compound is shown in the specification,

representing the conditional probability of the class, θ, calculated by the discriminator on the comments_dParameter set, Q, representing discriminator D_D ⁱRepresenting the actual class probability distribution of the discriminator on the ith term;

In the formula (I), the compound is shown in the specification,

for the evaluator weight matrix of the discriminator,

is a bias term;

step B26: using cross entropy losses

Training the discriminator by using the loss of mean square error

To evaluator D_criticTraining is carried out;

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

represents a pair S_DThe loss of classification of the samples extracted from S,

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

In this embodiment, step B3 specifically includes the following steps:

step B31: using annotated data sets S_LPre-training the classifier, and performing S_LS ═ r, t, c), the process steps of B11-B12 are followed to obtain a comment characterization vector v_rAnd a topic characterization vector v_tObtaining the main information representation of the subject according to the processing step of B13

Step B32: according to the processing procedure of B14, v is formed_rThe vector sequence of (a) is input into a multi-head attention unit of the fusion topic in turn, and

Splicing to obtain the comment characterization vector of the fusion subject

Wherein

And

adding to obtain the comment characterization vector of position perception

Step B33: to O_CAfter linear transformation, softmax is input, and the class probability distribution of all the words of the comment is calculated by a classifier

Q_C＝softmax(O_CW_C+b_C)；

In the formula (I), the compound is shown in the specification,

as a weight matrix, the weight matrix is,

is a bias term;

in the formula, Q_C ⁱIndicating that the comment is true on the ith wordThe probability distribution of the inter-category,

using cross entropy losses

The classifier is pre-trained and the classifier is pre-trained,

the calculation formula of (a) is as follows:

in the formula (I), the compound is shown in the specification,

representing the conditional probability of the class, θ, calculated by the discriminator on the comments_cRepresenting classifier parameters;

in the formula (I), the compound is shown in the specification,

is the evaluator weight matrix of the classifier,

is a bias term;

step B35: using mean square error loss

Evaluator C for classifier_criticTraining is carried out:

In this embodiment, step C specifically includes the following steps:

step C1: traversing each training sample in the data set S, and obtaining a comment characterization vector v for each training sample according to the processing steps of B11-B12_rAnd a topic characterization vector v_rObtaining the main information representation of the subject according to the processing step of B13

Step C2: for each training sample in the data set S, the generator is used to randomly distribute P_zAnd randomly distributing P_cRespectively sampling to obtain random noise z and class c to obtain noise containing class information

Expressed as:

Splicing to obtain the comment characterization vector of the fusion subject

Wherein in

Updating is carried out;

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

is the loss of the classifier's classification prediction on the comments generated by the generator, where

step C5: and training the generator by adopting a reinforcement learning mode.

In this embodiment, step C5 specifically includes:

the process of generating comments by the generator is regarded as a sequence decision process, the generator is used as an agent or an actor in reinforcement learning, and the generated term sequence { y is used in the process of generating comments₁，y₂，...，y_i-1The situation is regarded as the current state of the intelligent agent, the next word yi to be generated is the action taken by the intelligent agent, and the action taken by the intelligent agent is based on strategy distribution

wherein the content of the first and second substances,

to maximize

The generator learns and adjusts the parameter theta of the generator step by step through a gradient strategy algorithm_gExpressed as:

in the formula, Qⁱ-VⁱIs a merit function, wherein:

The present embodiment also provides a false user comment detection system comprising a memory, a processor, and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, enable the method steps as above.

The present embodiments also provide a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, being capable of performing the method steps as above.

Preferably, as shown in fig. 2, the present embodiment correspondingly includes the following functional modules:

the data collection module is used for extracting the user comments and the theme information related to the comments, labeling the false category labels of the comments and constructing a training set;

the text preprocessing module is used for preprocessing the training samples in the training set, including unifying case and case, processing word segmentation and removing stop words;

the text coding module is used for searching word vectors of words in the preprocessed user comments and topics in the pre-trained word vector matrix to obtain the characteristic vectors of the user comments and the characteristic vectors of the topics;

and the pre-training module is used for inputting the characterization vectors of the user comments and the characterization vectors of the topics into each component of the deep learning network for pre-training respectively to obtain a pre-trained deep network model.

The countercheck training module is used for inputting the characterization vectors of the comments of the user and the characterization vectors of the subjects into each module of the deep learning network, each module obtains comment characterization vectors fusing the subjects, the deep learning network is trained through reinforcement learning, the probability that the characterization vectors belong to a certain class and the marks in a training set are used as losses, the overall deep learning network is trained by taking the minimum losses as a target, and a deep learning network model which is subjected to countercheck training is obtained;

and the false comment analysis module is used for analyzing and processing the input user comments and topics by utilizing the countertraining deep learning network model and outputting false categories of the user comments.

The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention will still fall within the protection scope of the technical solution of the present invention.

Claims

1. A false user comment detection method is characterized by comprising the following steps:

step D: inputting the user comment and the theme into a classifier of a false user comment detection model, and outputting a detection result of the user comment, namely the user comment is a false comment or a real comment;

the step B specifically comprises the following steps:

step B1: pre-training a text generator by using a user comment data set S;

step B2: b1, generating comments by using the text generator obtained in the step B1, and using the comments together with the comments in the user comment data set S to pre-train the identifier and the evaluator thereof;

step B3: pre-training the classifier and the evaluator thereof by using a user comment data set S;

step B1 specifically includes the following steps:

step B11: traversing the comment training set S, and dividing S_LIs given as S ═ r, t, c, let S denote_UIs represented as s ═ r, t, where r represents the comment text, t represents the subject text to which the comment relates, and c is the subject text to which the comment relatesA category label to comment false or not; segmenting the comment r and the subject t in the training sample s and removing stop words, then setting the texts of the comment r and the subject t to be fixed lengths N and M respectively, and if the number of words in the comment r and the subject t after segmentation and removal of the stop words is smaller than the fixed length value, using the supplementary symbols<PAD>Supplementing, and cutting if the length is larger than the fixed length value;

in the formula (I), the compound is shown in the specification,

dividing words for the comment r, removing stop words, and setting the comment r as the ith word in the text with a fixed length, wherein i is 1,2, and RN is less than or equal to N;

in the formula (I), the compound is shown in the specification,

dividing words for a subject t, removing stop words, and setting the subject t as the ith word in the text with a fixed length, wherein i is 1,2, TM is less than or equal to M;

Wherein v is_rExpressed as:

in the formula (I), the compound is shown in the specification,

as the ith word of comment text

Corresponding word vectors are obtained by pre-training a word vector matrix

wherein v is_tExpressed as:

in the formula (I), the compound is shown in the specification,

as the ith word of the subject text

Corresponding word vectors are obtained by pre-training a word vector matrix

Wherein the content of the first and second substances,

is a characterization vector of the stem information of the topic,

for the weight matrix, a matrix dot product operation is represented,

is a bias term;

step B14: will form v_rVector sequence of (2)

At each time step will

And

Step B15: the vector sequence { x ] obtained in the step B14₁，x₂，...，x_i，…，x_NThe inputThe bidirectional GRU outputs a hidden layer state vector of the forward layer of the bidirectional GRU at the ith time step

For the reverse layer of a bidirectional GRU, the output hidden layer state vector is

f is an activation function; updating each weight matrix of GRU by adopting spectral normalization at each time step, and using W_i ^GA certain weight matrix representing GRU at the ith time step is obtained to obtain W_i ^GMaximum singular value of

Is represented as follows:

repeating the steps to obtain a forward hidden layer state vector sequence

And reverse hidden layer state vector sequence

Step B16: connecting forward and reverse hiddenLayer state vector to obtain comment characterization vector H of fusion subject, [ H ═ H [₁,...,h_i,...,h_N]^T，

h_iAs forward hidden layer state vectors

And reverse hidden layer state vector

The connection of (1);

step B17: linearly transforming the comment characteristic vector H of the fusion subject, inputting softmax to obtain a word probability distribution matrix B, randomly sampling according to the word probability distribution matrix B, and generating a word sequence y of the comment text, wherein y is { y ═ y }₁，y₂，...，y_i，...，y_N}；

wherein the content of the first and second substances,

2. The method for detecting false user comments as claimed in claim 1, wherein the step B14 is specifically as follows:

first, with X_iInput representing the ith time step

To X_iIn that

And a vertical vector

Expressed as:

in the formula (I), the compound is shown in the specification,

is a parallel vector, and is a parallel vector,

is a vertical vector, and is,

representing a vector

Transposing;

Is subjected to linear transformation to obtain

As Q in a multi-head attention mechanism; to pair

Is subjected to linear transformation to obtain

And

in the formula (I), the compound is shown in the specification,

respectively are weight matrixes to be trained;

then, will

in the formula (I), the compound is shown in the specification,

the output vector of the multi-head attention mechanism in the parallel direction is shown, MHA represents the multi-head attention mechanism, H represents the total number of attention heads,

indicating the result of the calculation of the ith head of attention,

is a weight matrix to be trained;

Mapping between 0 and 1 to obtain parallel vector

Expressed as:

for vertical vector

Is subjected to linear transformation to obtain

As Q in a multi-head attention mechanism, pair

Is subjected to linear transformation to obtain

And

as K and V in the attention mechanism, respectively, will

And obtaining a vertical vector through a softmax function

By using

And

two gate vector pairs X_iInformation screening is carried out to obtain the characterization vector of the fusion subject of the ith time step

Expressed as:

in the formula (I), the compound is shown in the specification,

representing weights in the parallel and perpendicular directions, respectivelyA matrix of values is formed by a matrix of values,

then will be

And random noise

Splicing to obtain the output vector x of the ith time step_iExpressed as:

in the formula (I), the compound is shown in the specification,

(ii) a It is shown that the connection operation is performed,

random noise, expressed as:

in the formula (I), the compound is shown in the specification,

random distribution P from a standard-Gauss distribution_zObtained by intermediate sampling, P_zRandom distribution P conforming to a standard Gaussian distribution and class labels c from conforming to a standard Bernoulli distribution_cMedium sample, c is 1, indicating normal comment, whenWhen c is 0, it represents a false comment.

3. The method for detecting false user comments as claimed in claim 1, wherein the step B2 specifically comprises the following steps:

Expressed as:

in the formula (I), the compound is shown in the specification,

in the formula (I), the compound is shown in the specification,

indicating the position coding direction corresponding to the ith wordQuantity, d represents the dimension of the position vector, which is the same as the dimension of the word vector, and N is the fixed maximum length of the comment text;

step B23: will be provided with

Q_D＝softmax(O_DW^D+b^D)；

In the formula (I), the compound is shown in the specification,

as a weight matrix, the weight matrix is,

is a bias term;

in the formula (I), the compound is shown in the specification,

presentation evaluator tallies reviewsCalculated class conditional probability, θ_dParameter set, Q, representing discriminator D_D ⁱRepresenting the actual class probability distribution of the discriminator on the ith term;

step B25: token vector O of the comment_DInput evaluator D_critic,The evaluator consists of a fully connected layer, O_DAfter linear transformation and softmax, obtaining the category probability distribution V of the comments_D：

In the formula (I), the compound is shown in the specification,

for the evaluator weight matrix of the discriminator,

is a bias term;

step B26: using cross entropy losses

Training the discriminator by using the loss of mean square error

To evaluator D_criticTraining is carried out;

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

represents a pair S_DThe loss of classification of the sample extracted from S,

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

4. The method for detecting false user comments as claimed in claim 1, wherein the step B3 specifically comprises the following steps:

Splicing to obtain the comment characterization vector of the fusion subject

Wherein

And with

Adding to obtain the comment characterization vector of position perception

Step B33: to O is_CAfter linear transformation, softmax is input, and the class probability distribution of all the words of the comment by the classifier is calculated

Q_C＝softmax(O_CW_C+b_C)；

In the formula (I), the compound is shown in the specification,

as a weight matrix, the weight matrix is,

is a bias term;

using cross entropy losses

The classifier is pre-trained and the classifier is pre-trained,

the calculation formula of (c) is as follows:

in the formula (I), the compound is shown in the specification,

representing pairs of slave data sets S_LThe expected calculation for the mid-sampled samples yields the expected cross-entropy loss for class c,

step B34: token vector O of the comment_CInput evaluator C_criticThe evaluator consists of a fully connected layer, O_CAfter linear transformation and softmax, obtaining target distribution V of actual class probability distribution_CExpressed as:

in the formula (I), the compound is shown in the specification,

is the evaluator weight matrix of the classifier,

is a bias term;

step B35: using mean square error loss

Evaluator C for classifier_criticTraining is carried out:

5. The method for detecting false user comments as claimed in claim 1, wherein the step C specifically includes the steps of:

step C1: traversing each training sample in the data set S, and obtaining a comment characterization vector v for each training sample according to the processing steps of B11-B12_rAnd a topic representation vector v_tObtaining the main information representation of the subject according to the processing step of B13

Expressed as:

Splicing to obtain the comment characterization vector of the fusion subject

Wherein the content of the first and second substances,

Updating is carried out;

wherein the content of the first and second substances,

expressed as:

in the formula (I), the compound is shown in the specification,

step C5: and training the generator in a reinforcement learning mode.

6. The method for detecting false user comments as claimed in claim 5, wherein the step C5 is specifically as follows:

the process of generating comments by the generator is regarded as a sequence decision process, the generator is used as an agent or an actor in reinforcement learning, and the generated term sequence { y is used in the process of generating comments₁，y₂，...，y_i-1The next word y to be generated is regarded as the current state of the agent_iActions taken by the agent, the actions taken by the agent based on policiesAre slightly distributed

wherein the content of the first and second substances,

to maximize

The generator learns and adjusts the parameter theta of the generator through a gradient strategy algorithm_gExpressed as:

in the formula, Qⁱ-VⁱIs a merit function, wherein:

7. A false user comment detection system comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions when executed by the processor being capable of carrying out the method steps of any one of claims 1 to 6.

8. A computer-readable storage medium, on which computer program instructions are stored which are executable by a processor, the method steps of any of claims 1-6 being implementable when the processor executes the computer program instructions.