CN111428513A

CN111428513A - False comment analysis method based on convolutional neural network

Info

Publication number: CN111428513A
Application number: CN202010393416.XA
Authority: CN
Inventors: 宋丹; 陆奎; 吴杰胜; 刘洋; 戴旭凡
Original assignee: Anhui University of Science and Technology
Current assignee: Anhui University of Science and Technology
Priority date: 2020-05-11
Filing date: 2020-05-11
Publication date: 2020-07-17

Abstract

The invention discloses a false comment method based on a convolutional neural network, which is used for identifying false commodity evaluation in the field of short text processing by combining a mainstream short text feature mining model. The method includes the steps of firstly selecting a sample data set, taking 80% of sample data as a training set, and taking 20% of samples as a testing set. And inputting the training set into a Word2Vec model to obtain the Word vector dimension. And then, the output result of the model has higher accuracy in precision, recall rate and f1-score through convolution calculation and feature extraction. This demonstrates the feasibility of applying convolutional neural network models to false comment identification for its practice.

Description

False comment analysis method based on convolutional neural network

Technical Field

The invention relates to the field of machine learning, in particular to a false comment analysis method based on a convolutional neural network.

Background

With the development of the internet, network consumption is more and more daily, people can use comments as an important reference, and a plurality of consumption platforms are provided with user opinion feedback mechanisms. However, many merchants are motivated by interests to want to issue some false comments, and in the mass comments, if correct judgment cannot be made, consumption experience is affected. But also has a bad influence on the development of the e-commerce platform.

Therefore, a method for accurately identifying false comments is urgently needed at present, and some scholars search many methods, but the result is not ideal and the accuracy is low. The convolutional neural network has good fault-tolerant capability, parallel processing capability and self-learning capability. Problems can be handled without a clear background, ambiguous inference rules, sample data with large defects, distortions, and high resolution can be accommodated.

Disclosure of Invention

Aiming at the problems of high consumption, low accuracy and the like of the existing false comment detection method, the invention provides a false comment analysis method based on a convolutional neural network. The invention adopts the following technical scheme for realizing the purpose:

step 1: collecting comment documents, extracting labels from comments in the comment documents, randomly dividing the collected comments into a training group and a testing group according to the ratio of 4: 1, and testing the identification performance of the network model on false comments by adopting a cross-validation method.

And 2, encoding the sentence by using the output Word vector of Word2Vec, namely encoding the words in the comment by using One-Hot, namely, expressing the words as vectors with the length of L (L represents the size of a Word list), and expressing One sentence by using One vector.

Embedded layer notation: y-w [ x ]

Step 3, extracting the characteristics of the sentence through a convolution kernel of h × k, wherein h represents the number of convolution words equivalent to the value of N in an N-gram model, k represents the dimensionality of a word vector, the size of a convolution window is inconsistent with the number of the convolution kernels, so that the semantic coding of text vector extraction is different, the sizes of different convolution windows are tried, the number of the convolution kernels is set to be consistent with the dimensionality of the word vector, and the sizes of the convolution windows are respectively set to be not consistentThe same value, feature map, column 1, row (n-h +1), i.e., c ═ c₁,c₂,...c_n-h+1) Wherein c is_i＝f(w*x_i:i+h-1+b)。

w is the convolution kernel of the corresponding size, b is the offset, a convolution operation, and f is the activation function.

And 4, step 4: the output of the previous layer is subjected to Pooling (Pooling) operation, and the number of parameters is reduced, specifically as follows:

firstly, extracting the characteristics of the output of the convolutional layer, selecting a specific (generally strongest characteristic) value from a plurality of characteristic values as a reserved value of the pooling layer, and then carrying out nonlinear conversion on the vector screened by the pooling layer and sending the vector into a classifier for classification.

Pooling layer notation: max (C, axis 1)

And 5: the fully connected layer gets the probability of each class by using the softmax classifier. And finally obtaining a performance index result of the false comment, which is specifically as follows:

softmax function formula:

given the input xi and the parameter w, the normalized probability assigned to the correct class label. The softmax classifier will have a greater probability of correct classification and a lesser probability of incorrect classification.

The invention has the following advantages:

1. the method can not only carry out false analysis on Chinese paradox, but also analyze and classify English;

2. by adopting the cross validation method, the use of data samples is reduced, more effective information is obtained from limited data, and overfitting is reduced to a certain extent.

3. The Word vector output by Word2Vec is used as an embedding layer, the relevance between words is emphasized, and the accuracy of the detection result is improved.

Drawings

FIG. 1 is a system flow chart of a convolutional neural network-based false comment analysis method

FIG. 2 is a schematic diagram of a structure based on a convolutional neural network model

Detailed Description

The invention is further illustrated by the following specific examples.

With reference to fig. 1 and fig. 2, a convolutional neural network-based false comment analysis method includes the following steps:

step 1: collecting commodity reviews, preprocessing data, and filtering some invalid reviews, such as: if the sentences are too short, the same sentences and the like, the accuracy of the experimental result is influenced by invalid comments. Adding labels to the processed comment documents, then randomly dividing the comment samples into a training group and a testing group according to the ratio of 4: 1, and testing the identification performance of the network model on the false comments by adopting a cross-validation method. The comment samples are randomly divided into five parts, one part of the comment samples is selected as a test group, the other four parts of the comment samples are selected as training groups, after one round of test, one part of the comment samples is selected as the test group, and the other four parts of the comment samples are selected as the training groups until all data samples are traversed.

For example: writing codes, changing the world

The method is developed by using a word vector mode, and a general training word vector can use a google open source word2vec program. In the word vector, "write" ═ 1.1, 2.1 "," code "═ 1.5, 2.9", "change" ═ 2.7, 3.1 "," world "═ 2.9, 3.5", then the sequence of the words "write code, change world" can be rewritten to ((1.1, 2.1), (1.5, 2.9), (2.7, 3.1), (2.9, 3.5)), the original text sequence is a vector of 4 × 1, and the text after rewriting can be represented as a matrix of 4 × 2.

Any array of m x d, m-dimensional text sequence, number of words, d-dimensional word vector may be represented in the text sequence.

And step 3: the convolution operation is first to slide a sliding window on the input array matrix of m × d, and assuming that the step size is 2, each time convolution is performed on the sliding window of 2 × d, m-h +1 ═ m-1 results can be obtained.

The results are then concatenated to form an n-1 dimensional feature vector. Wherein c is_i＝f(w*x_i:i+h-1+b)

x_i:i+hA sliding matrix window of size h x d, w and x, formed by the i-th row to the i + h-1-th row of the input matrix_i:i+hIs h x d, b is a bias parameter. f is a non-linear activation function, c_iIs a scalar quantity.

And finally, setting different h in the convolutional layer to generate a plurality of different filters, wherein the convolution operation of each filter is equivalent to extracting one feature, and different features can be extracted through the filters of different sliding windows.

And 4, step 4: the max pooling is adopted, the largest one of the feature vector values generated by each filter in the upper layer is selected as a feature value, all the values form a feature vector of a provisional layer, the feature vector is finally sent to a full-link layer, and each word vector is classified by using a Softmax function to obtain a result of a false comment.

And 5: and comparing the classification result obtained by the Softmax function with the data set label, if the classification result is the same as the data set label, keeping the parameters unchanged, and if the parameters are different, adjusting the parameters, and training again until the loss function is converged.

Example 1

This experiment verifies the invention using a false goodness gold dataset collected by MyleOtt et al, which is divided into two categories, positive and negative comments, each of which is divided into true and false comments, respectively. The number of each type of data samples is equal, and the data samples are 400, and a total of 1600 hotel reviews are used as data samples, and the data samples have two characteristic labels at the beginning.

A false comment analysis method based on a convolutional neural network comprises the following steps:

step 1 is executed, firstly, the sample data set is loaded from the designated path, because the number of the training sample sets has direct influence on the experimental result, the experiment randomly divides the data set, 80% of the samples are used as training data models of the training set, and 20% of the samples are used as verification data models of the test set.

And 2, inputting the sample data set into a Word2Vec model, wherein each data sample is represented by a Word vector. And (3) sending the processed word vectors into a convolutional layer, trying to set convolution windows with different sizes in an experiment, and finally setting the number of convolution kernels to be 300 to be consistent with the dimension of the word vectors.

(1) Setting the sizes of convolution windows to be 1 to 5 respectively, setting the number of convolution kernels corresponding to each length to be 300, setting the step length to be 1, and performing sliding convolution on each convolution kernel on a sentence vector.

(2) After the process (1) is repeated, 300 feature vectors are extracted from the convolution kernel of each convolution window length, and the feature vectors convolved by different convolution window lengths are different.

And step 3, in the pooling layer, processing the maximum value of the feature vector obtained by each convolution window length, obtaining a plurality of feature maps through a plurality of different convolution kernels of the convolution layer, obtaining a plurality of one-dimensional vectors after the processing of the pooling layer, and splicing the one-dimensional vectors together to form the input of the full connection layer. And weighting and summing the output vectors of the previous layer. And the output layer uses a drop out method to prevent the model from being over-fitted, and finally, the class with the largest value in the output results is judged as the prediction result of the model.

After completing step 2-3 and fixing the model parameters, the model is verified by using the test set, and the results are shown in the table:

test results Table 1

The test results of the test set samples have higher accuracy in precision, recall rate and f1-score, and can be competent for most detection tasks.

The above description is provided for the purpose of facilitating understanding of the present invention by those skilled in the art, and it is not intended to limit the scope of the present invention, which is defined by the appended claims and their equivalents, and may be used in other fields directly or indirectly without departing from the scope of the present invention.

Claims

1. A false comment analysis method based on a convolutional neural network is characterized by comprising the following steps:

Step 2, using output Word vectors of Word2Vec to code sentences, wherein the general training Word vectors can use a google open source Word2Vec program, the words are coded by One-Hot, namely, words are represented as vectors with the length of L (L represents the size of a Word list), and One vector represents One sentence.

The embedded layer word vector output is: y-w [ x ]

Step 3, extracting the characteristics of the sentence through a convolution kernel of h × k, wherein h represents the number of convolution words, which is equivalent to the value of N in an N-gram model, k represents the dimensionality of a word vector, the size of a convolution window is inconsistent with the number of the convolution kernels, so that the semantic coding of text vector extraction is different, the size of different convolution windows is tried, the number of the convolution kernels is set to be consistent with the dimensionality of the word vector, the sizes of the convolution windows are respectively set to be different values, a feature map is set, the column of the feature map is set to be 1, and the behavior is (N-h +1), namely c ═ (c ═ h [ + ] (c [ + ])₁,c₂,...c_n-h+1)，

Wherein c is_i＝f(w*x_i:i+h-1+b)。

The pooling layer output is: max (C, axis 1)

softmax function formula:

2. The method of claim 1, wherein the convolutional neural network-based false comment analysis method is characterized in that: when a Word vector is obtained by adopting a Word2Vec model, a three-layer neural network is used for constructing a probability language model, a text is converted into a digital vector, the probability of the occurrence of the next Word is predicted according to the currently known first n-1 words, and the first n-1 Word vectors are spliced end to serve as an input layer.