CN111984931B

CN111984931B - Public opinion calculation and deduction method and system for social event web text

Info

Publication number: CN111984931B
Application number: CN202010841830.2A
Authority: CN
Inventors: 王欣芝; 彭艳; 骆祥峰; 刘杨; 罗均; 谢少荣; 张丹
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2020-08-20
Filing date: 2020-08-20
Publication date: 2022-06-03
Anticipated expiration: 2040-08-20
Also published as: CN111984931A

Abstract

The invention discloses a public opinion calculating and deducing method and a system of social event web texts, which relate to the technical field of web text processing and comprise the following steps: acquiring a social event network text; preprocessing the social event network text to obtain network social event text word characteristics, network social event text word characteristics and network social event text implicit characteristics; respectively inputting the network social event text word features, the network social event text word features and the network social event text implicit features into a trained social emotion calculation model and a trained text emotion calculation model for prediction to obtain six emotion probabilities of a social event network text; and determining the emotional orientation of the social event network text by adopting a voting mechanism method according to the six emotional probabilities of the social event network text. The method and the system provided by the invention can realize the orientation of the final emotion of the social event network text through the analysis of various emotions of the social event network text.

Description

Public opinion calculation and deduction method and system for social event web text

Technical Field

The invention relates to the technical field of web text processing, in particular to a public sentiment calculation and deduction method and system for web texts of social events.

Background

With the development of the internet and network media, more and more emergency management decision-making personnel and scholars pay attention to the complexity of emotion caused by social event network information and pay attention to subsequent adverse consequences caused by improper disposal of the social event network information, so that more and more management personnel and scholars study the public opinion analysis of social events. When a new event occurs, if the event handling process occurred in history can be used for reference, the reliability of current event handling can be improved, that is, a handling clue of the new event is obtained by the handling method of the history case. The social event network text information public opinion calculation based on supervised learning aims at summarizing rules of historical related events, presumes the public opinion trend of the social event text to be analyzed, and understands the current event based on the existing historical event information. The method aims to carry out effective emotional calculation on a new event with reference information so as to promote and assist a decision maker in understanding and guiding social phenomena.

The traditional social event network text information public opinion calculation method assumes that the text information contains emotion which is single. However, in practical engineering applications, the emotions carried by the speech issued by the emotion presenter often show diversity.

Disclosure of Invention

The invention aims to provide a public sentiment calculation and deduction method and a public sentiment calculation and deduction system for a social event network text, so as to realize the final sentiment orientation of the social event network text through the analysis of various sentiments of the social event network text.

In order to achieve the purpose, the invention provides the following scheme:

a public opinion calculating and deducing method of social event network texts comprises the following steps:

acquiring a social event network text;

preprocessing the social event network text to obtain network social event text word characteristics, network social event text word characteristics and network social event text implicit characteristics;

respectively inputting the network social event text word features, the network social event text word features and the network social event text implicit features into a trained social emotion calculation model and a trained text emotion calculation model for prediction to obtain six emotion probabilities of a social event network text;

and determining the emotional orientation of the social event network text by adopting a voting mechanism method according to the six emotional probabilities of the social event network text.

Optionally, the training process of the trained social emotion calculation model includes:

acquiring initial characteristics of a network social event text to be trained; the initial characteristics of the network social event text to be trained comprise initial character characteristics of the network social event text, initial word characteristics of the network social event text and initial implicit characteristics of the network social event text;

inputting the initial characteristics of the network social event text to be trained into a word embedding vector layer of a CNN-LSTM model to obtain initial characteristics of the network social event text in a ciphertext word embedding form; the specific formula is as follows:

wherein the content of the first and second substances,

representing a one-hot vector, wherein the one-hot vector represents the ith initial feature of the jth sample in the network social event text to be trained;

representing a word vector, wherein the word vector is the initial feature of the network social event text in the embedded form of the ciphertext words;

determining word vectors in the sliding window according to the sliding window and the initial characteristics of the network social event text in the ciphertext word embedding form;

inputting the word vectors in the sliding window into a CNN convolution layer of the CNN-LSTM model to determine text characteristic vectors; the specific formula is as follows:

wherein the content of the first and second substances,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2，v_i-1，v_i，v_i+1，v_i+2]Represents the word vector in the ith sliding window [ ·]Representing vector stitching;

inputting the text feature vector into a ReLU activation layer of the CNN-LSTM model to obtain an output result of the ReLU activation layer; the specific formula is as follows:

wherein the content of the first and second substances,

representing the output result of the ReLU activation layer;

inputting the output result of the ReLU activation layer into an LSTM layer of the CNN-LSTM model to obtain the output result of the LSTM layer; the specific formula is as follows:

wherein the content of the first and second substances,

represents the output results of the first layer LSTM layer,

representing the output result of the second LSTM layer;

performing dropout operation on the output result of the LSTM layer to obtain an output result of the dropout operation; the specific formula is as follows:

wherein the content of the first and second substances,

representing the output result of the dropout operation;

performing mean pooling on the output result of the dropout operation to determine effective data; the specific formula is as follows:

wherein the content of the first and second substances,

in order for the data to be valid,

the effective parameters are determined according to whether the data in the current sliding window are effective, and N represents the length of the network social event text after being filled with default values;

inputting the effective data into a full-link layer of the CNN-LSTM model to obtain an output result of the full-link layer, and performing softmax classification on the output result of the full-link layer to determine six emotion probabilities of the network social event text; the specific formula is as follows:

wherein the content of the first and second substances,

represents the output result of the full connection layer, W^TIndicating full connectivityThe shift of the weight parameter in the layer, b denotes the bias in the fully-connected layer,

represents the value of the jth sample in the ith emotional dimension,

predicting the probability of the ith sample in the network social event text as the ith emotion;

adopting a formula according to six kinds of emotional probabilities of the network social event text

Determining a loss function; wherein, L represents a loss function,

representing the real value of the jth sample in the network social event text on the ith emotional dimension;

and optimizing parameters in the CNN-LSTM model by taking the minimized loss function as a target to obtain a trained social emotion calculation model.

Optionally, the training process of the trained text emotion calculation model includes:

inputting the initial characteristics of the network social event text to be trained into a word embedding vector layer of a CNN-LSTM-STACK model to obtain the initial characteristics of the network social event text in a ciphertext word embedding form; the specific formula is as follows:

wherein the content of the first and second substances,

inputting the word vectors in the sliding window into a CNN convolution layer of the CNN-LSTM-STACK model to determine text characteristic vectors; the specific formula is as follows:

wherein the content of the first and second substances,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2，v_i-1，v_i，v_i+1，v_i+2]Representing the word vector in the ith sliding window;

inputting the text feature vector into a ReLU activation layer of the CNN-LSTM-STACK model to obtain an output result of the ReLU activation layer; the specific formula is as follows:

wherein the content of the first and second substances,

to representThe output result of the ReLU activation layer;

inputting the output result of the ReLU activation layer into an LSTM layer of the CNN-LSTM-STACK model to obtain the output result of the LSTM layer; the specific formula is as follows:

wherein the content of the first and second substances,

represents the output results of the first layer LSTM layer,

representing the output result of the second LSTM layer;

wherein the content of the first and second substances,

representing the output result of the dropout operation;

inputting the initial characteristics of the network social event text in the form of the ciphertext word embedded into a full connection layer of an original characteristic attention mechanism of the CNN-LSTM-STACK model to obtain an output result of the full connection layer of the original characteristic attention mechanism; carrying out sigmoid activation on the output result of the full connection layer of the original characteristic attention mechanism, and determining the output result of the original characteristic attention mechanism; the concrete formula is as follows:

wherein the content of the first and second substances,

representing the output result of the original feature attention mechanism full link layer,

an output representing the original feature attention mechanism;

performing mean pooling on the output result of the dropout operation and the output result of the original characteristic attention mechanism to determine effective data; the specific formula is as follows:

wherein the content of the first and second substances,

in order for the data to be valid,

inputting the effective data into a full-link layer of the CNN-LSTM-STACK model to obtain an output result of the full-link layer, and performing softmax classification on the output result of the full-link layer to determine six emotional probabilities of the network social event text; the specific formula is as follows:

wherein the content of the first and second substances,

represents the output result of the full connection layer, W^TDenotes the transposition of the weight parameter in the fully-connected layer, b denotes the bias in the fully-connected layer,

represents the value of the jth sample in the ith emotional dimension,

Determining a loss function; wherein, L represents a loss function,

and optimizing parameters in the CNN-LSTM-STACK model by taking the minimized loss function as a target to obtain a trained text emotion calculation model.

Optionally, the determining, according to the six kinds of emotion probabilities of the social event web text, the emotion orientation of the social event web text by using a voting mechanism method specifically includes:

acquiring six kinds of emotional probabilities of the social event network text;

acquiring the number of the six kinds of emotional probabilities of the social event network text which are larger than an effective misjudgment threshold;

and determining the emotional orientation of the social event network text by adopting a threshold comparison method according to the number.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the invention provides a public sentiment calculation and deduction method and a system of a social event network text, which are characterized in that the character features, word features and implicit features of the network social event text are input into a trained text sentiment calculation model and a trained social sentiment calculation model to obtain six sentiment probabilities of the social event network text, and the sentiment orientation of the social event network text is determined through a voting mechanism, so that the final sentiment orientation of the social event network text is realized through the analysis of multiple sentiments of the social event network text.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.

FIG. 1 is a flow chart of a public opinion calculating and deducing method of social event web text according to the present invention;

FIG. 2 is a schematic diagram of a public opinion computing and deducing method of social event web text according to the present invention;

FIG. 3 is a schematic diagram of a public opinion calculation and deduction method CNN-LSTM model of the social event web text according to the present invention;

FIG. 4 is a schematic diagram of a public opinion computing and deducing method CNN-LSTM-STACK model of a social event web text according to the present invention;

FIG. 5 is a schematic diagram of a voting mechanism of the public opinion calculating and deducing method of the social event web text according to the present invention;

FIG. 6 is a schematic diagram of a public opinion computing and deducing method CNN-LSTM model module of the social event web text of the present invention;

FIG. 7 is a schematic diagram of an original attention mechanism of a public opinion calculation and deduction method CNN-LSTM-STACK model of a social event web text according to the present invention;

fig. 8 is a schematic diagram of a public opinion computing and deducing system of the social event web text according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention aims to provide a public sentiment calculation and deduction method and a system of a social event network text, which are used for realizing the final sentiment orientation of the social event network text through the analysis of various sentiments of the social event network text.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1, the public opinion calculating and deducing method of social event web text provided by the present invention includes:

step 101: and acquiring a social event network text. Assuming that enough label data supports supervised emotion calculation of social event related text, the acquired data comprises social event network text and emotion labels corresponding to the text.

Step 102: and preprocessing the social event network text to obtain the character features of the network social event text, the character features of the network social event text words and the implicit features of the network social event text. And dividing the characteristics of the network social event text words, the characteristics of the network social event text words and the implicit characteristics of the network social event text into a training set, a verification set and a test set according to a certain proportion.

Step 103: and respectively inputting the network social event text character features, the network social event text word features and the network social event text implicit features into the trained social emotion calculation model and the trained text emotion calculation model for prediction to obtain six emotion probabilities of the network social event text.

Step 104: and determining the emotional orientation of the social event network text by adopting a voting mechanism method according to the six emotional probabilities of the social event network text.

The trained social emotion calculation model comprises the following specific training processes:

acquiring initial characteristics of a network social event text to be trained; the initial characteristics of the network social event text to be trained comprise initial character characteristics of the network social event text, initial word characteristics of the network social event text and initial implicit characteristics of the network social event text.

A first module: input data dense word vector representation

The ith initial characteristic of the jth sample in the text character characteristic training set of the network social event is expressed as a single heat vector of the length of a word list

Then the jth sample is noted as

And N represents the length of the network social event text after the network social event text is filled with the default value, and if the length of the network social event text is smaller than N, the default value 'None' is used for filling at the tail part. For example, if N is 5, the text 'love cat' will be filled up as 'love cat None'. The CNN-LSTM model contains two modules, one of which includes: an Input layer (Input), a word embedding vector layer, a CNN convolution layer and a Relu activation layer; module one includes: LSTM layer, dropout layer, Mean pooling layer (Mean), fully connected layer (dense), softmax sort and output layer. As shown in fig. 3.

Inputting the initial characteristics of the network social event text to be trained into the word embedding vector layer of the CNN-LSTM model to obtain the secretA network social event text initial feature in a word-embedded form. The specific formula is as follows:

wherein the content of the first and second substances,

representing a unique heat vector, wherein the unique heat vector represents the ith initial feature of the jth sample in the network social event text to be trained;

representing word vectors, wherein the word vectors correspond to the unique heat vectors, and the word vectors are initial characteristics of the network social event texts in the form of the embedded ciphertext words;

the dimension of (2) is the length of the vocabulary, wherein only the corresponding dimension value of the corresponding feature is 1, and the values of the other dimensions are 0. The one-hot vector expression is sparse and insufficient to reflect the semantic relationship between input data.

The length of (a) can be any dimension, and the value range of each dimension is 0 to 1. The relatively one-hot vector has high semantic density and can reflect partial semantic relation in data.

And determining word vectors in the sliding window according to the sliding window and the initial characteristics of the network social event text in the ciphertext word embedding form. And the sliding window processes the filled text package to keep the context information in the text. If the window size is assumed to be 5, then it can be set

Therefore, the content acquired by the 1 st sliding window is ensured to take the first word in the text as the central word. The content in the ith window may representIs [ w ]_i-2，w_i-1′w_i，w_i+1，w_i+2]The corresponding word vector is [ v ]_i-2，v_i-1，v_i，v_i+1，v_i+2]。

And inputting the word vectors in the sliding window into a CNN convolution layer of the CNN-LSTM model to determine text feature vectors. The concrete formula is as follows:

wherein, the first and the second end of the pipe are connected with each other,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2，v_i-1，v_i，v_i+1，v_i+2]Represents the word vector in the ith sliding window [ ·]Representing vector stitching; convolution calculations extract information useful for emotion analysis based on dense word embedding expressions. At this time, one window data corresponds to one output data, different windows share the same convolution operation, and at this time, the relationship between the windows and the window data is not established.

And inputting the text feature vector into a ReLU activation layer of the CNN-LSTM model to obtain an output result of the ReLU activation layer. The specific formula is as follows:

representing the output result of the ReLU activation layer; the ReLU activation enhances the extracted features, enlarges or reduces the effect of part of the features on the result,

the output data of the module I is simultaneously the input data of the module II. At this time, the data between different windows still keeps independent, and the semantic relation between the data of different windows is not established.

And a second module: emotion calculation based on dense word vectors

And inputting the output result of the ReLU activation layer into an LSTM layer of the CNN-LSTM model to obtain the output result of the LSTM layer. The specific formula is as follows:

wherein the content of the first and second substances,

represents the output results of the first layer LSTM layer,

representing the output result of the second LSTM layer; the CNN-LSTM model is provided with two LSTM layers for data processing, and after the operation, context semantic relations are established between independent windows and are not separated from each other.

And performing dropout operation on the output result of the LSTM layer to obtain the output result of the dropout operation. The specific formula is as follows:

wherein the content of the first and second substances,

an output result representing a dropout operation; the value range of i is [1, N]And N represents the length of the social event network text after being filled up by using a default value, namely if the text length is less than N, the default value 'None' is used for filling up at the end. Since the actual length of the text is not N, define

If the data pointed to by i is valid data, then

If the data pointed to by i is the default data, then

dropout operates on

On the basis of the method, the activity of partial neurons is randomly inhibited, and the overfitting phenomenon is prevented.

In order to unify the calculation process and not influence the emotional tendency of the text corresponding to the social event, the mean pooling operation only processes the valid data, retains the valid data and removes the invalid data. And performing mean pooling on the output result of the dropout operation to determine effective data. The concrete formula is as follows:

wherein the content of the first and second substances,

in order for the data to be valid,

and the value of the effective parameter is 0 or 1, the value of the effective parameter is determined according to whether the data in the current sliding window is effective, and N represents the length of the network social event text after the completion of the network social event text by using a default value.

Inputting effective data into a full connection layer of the CNN-LSTM model to obtain an output result of the full connection layer, and performing softmax classification on the output result of the full connection layer to determine six emotion probabilities of the network social event text. The specific formula is as follows:

wherein the content of the first and second substances,

represents the value of the jth sample in the ith emotional dimension,

and transforming the data into data with six dimensions through full connection, wherein the six dimensions respectively correspond to six emotions, and W and b respectively represent weight parameters and paranoia in the full connection. The Softmax operation establishes the mutual exclusivity among six dimensions, and limits the value range of the emotional intensity to be 0,1]. Effective data of the text is converted into vector information with uniform length through feature extraction, full connection and softmax classification are added on the basis, and classification results are six kinds of emotions.

Setting the real value of the jth sample in the ith emotion dimension as

If the real label is l, then

A value of 1, otherwise

The value is 0. Adopting a cross entropy loss function and adopting a formula according to six emotion probabilities of the network social event text

Determining a loss function; wherein, L represents a loss function,

and representing the real value of the jth sample in the network social event text on the ith emotional dimension. And optimizing parameters in the CNN-LSTM model by taking the minimized loss function as a target to obtain a trained social emotion calculation model.

And a second module of the CNN-LSTM model performs further processing on the data on the basis of the first module after coding, establishes semantic relation among characters and divides calculation results into six emotions. The details of module one and module two are shown in fig. 6.

The trained text emotion calculation model comprises the following specific training processes:

Inputting the initial characteristics of the network social event text to be trained into a word embedding vector layer of a CNN-LSTM-STACK model to obtain the initial characteristics of the network social event text in a ciphertext word embedding form. The specific formula is as follows:

wherein the content of the first and second substances,

representing a word vector, the word vector being the ciphertext word embedding shapeA cyber social event text initiation feature of formula (la).

And determining word vectors in the sliding window according to the sliding window and the initial characteristics of the network social event text in the ciphertext word embedding form.

And inputting the word vectors in the sliding window into a CNN convolution layer of a CNN-LSTM-STACK model, and determining text feature vectors. The specific formula is as follows:

wherein the content of the first and second substances,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2，v_i-1，v_i，v_i+1，v_i+2]Representing the word vector in the ith sliding window.

And inputting the text feature vector into a ReLU activation layer of the CNN-LSTM-STACK model to obtain an output result of the ReLU activation layer. The specific formula is as follows:

wherein the content of the first and second substances,

indicating the output result of the ReLU activation layer.

And inputting the output result of the ReLU activation layer into an LSTM layer of the CNN-LSTM-STACK model to obtain the output result of the LSTM layer. The specific formula is as follows:

wherein the content of the first and second substances,

output junction representing first LSTM layerIn the fruit-bearing mixture, the mixture is stirred,

indicating the output results of the second LSTM layer.

wherein the content of the first and second substances,

representing the output result of the dropout operation.

Inputting the initial characteristics of the network social event text in the form of ciphertext word embedding into a full connection layer of an original characteristic attention mechanism of the CNN-LSTM-STACK model to obtain an output result of the full connection layer of the original characteristic attention mechanism; and carrying out sigmoid activation on the output result of the full connection layer of the original characteristic attention mechanism, and determining the output result of the original characteristic attention mechanism. The concrete formula is as follows:

wherein the content of the first and second substances,

representing the output of the original feature attention mechanism.

And performing mean pooling on the output result of the dropout operation and the output result of the original characteristic attention mechanism to determine effective data. The specific formula is as follows:

wherein the content of the first and second substances,

in order for the data to be valid,

and the value of the effective parameter is determined according to whether the data in the current sliding window is effective or not, and N represents the length of the network social event text after being filled up by using a default value. Module three is intended to distinctively emphasize the effect of the input word vector on the emotion calculation result, and the details of module three are shown in fig. 7.

Inputting effective data into a full connection layer of the CNN-LSTM-STACK model to obtain an output result of the full connection layer, and performing softmax classification on the output result of the full connection layer to determine six emotion probabilities of the network social event text. The specific formula is as follows:

wherein the content of the first and second substances,

represents the value of the jth sample in the ith emotional dimension,

and converting the data into data with six dimensions through full connection, wherein the six dimensions respectively correspond to six emotions, and W and b respectively represent weight parameters and paranoims in the full connection. The Softmax operation establishes the mutual exclusivity among six dimensions, and limits the value range of the emotional intensity to be 0,1]。

Adopting formula according to six kinds of emotion probabilities of network social event text

Determining a loss function; wherein, L represents a loss function,

and representing the true value of the jth sample in the network social event text in the ith emotional dimension. And optimizing parameters in the CNN-LSTM-STACK model by taking the minimum loss function as a target to obtain a trained text emotion calculation model.

The invention provides a specific mode of a public opinion calculating and deducing method of a social event network text.

1. And acquiring the social event network text and the emotion label corresponding to the text.

2. And preprocessing the acquired data in different expression modes.

3. And constructing and training a CNN-LSTM model.

4. And loading the trained social emotion calculation model.

5. Organizing the data in the test set in batches and inputting the data into a social emotion calculation model; the network social event text after the completion is input in the testing process, and the expression mode of the testing data is unified with the expression mode of the loaded model training data.

6. Obtaining a scene social event emotion calculation result; and the calculation result is the emotion corresponding to the value with the maximum probability output by softmax.

7. And constructing and training a CNN-LSTM-STACK model. As shown in FIG. 4, a CNN-LSTM-STACK model is constructed, a module I and a module II jointly form the CNN-LSTM model, and the gradient disappearance phenomenon becomes obvious in the back propagation process of parameter optimization as the number of model layers becomes deeper. In order to solve the problem, an original feature attention mechanism is added on the basis of the two parts of information processing, wherein the original feature attention mechanism is formed by connecting a full connection layer and a sigmoid activation layer, and connecting a word embedding vector layer and a mean value pooling layer. The CNN-LSTM-STACK model is trained. The training set is the same as that of the CNN-LSTM model. The CNN-LSTM-STACK model comprises three modules, wherein the first module and the second module are both the first module and the second module in the CNN-LSTM model. Module three in the CNN-LSTM-STACK model is the original feature attention mechanism as shown in FIG. 7. The layer uses full-join and sigmoid activated join words to embed into the vector layer and the mean pooling layer.

8. And (4) predicting the emotion of the test data by using the CNN-LSTM-STACK according to steps 4, 5 and 6.

9. Word feature expression using data emotion calculations were performed using the CNN-LSTM model and CNN-LSTM-STACK, respectively, according to steps 3-8.

10. Emotion calculations were performed using the implicit feature expression of the data using the CNN-LSTM model and CNN-LSTM-STACK, respectively, according to steps 3-8.

11. A total of six sets of results predicted in two models using three features were included for one data set.

12. And mining the emotion deduction relationship of the network social events.

In order to summarize the misjudgment rule, the intrinsic relevance of emotional expression is mined, and a voting mechanism is established based on multiple models and multiple features so as to obtain the common part of multiple groups of results and increase the reliability of analysis. The analysis process of the voting mechanism is shown in fig. 5. The process can be applied to objective data and subjective data analysis. Determining the emotional orientation of the social event network text by adopting a voting mechanism method according to six emotional probabilities of the social event network text, wherein the method comprises the following steps:

and acquiring six emotional probabilities of the social event network text.

And acquiring the number of the six kinds of emotional probabilities of the social event network texts which are larger than the effective misjudgment threshold value.

Given data D_iInitial characteristic F_jModel M_kModel M_kIncluding a text emotion calculation model and a social emotion calculation model, emotion e_bIs judged as emotion e_aIs denoted as C (e)_a|e_b，D_i，F_j，M_k)∈{0，1}。 C(e_a|e_b，D_i，F_j，M_k) As data D_iUsing initial features F_jAfter representation, model M is used_kMaking a prediction in which the label is e_bThe model prediction label is e_aThe ratio of the number of sentences to the total number of sentences. C_rThe calculation process of (2) is as follows:

wherein, theta₁Is a valid false positive threshold; c_rThe emotion incidence matrix has elements with values of 0 or 1. Strictly speaking, if all the model and feature combinations vote for a certain emotional misjudgment result, the vote is valid and is recorded as: t (e)_a|e_b)＝count_i，j，k(Cr(e_a|e_b，D_i，F_j，M_k)＞θ₂)，T(e_a|e_b) Representing support emotion e in six sets of results_bIs judged as emotion e_aThe number of (2).

The input text data has three expression modes including the expression of the text character characteristic of the network social event, the text word characteristic of the network social event and the implicit characteristic expression of the text of the network social event. The model calculates the data of the three expression modes respectively. The text character features and the word features belong to a general method for text processing, and the implicit expression features of the text are a feature expression mode designed for social events and aim to reduce the importance of irrelevant information so as to improve the effective processing of the relevant information.

The text implicit expression of the network social event is processed on the basis of word characteristics by utilizing two methods based on a dictionary and a part of speech. The semantic density degree is between the character characteristic and the word characteristic, and the word list length corresponding to the language material is also between the character characteristic and the word characteristic.

Corpus implicit expression based on synonym dictionary

The dictionary used in the invention is a Harmony big synonym word forest, and the implicit expression is not carried out on the repeated vocabulary in the word forest, namely the vocabulary with a plurality of meanings. The specific working process is as follows:

step 1: synonym forest loading. The storage mode of the word forest uses the dictionary tree, and compared with the storage mode of the dictionary, the structure has higher retrieval speed and smaller storage space. To improve the efficiency of event information processing.

Step 2: and (4) implicitly expressing the linguistic data based on the synonym dictionary. And traversing the vocabulary in the corpus and querying in the dictionary tree. If the query can not be carried out, the original expression is kept; if it can be queried, the code in the dictionary tree is used to replace the original expression. For example, after the two sentences of "wang classmates like summer" and "wang classmates like summer" are expressed by using codes in the dictionary, the words become "wang classmates # Gb09A01 summer".

After the vocabulary is implicitly expressed, part of synonyms are uniformly symbolically expressed, the sparsity degree of the corpus is reduced, and the semantic compactness is improved. Implicit expressions of words are used as one of the textual features of the follow-up studies in this chapter.

Linguistic data implicit expression based on part-of-speech characteristics

Entities in the corpus generally have larger dispersity, but most entities appear less frequently, so that the entities are difficult to be trained fully in a specific training process, and the using effect of part of entity vectors is poor. An entity may be implicitly expressed when the entity's specific information is not of concern in the current processing event. Compared with the situation that the severe haze of Tianjin today and the severe haze of Beijing today have high semantic similarity, the semantic dispersion is caused by the place names of Tianjin and Beijing, and if the place names are not concerned but only the information of the 'haze' is concerned, the semantic dispersion degree caused by entities can be reduced by uniformly expressing the 'ns severe haze of today'. In addition, a number is also one of typical corpus contents with high dispersity, and the specific contents of the number are also not exhaustive, so that the number can be uniformly and implicitly expressed without considering the specific contents of the number. The entities in the chinese text include various types, which mainly include orientation vocabularies (nd, left side and right side), person names (nr), organization names (nt), place names (nl, suburb), geographic names (ns, beijing, sea lake district), time names (t), other names (nz, nobel prize), numbers, and the like. Through the linguistic data implicit expression based on the part of speech characteristics, the dispersity of words in the linguistic data can be further reduced on the basis of the entity implicit expression, and the training quality of the word vector is improved. The working process is as follows:

step 1: and selecting a target part of speech needing implicit expression according to the event text to be processed, and carrying out self-adaptive adjustment according to the actual situation. Such as: in the event 'Wenchuan earthquake', the entity 'Wenchuan' is an important target to be processed, so that the entity 'Wenchuan' cannot be implicitly expressed, and whether other entities need to be implicitly expressed or not depends on the situation.

Step 2: and designing a proper implicit expression mode. For example, # nd, # nr, # nt, # nl, # ns, # t, # nz, etc. using the parts of speech of the corresponding vocabulary.

And step 3: and traversing the vocabulary in the corpus, and implicitly expressing the corpus content by using a designed implicit expression mode.

After the implicit expression based on the synonym dictionary, the linguistic data is subjected to deeper implicit expression by utilizing the part of speech, so that the sparsity of the linguistic data is further reduced, for example, after the implicit expression of 'Wangchong # Gb09A01 in summer', the linguistic data is expressed as '# nr # Gb09A01 in summer'. The linguistic data is reformulated through two steps of implicit expression based on a synonym dictionary and implicit expression based on part-of-speech characteristics. Compared with the original language material, the linguistic data after implicit expression has reduced vocabulary quantity and reduced sparsity, and the synonym and other vocabularies are preliminarily distinguished, so that the learning effect of the vocabulary vector is favorably improved.

In this experiment, the window size of the model input data is set to be 5, the dimensionality of the word vector is set to be 100, the dimensionality of the hidden layer of the long-time and short-time memory network is set to be 128, and the dimensionality of the output layer is set to be 6. The effect of the invention proposed model was tested on three sets of data.

The data of the invention is from the New wave, and mainly uses the data of the social news channel of the New wave. Each news has three important components on the website, namely news main content, user voting distribution and user comments. The method acquires news titles, main content of the news and news comments as three different data sets which are respectively used as objective data and subjective data.

Emotion calculation model effect evaluation

And respectively observing the performances of the three groups of different data sets on the model, and recording the loss and accuracy trend of the training set in the training iteration process. Wherein the cost function adopts a cross entropy function. And simultaneously recording the accuracy, the recall rate and the F1 value corresponding to the six emotions in the test set respectively. Wherein gd _ precision, zj _ precision, gx _ precision, ng _ precision, xq _ precision and fn _ precision respectively represent the accuracy of the sensation dimension, the startle dimension, the laugh dimension, the impatience dimension, the novelty dimension and the anger dimension. And gd _ recall, zj _ recall, gx _ recall, ng _ recall, xq _ recall and fn _ recall respectively represent recall rates of the emotional dimension, the startle dimension, the funny dimension, the difficult dimension, the novel dimension and the angry dimension. gd _ F1, zj _ F1, gx _ F1, ng _ F1, xq _ F1 and fn _ F1 respectively represent the F1 values of the sensation dimension, the startle dimension, the funny dimension, the difficulty dimension, the novelty dimension and the anger dimension. Wherein

In addition, the overall accuracy on the test set is also counted.

Evaluation of sentiment calculation effect of news comment data set

The accuracy of each dimension on the first group of data (news comment data) test set is as follows, the features and models with the highest overall accuracy are combined into a word feature expression and model CNN-LSTM, and the overall accuracy reaches 85.0%. And secondly, character characteristic expression and a model CNN-LSTM are respectively carried out, and the overall accuracy rate is 84.4%. In addition, the overall accuracy obtained by combining the characteristics and the models is 82.9% of word characteristic expression and model CNN-LSTM-STACK, 82.2% of implicit expression and model CNN-LSTM, 81.5% of word characteristic expression and model CNN-LSTM-STACK and 76.1% of implicit expression and model CNN-LSTM-STACK respectively.

Evaluation of emotional computing effect of news data set

The accuracy and loss on the training set of the second set of data (the news data set) is as follows, and the accuracy of the word feature expression and implicit expression of the two sets of features stabilizes after a limited number of iterations. The accuracy of the combination of the two groups of characteristics and the model CNN-LSTM-STACK is more than 98.0%, and the loss is 0.03 and 0.05 respectively. The accuracy of the two sets of features combined with the model CNN-LSTM was approximately 79.0% and 85.0%, respectively, with losses of 0.68 and 0.49, respectively. It is shown that the fitting effect of the model CNN-LSTM-STACK on the news data set is higher than that of the model CNN-LSTM. The accuracy of the character feature expression on the two models is improved, the loss is reduced, but the result is unstable, which means that the character feature expression cannot effectively fit the training data on the basis of the current model, and if the character feature is used, a network for intermediate feature processing needs to be added to improve the effect.

Evaluation of emotional computing effect of news headline data set

On the training set of the third group of data (the header data set), the accuracy of the six combinations of the three groups of features and the two models all stabilizes after a limited number of iterations. The fitting effect of the two models on the three groups of characteristics is better. The combined accuracy of the implicit expression and the model CNN _ LSTM is 94%, the loss is 0.16, the accuracy of the other five groups of results is higher than 97%, and the loss is lower than 0.1.

The model with the highest overall accuracy is the combination of the character feature expression and the model CNN _ LSTM _ STACK, and the overall accuracy reaches 82.0%. Secondly, the integral accuracy of the combination of the implicit expression and the model CNN _ LSTM _ STACK reaches 81.3%, the integral accuracy of the combination of the word characteristic expression and the model CNN _ LSTM _ STACK reaches 80.5%, the integral accuracy of the combination of the word characteristic expression and the model CNN _ LSTM reaches 80.0%, the integral accuracy of the word characteristic expression and the model CNN _ LSTM reaches 79.6%, and the integral accuracy of the implicit expression and the model CNN _ LSTM reaches 77.8%. The accuracy, recall rate and F1 value of the sensory dimension are higher in the four models. Anger, accuracy of the novelty dimension, recall, and F1 values were at a lower level in the four models. This phenomenon is reflected laterally by the fact that the content of the news is avoiding inducing negative emotions, in an effort to lead positive emotions, consistent with the results in the second set of data (the news data set).

Emotion misinterpretation analysis in social event web text information

And analyzing the calculation results of the three groups of original features and the two models on the three groups of data sets, mainly observing the emotion data which are difficult to distinguish by using a confusion matrix, and analyzing the rules of the emotion data, wherein the confusion matrix is summarized as follows, and the shade of the color in the confusion matrix indicates the corresponding discrimination probability. For example, the abscissa is anger fn, the ordinate is sensibility gd, and the value of the corresponding position is the proportion of misjudging the data marked as sensibility gd as anger fn.

The data adopted by the invention can be divided into subjective data and objective data. The news data and the news title data are compiled by practitioners, the content of the described event is mainly described, and the content is objective; the news comment data is compiled by netizens and used for expressing the opinions and the opinions of the netizens on objective data, wherein the opinions of the netizens carry a large amount of subjective opinions. Comparing the three groups of data, the confusion degree of the news data emotion calculation result is greater than that of the news comment and the news headline data emotion calculation result, namely, the emotion of the news data is more difficult to distinguish relative to the news comment and the news headline data set. The news comment is used as subjective data and can be used for observing the emotional tendency of the netizen to the opinion expressed by the event; the two groups of data of news headlines and news are used as factual description objective data and are mainly used for observing the guidance of subjective emotion by objective description.

If the model provided by the invention is used as a person with limited intelligence to observe data, the misjudgment rule of subjective data and the rule of subjective data guided by objective data can be observed. It can be seen that there is a reverse phenomenon in the three sets of data, namely that in subjective comments, others' comments are easily understood as anger, and in the other two sets of objective descriptions, the emotion caused by news is easily mistaken for emotion. Two phenomena can be summarized: news content is intended to guide non-angry emotions, but is unexpected to cause angry emotions; negative words (angry words) are used to express positive emotions when a user posts a comment. By combining the two phenomena, the netizen's angry emotion is easy to be converted from content and other emotions, and if the netizen is forced to progress from an event, the anger is easy to become the final emotion trend.

As shown in fig. 8, the public opinion calculating and deducing system of social event web text provided by the present invention comprises:

the text acquisition module 201 is configured to acquire a social event web text.

The preprocessing module 202 is configured to preprocess the social event web text to obtain a web social event text word feature, and a web social event text implicit feature.

The predicting module 203 is used for inputting the network social event text word features, the network social event text word features and the network social event text implicit features into a trained social emotion calculation model and a trained text emotion calculation model respectively for prediction to obtain six emotion probabilities of the network social event text,

and the emotion orientation determining module 204 is configured to determine the emotion orientation of the social event web text by using a voting mechanism method according to the six kinds of emotion probabilities of the social event web text.

The prediction module comprises a social emotion calculation model training module, and the social emotion calculation model training module specifically comprises:

the initial characteristic acquisition unit is used for acquiring initial characteristics of the network social event text to be trained; the initial characteristics of the network social event text to be trained comprise initial character characteristics of the network social event text, initial word characteristics of the network social event text and initial implicit characteristics of the network social event text.

The word embedding vector layer input unit is used for inputting the initial characteristics of the network social event text to be trained into a word embedding vector layer of the CNN-LSTM model to obtain the initial characteristics of the network social event text in a ciphertext word embedding form; the specific formula is as follows:

wherein the content of the first and second substances,

and representing a word vector, wherein the word vector is the initial feature of the network social event text in the embedded form of the ciphertext words.

And the word vector determining unit is used for determining the word vectors in the sliding window according to the sliding window and the initial characteristics of the network social event text in the form of the embedded dense words.

A text feature vector determining unit, configured to input the word vector in the sliding window into the CNN convolution layer of the CNN-LSTM model, and determine a text feature vector; the concrete formula is as follows:

wherein the content of the first and second substances,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2，v_i-1′v_i，v_i+1，v_i+2]Represents the word vector in the ith sliding window [ ·]Vector stitching is represented.

The ReLU active layer input unit is used for inputting the text feature vector into a ReLU active layer of the CNN-LSTM model to obtain an output result of the ReLU active layer; the specific formula is as follows:

wherein the content of the first and second substances,

indicating the output result of the ReLU activation layer.

The LSTM layer input unit is used for inputting the output result of the ReLU activation layer into the LSTM layer of the CNN-LSTM model to obtain the output result of the LSTM layer; the specific formula is as follows:

wherein the content of the first and second substances,

represents the output results of the first layer LSTM layer,

indicating the output results of the second LSTM layer.

A dropout operation unit, configured to perform a dropout operation on the output result of the LSTM layer to obtain an output result of the dropout operation; the concrete formula is as follows:

wherein the content of the first and second substances,

representing the output result of the dropout operation.

The mean pooling operation unit is used for performing mean pooling operation on the output result of the dropout operation and determining effective data; the specific formula is as follows:

wherein the content of the first and second substances,

in order for the data to be valid,

and the value of the effective parameter is determined according to whether the data in the current sliding window is effective or not, and N represents the length of the network social event text after being completed by using a default value.

A full connection layer input unit, configured to input the valid data into a full connection layer of the CNN-LSTM model, obtain an output result of the full connection layer, perform softmax classification on the output result of the full connection layer, and determine six kinds of emotion probabilities of the network social event text; the specific formula is as follows:

wherein the content of the first and second substances,

represents the value of the jth sample in the ith emotional dimension,

the probability that the jth sample in the network social event text is predicted as the ith emotion is determined.

A loss function determining unit for adopting a formula according to six emotion probabilities of the network social event text

Determining a loss function; wherein, L represents a loss function,

and representing the real value of the jth sample in the network social event text in the ith emotional dimension.

And the parameter optimization unit is used for optimizing the parameters in the CNN-LSTM model by taking the minimized loss function as a target to obtain a trained social emotion calculation model.

The prediction module comprises a text emotion calculation model training module, and the text emotion calculation model training module specifically comprises:

The word embedding vector layer input unit is used for inputting the initial features of the network social event text to be trained into a word embedding vector layer of a CNN-LSTM-STACK model to obtain the initial features of the network social event text in a ciphertext word embedding form; the specific formula is as follows:

wherein the content of the first and second substances,

And the word vector determining unit is used for determining the word vectors in the sliding window according to the sliding window and the initial characteristics of the network social event text in the embedded form of the dense words.

A text feature vector determining unit, configured to input the word vector in the sliding window into the CNN convolution layer of the CNN-LSTM-STACK model, and determine a text feature vector; the concrete formula is as follows:

wherein the content of the first and second substances,

The ReLU activation layer input unit is used for inputting the text feature vector into a ReLU activation layer of the CNN-LSTM-STACK model to obtain an output result of the ReLU activation layer; the specific formula is as follows:

wherein the content of the first and second substances,

indicating the output result of the ReLU activation layer.

The LSTM layer input unit is used for inputting the output result of the ReLU activation layer into the LSTM layer of the CNN-LSTM-STACK model to obtain the output result of the LSTM layer; the concrete formula is as follows:

wherein the content of the first and second substances,

the output result of the first layer LSTM layer is shown,

the output results of the second LSTM layer are shown.

A dropout operation unit, configured to perform a dropout operation on the output result of the LSTM layer to obtain an output result of the dropout operation; the specific formula is as follows:

representing the output result of the dropout operation.

The attention mechanism input unit is used for inputting the initial characteristics of the network social event text in the form of the ciphertext word embedded into the full connection layer of the original characteristic attention mechanism of the CNN-LSTM-STACK model to obtain the output result of the full connection layer of the original characteristic attention mechanism; carrying out sigmoid activation on the output result of the full connection layer of the original characteristic attention mechanism, and determining the output result of the original characteristic attention mechanism; the specific formula is as follows:

wherein the content of the first and second substances,

representing the output of the original feature attention mechanism.

The mean pooling operation unit is used for performing mean pooling operation on the output result of the dropout operation and the output result of the original characteristic attention mechanism to determine effective data; the concrete formula is as follows:

wherein the content of the first and second substances,

in order for the data to be valid,

and the value of the effective parameter is determined according to whether the data in the current sliding window is effective or not, and N represents the length of the network social event text after being filled up by using a default value.

A full connection layer input unit, configured to input the valid data into a full connection layer of the CNN-LSTM-STACK model, obtain an output result of the full connection layer, perform softmax classification on the output result of the full connection layer, and determine six kinds of emotion probabilities of the network social event text; the concrete formula is as follows:

represents the value of the jth sample in the ith emotional dimension,

the probability that the jth sample in the web social event text is predicted as the ith emotion.

Determining a loss function; wherein, L represents a loss function,

And the parameter optimization unit is used for optimizing the parameters in the CNN-LSTM-STACK model by taking the minimized loss function as a target to obtain a trained text emotion calculation model.

The emotion orientation determining module 204 specifically includes:

and the emotion probability acquisition unit is used for acquiring six kinds of emotion probabilities of the social event network text.

And the number acquisition unit is used for acquiring the number of the six kinds of emotion probabilities of the social event network text which are greater than the effective misjudgment threshold value.

And the emotion orientation unit determines the emotion orientation of the social event network text by adopting a threshold comparison method according to the number.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the foregoing, the description is not to be taken in a limiting sense.

Claims

1. A public opinion calculating and deducing method of social event network texts is characterized by comprising the following steps:

acquiring a social event network text;

determining the emotional orientation of the social event network text by adopting a voting mechanism method according to the six emotional probabilities of the social event network text;

the specific training process of the trained social emotion calculation model comprises the following steps:

inputting the initial characteristics of the network social event text to be trained into a word embedding vector layer of a CNN-LSTM model to obtain the initial characteristics of the network social event text in a ciphertext word embedding form; the specific formula is as follows:

wherein the content of the first and second substances,

inputting the word vectors in the sliding window into a CNN convolution layer of the CNN-LSTM model to determine text characteristic vectors; the concrete formula is as follows:

wherein the content of the first and second substances,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2,v_i-1,v_i,v_i+1,v_i+2]Represents the word vector in the ith sliding window [ ·]Representing vector stitching;

representing the output result of the ReLU activation layer;

wherein the content of the first and second substances,

represents the output results of the first layer LSTM layer,

representing the output result of the second LSTM layer;

wherein the content of the first and second substances,

representing the output result of the dropout operation;

wherein the content of the first and second substances,

in order for the data to be valid,

the value of the effective parameter is determined according to whether the data in the current sliding window is effective or not, and N represents the length of the network social event text after being completed by using a default value;

wherein the content of the first and second substances,

represents the output result of the full connection layer, W^TDenotes the transpose of the weight parameters in the fully-connected layer, b denotes the bias in the fully-connected layer,

represents the value of the jth sample in the ith emotional dimension,

Determining a loss function; wherein, L represents a loss function,

2. The method for public opinion calculation and deduction of social event web text according to claim 1, wherein the training process of the trained text emotion calculation model comprises:

inputting the initial characteristics of the network social event text to be trained into a word embedding vector layer of a CNN-LSTM-STACK model to obtain the initial characteristics of the network social event text in a ciphertext word embedding form; the concrete formula is as follows:

wherein the content of the first and second substances,

inputting the word vector in the sliding window into a CNN convolution layer of the CNN-LSTM-STACK model to determine a text feature vector; the specific formula is as follows:

wherein the content of the first and second substances,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2,v_i-1,v_i,v_i+1,v_i+2]Representing the word vector in the ith sliding window;

wherein the content of the first and second substances,

representing the output result of the ReLU activation layer;

wherein the content of the first and second substances,

the output result of the first layer LSTM layer is shown,

representing the output result of the second LSTM layer;

wherein the content of the first and second substances,

representing the output result of the dropout operation;

inputting the initial characteristics of the network social event text in the ciphertext word embedding form into a full connection layer of an original characteristic attention mechanism of the CNN-LSTM-STACK model to obtain an output result of the full connection layer of the original characteristic attention mechanism; carrying out sigmoid activation on the output result of the full connection layer of the original characteristic attention mechanism, and determining the output result of the original characteristic attention mechanism; the specific formula is as follows:

wherein the content of the first and second substances,

an output representing the raw feature attention mechanism;

wherein the content of the first and second substances,

in order for the data to be valid,

inputting the effective data into a full-link layer of the CNN-LSTM-STACK model to obtain an output result of the full-link layer, performing softmax classification on the output result of the full-link layer, and determining six emotion probabilities of the network social event text; the specific formula is as follows:

wherein the content of the first and second substances,

represents the output result of the full connection layer, W^TDenotes the transpose of the weight parameter in the fully connected layer, b denotes the bias in the fully connected layer,

represents the value of the jth sample in the ith emotional dimension,

Determining a loss function; wherein, L represents a loss function,

3. A method as claimed in claim 1, wherein the determining the emotional orientation of the social event web text by using a voting mechanism method according to six emotional probabilities of the social event web text specifically comprises:

acquiring the number of six kinds of emotional probabilities of the social event network text which are larger than an effective misjudgment threshold;

4. A public opinion calculating and deducing system of social event web text is characterized by comprising:

the text acquisition module is used for acquiring a social event network text;

the preprocessing module is used for preprocessing the social event network text to obtain network social event text word characteristics, network social event text word characteristics and network social event text implicit characteristics;

the prediction module is used for inputting the network social event text word features, the network social event text word features and the network social event text implicit features into a trained social emotion calculation model and a trained text emotion calculation model respectively for prediction to obtain six emotion probabilities of the network social event text;

the emotion orientation determination module is used for determining the emotion orientation of the social event network text by adopting a voting mechanism method according to the six emotion probabilities of the social event network text;

the initial characteristic acquisition unit is used for acquiring initial characteristics of the network social event text to be trained; the initial characteristics of the network social event text to be trained comprise initial word characteristics of the network social event text, initial word characteristics of the network social event text and initial implicit characteristics of the network social event text;

wherein the content of the first and second substances,

representing a unique heat vector, wherein the unique heat vector represents the ith initial characteristic of the jth sample in the network social event text to be trained;

the word vector determining unit is used for determining word vectors in the sliding window according to the sliding window and the initial characteristics of the network social event text in the form of the embedded ciphertext words;

a text feature vector determining unit, configured to input the word vector in the sliding window into the CNN convolution layer of the CNN-LSTM model, and determine a text feature vector; the specific formula is as follows:

wherein the content of the first and second substances,

represents a text feature vector obtained after convolutional layer processing, [ v ]_i-2,v_i-1,v_i,v_i+1,v_i+2]Represents the word vector in the ith sliding window [ ·]Direction of expressionMeasuring and splicing;

the ReLU active layer input unit is used for inputting the text feature vector into a ReLU active layer of the CNN-LSTM model to obtain an output result of the ReLU active layer; the concrete formula is as follows:

wherein the content of the first and second substances,

representing the output result of the ReLU activation layer;

wherein the content of the first and second substances,

represents the output results of the first layer LSTM layer,

representing the output result of the second LSTM layer;

wherein the content of the first and second substances,

representing the output result of the dropout operation;

wherein the content of the first and second substances,

in order for the data to be valid,

wherein the content of the first and second substances,

representing fully connected layersOutput result of (1), W^TDenotes the transpose of the weight parameters in the fully-connected layer, b denotes the bias in the fully-connected layer,

represents the value of the jth sample in the ith emotional dimension,

Determining a loss function; wherein, L represents a loss function,

5. A system for public opinion computation and deduction of web texts of social events as claimed in claim 4, wherein the prediction module comprises a text emotion computation model training module, the text emotion computation model training module specifically comprising:

the initial characteristic acquisition unit is used for acquiring initial characteristics of the network social event text to be trained; the initial characteristics of the network social event text to be trained comprise initial character characteristics of the network social event text, initial word characteristics of the network social event text and initial implicit characteristics of the network social event text;

wherein the content of the first and second substances,

representing a word vector, wherein the word vector is the initial characteristic of the network social event text in the form of the embedded ciphertext words;

a text feature vector determining unit, configured to input the word vector in the sliding window into the CNN convolution layer of the CNN-LSTM-STACK model, and determine a text feature vector; the specific formula is as follows:

wherein the content of the first and second substances,

wherein the content of the first and second substances,

representing the output result of the ReLU activation layer;

the LSTM layer input unit is used for inputting the output result of the ReLU activation layer into the LSTM layer of the CNN-LSTM-STACK model to obtain the output result of the LSTM layer; the specific formula is as follows:

wherein the content of the first and second substances,

the output result of the first layer LSTM layer is shown,

representing the output result of the second LSTM layer;

wherein the content of the first and second substances,

representing the output result of the dropout operation;

wherein the content of the first and second substances,

an output representing the raw feature attention mechanism;

the mean pooling operation unit is used for performing mean pooling operation on the output result of the dropout operation and the output result of the original characteristic attention mechanism to determine effective data; the specific formula is as follows:

wherein the content of the first and second substances,

in order for the data to be valid,

a full connection layer input unit, configured to input the valid data into a full connection layer of the CNN-LSTM-STACK model, obtain an output result of the full connection layer, perform softmax classification on the output result of the full connection layer, and determine six kinds of emotion probabilities of the network social event text; the specific formula is as follows:

wherein the content of the first and second substances,

represents the value of the jth sample in the ith emotional dimension,

Determining lossA function; wherein, L represents a loss function,

6. A system for public opinion calculation and deduction of web text of social events according to claim 4, wherein the emotion orientation determination module further comprises:

the emotion probability acquisition unit is used for acquiring six emotion probabilities of the social event network text;

the number acquisition unit is used for acquiring the number of the six kinds of emotion probabilities of the social event network text which are greater than the effective misjudgment threshold;