CN114691836A

CN114691836A - Method, device, equipment and medium for analyzing emotion tendentiousness of text

Info

Publication number: CN114691836A
Application number: CN202210433270.6A
Authority: CN
Inventors: 金力; 李晓宇; 张泽群; 刘庆; 张林浩; 李树超
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2022-04-24
Filing date: 2022-04-24
Publication date: 2022-07-01

Abstract

The disclosure provides a method for analyzing emotion tendentiousness of a text, which comprises the following steps: respectively extracting semantic features, part-of-speech features and co-occurrence word features from a text to be analyzed, wherein the text to be analyzed comprises text content and comment content associated with the text content; splicing the semantic features, the part-of-speech features and the co-occurrence word features to obtain spliced features; preprocessing the splicing characteristics to obtain word sequence characteristics in the text to be analyzed; aggregating word sequence characteristics in the text to be analyzed to obtain sentence vectors of the text to be analyzed; and inputting the sentence vector into the emotion tendentiousness analysis model, and outputting the emotion tendentiousness analysis result of the text to be analyzed. The disclosure also provides a device, equipment, storage medium and program product for analyzing the emotion tendentiousness of the text.

Description

Method, device, equipment and medium for analyzing emotion tendentiousness of text

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, a medium, and a program product for emotion tendentiousness analysis of a text.

Background

With the rapid development of internet technology, social networks such as WeChat and twitter are particularly emerging. Network users publish or disseminate vast amounts of information each day. Most of the information contains user viewpoint tendency and emotional tendency, is a valuable opinion resource, contains different viewpoints and positions of people for various phenomena in the society, and relates to not only the aspects of our lives but also various fields. Therefore, the emotion text information is automatically analyzed and processed by using a computer technology, and the method has great application value in the invention of auxiliary prediction, accurate marketing, decision making and the like.

Disclosure of Invention

In view of the foregoing, the present disclosure provides a method, apparatus, device, medium, and program product for emotion tendency analysis of text.

According to a first aspect of the present disclosure, there is provided a method for emotion tendentiousness analysis of a text, comprising:

respectively extracting semantic features, part-of-speech features and co-occurrence word features from a text to be analyzed, wherein the text to be analyzed comprises text content and comment content associated with the text content;

splicing the semantic features, the part-of-speech features and the co-occurrence word features to obtain spliced features;

preprocessing the splicing characteristics to obtain word sequence characteristics in the text to be analyzed;

aggregating word sequence characteristics in the text to be analyzed to obtain sentence vectors of the text to be analyzed; and

and inputting the sentence vector into an emotion tendency analysis model, and outputting an emotion tendency analysis result of the text to be analyzed.

According to the embodiment of the disclosure, preprocessing the splicing features to obtain word sequence features in a text to be analyzed includes:

performing linear activation conversion on the splicing characteristics to generate linearly activated characteristics;

carrying out nonlinear activation conversion on the splicing characteristics to generate nonlinear activated characteristics;

based on the splicing characteristics, obtaining a weight vector by using an attention mechanism;

and combining the linearly activated features and the non-linearly activated features according to the weight vector to obtain the word sequence features in the text to be analyzed.

According to an embodiment of the present disclosure, obtaining the weight vector using an attention mechanism based on the stitching feature includes:

converting the splicing characteristics into query vectors, key vectors and value vectors according to different mapping matrixes;

obtaining splicing characteristics with different weight values by using an attention mechanism;

reducing the dimension of the splicing features with different weight values through a forward propagation layer to obtain the spliced features with different weight values after dimension reduction;

and generating a weight vector by the splicing characteristics with different weight values after dimension reduction through a nonlinear activation function.

According to the embodiment of the present disclosure, aggregating word sequence features in a text to be analyzed to obtain a sentence vector of the text to be analyzed includes:

inputting the word sequence characteristics in the text to be analyzed into a gate control circulation unit;

and outputting a sentence vector of the text to be analyzed.

According to the embodiment of the disclosure, the extracting semantic features, part-of-speech features and co-occurrence word features from the text to be analyzed respectively comprises the following steps:

inputting a text to be analyzed into a pre-training language representation model, and outputting semantic features;

embedding and representing the commonly occurring words in the text content and the comment content to obtain the characteristics of the commonly occurring words;

and embedding and representing the part of speech of each word in the text to be analyzed to obtain part of speech characteristics.

According to the embodiment of the disclosure, the emotion tendency analysis model is obtained by pre-training; the pre-training method comprises the following steps:

respectively extracting sample semantic features, sample part-of-speech features and sample co-occurrence word features from training samples, wherein the training samples comprise: the method comprises the steps of obtaining a text content sample, a comment content sample associated with the text content sample and an emotion label;

splicing the sample semantic features, the sample part-of-speech features and the sample co-occurrence word features to obtain sample splicing features;

preprocessing the sample splicing characteristics to obtain sample word sequence characteristics in the training sample;

aggregating the sample word sequence characteristics in the training samples to obtain sample sentence vectors of the training samples;

inputting the sentence vectors of the samples into a classification model, and outputting emotion tendency classification results of the training samples;

and adjusting parameters of the classification model based on the emotion tendency classification result and the emotion label, and taking the trained classification model as an emotion tendency analysis model.

A second aspect of the present disclosure provides an emotion tendentiousness analysis apparatus for a text, including:

the feature extraction module is used for respectively extracting semantic features, part-of-speech features and co-occurrence word features from a text to be analyzed, wherein the text to be analyzed comprises text content and comment content associated with the text content;

the feature splicing module is used for splicing the semantic features, the part-of-speech features and the co-occurrence word features to obtain splicing features;

the preprocessing module is used for preprocessing the splicing characteristics to obtain word sequence characteristics in the text to be analyzed;

the aggregation module is used for aggregating the word sequence characteristics in the text to be analyzed to obtain sentence vectors of the text to be analyzed; and

and the analysis module is used for inputting the sentence vector into the emotion tendency analysis model and outputting the emotion tendency analysis result of the text to be analyzed.

A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method for emotion tendencies analysis of text described above.

The fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the method for emotion tendentiousness analysis of text described above.

A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the method for emotion tendentiousness analysis of text described above.

According to the embodiment of the disclosure, the semantic features, the co-occurrence word features and the part-of-speech features of the text to be analyzed are extracted to obtain the splicing features; and preprocessing the spliced features to enable the features to be enhanced and then aggregated to obtain sentence vectors of the text to be analyzed, and then inputting an emotion tendency analysis model to obtain an emotion tendency analysis result. In order to fully mine the linguistic characteristics of the text to be analyzed, the co-occurrence word characteristics and the text part-of-speech characteristics are fused, and emotion tendency analysis is assisted. And the features are aggregated to obtain sentence vectors of the text to be analyzed, so that the accuracy of emotion tendency analysis is improved.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:

FIG. 1 schematically illustrates an application scenario diagram of a method, apparatus, device, medium, and program product for emotion tendentiousness analysis of text, in accordance with an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flowchart of a method of sentiment orientation analysis of text according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flowchart of a method for preprocessing a spelling feature to obtain a word sequence feature in a text to be analyzed according to an embodiment of the present disclosure;

fig. 4 schematically shows a model architecture diagram of sentence vectors of a text to be analyzed obtained in the emotion tendentiousness analysis method of the text according to the embodiment of the present disclosure;

FIG. 5 schematically illustrates a flowchart of a method by which an emotional orientation analysis model is pre-trained, according to an embodiment of the disclosure;

FIG. 6 is a block diagram schematically illustrating the structure of an emotion tendentiousness analysis apparatus according to an embodiment of the present disclosure; and

FIG. 7 schematically shows a block diagram of an electronic device suitable for implementing a sentiment propensity analysis method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, necessary security measures are taken, and the customs of the public order is not violated.

In the related art, there are three main types of text emotion tendency analysis methods, namely, a text dictionary and statistical machine learning method. The method has wide use scenes and simple use, and mainly solves the emotion analysis task primarily by calculating the word frequency and adding a machine learning method. Based on the keywords, a dictionary is established in advance by combining with a target scene, and the emotion of the sentence is judged by summarizing the emotion of the keywords when the method is used. The construction of the emotion dictionary requires great manual labor.

The emotion tendency analysis method based on deep learning is mainly based on a CNN (convolutional neural network) structure and an RNN (recurrent neural network), and the effect is greatly improved. However, for long-distance dependency, the problem of gradient disappearance occurs, and long-distance information cannot be captured when the sequence is long. Subsequently, an LSTM (long-short memory network) arises. It has long-term memory function. The method is simple to implement and has certain advantages in the aspect of sequence modeling. In general, however, the round-robin model (RNN) calculates the positions of the symbols of the input and output sequences as factors. In aligning the position to the computation time step, a series of hidden states are generated based on the input of the previous hidden state and position. This inherent sequential nature prevents training samples from being trained in parallel, a problem that is particularly critical when faced with longer sequences and insufficient memory to provide larger batch training. Although one improves computational efficiency (the latter improves the model's performance at the same time) through factorization techniques and conditional computation (conditional computation). However, the inherent sequential nature has not yet been addressed.

In practicing the present disclosure, it has been discovered that the attention mechanism has become an integral part of the tasks of modeling and transforming models of various sequences. It can model inter-symbol dependencies without regard to symbol distance. However, in most cases, such attention mechanism is used with RNN (recurrent neural network). The Transformer architecture abandons the RNN (recurrent neural network) structure and completely depends on attention mechanism to map the global dependency between input and output. The Transformer model has better parallelism and achieves great performance improvement on NLP (natural language processing) tasks.

NLP (natural language processing) related tasks also mostly adopt a Transformer structure. By adopting BERT as an upstream model and combining CNN (convolutional neural network)/LSTM (long-term memory network) to perform downstream classification by virtue of the generalization advantage of pre-training, the effect is greatly improved. The reason is that multiple Attention mechanisms can better capture the relevance of words with different distances, and semantically related words have shorter distances in the mapped vector space.

Based on this, the embodiment of the present disclosure provides an emotion tendentiousness analysis method, including: respectively extracting semantic features, part-of-speech features and co-occurrence word features from a text to be analyzed, wherein the text to be analyzed comprises text content and comment content associated with the text content; splicing the semantic features, the part-of-speech features and the co-occurrence word features to obtain spliced features; preprocessing the splicing characteristics to obtain word sequence characteristics in the text to be analyzed; aggregating word sequence characteristics in the text to be analyzed to obtain sentence vectors of the text to be analyzed; and inputting the sentence vector into an emotion tendency analysis model, and outputting an emotion tendency analysis result of the text to be analyzed.

Fig. 1 schematically illustrates an application scenario diagram of a method, an apparatus, a device, a medium, and a program product for emotion tendency analysis of a text according to an embodiment of the present disclosure.

As shown in fig. 1, the application scenario 100 according to this embodiment may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have installed thereon various communication client applications, such as a financial product type application, a shopping type application, a web browser application, a search type application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

terminal devices

101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that the emotion tendency analysis method for text provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the emotion tendency analysis device for text provided by the embodiment of the present disclosure can be generally disposed in the server 105. The emotion tendency analysis method for the text provided by the embodiment of the present disclosure may also be executed by a server or a server cluster which is different from the server 105 and can communicate with the

terminal devices

101, 102, 103 and/or the server 105. Accordingly, the emotion tendency analysis device for text provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster which is different from the server 105 and can communicate with the

terminal devices

101, 102, 103 and/or the server 105.

The emotion tendency analysis method for the text provided by the embodiment of the disclosure can also be executed by the

terminal devices

101, 102 and 103. Accordingly, the emotion tendency analysis device for text provided by the embodiments of the present disclosure may also be generally disposed in the

terminal devices

101, 102, and 103. The emotion tendency analysis method for the text provided by the embodiment of the present disclosure can also be executed by other terminals different from the

terminal devices

101, 102, and 103. Accordingly, the emotion tendency analysis device for texts provided by the embodiment of the present disclosure may also be disposed in other terminals different from the

terminal devices

101, 102, and 103.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

The method for analyzing emotion tendentiousness of a text according to the embodiment of the present disclosure will be described in detail with reference to fig. 2 to 5 based on the scenario described in fig. 1.

FIG. 2 schematically shows a flow chart of a method of sentiment orientation analysis of text according to an embodiment of the present disclosure.

As shown in FIG. 2, the emotion tendentiousness analysis method 200 of the text of this embodiment includes operations S201 to S205.

In operation S201, semantic features, part-of-speech features, and co-occurrence features are respectively extracted from a text to be analyzed, wherein the text to be analyzed includes text content and comment content associated with the text content.

According to the embodiment of the disclosure, the obtained text to be analyzed can be coded through a BERT double sentence mode, and semantic features are obtained. Co-occurring word features may be obtained by identifying and then encoding co-occurring words in textual content and review content associated with the textual content. The part of speech of the text to be analyzed can be labeled through a word segmentation tool jieba, and part of speech characteristics are constructed.

In operation S202, the semantic features, the part-of-speech features, and the co-occurrence features are concatenated to obtain a concatenated feature.

In operation S203, the splicing feature is preprocessed to obtain a word sequence feature in the text to be analyzed.

According to the embodiment of the disclosure, the splicing characteristics can be enhanced and transferred by adopting a Highway Network (Highway Network) based on a self-attention mechanism, so as to obtain the word sequence characteristics in the text to be analyzed.

It should be noted that, the Highway Network introduces a Transform Gate and a Carry Gate based on a Gate mechanism, and the output is composed of a Transform input and a Carry input. When a Transform Gate is generated, the conventional method directly performs linear transformation and nonlinear activation on an input vector to obtain an output value of a learnable Gate control unit, so that the behavior of a Gate mechanism is between feature transformation and feature transfer. The traditional conversion gate generation method gives equal-degree weight to each dimension in the input vector, and does not consider the difference of information quantity carried by different dimensions of the feature vector.

In operation S204, the word sequence features in the text to be analyzed are aggregated to obtain a sentence vector of the text to be analyzed.

According to the embodiment of the disclosure, word sequence characteristics in a text to be analyzed can be input into a Gated Round Unit (GRU), and word information is aggregated by capturing dependency relationships among word vectors to obtain sentence vectors of the text to be analyzed.

In operation S205, the sentence vector is input into the emotion tendency analysis model, and an emotion tendency analysis result of the text to be analyzed is output.

According to the embodiment of the disclosure, the emotion tendency analysis model can be obtained through pre-training, and the emotion tendency analysis result of the text to be analyzed can be output after the sentence vector is input. The emotional orientation analysis results may include positive, negative, neutral, and the like.

FIG. 3 schematically illustrates a flowchart of a method for preprocessing a spelling feature to obtain a word sequence feature in a text to be analyzed according to an embodiment of the present disclosure; fig. 4 schematically shows a model architecture diagram of a sentence vector of a text to be analyzed obtained in the emotion tendentiousness analysis method for the text according to the embodiment of the present disclosure.

As shown in fig. 3, the method 300 for preprocessing the spelling pattern to obtain the word sequence pattern in the text to be analyzed according to this embodiment includes operations S301 to S304.

In the embodiment of the present disclosure, as shown in fig. 4, after the semantic feature (Bert _ Out), the part-of-speech feature (tag _ feature), and the co-occurrence word feature (word _ match) of the text to be analyzed are concatenated, a concatenated feature (concatenated feature) may be obtained. And (3) completing the enhancement and transmission of the features by adopting a Highway Network based on a self-attention mechanism to obtain the word sequence features in the text to be analyzed.

For example, the sentence length of the text to be analyzed may be n, which is a positive integer. The sentence sequence vector W may be dictionary mapped, as shown in the following equation (1):

W＝[w₁,w₂,…,w_n] (1)

the sentence sequence vector W can be input into a BERT model for coding to obtain semantic features F_wAs shown in the following formula (2):

F_w＝BERT(W) (2)

wherein, W corresponding to the ith word in the sentence sequence vector W_iThe semantic feature f can be obtained after BERT coding_i，f_iCorresponding dimension is d_hThus, semantic feature F_wAnd may be represented by the following formula (3):

semantic features F of the text to be analyzed_wPart of speech feature F_tAnd co-occurrence feature F_cSplicing to obtain a splicing characteristic F, which is shown in the following formula (4):

F＝[F_w:F_t:F_c](4)

wherein, [:]representing operations for stitching in a sequence dimension;

n_crepresenting the length of co-occurrence word codes after word embedding, n_tIndicating the length of the part-of-speech tag after embedding.

In operation S301, a linear activation transformation is performed on the spliced features to generate linearly activated features.

According to the embodiment of the disclosure, for example, the splicing feature F may be subjected to linear activation conversion to generate a linearly activated feature F_LAs shown in the following formula (5): f_L＝W_L·F+b_L(5)

Wherein, W_L、b_LRespectively representing a trainable parameter matrix and a paradoxical constant.

In operation S302, a non-linear activation transformation is performed on the spliced features to generate non-linear activated features.

According to the embodiment of the disclosure, for example, the splicing feature F may be subjected to nonlinear activation conversion to generate the nonlinear activated feature F_NAs shown in the following formula (6):

where σ (·) denotes the nonlinear activation function used, thisCan adopt tanh function, W_N、

Both represent trainable parameters.

In operation S303, a weight vector is obtained using an attention mechanism based on the stitching feature.

According to the embodiment of the present disclosure, as shown in fig. 4, a Multi-Head Self-Attention mechanism may be adopted, the splicing features are used as input, different mapping matrices are adopted to perform linear transformation on the splicing features to generate Query (Query vector), Key (Key vector), Value (Value vector) vectors, and then different dimensions of the input features are aggregated with different weights through a Multi-Head Self-Attention mechanism (Self Attention). After the characteristics of multi-head output are spliced, dimension reduction is carried out through a Forward propagation layer (Feed Forward), and finally a Gate Vector weight Vector is generated through a nonlinear activation function softmax. Two layers of residual join and normalization (Add & Norm) were added to the overall multi-headed attention mechanism to prevent overfitting. The output Gate Vector weight Vector contains a Transform Gate value and a corresponding Carry Gate value, and is used for aggregating linear activated features and nonlinear activated features in the next step, so that word sequence features (feature sequences) in the text to be analyzed can be obtained through a Gate control loop unit (GRU), and word information is aggregated by capturing the dependency relationship among word vectors to obtain sentence vectors (sense representation/feature) of the text to be analyzed.

For example, the ith attention Head (Head) is first obtained using the stitching feature Fⁱ) The input of (2): query vector Q of ith attention headⁱAs shown in the following formula (7); key vector K of ith attention headⁱAs shown in the following formula (8); value vector V of ith attention headⁱThis is shown in the following formula (9).

Qⁱ＝W_q ⁱ·F+b_q ⁱ，(7)

Kⁱ＝W_k ⁱ·F+b_k ⁱ (8)

Vⁱ＝W_v ⁱ·F+b_v ⁱ (9)

Wherein, W_q ⁱ、W_k ⁱ、W_v ⁱ、b_q ⁱ、b_k ⁱ、b_v ⁱBoth represent trainable parameters.

By querying the vector QⁱKey vector KⁱSimilarity calculation is carried out, similarity scores are normalized into weights through a softmax function, and then a value vector V is obtained according to the similarity weightsⁱRe-weighting to obtain the ith Head of attention (Head)ⁱ) Output H ofⁱAs shown in the following formula (10):

where T denotes the transpose operation of the matrix.

The m outputs H of the attention heads are obtained by calculating m times according to the method¹,H²,…H^mThen splicing to obtain a hidden representation H, passing through a linear matrix W_hReducing dimension to obtain multi-head attention output

And residual connection and regularization are added between the splicing characteristic F and the splicing characteristic F, as shown in the following formula (11):

wherein [ ] denotes the operation of splicing in the order dimension.

Output the attention of multiple heads

Performing nonlinear mapping, adopting residual connection and regularization again, and finally sending the residual connection and regularization into softmax for normalization to obtain a weight vector G, wherein the weight vector G is shown in the following formula (12):

wherein σ₂(. to) denotes the non-linear mapping function used, where the RELU function W can be used_g，b_gTrainable parameters are represented.

According to the embodiment of the disclosure, the improved Highway Network can enable a door mechanism to have stronger learning capacity and make more reasonable behaviors between feature conversion and feature transfer, so that the Highway Network has stronger key feature capturing capacity.

In operation S304, the linearly activated features and the non-linearly activated features are combined according to the weight vector to obtain word sequence features in the text to be analyzed.

According to the embodiment of the disclosure, weight values can be distributed to the linearly activated features and the non-linearly activated features according to the weight vectors, and the weight values are combined to obtain word sequence features in the text to be analyzed.

For example, the non-linear feature F may be transformed according to a weight vector G_NAnd linear characteristic F_LWeighting to obtain the word sequence characteristics F in the text to be analyzed_sAs shown in the following formula (13):

F_s＝G[0]⊙F_N+G[1]⊙F_L (13)

wherein G [0] represents an element with a G vector index of 0; g [1] represents an element with a G vector index of 1.

and outputting a sentence vector of the text to be analyzed.

For example, the word sequence feature F in the text to be analyzed can be_sThe input GRU code obtains a sentence vector Rs of the text to be analyzed, as shown in the following equation (14):

R_s＝GRU(F_s) (14)

wherein the content of the first and second substances,

according to the embodiment of the disclosure, the method for analyzing the emotion tendentiousness of the text of the improved expressway network based on BERT and the combination of the self-attention mechanism and the GRU can improve the accuracy of the emotion tendentiousness analysis of the text at the cost of increasing a small amount of computing resources.

According to the embodiment of the disclosure, the traditional emotion tendency analysis method adopts word2vec or glove and other pre-training word vectors as word embedding of text words, and lacks the modeling capability of capturing the dependency relationship between the bidirectional relationship and the long-distance words in the sentence. The present disclosure employs a pre-trained language characterization model (BERT model) based on a Transformer architecture as a semantic feature encoder for text to be analyzed.

It should be noted that the encoder can be finely tuned in pre-training, and can embed the words obtained by running the self-supervision learning method on the basis of massive corpora into features to be used as vector representation of the text words to be analyzed, and then finely tune in training to finally obtain a text encoder capable of performing joint adjustment according to context as a semantic feature encoder of the text to be analyzed.

According to the embodiment of the disclosure, a BERT double-sentence mode is adopted, so that the method not only has stronger capturing capability of the dependency relationship between words in fine-grained sentences, but also increases the analyzing and judging capability of certain inter-sentence emotional tendency consistency on the aspect of wide granularity.

FIG. 5 schematically shows a flowchart of a method for obtaining an emotional orientation analysis model through pre-training according to an embodiment of the disclosure.

As shown in FIG. 5, the method 500 of the emotional tendency analysis model of the embodiment obtained by pre-training includes operations S501-S506.

In operation S501, sample semantic features, sample part-of-speech features, and sample co-occurrence features are extracted from training samples, respectively, where the training samples include: a text content sample, a comment content sample associated with the text content sample, and an emotion tag.

According to the embodiment of the disclosure, the initial data sample may be obtained first, and then the training sample is obtained after the initial data sample is cleaned. The cleaning of the initial data sample may include converting a traditional Chinese character into a simplified Chinese character and removing an emoticon in the initial data sample to obtain a first processed data sample. And then unifying the formats of the first processing data samples to obtain second processing data samples. And dividing the second processing data sample into a training sample and a verification sample according to a certain proportion. The proportion may be determined from an actual training model.

For example, a post (blog) text content sample and a comment content sample associated therewith may be obtained, and a traditional chinese character is converted to a simplified chinese character through a function defined in longcon. The emoji emoticons in the text are then filtered out by calling a function in the emoji library. The data is re-unified in the format of [ content, comment, publication, policy ], i.e., [ blog content, blog comment, blog publisher, emotion tag ], separated by an english comma. For example, it may be: [ this notebook computer works very well, runs quickly, recommends everyone to buy, and with this computer, besides running quickly, its price is also very cheap, very cost! 1111, +1] where +1 represents that the emotion tag is positive. The sentiment tag may be represented by-1 for negative, 0 for neutral, etc.

And finally, uniformly dividing the data into 3: the scale of 1 is divided into training samples and validation samples.

According to an embodiment of the present disclosure, a BERT bilingual mode may be used to encode the training samples, which is composed of four parts: guid, current training sample index; text _ a: a blog content sample; text _ b: comment text associated with the blog content sample; label: and obtaining the semantic features of the sample by using the emotion labels.

For characters that co-occur in text _ a, text _ b, it can be labeled as 1, otherwise it is labeled as 0; sample co-occurrence word features can be obtained according to the method to assist emotion tendency analysis.

For example, if text _ a is: the notebook computer is very good in use and fast in running, and is recommended to be purchased

The text _ b corresponding thereto is: the same type of computer, besides being fast in operation, is very cheap and very cost-effective!

Then the sample co-occurrence feature may be expressed as: 01000111100111111000000

Wherein, the characters of 'computer, fast running, very' and the like in the text _ a appear in the text _ b.

Sample part-of-speech characteristics can be constructed by introducing a jieba word segmentation tool to assist emotion tendency analysis.

It should be noted that, because words of different parts of speech carry different emotional information, such as adjectives "happy", "angry", and verbs "opposed", etc., the expressions show obvious praise or deprecate emotions, and contain obvious emotional tendencies. While some conjunctions "and", "then", etc., may be considered to carry little emotional information. Based on the emotion tendency analysis, sample part-of-speech characteristics are constructed to assist emotion tendency analysis.

In operation S502, the sample semantic features, the sample part-of-speech features, and the sample co-occurrence word features are concatenated to obtain sample concatenation features.

According to embodiments of the present disclosure, sample co-occurrence word features may be input into an embedding layer with dimension 2 x 1, which is mapped to a single-valued vector, e.g., a vector

[ [0.0427], [0.0427], [ -0.0015], [0.0427] … … ], shape was 1 x 512 x 1.

Sample part-of-speech features may be input into an embedding layer with dimensions 30 × 16, mapping part-of-speech information for each character to a 16-dimensional continuous vector, e.g., adjective part-of-speech 'a' as:

[0.0133, -0.0248, -0.0248,0.0067 … …, -0.0092], the part-of-speech feature vector dimension of the whole sample is 1 × 512 × 16.

Sample semantic features may be represented by BERT models as vectors in 1 x 512 x 768 dimensions.

The sample semantic features, the sample part-of-speech features and the sample co-occurrence features can be spliced in a third dimension to obtain a sample splicing feature with the dimension of 1 × 512 × 785.

In operation S503, the sample concatenation feature is preprocessed to obtain a sample word sequence feature in the training sample.

According to the embodiment of the disclosure, the sample concatenation feature can be input into a Highway Network for further extraction of features, and a module in the Highway Network can be, for example, a 785 by 785 fully-connected layer, and the sample word sequence feature in the training sample is output without changing the shape of the vector.

In operation S504, the sample word sequence features in the training sample are aggregated to obtain a sample sentence vector of the training sample.

According to an embodiment of the present disclosure, a sample word sequence feature in a training sample may be input to a GRU for encoding, resulting in a sample sentence vector of the training sample, for example, the dimension may be 1 × 785.

In operation S505, the sentence vectors of the sample are input into the classification model, and the emotion tendency classification result of the training sample is output.

According to an embodiment of the present disclosure, the sample sentence vector may be input into 785 × 3 fully connected layers, outputting a score belonging to each emotional propensity category. For example, for the training samples: [ this notebook computer works very well, runs quickly, recommends everyone to buy, and with this computer, besides running quickly, its price is also very cheap, very cost! The probability that the output of the user abcd, +1] is, for example, tenor ([ -2.8360, -3.6533,5.5173]), the probability of the third class is the highest, and the corresponding emotion label is +1, that is, the emotion tendency classification result of the training sample is in the positive direction.

In operation S506, parameters of the classification model are adjusted based on the emotion tendency classification result and the emotion label, and the trained classification model is used as an emotion tendency analysis model.

According to the embodiment of the present disclosure, Loss may be calculated by using a CrossEntropy Loss function according to probabilities (logits) and emotion labels (labels) in the emotion tendency classification result, for example, Loss may be corssentrytprytprytprytprytprytprytprytprytprytprytprytproof (logis, label) 10.2660. And adjusting parameters of the classification model according to the loss value, and taking the trained classification model as an emotional tendency analysis model.

According to the embodiment of the disclosure, model parameters such as embedding of words, BERT model, Highway Network and GRU of each character can be updated in the loss back propagation process, an adam optimizer is adopted, the initial learning rate is set to be 1e-5, iteration is continuously updated, the loss is reduced to be below 1, and a converged model is obtained for prediction.

Based on the method for analyzing the emotion tendentiousness of the text, the disclosure also provides a device for analyzing the emotion tendentiousness of the text. The apparatus will be described in detail below with reference to fig. 6.

Fig. 6 schematically shows a block diagram of the emotion tendentiousness analysis apparatus for text according to an embodiment of the present disclosure.

As shown in fig. 6, the apparatus 600 for analyzing emotion tendencies of texts in this embodiment includes a feature extraction module 610, a feature concatenation module 620, a preprocessing module 630, an aggregation module 640, and an analysis module 650.

The feature extraction module 610 is configured to respectively extract semantic features, part-of-speech features, and co-occurrence features from a text to be analyzed, where the text to be analyzed includes text content and comment content associated with the text content. In an embodiment, the feature extraction module 610 may be configured to perform the operation S201 described above, which is not described herein again.

The feature concatenation module 620 is configured to concatenate the semantic feature, the part-of-speech feature, and the co-occurrence word feature to obtain a concatenation feature. In an embodiment, the feature splicing module 620 may be configured to perform the operation S202 described above, which is not described herein again.

The preprocessing module 630 is configured to preprocess the concatenation characteristics to obtain word sequence characteristics in the text to be analyzed. In an embodiment, the preprocessing module 630 may be configured to perform the operation S203 described above, which is not described herein again.

The aggregation module 640 is configured to aggregate word sequence features in the text to be analyzed to obtain a sentence vector of the text to be analyzed. In an embodiment, the aggregation module 640 may be configured to perform the operation S204 described above, which is not described herein again.

The analysis module 650 is configured to input the sentence vector into the emotion tendency analysis model, and output an emotion tendency analysis result of the text to be analyzed. In an embodiment, the analysis module 650 may be configured to perform the operation S205 described above, which is not described herein again.

According to an embodiment of the present disclosure, any plurality of the feature extraction module 610, the feature concatenation module 620, the preprocessing module 630, the aggregation module 640, and the analysis module 650 may be combined into one module to be implemented, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the feature extraction module 610, the feature concatenation module 620, the preprocessing module 630, the aggregation module 640, and the analysis module 650 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or in any one of three implementations of software, hardware, and firmware, or in a suitable combination of any of them. Alternatively, at least one of the feature extraction module 610, the feature concatenation module 620, the pre-processing module 630, the aggregation module 640 and the analysis module 650 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.

FIG. 7 schematically shows a block diagram of an electronic device adapted for a method of sentiment orientation analysis of text according to an embodiment of the present disclosure.

As shown in fig. 7, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.

In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are stored. The processor 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. The processor 701 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 702 and/or the RAM 703. It is noted that the programs may also be stored in one or more memories other than the ROM 702 and RAM 703. The processor 701 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

Electronic device 700 may also include input/output (I/O) interface 705, which input/output (I/O) interface 705 is also connected to bus 704, according to an embodiment of the present disclosure. The electronic device 700 may also include one or more of the following components connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 702 and/or the RAM703 and/or one or more memories other than the ROM 702 and the RAM703 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the method provided by the embodiment of the disclosure.

The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 701. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of a signal on a network medium, distributed, downloaded and installed via the communication section 709, and/or installed from the removable medium 711. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by the processor 701, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. A method for emotion tendentiousness analysis of text, comprising:

2. The method according to claim 1, wherein the preprocessing the concatenation features to obtain word sequence features in the text to be analyzed comprises:

performing linear activation conversion on the splicing characteristics to generate linear activated characteristics;

and combining the linearly activated features and the non-linearly activated features according to the weight vector to obtain word sequence features in the text to be analyzed.

3. The method of claim 2, wherein the deriving a weight vector using an attention mechanism based on the stitching features comprises:

obtaining splicing characteristics with different weight values by using the attention mechanism;

and generating the weight vector by using the splicing features with different weight values after the dimension reduction through a nonlinear activation function.

4. The method according to claim 1, wherein the aggregating word sequence features in the text to be analyzed to obtain a sentence vector of the text to be analyzed comprises:

and outputting the sentence vector of the text to be analyzed.

5. The method as claimed in claim 1, wherein said extracting semantic features, part-of-speech features and co-occurrence features from the text to be analyzed respectively comprises:

inputting the text to be analyzed into a pre-training language representation model, and outputting the semantic features;

embedding and representing the words which commonly appear in the text content and the comment content to obtain the characteristics of the commonly-appearing words;

and embedding and representing the part of speech of each word in the text to be analyzed to obtain the part of speech characteristics.

6. The method of claim 1, wherein the emotional orientation analysis model is obtained by pre-training; the pre-training method comprises the following steps:

respectively extracting sample semantic features, sample part-of-speech features and sample co-occurrence word features from training samples, wherein the training samples comprise: a text content sample, a comment content sample associated with the text content sample, and an emotion tag;

preprocessing the sample splicing characteristics to obtain sample word sequence characteristics in the training samples;

inputting the sentence vectors into a classification model, and outputting the emotion tendency classification result of the training sample;

and adjusting parameters of the classification model based on the emotion tendency classification result and the emotion label, and taking the trained classification model as the emotion tendency analysis model.

7. An emotional tendency analysis device for a text, comprising:

and the analysis module is used for inputting the sentence vector into an emotion tendency analysis model and outputting an emotion tendency analysis result of the text to be analyzed.

8. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-6.

9. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 6.

10. A computer program product comprising a computer program which, when executed by a processor, carries out the method according to any one of claims 1 to 6.