CN113536803A

CN113536803A - Text information processing device and method, computer equipment and readable storage medium

Info

Publication number: CN113536803A
Application number: CN202010639599.9A
Authority: CN
Inventors: 王炳乾; 梁天新; 周希波
Original assignee: BOE Technology Group Co Ltd
Current assignee: BOE Technology Group Co Ltd
Priority date: 2020-04-13
Filing date: 2020-07-06
Publication date: 2021-10-22
Anticipated expiration: 2040-07-06
Also published as: CN113536803B

Abstract

The invention discloses a text information processing device and method, a computer device and a readable storage medium. One embodiment of the apparatus comprises: the coder based on the BERT model is used for extracting semantic feature vectors of the text to be analyzed; and the information processing module is used for carrying out minimum solution on an objective function containing a first loss function and a second loss function according to the semantic feature vector of the text to be analyzed to obtain at least one aspect word contained in the text to be analyzed and the information polarity of the aspect word, wherein the objective of the first loss function is to label the starting and ending position of the aspect word in the text to be analyzed, and the objective of the second loss function is to classify the information polarity of the aspect word. According to the implementation mode, an end-to-end fine-grained information processing model capable of simultaneously realizing extraction of multi-aspect words and information polarity classification is built, and the accuracy and the recall rate of fine-grained information processing can be improved.

Description

Text information processing device and method, computer equipment and readable storage medium

Technical Field

The invention relates to the technical field of text analysis. And more particularly, to a text information processing apparatus and method, a computer device, and a readable storage medium.

Background

With the rise of online social networks, a large number of users express their experiences and evaluations of life, events, products and the like on the internet by publishing characters. These word expressions provide a data basis for textual information processing research. Textual information processing studies the attitudes and opinions that people express in text. Fine-grained information processing is one of the subdivided fields, and the attitude and the viewpoint of fine granularity are researched. Fine-grained information processing still faces many difficulties and challenges in task definition, data preparation, and method effectiveness. Firstly, the classification research of text information with fine-grained information polarity can be used for extracting information words with attitudes and viewpoints from the text, and the related research has great application value in public opinion monitoring. Most of the previous information classification researches assume that only one viewpoint exists in the text, and ignore the phenomenon that the social network text contains multiple viewpoints. It is challenging how to identify all kinds of opinions that text contains, especially for short text scenes. Secondly, how to correspond various viewpoints and aspects is an aspect level information processing problem, which is a research on fine-grained information and can be further divided into two categories, namely information processing for aspect words and information processing for aspect categories. How to design a unified method and solve two aspect level information processing problems simultaneously is challenging.

With the rapid development of deep learning, a new method is provided for fine-grained information processing. The Google open-source pre-trained language model BERT achieves optimal results on 11 natural language processing tasks. The BERT model is called as Bidirectional Encoder responses from transducers, and is a novel language model. It is a new language model because it trains a pre-trained deep bi-directional representation by jointly adjusting bi-directional transformers in all layers. The pre-trained language model plays an important role for many natural language processing questions, such as SQuAD question-answering tasks, named entity recognition, and opinion recognition. At present, two strategies mainly exist for applying a pre-trained language model to an NLP task, wherein one strategy is a characteristic-based language model such as an ELMo model; another is a fine-tuned based language model, such as OpenAI GPT. The two types of language models have advantages and disadvantages, and the appearance of BERT combines the advantages of the two types of language models, so that the optimal effect can be achieved on a plurality of subsequent specific tasks. However, the existing BERT model has problems of low accuracy and recall rate when applied to information processing of texts such as social network texts of restaurant reviews.

Therefore, it is desirable to provide a new text information processing apparatus and method, a computer device, and a readable storage medium.

Disclosure of Invention

The invention aims to provide a text information processing device and method, a computer device and a readable storage medium, so as to solve at least one of the problems in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

a first aspect of the present invention provides a text information processing apparatus comprising:

the coder based on the BERT model is used for extracting semantic feature vectors of the text to be analyzed;

and the information processing module is used for carrying out minimum solution on an objective function containing a first loss function and a second loss function according to the semantic feature vector of the text to be analyzed to obtain at least one aspect word contained in the text to be analyzed and the information polarity of the aspect word, wherein the objective of the first loss function is to label the starting and ending position of the aspect word in the text to be analyzed, and the objective of the second loss function is to classify the information polarity of the aspect word.

According to the text information processing device provided by the first aspect of the invention, an end-to-end fine-grained information processing model capable of simultaneously realizing extraction of words and information polarity classification is constructed, and the accuracy and recall rate of fine-grained information processing can be improved.

Optionally, the first loss function and the second loss function are cross entropy loss functions, respectively.

Optionally, the first loss function is:

wherein,

is the probability distribution of the starting position of the facet word at each position in the text to be analyzed,

n is the word length of the text to be analyzed.

Alternatively,

probability distribution of starting position of aspect word in each position of text to be analyzed

Expressed as:

wherein, W_startAs a first trainable weight vector, b_startIs the first bias term, σ is the sigmoid activation function,

for the two-class sequence labeled starting position, h_LSemantic feature vectors output by an L-th layer Transformer network of a BERT model, wherein the BERT model comprises the L-layer Transformer network;

probability distribution of termination position of aspect word in each position of text to be analyzed

Expressed as:

wherein, W_endAs a second trainable weight vector, b_endIn order to be a second bias term, the first bias term,

the binary sequence with the end position marked.

The optional mode adopts two binary classified sequences to judge the confidence coefficient that each sequence position of the semantic feature vector output by the L-th layer Transformer network of the BERT model is the starting and stopping position of the aspect word, so that the aspect word in the text to be analyzed can be accurately extracted.

Optionally, the second loss function is:

wherein k is the number of classification labels of the information polarity;

a known correct classification label;

the result probability is predicted for the information polarity classification of the aspect word, expressed as

h_pFor the markup symbols [ CLS ] in semantic feature vectors output by the L-th transform network of the BERT model]Corresponding semantic feature vector h_clsSemantic feature vector h of aspect words in semantic feature vector output by L-th layer Transformer network of BERT model_aspSplicing the obtained comprehensive semantic feature vectors; w_pFor the parameter matrix involved in training, W_p∈R^k×HH is the number of hidden layer units, parameter matrix W_pAll parameters of (a) are used to be jointly refined to achieve a minimization solution; b_pIs the third bias term.

According to the extracted aspect words and the boundary thereof, the optional mode can obtain the aspect word representation from the semantic feature vector output by the transform network of the L-th layer of the BERT model acquired from the shared layer, and the aspect word representation is spliced with the whole semantic representation to carry out information polarity classification, so that the information polarity classification of the aspect words in the text to be analyzed can be accurately realized.

The second aspect of the present invention provides a text information processing method, including:

adopting a BERT model as an encoder to extract semantic feature vectors of a text to be analyzed;

and according to the semantic feature vector of the text to be analyzed, performing minimum solution on an objective function comprising a first loss function and a second loss function to obtain at least one aspect word contained in the text to be analyzed and the information polarity of the aspect word, wherein the objective of the first loss function is to label the starting and ending positions of the aspect word in the text to be analyzed, and the objective of the second loss function is to classify the information polarity of the aspect word.

Optionally, the first loss function is:

wherein,

n is the word length of the text to be analyzed.

Alternatively,

Expressed as:

Expressed as:

the binary sequence with the end position marked.

Optionally, the second loss function is:

wherein k is the number of classification labels of the information polarity;

a known correct classification label;

A third aspect of the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the text information processing method provided by the second aspect of the present invention when executing the program.

A fourth aspect of the present invention provides a computer-readable storage medium on which a computer program is stored, which, when executed by a processor, implements the text information processing method provided by the second aspect of the present invention.

The invention has the following beneficial effects:

according to the technical scheme, an end-to-end fine-grained information processing model capable of simultaneously realizing extraction of words and information polarity classification is constructed, and the accuracy and the recall rate of fine-grained information processing can be improved.

Drawings

The following detailed description of embodiments of the invention is provided in conjunction with the appended drawings:

fig. 1 is a schematic diagram illustrating an overall framework of an end-to-end fine-grained information processing model constructed by a text information processing apparatus according to an embodiment of the present invention.

Fig. 2 is a flowchart illustrating a text information processing method according to an embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a computer system that implements a text information processing apparatus according to an embodiment of the present invention.

Detailed Description

In order to more clearly illustrate the invention, the invention is further described below with reference to preferred embodiments and the accompanying drawings. Similar parts in the figures are denoted by the same reference numerals. It is to be understood by persons skilled in the art that the following detailed description is illustrative and not restrictive, and is not to be taken as limiting the scope of the invention.

As shown in fig. 1, an embodiment of the present invention provides a text information processing apparatus including:

The text information processing device provided by the embodiment constructs an end-to-end fine-grained information processing model capable of simultaneously realizing extraction of words and information polarity classification, and can improve the accuracy and recall rate of fine-grained information processing.

The end-to-end fine-grained information processing model constructed by the text information processing device provided by the embodiment utilizes the BERT model as an encoder to extract the semantic feature vector of the text to be analyzed as a shared layer, and can simultaneously realize two tasks of multi-aspect word extraction and information polarity classification based on the words in various aspects for the text to be analyzed. Taking the text to be analyzed as an example of an english commenting text, for a restaurant comment "It's sad that at least one commenting is about place was great (unfortunately, all things except steak in this place are very good (even service and decoration)), the end-to-end fine-grained information processing model constructed by the text information processing apparatus provided by the embodiment can simultaneously extract a plurality of aspect words in the text, namely service, decor, steak and their corresponding information polarities, namely positive, positive and negative, the text information processing apparatus provided by the embodiment is also applicable to chinese texts.

As shown in table 1, unlike the existing information processing model (ABSA) based on the Aspect, the end-to-end fine-grained information processing model (E2E-ABSA) scheme constructed by the text information processing apparatus provided in this embodiment only needs to input a text to be analyzed (sequence), for example, an APP comment short text, and can output all the aspects (Aspect) and information polarities (Sentiment polarity) corresponding to the aspects in the text:

TABLE 1

In some optional implementation manners of this embodiment, the extracting of the semantic feature vector of the text to be analyzed by the BERT model-based encoder specifically includes: at the data input end, firstly, the input text to be analyzed is participled through a participler to obtain a participleThe latter sequence X ═ X₁,x₂,...,x_n) The dictionary of the word segmentation device is consistent with the size of the BERT model, and is provided with 30522 wordpientes (words or word fragments). The sequence X is then encoded into a Token Embedding, Segment Embedding and Position Embedding. Adding three vectors as a total input-embedded representation or input-vector representation h₀Represents the input vector as h₀Obtaining semantic feature vector h of text to be analyzed through a Transformer network of an L layer of a BERT model_LWherein the input vector represents h₀And the output of the i-th layer of the Transformer network in the L-layer of the Transformer network is as follows:

h₀＝XW_t+W_s+W_p

h_i＝Transformer(h_i-1),i∈[1,L]

wherein, W_tEmbedding matrices for words, W_pFor position-embedding matrices, W_sSentence embedding matrix, h_iIs the output of the i-th layer Transformer network or the hidden layer vector.

In some optional implementation manners of this embodiment, the information processing module is configured to execute two tasks, which are specifically as follows:

for the multifaceted word extraction (Multi-Aspect Extractor) task:

different from the existing sequence table labeling scheme, the embodiment adopts a double-pointer labeling mode to realize an Aspect word (Aspect) extraction task. The method specifically comprises the following steps:

as shown in FIG. 1, two binary 0/1 sequences S are used_sAnd S_eMarking the starting and ending positions of the Aspect word (Aspect) in the text to be analyzed, specifically determining the starting and ending positions of the Aspect word (Aspect) by judging the possibility that an input sequence (semantic feature vector output by an L-th layer Transformer network of a BERT model) is 0/1 at each position by adopting two binary classification sequences, namely determining the starting and ending positions of the Aspect word (Aspect) by S_sAnd S_eThe confidence level of the starting and ending positions of the Aspect word (Aspect) at each position in the system is possible to determine the Aspect word (Aspect).

Wherein the start position of a certain Aspect word (Aspect) isS_sProbability distribution of possible occurrence at each position in

(confidence) is expressed as:

to label the process value of the binary sequence of the starting position, h_LSemantic feature vectors or coding expressions output by an L-th layer Transformer network of the BERT model;

similarly, the end position of a certain Aspect word (Aspect) is S_eProbability distribution of possible occurrence at each position in

(confidence) is expressed as:

the process value of the binary sequence for the start position is labeled.

Finally, two vectors can be obtained

And

the objective function of the training (specifically the cross entropy loss function) is:

where n is the word length of the text to be analyzed (number of singles for English, number of words for Chinese).

For example, for the text to be analyzed illustrated in fig. 1, "it may be a bit packet on trees, but the week is good and it's the best French food you fine in the area" (the weekend may be somewhat crowded, but the atmosphere is good and this is the best French meal in the area), the positions corresponding to "packet", "video" and "French" in the binary sequence of the labeled start position have a value of 1, and the others are 0; the positions corresponding to the "packed", "vibe", and "food" in the two classification sequences labeled with the end positions have a value of 1, and the others have values of 0, whereby the headwords "packed", "vibe", and "French food" can be extracted.

The mode for realizing the aspect word extraction task adopts two binary classified sequences to judge the confidence degree that each sequence position of the semantic feature vector output by the L-th layer Transformer network of the BERT model is the starting and ending position of the aspect word, so that the aspect word in the text to be analyzed can be accurately extracted.

Information polarity Classification for facet words (sententimportarity Classifier) task:

for a fine-grained information processing task, the embodiment converts the information processing problem of the aspect words into a polarity classification problem, and the polarity of the aspect words is used as a classification label. In order to obtain the vector representation of the aspect words, the output result h is coded and output from the BERT model according to the boundaries of the aspect words (the starting and ending positions of the aspect words are the boundaries) obtained by the aspect word extraction task_LSemantic feature vector or coded representation h of mid-extraction aspect words_aspAnd a semantic feature vector or a coded representation h integrating the overall semantic meaning and the meaning of the aspect word_pIn this embodiment, the BERT model is encoded and output as a special mark symbol [ CLS ]]Pooled outputs of corresponding final hidden states (i.e., by L-th layer transform of BERT model)Mark symbol [ CLS ] in semantic feature vector output by mer network]Corresponding semantic feature vector) h_clsAnd according to aspect word boundaries from h_LCoded representation h of extracted aspect words_aspSplicing to obtain a comprehensive semantic representation h_p：

h_p＝concatenate([h_cls,h_asp])

Wherein(s)_i',e_j) Representing the boundaries of the facet words.

Then, the information polarity classification prediction result probability of the aspect word is expressed as:

wherein, W_pFor the parameter matrix involved in training, W_p∈R^k×HK is the number of classification labels of information polarity, H is the number of hidden layer units, and a parameter matrix W_pAll parameters of (a) are used to be jointly refined to achieve a minimization solution; b_pIs the third bias term.

The cross entropy loss function is adopted as the objective function (loss function) of the polarity classification model:

wherein,

a known correct classification label;

classify the predicted result probability for the information polarity of the facet word, which is expressed in

For parameter matrix W_pAll parameters are combined and finely adjusted to maximize the logarithmic probability of a correct label, and then loss can be realized_pAnd (4) minimizing.

According to the method for realizing the information polarity classification task of the aspect words, the aspect word representation can be obtained from the semantic feature vector output by the transform network of the L-th layer of the BERT model obtained from the shared layer according to the extracted aspect words and the boundary thereof, and the information polarity classification can be carried out by splicing with the whole semantic representation, so that the information polarity classification of the aspect words in the text to be analyzed can be accurately realized.

In summary, the text information processing apparatus provided in this embodiment may construct an end-to-end fine-grained information processing model having an overall framework as shown in fig. 1, and encode a text to be analyzed by using a BERT model to obtain a text expression vector h_LExtracting all aspect words and their boundaries(s) in the text using two binary classifications_i',e_j) Completing the task of extracting various words, passing through the boundary of the various words and h_LObtaining a coded or semantic representation h of an aspect word_aspExpress the whole semantic meaning h_clsSemantic representation h with aspect word_aspSplicing to obtain a comprehensive semantic representation or semantic vector h with aspect perception_pAnd predicting the information polarity of the corresponding aspect word in the text to be analyzed through a multi-classification. Wherein h is_LAs a shared coding layer, the overall objective function of the two tasks is expressed as: loss is loss_asp+loss_p。

Finally, only one text needs to be input at the input end of the end-to-end fine-grained information processing model constructed by the text information processing device provided by the embodiment, all the aspect words and the corresponding information polarities of the aspect words in the text can be obtained, and therefore end-to-end fine-grained information processing is achieved.

The following provides a further explanation of the performance of the end-to-end fine-grained information processing model constructed by the text information processing apparatus according to this embodiment with data of comparative experiments.

As shown in table 2, a three-year data set (referred to as "reserve (total)) in the data set setup disclosed in SemEval 2014/2015/2016 and in the data set reserve disclosed in SemEval 2014/2015/2016 is used as an experimental object. These data give the start-stop position of the facet and one of the three information polarities (positive: +, negative: -, neutral: 0) for the facet.

TABLE 2

Dataset	#Sent	#Targets	#+	#-	#o
						LAPtop	1869	2936	1326	990	620
RESTAURANT(total)	3900	6603	4134	1538	931

To verify the validity of the scheme, the experimental results of the end-to-end fine-grained information processing model constructed by the text information processing apparatus provided in this embodiment on the LAPTOP and restaring data sets are compared with the experimental results of several current end-to-end models with better performance, as shown in table 3, the initial value of the learning rate of BERT-Large of the pre-training model Google open source used in the experiment is 5e-5, the second epoch is reduced to 2e-5, 50 epochs are trained, and the optimal model is saved in the training process:

TABLE 3

Experimental results show that the experimental results of the end-to-end fine-grained information processing model constructed by the text information processing device provided by the embodiment on two data sets are obviously improved compared with the results of several current end-to-end models such as UNIFIED, TAG-join, SPAN-join and the like, the F1 value on the LAPTOP data set is improved by 4+ percentage points, and the F1 value on the RESTURANT data set is improved by 7+ percentage points, so that the reliability and the effectiveness of the end-to-end fine-grained information processing model constructed by the text information processing device provided by the embodiment are shown.

As shown in fig. 2, another embodiment of the present invention provides a text information processing method, including:

s1, extracting semantic feature vectors of the text to be analyzed by using a BERT model as an encoder;

s2, according to the semantic feature vector of the text to be analyzed, performing minimum solution on an objective function including a first loss function and a second loss function to obtain at least one aspect word included in the text to be analyzed and the information polarity of the aspect word, wherein the objective of the first loss function is to label the starting and ending positions of the aspect word in the text to be analyzed, and the objective of the second loss function classifies the information polarity of the aspect word.

In some optional implementations of this embodiment, the first loss function and the second loss function are cross entropy loss functions, respectively.

In some optional implementations of this embodiment, the first loss function is:

wherein,

n is the word length of the text to be analyzed.

In some alternative implementations of the present embodiment,

Expressed as:

to label the process value of the binary sequence of the starting position, h_LSemantic feature vectors output by an L-th layer Transformer network of a BERT model, wherein the BERT model comprises the L-layer Transformer network;

Expressed as:

the process value of the binary sequence that marks the termination location.

In some optional implementations of this embodiment, the second loss function is:

wherein k is the number of classification labels of the information polarity;

a known correct classification label;

It should be noted that the principle and the work flow of the text information processing method provided in this embodiment are similar to those of the text information processing apparatus, and reference may be made to the above description for relevant parts, which are not described herein again.

As shown in fig. 3, a computer system suitable for implementing the text information processing apparatus provided by the above-described embodiments includes a central processing module (CPU) that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage section into a Random Access Memory (RAM). In the RAM, various programs and data necessary for the operation of the computer system are also stored. The CPU, ROM, and RAM are connected thereto via a bus. An input/output (I/O) interface is also connected to the bus.

An input section including a keyboard, a mouse, and the like; an output section including a speaker and the like such as a Liquid Crystal Display (LCD); a storage section including a hard disk and the like; and a communication section including a network interface card such as a LAN card, a modem, or the like. The communication section performs communication processing via a network such as the internet. The drive is also connected to the I/O interface as needed. A removable medium such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive as necessary, so that a computer program read out therefrom is mounted into the storage section as necessary.

In particular, the processes described in the above flowcharts may be implemented as computer software programs according to the present embodiment. For example, the present embodiments include a computer program product comprising a computer program tangibly embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium.

The flowchart and schematic diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to the present embodiments. In this regard, each block in the flowchart or schematic diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the schematic and/or flowchart illustration, and combinations of blocks in the schematic and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

On the other hand, the present embodiment also provides a nonvolatile computer storage medium, which may be the nonvolatile computer storage medium included in the apparatus in the foregoing embodiment, or may be a nonvolatile computer storage medium that exists separately and is not assembled into a terminal. The non-volatile computer storage medium stores one or more programs that, when executed by a device, cause the device to: adopting a BERT model as an encoder to extract semantic feature vectors of a text to be analyzed; and according to the semantic feature vector of the text to be analyzed, performing minimum solution on an objective function comprising a first loss function and a second loss function to obtain at least one aspect word contained in the text to be analyzed and the information polarity of the aspect word, wherein the objective of the first loss function is to label the starting and ending positions of the aspect word in the text to be analyzed, and the objective of the second loss function is to classify the information polarity of the aspect word.

It is to be noted that, in the description of the present invention, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

It should be understood that the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention, and it will be obvious to those skilled in the art that other variations and modifications can be made on the basis of the above description, and all embodiments cannot be exhaustive, and all obvious variations and modifications belonging to the technical scheme of the present invention are within the protection scope of the present invention.

Claims

1. A text information processing apparatus characterized by comprising:

2. The apparatus of claim 1, wherein the first and second loss functions are cross-entropy loss functions, respectively.

3. The apparatus of claim 2, wherein the first loss function is:

wherein,

n is the word length of the text to be analyzed.

4. The apparatus of claim 3,

Expressed as:

Expressed as:

the binary sequence with the end position marked.

5. The apparatus of claim 2, wherein the second loss function is:

wherein k is the number of classification labels of the information polarity;

a known correct classification label;

h_pFor the markup symbols [ CLS ] in semantic feature vectors output by the L-th transform network of the BERT model]Corresponding semantic feature vector h_clsSemantic feature vector h of aspect words in semantic feature vector output by L-th layer Transformer network of BERT model_aspSplicing the obtained comprehensive semantic feature vectors; w_pFor the parameter matrix involved in training, W_p∈R^k×HH is the number of hidden layer units, parameter matrix W_pAll ofParameters are used to be jointly refined to achieve minimization solution; b_pIs the third bias term.

6. A text information processing method, comprising:

7. The method of claim 6, wherein the first and second loss functions are cross-entropy loss functions, respectively.

8. The method of claim 7, wherein the first loss function is:

wherein,

n is the word length of the text to be analyzed.

9. The method of claim 8,

Expressed as:

Expressed as:

the binary sequence with the end position marked.

10. The method of claim 7, wherein the second loss function is:

wherein k is the number of classification labels of the information polarity;

a known correct classification label;

11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 6-10 when executing the program.

12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 6-10.