CN114281993B

CN114281993B - A text empathy prediction system and method

Info

Publication number: CN114281993B
Application number: CN202111592897.8A
Authority: CN
Inventors: 王上飞; 李晨光; 陈小平
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2021-12-23
Filing date: 2021-12-23
Publication date: 2024-03-29
Anticipated expiration: 2041-12-23
Also published as: CN114281993A

Abstract

The invention discloses a text co-emotion prediction system and a method, wherein the system comprises the following steps: the public-emotion private feature encoder encodes private features of the public-emotion data into public-emotion private features; the polarity private feature encoder encodes the private feature of the polarity data into a polarity private feature; the public feature encoder encodes public features of the common-case data and the polarity data into public features of the common-case data and the polarity data respectively; the public and private common-situation feature fusion module is used for weighting and fusing public and private common-situation features into final common-situation prediction feature expression; the polarity public and private feature fusion module is used for weighting and fusing the polarity public and private features into final polarity classification feature expression; the co-condition predictor predicts the final co-condition prediction feature expression to obtain a prediction label; the polarity classifier predicts the final polarity classification feature expression to obtain a predictive tag. The system and the method can realize small-scale text co-emotion prediction through large-scale text emotion classification data based on transfer learning, and improve prediction accuracy.

Description

A text empathy prediction system and method

技术领域Technical Field

本发明涉及自然语言处理领域，尤其涉及一种文本共情预测系统及方法。The present invention relates to the field of natural language processing, and in particular to a text empathy prediction system and method.

背景技术Background Art

共情(同理心)作为情感的重要组成部分，反应了人们面对他人的遭遇或目睹他人的境况时所产生的对应的情感。共情可以反映人们对于他人遭遇所产生的情感及反馈，且共情分析这一领域也与人机交互，情感分析等领域息息相关。因而识别文本内所蕴含的共情因素是非常必要的，具有较强的研究价值。Empathy, as an important component of emotion, reflects the corresponding emotions people have when facing others' experiences or witnessing others' situations. Empathy can reflect people's emotions and feedback on others' experiences, and the field of empathy analysis is also closely related to human-computer interaction, sentiment analysis and other fields. Therefore, it is very necessary to identify the empathy factors contained in the text, which has strong research value.

但基于文本的共情数据集的样本量都过小。目前公认的开源文本共情数据集均只包含一千多条数据，并且现有主流共情预测方法均是单独在共情数据集上进行共情预测。显然，当数据量不足时，所训练出的模型的泛化能力就会比较差，且其预测精度也不会太高。与之形成鲜明对比的是，一些其他的情感分析任务拥有非常充足的训练数据。以情感极性分类这一任务为例，目前该任务对应非常多的开源数据集，这些数据集包含数万乃至数十万训练样本。因此，因开源文本共情数据集数据量小，导致文本处理中共情预测的准确性差是需要解决的问题。However, the sample size of text-based empathy datasets is too small. Currently, the recognized open source text empathy datasets only contain more than a thousand data points, and the existing mainstream empathy prediction methods all perform empathy prediction on empathy datasets alone. Obviously, when the amount of data is insufficient, the generalization ability of the trained model will be relatively poor, and its prediction accuracy will not be too high. In sharp contrast, some other sentiment analysis tasks have very sufficient training data. Taking the task of sentiment polarity classification as an example, this task currently corresponds to a large number of open source datasets, which contain tens of thousands or even hundreds of thousands of training samples. Therefore, the poor accuracy of empathy prediction in text processing due to the small amount of data in open source text empathy datasets is a problem that needs to be solved.

有鉴于此，特提出本发明。In view of this, the present invention is proposed.

发明内容Summary of the invention

本发明的目的是提供了一种文本共情预测系统及方法，能利用样本数据量大的文本情感分类数据，辅助进行文本共情预测，提升文本处理中共情预测准确性，进而解决现有技术中存在的上述技术问题。The purpose of the present invention is to provide a text empathy prediction system and method, which can utilize text sentiment classification data with a large amount of sample data to assist in text empathy prediction, improve the accuracy of empathy prediction in text processing, and thus solve the above-mentioned technical problems existing in the prior art.

本发明的目的是通过以下技术方案实现的：The objective of the present invention is achieved through the following technical solutions:

本发明实施方式提供一种文本共情预测系统，包括：An embodiment of the present invention provides a text empathy prediction system, comprising:

共情私有特征编码器、极性私有特征编码器、公共特征编码器、共情公私有特征融合模块、极性公私有特征融合模块、共情预测器和极性分类器；其中，Empathy private feature encoder, polarity private feature encoder, public feature encoder, empathy public-private feature fusion module, polarity public-private feature fusion module, empathy predictor and polarity classifier; wherein,

所述共情私有特征编码器的输入为共情数据集中的共情数据，能对输入的所述共情数据的私有特征进行编码，得到共情私有特征；The input of the empathy private feature encoder is the empathy data in the empathy data set, and the private features of the input empathy data can be encoded to obtain the empathy private features;

所述极性私有特征编码器的输入为极性数据集中的极性数据，能对输入的所述极性数据的私有特征进行编码，得到极性私有特征；The polarity private feature encoder is inputted with polarity data in the polarity data set, and can encode the private features of the inputted polarity data to obtain the polarity private features;

所述公共特征编码器分别接收所述共情数据集中的共情数据与所述极性数据集中的极性数据，能对所述共情数据的公有特征进行编码，得到共情公有特征，以及对所述极性数据的公有特征进行编码，得到极性公有特征；The public feature encoder receives the empathy data in the empathy data set and the polarity data in the polarity data set respectively, and can encode the public features of the empathy data to obtain empathy public features, and encode the public features of the polarity data to obtain polarity public features;

所述共情公私有特征融合模块，分别与所述共情私有特征编码器的输出端和公共特征编码器的输出端连接，能将所述共情私有特征编码器输出的共情私有特征和所述公共特征编码器输出的共情公有特征加权融合为最终的共情预测特征表达；The empathy public-private feature fusion module is connected to the output end of the empathy private feature encoder and the output end of the public feature encoder respectively, and can weightedly fuse the empathy private features output by the empathy private feature encoder and the empathy public features output by the public feature encoder into a final empathy prediction feature expression;

所述极性公私有特征融合模块，分别与所述极性私有特征编码器的输出端和公共特征编码器的输出端连接，能将所述极性私有特征编码器输出的极性私有特征和所述公共特征编码器输出的极性公有特征加权融合为最终的极性分类特征表达；The polarity public-private feature fusion module is connected to the output end of the polarity private feature encoder and the output end of the public feature encoder respectively, and can weightedly fuse the polarity private features output by the polarity private feature encoder and the polarity public features output by the public feature encoder into a final polarity classification feature expression;

所述共情预测器，与所述共情公私有特征融合模块的输出端连接，能对最终的共情预测特征表达进行预测得出对应的共情标签；The empathy predictor is connected to the output end of the empathy public and private feature fusion module, and can predict the final empathy prediction feature expression to obtain the corresponding empathy label;

所述极性分类器，与所述极性公私有特征融合模块的输出端连接，能对最终的极性分类特征表达进行预测得出对应的极性标签。The polarity classifier is connected to the output end of the polarity public-private feature fusion module, and can predict the final polarity classification feature expression to obtain the corresponding polarity label.

本发明实施方式还提供一种文本共情预测方法，采用本发明所述的文本共情预测系统作为文本共情预测模型，包括：The embodiment of the present invention further provides a text empathy prediction method, which uses the text empathy prediction system of the present invention as a text empathy prediction model, including:

步骤1，对输入的共情数据集中的共情数据的私有特征进行编码得到共情私有特征，以及对输入的极性数据集中的极性数据的私有特征进行编码得到极性私有特征；Step 1, encoding the private features of the empathy data in the input empathy data set to obtain the empathy private features, and encoding the private features of the polarity data in the input polarity data set to obtain the polarity private features;

对输入的共情数据集中的共情数据的公有特征进行编码得到共情公有特征，以及对输入的极性数据集中的极性数据的公有特征进行编码得到极性公有特征；Encoding the public features of the empathy data in the input empathy data set to obtain empathy public features, and encoding the public features of the polarity data in the input polarity data set to obtain polarity public features;

步骤2，对所述步骤1得到的共情私有特征和共情公有特征进行加权融合成最终的共情预测特征表达；以及对所述步骤1得到的极性私有特征和极性公有特征进行加权融合成最终的极性预测特征表达；Step 2, weighting and fusing the empathy private features and the empathy public features obtained in step 1 into a final empathy prediction feature expression; and weighting and fusing the polarity private features and the polarity public features obtained in step 1 into a final polarity prediction feature expression;

步骤3，对所述步骤2得到的最终的共情预测特征表达进行共情预测得出作为预测结果的对应的共情标签；以及对最终的极性分类特征表达进行极性分类得出作为预测结果的对应的极性标签。Step 3: Perform empathy prediction on the final empathy prediction feature expression obtained in step 2 to obtain a corresponding empathy label as a prediction result; and perform polarity classification on the final polarity classification feature expression to obtain a corresponding polarity label as a prediction result.

与现有技术相比，本发明所提供的文本共情预测系统及方法，其有益效果包括：Compared with the prior art, the text empathy prediction system and method provided by the present invention have the following beneficial effects:

通过含有大量样本数据的情感极性数据集来辅助样本数据量小的文本共情数据集进行共情预测，并通过两个预测标签不同且数据集领域不同的任务间进行迁移学习的方式，可以同时消除数据集领域不同与预测标签不同所带来的的干扰。本发明的方法首次将迁移学习引入到文本共情预测中，目前在文本共情预测领域内较其余方法性能更优，由于通过大规模的情感极性数据集来辅助共情数据集，从而取得更好的共情预测效果。By using a sentiment polarity dataset containing a large amount of sample data to assist a text empathy dataset with a small amount of sample data to perform empathy prediction, and by performing transfer learning between two tasks with different prediction labels and different dataset fields, the interference caused by different dataset fields and different prediction labels can be eliminated at the same time. The method of the present invention introduces transfer learning into text empathy prediction for the first time, and currently has better performance than other methods in the field of text empathy prediction. Because a large-scale sentiment polarity dataset is used to assist an empathy dataset, a better empathy prediction effect is achieved.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域的普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings required for use in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other accompanying drawings can be obtained based on these accompanying drawings without paying creative work.

图1为本发明实施例提供的文本共情预测系统的构成示意图；FIG1 is a schematic diagram of the structure of a text empathy prediction system provided by an embodiment of the present invention;

图2为本发明实施例提供的另一种文本共情预测系统的构成示意图；FIG2 is a schematic diagram of another text empathy prediction system provided by an embodiment of the present invention;

图3为本发明实施例提供的又一种文本共情预测系统的构成示意图；FIG3 is a schematic diagram of the structure of another text empathy prediction system provided by an embodiment of the present invention;

图4为本发明实施例提供的对抗分类和抗干扰的文本共情预测系统的构成示意图。FIG4 is a schematic diagram of the structure of an adversarial classification and anti-interference text empathy prediction system provided by an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合本发明的具体内容，对本发明实施例中的技术方案进行清楚、完整地描述；显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例，这并不构成对本发明的限制。基于本发明的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明的保护范围。The following is a clear and complete description of the technical solutions in the embodiments of the present invention in combination with the specific content of the present invention; it is obvious that the described embodiments are only part of the embodiments of the present invention, not all of the embodiments, which does not constitute a limitation of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the protection scope of the present invention.

首先对本发明中可能使用的术语进行如下说明：First, the terms that may be used in the present invention are explained as follows:

术语“和/或”是表示两者任一或两者同时均可实现，例如，X和/或Y表示既包括“X”或“Y”的情况也包括“X和Y”的三种情况。The term “and/or” means that either or both of them can be realized at the same time. For example, X and/or Y means both “X” or “Y” and “X and Y”.

术语“包括”、“包含”、“含有”、“具有”或其它类似语义的描述，应被解释为非排它性的包括。例如：包括某技术特征要素(如原料、组分、成分、载体、剂型、材料、尺寸、零件、部件、机构、装置、步骤、工序、方法、反应条件、加工条件、参数、算法、信号、数据、产品或制品等)，应被解释为不仅包括明确列出的某技术特征要素，还可以包括未明确列出的本领域公知的其它技术特征要素。The terms "include", "comprises", "contains", "has" or other descriptions with similar semantics should be interpreted as non-exclusive inclusion. For example, including certain technical feature elements (such as raw materials, components, ingredients, carriers, dosage forms, materials, dimensions, parts, components, mechanisms, devices, steps, procedures, methods, reaction conditions, processing conditions, parameters, algorithms, signals, data, products or products, etc.) should be interpreted as including not only certain technical feature elements explicitly listed, but also other technical feature elements known in the art that are not explicitly listed.

下面对本发明所提供的文本共情预测系统及方法进行详细描述。本发明实施例中未作详细描述的内容属于本领域专业技术人员公知的现有技术。本发明实施例中未注明具体条件者，按照本领域常规条件或制造商建议的条件进行。本发明实施例中所用试剂或仪器未注明生产厂商者，均为可以通过市售购买获得的常规产品。The text empathy prediction system and method provided by the present invention are described in detail below. The contents not described in detail in the embodiments of the present invention belong to the prior art known to professional and technical personnel in the field. If no specific conditions are specified in the embodiments of the present invention, the conventional conditions in the field or the conditions recommended by the manufacturer are followed. The reagents or instruments used in the embodiments of the present invention, for which the manufacturer is not specified, are all conventional products that can be purchased commercially.

如图1所示，本发明实施例提供一种文本共情预测系统，包括：As shown in FIG1 , an embodiment of the present invention provides a text empathy prediction system, including:

如图2所示，上述的文本共情预测系统中，As shown in Figure 2, in the above text empathy prediction system,

所述共情私有特征编码器还输入极性数据集中的极性数据，能对输入的所述极性数据的私有特征进行编码，得到共情极性私有特征；The empathy private feature encoder also inputs polarity data in the polarity data set, and can encode the private features of the input polarity data to obtain empathy polarity private features;

所述极性私有特征编码器还输入共情数据集中的共情数据，能对输入的所述共情数据的私有特征进行编码，得到极性共情私有特征；The polarity private feature encoder also inputs the empathy data in the empathy data set, and can encode the private features of the input empathy data to obtain the polarity empathy private features;

还包括：领域二分类器，该领域二分类器的输入端分别与所述共情私有特征编码器、极性私有特征编码器和公共特征编码器的输出端连接，能以对抗分类损失方式对公共特征编码器输出的公共特征编码进行二分类处理；It also includes: a domain binary classifier, the input end of which is connected to the output end of the empathy private feature encoder, the polarity private feature encoder and the public feature encoder respectively, and can perform binary classification processing on the public feature code output by the public feature encoder in an adversarial classification loss manner;

如图3所示，上述的文本共情预测系统中，As shown in Figure 3, in the above text empathy prediction system,

还包括：共情铰链损失模块与极性铰链损失模块；其中，It also includes: an empathy hinge loss module and a polarity hinge loss module; wherein,

所述共情铰链损失模块与所述共情预测器的输出端连接，能在正确拼接结果对应的共情预测结果L_em与错误拼接结果对应的共情预测结果L_em’之间的差值小于预设差值时进行共情预测，所述正确拼接结果指共情私有特征与共情公有特征的拼接结果，所述错误拼接结果指极性私有特征与共情私有特征的拼接结果；The empathy hinge loss module is connected to the output end of the empathy predictor, and can perform empathy prediction when the difference between the empathy prediction result _Lem corresponding to the correct splicing result and the empathy prediction result _Lem ' corresponding to the incorrect splicing result is less than a preset difference, the correct splicing result refers to the splicing result of the empathy private feature and the empathy public feature, and the incorrect splicing result refers to the splicing result of the polarity private feature and the empathy private feature;

所述极性铰链损失模块与所述极性分类器的输出端连接，能在正确拼接结果对应的极性预测结果L_em与错误拼接结果对应的极性预测结果L_em’之间的差值小于预设差值时进行极性预测，所述正确拼接结果指极性私有特征与极性公有特征的拼接结果，所述错误拼接结果指共情私有特征与极性私有特征的拼接结果。The polarity hinge loss module is connected to the output end of the polarity classifier, and can perform polarity prediction when the difference between the polarity prediction result _Lem corresponding to the correct splicing result and the polarity prediction result _Lem ' corresponding to the incorrect splicing result is less than a preset difference, the correct splicing result refers to the splicing result of the polarity private feature and the polarity public feature, and the incorrect splicing result refers to the splicing result of the empathy private feature and the polarity private feature.

如图4所示，上述的文本共情预测系统中，As shown in Figure 4, in the above text empathy prediction system,

和/或，共情铰链损失模块与极性铰链损失模块；其中，and/or, an empathy hinge loss module and a polarity hinge loss module; wherein,

所述共情铰链损失模块与所述共情预测器的输出端连接，能在共情正确拼接结果对应的共情预测结果L_em与共情错误拼接结果对应的共情预测结果L_em’之间的差值小于预设差值时进行共情预测，所述共情正确拼接结果指共情私有特征与共情公有特征的拼接结果，所述共情错误拼接结果指极性私有特征与共情私有特征的拼接结果；这种共情铰链损失模块能扩大共情正确拼接结果对应的共情预测结果L_em与共情错误拼接结果对应的共情预测结果L_em’之间的差值；The empathy hinge loss module is connected to the output end of the empathy predictor, and can perform empathy prediction when the difference between the empathy prediction result _Lem corresponding to the empathy correct splicing result and the empathy prediction result _Lem ' corresponding to the empathy incorrect splicing result is less than a preset difference, the empathy correct splicing result refers to the splicing result of the empathy private feature and the empathy public feature, and the empathy incorrect splicing result refers to the splicing result of the polarity private feature and the empathy private feature; this empathy hinge loss module can enlarge the difference between the empathy prediction result _Lem corresponding to the empathy correct splicing result and the empathy prediction result _Lem ' corresponding to the empathy incorrect splicing result;

所述极性铰链损失模块与所述极性分类器的输出端连接，能在极性正确拼接结果对应的极性预测结果L_em与极性错误拼接结果对应的极性预测结果L_em’之间的差值小于预设差值时进行极性预测，所述极性正确拼接结果指极性私有特征与极性公有特征的拼接结果，所述极性错误拼接结果指共情私有特征与极性私有特征的拼接结果。这种极性铰链损失模块能扩大极性正确拼接结果对应的极性预测结果L_po与极性错误拼接结果对应的极性预测结果L_po’之间的差值。The polarity hinge loss module is connected to the output end of the polarity classifier, and can perform polarity prediction when the difference between the polarity prediction result _Lem corresponding to the polarity correct splicing result and the polarity prediction result _Lem ' corresponding to the polarity incorrect splicing result is less than a preset difference, the polarity correct splicing result refers to the splicing result of the polarity private feature and the polarity public feature, and the polarity incorrect splicing result refers to the splicing result of the empathy private feature and the polarity private feature. This polarity hinge loss module can expand the difference between the polarity prediction result _Lpo corresponding to the polarity correct splicing result and the polarity prediction result _Lpo ' corresponding to the polarity incorrect splicing result.

上述文本共情预测系统中，所述领域二分类器采用全连接网络。In the above text empathy prediction system, the domain binary classifier adopts a fully connected network.

上述文本共情预测系统中，所述共情私有特征编码器采用Bi-LSTM类型或BERT类型的特征编码器；In the above-mentioned text empathy prediction system, the empathy private feature encoder adopts a Bi-LSTM type or a BERT type feature encoder;

所述极性私有特征编码器采用Bi-LSTM类型或BERT类型的特征编码器；The polarity private feature encoder adopts a Bi-LSTM type or a BERT type feature encoder;

所述公共特征编码器采用Bi-LSTM类型或BERT类型的特征编码器；The common feature encoder adopts a Bi-LSTM type or a BERT type feature encoder;

所述共情公私有特征融合模块与极性公私有特征融合模块均采用注意力网络模块；The empathy public-private feature fusion module and the polarity public-private feature fusion module both use an attention network module;

所述共情预测器采用全连接网络；The empathy predictor adopts a fully connected network;

所述极性分类器采用全连接网络。The polarity classifier adopts a fully connected network.

上述图2、3和4所示的系统通过设计对抗分类损失来降低数据集领域差异所带来的干扰，并通过设置共情、极性铰链损失模块的方式来降低标签不同所带来的干扰。The systems shown in Figures 2, 3, and 4 above reduce the interference caused by differences in data set domains by designing adversarial classification losses, and reduce the interference caused by different labels by setting empathy and polarity hinge loss modules.

本发明实施例还提供一种文本共情预测方法，采用上述的文本共情预测系统作为文本共情预测模型，包括：The embodiment of the present invention further provides a text empathy prediction method, which uses the above-mentioned text empathy prediction system as a text empathy prediction model, including:

上述方法的步骤2中，加权融合得到的最终的共情预测特征表达f_em与最终的极性分类特征表达f_po分别为：In step 2 of the above method, the final empathy prediction feature expression _fem and the final polarity classification feature expression _fpo obtained by weighted fusion are respectively:

上述式(1)、(2)中，q为q_k*U(S_em),其中q_k为超参数向量，其维度为R^1*d，q_k为超参数向量，通过随机初始化，用于得到初始值Q；T为转置矩阵，即U(s_em)^T为U(s_em)的转置矩阵，方便矩阵乘积操作；d为特征向量的维度；U(s_em)为共情私有特征e_em(s_em)与共情公共特征e_c(s_em)的拼接结果，U(s_em)∈R^2*d；s_em为共情数据集中的共情数据；V(s_po)为极性私有特征e_po(s_po)与极性公共特征e_c(s_po)的拼接结果，V(s_po)∈R^2*d，s_po为极性数据集中的极性数据；In the above formulas (1) and (2), q is q _k *U(S _em ), where q _k is a hyperparameter vector with a dimension of R ^1*d , q _k is a hyperparameter vector, which is randomly initialized and used to obtain the initial value Q; T is the transposed matrix, that is, U(s _em ) ^T is the transposed matrix of U(s _em ), which is convenient for matrix multiplication operations; d is the dimension of the feature vector; U(s _em ) is the concatenation result of the empathy private feature e _em (s _em ) and the empathy public feature e _c (s _em ), U(s _em )∈R ^2*d ; s _em is the empathy data in the empathy dataset; V(s _po ) is the concatenation result of the polarity private feature e _po (s _po ) and the polarity public feature e _c (s _po ), V(s _po )∈R ^2*d , s _po is the polarity data in the polarity dataset;

所述步骤3中，将得到的最终的共情预测特征表达f_em输入到共情预测器g_em()中按公式(3)预测得到共情标签公式(3)为： In step 3, the final empathy prediction feature expression _fem is input into the empathy predictor _gem () to predict the empathy label according to formula (3). Formula (3) is:

将得到的最终的极性预测特征表达f_po输入到极性分类器g_po(.)中按公式(4)预测得到预测的极性标签公式(4)为： The final polarity prediction feature expression _fpo is input into the polarity classifier _gpo (.) and the predicted polarity label is obtained according to formula (4): Formula (4) is:

上述方法的步骤3中，若共情任务为回归任务，则共情预测器的训练损失函数为公式(5)：In step 3 of the above method, if the empathy task is a regression task, the training loss function of the empathy predictor is formula (5):

上述公式(5)中，θ_eem、θ_ec、θ_gem分别为共情私有特征编码器e_em()、公共特征编码器e_c()、共情预测器g_em()的参数；为预测的共情标签；em^*为真实的共情标签；In the above formula (5), θ _eem , θ _ec , θ _gem are the parameters of the empathy private feature encoder e _em (), the public feature encoder e _c (), and the empathy predictor g _em () respectively; is the predicted empathy label; em ^* is the true empathy label;

所述的θ_eem、θ_ec、θ_gem是根据共情私有特征编码器e_em()、公共特征编码器e_c()、共情预测器g_em()的具体模型框架进行设置的参数，如编码器为Bi-LSTM网络会对应一套θ_eem、θ_ec、θ_gem参数，编码器为其余网络模型会对应另一套θ_eem、θ_ec、θ_gem参数。The θ _eem , θ _ec , and θ _gem are parameters set according to the specific model framework of the empathy private feature encoder e _em (), the public feature encoder e _c (), and the empathy predictor g _em (). For example, if the encoder is a Bi-LSTM network, it will correspond to a set of θ _eem , θ _ec , and θ _gem parameters; if the encoder is other network models, it will correspond to another set of θ _eem , θ _ec , and θ _gem parameters.

若共情任务为分类任务，则共情预测器的训练损失函数为公式(6)：If the empathy task is a classification task, the training loss function of the empathy predictor is formula (6):

上述公式(6)中，θ_eem、θ_ec、θ_gem分别为共情私有特征编码器e_em()、公共特征编码器e_c()、共情预测器g_em()的参数；为预测的共情标签；em^*为真实的共情标签；N为当共情任务为分类任务时的共情标签种类数量；In the above formula (6), θ _eem , θ _ec , θ _gem are the parameters of the empathy private feature encoder e _em (), the public feature encoder e _c (), and the empathy predictor g _em () respectively; is the predicted empathy label; em ^* is the real empathy label; N is the number of empathy label types when the empathy task is a classification task;

上述方法的步骤3中，极性分类器的训练损失函数为：In step 3 of the above method, the training loss function of the polarity classifier is:

上述公式(7)中,θ_epo、θ_ec、θ_gpo分别为极性私有特征编码器e_po()、公共特征编码器e_c()、极性分类器g_po()的参数；为预测的极性标签；po^*为真实的极性标签。In the above formula (7), θ _epo , θ _ec , and θ _gpo are the parameters of the polarity private feature encoder e _po (), the public feature encoder e _c (), and the polarity classifier g _po () respectively; is the predicted polarity label; po ^* is the true polarity label.

所述的θ_epo、θ_ec、θ_gpo分别为极性私有特征编码器e_po()、公共特征编码器e_c()、极性分类器g_po()的参数，与上述的θ_eem、θ_ec、θ_gem参数的确定方式类似，在此不再重复说明。The θ _epo , θ _ec , θ _gpo are respectively parameters of the polarity private feature encoder e _po (), the public feature encoder e _c (), and the polarity classifier g _po (), which are similar to the determination method of the above-mentioned θ _eem , θ _ec , θ _gem parameters and will not be repeated here.

上述方法还包括，采用领域二分类器g_l()以对抗损失的方式对共情私有特征编码器e_em()得出的共情私有特征和极性私有特征编码器e_po()得出的极性私有特征的来源领域进行判别，该领域二分类器g_l()的训练损失函数为L_cop为：The method further includes using a domain binary classifier g _l () to discriminate the source domains of the empathy private features obtained by the empathy private feature encoder e _em () and the polarity private features obtained by the polarity private feature encoder e _po () in an adversarial loss manner, and the training loss function of the domain binary classifier g _l () is L _cop :

以及采用领域二分类器g_l()以对抗损失的方式对公共特征编码器e_c()得出的共情公有特征和极性公有特征的来源领域进行判别，该领域二分类器g_l()的训练损失函数为L_adv为：And the domain binary classifier g _l () is used to discriminate the source domains of the empathy public features and polarity public features obtained by the common feature encoder e _c () in an adversarial loss manner. The training loss function of the domain binary classifier g _l () is L _adv :

和/或，通过共情铰链损失模块在所述共情预测输出的共情正确拼接结果所对应的预测结果L_em与共情错误拼接结果所对应的预测结果L_em’之间的差值小于预设差值时进行共情预测，所述共情正确拼接结果指共情私有特征与共情公有特征的拼接结果，所述共情错误拼接结果指极性私有特征与共情私有特征的拼接结果；以及通过极性铰链损失模块在所述共情预测输出的极性正确拼接结果所对应的预测结果L_em与极性错误拼接结果所对应的预测结果L_em’之间的差值小于预设差值时进行极性预测，所述极性正确拼接结果指极性私有特征与极性公有特征的拼接结果，所述极性错误拼接结果指共情私有特征与极性私有特征的拼接结果；其中，所述共情铰链损失模块的训练损失函数(10)为：And/or, when the difference between the prediction result _Lem corresponding to the empathy correct splicing result output by the empathy prediction and the prediction result _Lem ' corresponding to the empathy incorrect splicing result is less than a preset difference, an empathy prediction is performed by an empathy hinge loss module, wherein the empathy correct splicing result refers to the splicing result of the empathy private feature and the empathy public feature, and the empathy incorrect splicing result refers to the splicing result of the polarity private feature and the empathy private feature; and when the difference between the prediction result _Lem corresponding to the polarity correct splicing result output by the empathy prediction and the prediction result _Lem ' corresponding to the polarity incorrect splicing result is less than a preset difference, a polarity prediction is performed by a polarity hinge loss module, wherein the polarity correct splicing result refers to the splicing result of the polarity private feature and the polarity public feature, and the polarity incorrect splicing result refers to the splicing result of the empathy private feature and the polarity private feature; wherein the training loss function (10) of the empathy hinge loss module is:

上述式(10)中，各参数的含义为：L_em为共情正确拼接结果所对应的共情损失值(即共情loss值)；L_em’为共情错误拼接结果所对应的共情损失值(即共情loss值)；δ₁为设置的共情的两种不同拼接结果之间所需要达到的差值；In the above formula (10), the meanings of the parameters are as follows: _Lem is the empathy loss value (i.e., empathy loss value) corresponding to the correct empathy splicing result; _Lem ' is the empathy loss value (i.e., empathy loss value) corresponding to the incorrect empathy splicing result; _δ1 is the difference between the two different empathy splicing results that need to be set;

所述极性铰链损失模块的训练损失函数(11)为：The training loss function (11) of the polar hinge loss module is:

上述式(11)中，各参数的含义为：L_po为极性正确拼接结果所对应的极性损失值(即极性loss值)；L_po’为极性错误拼接结果所对应的极性损失值(即极性loss值)；δ₂为设置的极性的两种不同拼接结果之间所需要达到的差值。In the above formula (11), the meanings of the parameters are as follows: _Lpo is the polarity loss value (i.e., polarity loss value) corresponding to the polarity correct splicing result; _Lpo ' is the polarity loss value (i.e., polarity loss value) corresponding to the polarity incorrect splicing result; _δ2 is the difference that needs to be achieved between the two different splicing results of the set polarity.

所述文本共情预测模型的最终铰链损失函数(12)为：The final hinge loss function (12) of the text empathy prediction model is:

L_hin＝L_{hin_1}+L_{hin_2} (12)；L _hin =L _{hin_1} +L _{hin_2} (12);

所述文本共情预测模型的最终损失函数L_o为：The final loss function L _o of the text empathy prediction model is:

L_o＝λ₁*L_em+λ₂*L_po+λ₃*L_cop+λ₄*L_adv+λ₅*L_hin (13)；L _o =λ ₁ *L _em +λ ₂ *L _po +λ ₃ *L _cop +λ ₄ *L _adv +λ ₅ *L _hin (13);

上述公式(13)中，λ₁、λ₂、λ₃、λ₄、λ₅为各个损失函数L_em、L_po、L_cop、L_adv、L_hin的权重，具体实验中，λ₁、λ₂、λ₃、λ₄、λ₅这几个权重值分别为1、0.5、2、2、1.5。In the above formula (13), λ ₁ , λ ₂ , λ ₃ , λ ₄ , and λ ₅ are weights of each loss function _Lem , _Lpo , _Lcop , _Ladv , and _Lhin . In specific experiments, the weight values of λ ₁ , λ ₂ , λ ₃ , λ 4 , and λ ₅ are 1, 0.5, 2, _{2, and} 1.5, respectively.

上述的最终损失函数L_o通过对L_em、L_po、L_cop、L_adv、L_hin加权平均得出，是综合各个模块最终的损失函数。The above-mentioned final loss function L _o is obtained by weighted average of _Lem , _Lpo , _Lcop , _Ladv , and _Lhin , and is the final loss function of each module.

上述方法通过对抗分类损失来降低数据集领域差异所带来的干扰，并通过铰链损失方式来降低标签不同所带来的干扰。其中对抗分类损失来降低数据集领域差异体现在步骤5中，对应损失函数为(8)、(9)；铰链损失方式来降低标签不同所带来的干扰则体现在损失函数(10)、(11)、(12)中。The above method uses adversarial classification loss to reduce the interference caused by differences in dataset domains, and uses hinge loss to reduce the interference caused by different labels. The adversarial classification loss to reduce the differences in dataset domains is reflected in step 5, and the corresponding loss functions are (8) and (9); the hinge loss method to reduce the interference caused by different labels is reflected in the loss functions (10), (11), and (12).

由于极性与共情都是一种情感子属性，且两者标签值的大小都较为依赖于文本内所出现的情感词，因此文本极性数据与文本共情数据之间存在较大的相关性。而文本极性数据领域的数据集包含的数据量往往很大，达到了数万乃至数十万条，相比仅包含一两千条数据的文本共情数据的数据集具有数据量大的优点，本发明基于文本情感分类辅助的文本共情预测模型及方法，通过迁移学习方式，利用极性数据去辅助共情数据取得更优的效果。Since polarity and empathy are both sub-attributes of emotion, and the size of the label values of both are relatively dependent on the emotional words that appear in the text, there is a large correlation between text polarity data and text empathy data. The amount of data contained in the data set in the field of text polarity data is often very large, reaching tens of thousands or even hundreds of thousands of data. Compared with the data set of text empathy data that only contains one or two thousand data, it has the advantage of large data volume. The present invention is based on a text empathy prediction model and method assisted by text sentiment classification, and uses polarity data to assist empathy data through transfer learning to achieve better results.

综上可见，本发明实施例的基于文本情感分类辅助的文本共情预测系统及方法，通过用文本情感分类来辅助文本共情分析，不仅考虑了两个任务间数据集领域的差别之外，还考虑两个任务间标签领域的区别，从而可以取得了更准确的文本共情预测结果。In summary, it can be seen that the text empathy prediction system and method based on the assistance of text sentiment classification in the embodiments of the present invention uses text sentiment classification to assist text empathy analysis. It not only considers the difference in the data set fields between the two tasks, but also considers the difference in the label fields between the two tasks, thereby achieving more accurate text empathy prediction results.

为了更加清晰地展现出本发明所提供的技术方案及所产生的技术效果，下面以具体实施例对本发明实施例所提供的基于文本情感分类辅助的文本共情预测模型及方法进行详细描述。In order to more clearly demonstrate the technical solution and technical effects provided by the present invention, the text empathy prediction model and method based on text sentiment classification assistance provided by the embodiments of the present invention are described in detail with specific embodiments below.

实施例Example

如图1所示，本发明实施例提供一种文本共情预测系统，是一种能基于迁移学习实现通过大规模的文本情感分类数据来辅助小规模的文本共情预测的模型，具体是由于共情数据集的共情数据非常少因而无法支持共情预测器训练出一个泛化能力很强的神经网络，因而采用从拥有大量极性数据的极性分类任务中学习可迁移的公共特征来辅助共情预测。该文本共情预测模型进行本共情预测的方法如下：As shown in FIG1 , an embodiment of the present invention provides a text empathy prediction system, which is a model that can assist small-scale text empathy prediction through large-scale text sentiment classification data based on transfer learning. Specifically, since the empathy data set has very little empathy data, it is impossible to support the empathy predictor to train a neural network with strong generalization ability. Therefore, the empathy prediction is assisted by learning transferable common features from polarity classification tasks with a large amount of polarity data. The method for the text empathy prediction model to perform this empathy prediction is as follows:

其涉及的数据集和相关定义中：The data sets and related definitions involved are:

共情数据集设为D_em,D_em＝{(s_em ¹,em₁),...,(s_em ⁿ,emⁿ),(s_em ^N,em^N)}；其中s_em ⁿ(1<＝n<＝N)与emⁿ分别代表第n个共情数据的输入文本与共情标签；The empathy data set is set as _Dem , _Dem = {( _sem1 , _em1 ), ..., ⁽ _semn , ^emn ), ( _semN , ^emN )}; where _semn ( ¹ <= ⁿ <= N) and ^emn represent the input ^text and empathy label of the nth empathy data respectively;

极性数据集设为D_po,D_po＝{(s_po ¹,po¹),...,(s_po ^m,po^m),...,(s_po ^M,po^M)}；其中s_po ^m(1<＝m<＝M)与em^m∈{0,1}分别代表第m个极性数据的输入文本与极性标签；The polarity data set is set as D _po , D _po ={(s _po ¹ ,po ¹ ),...,(s _po ^m ,po ^m ),...,(s _po ^M ,po ^M )}; where s _po ^m (1<=m<=M) and em ^m ∈{0,1} represent the input text and polarity label of the mth polarity data respectively;

上述极性数据集的极性数据总数M相比共情数据集的共情数据总数N大很多。The total number of polarity data M in the polarity data set is much larger than the total number of empathy data N in the empathy data set.

本发明的文本共情预测模型的主要网络框架图如图1所示，其编码器部分主要包括三个特征编码器，分别是e_em(),e_po(),e_c(),其中e_em()、e_po()表示共情预测与极性分类这两个任务的私有特征编码器，用来对两个任务的私有特征进行编码，而e_c()则表示公共特征编码器，用来对两个任务之间的公共特征进行编码；假定输入的样本对中的共情数据记为s_em，极性数据记为s_po，则最终的编码结果为e_em(s_em)、e_po(s_em)、e_c(s_em)、e_em(s_po)，e_po(s_po)、e_c(s_po).The main network framework of the text empathy prediction model of the present invention is shown in FIG1 . The encoder part mainly includes three feature encoders, namely, e _em (), e _po (), and e _c (), wherein e _em () and e _po () represent the private feature encoders of the two tasks of empathy prediction and polarity classification, which are used to encode the private features of the two tasks, and e _c () represents the public feature encoder, which is used to encode the public features between the two tasks. Assuming that the empathy data in the input sample pair is denoted as s _em and the polarity data is denoted as s _po , the final encoding results are e _em (s _em ), e _po (s _em ), e _c (s _em ), e _em (s _po ), e _po (s _po ), e _c (s _po ).

在后续的网络架构中，由于公私有特征对共情预测的贡献可能不一致，因此本发明首先通过attention架构的两个公私有特征融合模块来对公私有特征进行加权；从而通过动态加权的方式来将公私有特征融合为最终的特征表达，并进行对应标签的预测；与此同时，针对共情预测与极性分类这两个任务间数据集领域差异与标签评测差异所带来的干扰，本发明进一步通过对抗分类损失来降低数据集领域差异所带来的干扰，并通过设置铰链损失模块(即Hinge-loss模块，包括共情铰链损失模块和极性铰链损失模块)的方式来降低标签不同所带来的干扰。从而使得特征编码器所学习到的可迁移特征可以适用于不同的领域和标签，进一步地解缠中所得到的两个任务之间的公私有特征。In the subsequent network architecture, since the contributions of public and private features to empathy prediction may be inconsistent, the present invention first weights the public and private features through the two public and private feature fusion modules of the attention architecture; thereby, the public and private features are fused into the final feature expression through dynamic weighting, and the corresponding labels are predicted; at the same time, in view of the interference caused by the differences in data set domains and label evaluation between the two tasks of empathy prediction and polarity classification, the present invention further reduces the interference caused by the differences in data set domains through adversarial classification loss, and reduces the interference caused by different labels by setting a hinge loss module (i.e., Hinge-loss module, including empathy hinge loss module and polarity hinge loss module). As a result, the transferable features learned by the feature encoder can be applied to different fields and labels, further disentangling the public and private features between the two tasks obtained.

上述的铰链损失模块的损失函数一般为loss_final＝max(0,δ-loss)，含义为：当原本损失较小时，loss_final不为0；损失较大时,loss_final为0；对应本发明中即为当正确拼接结果与错误组合方式之间的差距很小时，说明此时特征分离不彻底，因此loss_final不为0，该铰链损失模块起作用；而如果正确拼接结果与错误拼接结果之间的差距比较大，说明此时特征分离效果较好，loss_final为0，该铰链损失模块不起作用。The loss function of the above-mentioned hinge loss module is generally loss _final = max(0,δ-loss), which means: when the original loss is small, loss _final is not 0; when the loss is large, loss _final is 0; corresponding to the present invention, when the gap between the correct splicing result and the wrong combination method is very small, it means that the feature separation is not complete at this time, so loss _final is not 0, and the hinge loss module works; and if the gap between the correct splicing result and the wrong splicing result is relatively large, it means that the feature separation effect is better at this time, loss _final is 0, and the hinge loss module does not work.

具体而言，各模块的具体描述如下：Specifically, the detailed description of each module is as follows:

(一)对于输入的共情数据s_em与极性数据s_po而言，共情数据s_em通过共情私有特征编码器所得到的共情私有特征与公共特征编码器得到的共情公有特征分别为e_em(s_em)和e_c(s_em)；而极性数据s_po通过极性私有特征编码器所得到的极性私有特征与公共特征编码器得到的极性公有特征分别为e_po(s_po)和e_c(s_po)；由于领域私有特征与公有特征对最终的标签预测的效益可能不同，因此本发明通过attention架构的两个公私有特征融合模块对公私有特征分别进行动态加权，从而得到了公私有特征融合后的最终特征表达f_em与f_po，具体流程如公式(1)、(2)所示：(i) For the input empathy data s _em and polarity data s _po , the empathy private features obtained by the empathy data s _em through the empathy private feature encoder and the empathy public features obtained by the public feature encoder are e _em (s _em ) and e _c (s _em ) respectively; while the polarity private features obtained by the polarity data s _po through the polarity private feature encoder and the polarity public features obtained by the public feature encoder are e _po (s _po ) and e _c (s _po ) respectively. Since the benefits of domain private features and public features for the final label prediction may be different, the present invention dynamically weights the public and private features respectively through two public and private feature fusion modules of the attention architecture, thereby obtaining the final feature expressions _fem and f _po after the fusion of public and private features. The specific process is shown in formulas (1) and (2):

上述公式(1)、(2)中，d为特征向量的维度；U(s_em)∈R^2*d，其为e_em(s_em)与e_c(s_em)的拼接结果；而V(s_po)∈R^2*d，其为e_po(s_po)与e_c(s_po)的拼接结果；In the above formulas (1) and (2), d is the dimension of the feature vector; U(s _em )∈R ^2*d , which is the concatenation result of e _em (s _em ) and e _c (s _em ); and V(s _po )∈R ^2*d , which is the concatenation result of e _po (s _po ) and e _c (s _po );

在得到共情预测任务与极性分类任务的最终特征表达后，将得到的最终特征表达分别对应送入共情预测器与极性分类器中得到对应的预测标签，即共情标签、极性标签；其中f_em输入到共情预测器g_em()内，f_po输入到极性分类器g_po(.)内，具体处理公式为(3)、(4)：After obtaining the final feature expressions of the empathy prediction task and the polarity classification task, the final feature expressions are respectively sent to the empathy predictor and the polarity classifier to obtain the corresponding prediction labels, namely, the empathy label and the polarity label; wherein _fem is input into the empathy predictor _gem (), and _fpo is input into the polarity classifier _gpo (). The specific processing formulas are (3) and (4):

上述公式(3)、(4)中，em为共情标签；po为极性标签；em^*为真实的共情标签，po^*为真实的极性标签。In the above formulas (3) and (4), em is the empathy label; po is the polarity label; em ^* is the true empathy label, and po ^* is the true polarity label.

若共情任务为回归任务，则其训练损失函数如公式(5)所示；若其为分类任务，则其训练损失函数如公式(6)所示：If the empathy task is a regression task, its training loss function is as shown in formula (5); if it is a classification task, its training loss function is as shown in formula (6):

极性分类的训练损失函数如公式(7)所示：The training loss function for polarity classification is shown in formula (7):

上述公式(5)、(6)和(7)，θ_eem、θ_epo、θ_ec、θ_gem、θ_gpo分别为e_em()、e_po()、e_c()、g_em()、g_po()的参数；N为当共情任务为分类任务时的共情标签种类数量。In the above formulas (5), (6) and (7), θ _eem , θ _epo , θ _ec , θ _gem , θ _gpo are the parameters of e _em (), e _po (), e _c (), g _em (), g _po (), respectively; N is the number of empathy label types when the empathy task is a classification task.

(二)本发明通过对抗学习的方式来将所学习到的特征分离为领域公共特征与领域独有特征，提升公共特征编码器所编码的公共特征，对任何数据集领域都是有助益且普适的。具体而言，本发明通过一个领域二分类器g_l()来对e_em()、e_po()、e_c()的编码特征进行其来源领域上的判别。其中对于e_em()、e_po()这两个私有特征编码器的编码特征，g_l()应该是可以分清楚这些特征来源于共情领域还是极性领域，因为这些私有特征是两个领域独有特征。对应的训练过程如公式(8)所示，对应损失函数为L_cop：(ii) The present invention separates the learned features into domain common features and domain unique features by means of adversarial learning, and improves the common features encoded by the common feature encoder, which is helpful and universal for any data set domain. Specifically, the present invention uses a domain binary classifier g _l () to discriminate the source domain of the encoded features of e _em (), e _po (), and e _c (). Among them, for the encoded features of the two private feature encoders e _em () and e _po (), g _l () should be able to distinguish whether these features come from the empathy domain or the polarity domain, because these private features are unique features of the two domains. The corresponding training process is shown in formula (8), and the corresponding loss function is L _cop :

而对于公共特征编码器e_c()的编码特征而言，无论其来源领域是共情领域还是极性领域，由于公有特征应该是两个任务都所具备的公共特征，所以g_l()应该分不清楚这些特征的来源，即会混淆这些特征来源。具体的训练损失函数如公式(9)所示,对应损失函数为L_adv：As for the encoding features of the common feature encoder e _c (), no matter whether the source domain is the empathy domain or the polarity domain, since the common features should be common features of both tasks, g _l () should not be able to distinguish the sources of these features, that is, it will confuse the sources of these features. The specific training loss function is shown in formula (9), and the corresponding loss function is L _adv :

(3)本发明提出一种标签训练策略来消除共情任务与极性任务两者间的标签差异。具体而言，以共情预测为例，有些可迁移特征对共情预测是有用的，而有些特征是没有用的。本发明对有用的特征通过公共特征编码器进行提取，而无效的特征由另外一个任务的私有特征提取器所提取。基于此，本发明提出了采用铰链损失模块(即Hinge-loss模块，分为共情铰链损失模块与极性铰链损失模块)来尽可能扩大公共特征编码器和共情私有特征编码器所对应的实验结果与极性私有特征编码器和共情私有特征编码器所对应的实验结果之间的差值。(3) The present invention proposes a label training strategy to eliminate the label difference between the empathy task and the polarity task. Specifically, taking empathy prediction as an example, some transferable features are useful for empathy prediction, while some features are useless. The present invention extracts useful features through a common feature encoder, while invalid features are extracted by a private feature extractor of another task. Based on this, the present invention proposes to use a hinge loss module (i.e., Hinge-loss module, divided into an empathy hinge loss module and a polarity hinge loss module) to maximize the difference between the experimental results corresponding to the common feature encoder and the empathy private feature encoder and the experimental results corresponding to the polarity private feature encoder and the empathy private feature encoder.

具体而言，在公式(1)、(2)中，f_em为最终的融合特征，而对应损失为L_em,这是正确的拼接结果所对应的实验结果；除此之外，本发明定义f_em’＝e_em(s_em)+e_po(s_em)，基于此的预测标签为em’＝g_em(f_em’)，从而得到了新的共情损失L_em’,L_em’即为错误的拼接结果所对应的实验结果；共情铰链损失模块通过尽可能扩大L_em与L_em’之间的差值来使得对共情预测有助益的特征更多地集中于公共特征编码器内，而那些无用的特征更多地集中于另一任务的私有特征编码器内；极性分类任务部分的极性铰链损失模块实现方法同共情预测保持一致，因此最终的共情铰链损失模块的训练目标及损失函数如公式(10)所示：Specifically, in formulas (1) and (2), _fem is the final fusion feature, and the corresponding loss is _Lem , which is the experimental result corresponding to the correct splicing result; in addition, the present invention defines _fem '= _eem ( _sem )+ _epo ( _sem ), and the prediction label based on this is em'= _gem ( _fem '), thereby obtaining a new empathy loss _Lem ', _which is the experimental result corresponding to the wrong splicing result; the empathy hinge loss module maximizes the difference between _Lem and _Lem ' so that the features that are helpful for empathy prediction are more concentrated in the public feature encoder, while those useless features are more concentrated in the private feature encoder of another task; the implementation method of the polarity hinge loss module of the polarity classification task is consistent with that of the empathy prediction, so the training objective and loss function of the final empathy hinge loss module are shown in formula (10):

L_hin＝L_{hin_1}+L_{hin_2} (12)；L _hin =L _{hin_1} +L _{hin_2} (12);

所述文本共情预测模型的最终损失函数L_o是对L_em、L_po、L_cop、L_adv和L_hin的加权平均，如公式(13)所示：The final loss function L _o of the text empathy prediction model is the weighted average of _Lem , _Lpo , _Lcop , _Ladv and _Lhin , as shown in formula (13):

其中λ₁，λ₂，λ₃，λ₄，λ₅对应于各个损失函数的权重，用于控制多个损失函数之间的平衡性，具体实验中，这几个权重λ₁，λ₂，λ₃，λ₄，λ₅分别取值为1、0.5、2、2、1.5。Among them, λ ₁ , λ ₂ , λ ₃ , λ ₄ , and λ ₅ correspond to the weights of each loss function and are used to control the balance between multiple loss functions. In the specific experiment, these weights λ ₁ , λ ₂ , λ ₃ , λ ₄ , and λ ₅ are 1, 0.5, 2, 2, and 1.5, respectively.

对本发明的模型及方法在两个数据集上进行了实验，并取得了初步的实验结果，实验结果证明本发明所提方法的有效性。The model and method of the present invention were experimented on two data sets, and preliminary experimental results were obtained. The experimental results proved the effectiveness of the method proposed in the present invention.

共情数据集部分，由Buchel所提出的共情数据集共包括1860条标注数据，主要来源于标注人员对于各类新闻的读后感。每一条数据样例都包括两个共情标签，分别是EC和PD，这两个标签值的取值范围是1-7。由Zhou所提出的共情数据集包括1000条标注数据，这些数据来源于Reddit论坛，主要内容为用户在该论坛上的发帖及对应帖子的回复。每条数据样例都被打上了一个共情标签，标签值的范围为1-5。In terms of empathy data sets, the empathy data set proposed by Buchel includes 1,860 annotated data, which mainly comes from the annotation personnel's reading experience of various news. Each data sample includes two empathy labels, EC and PD, and the value range of these two labels is 1-7. The empathy data set proposed by Zhou includes 1,000 annotated data, which comes from the Reddit forum. The main content is the user's posting on the forum and the reply to the corresponding post. Each data sample is marked with an empathy label, and the label value range is 1-5.

两个极性分类数据集中前者主要包括用户在twitter上所发布的各类推文，而后者则主要是用户对各种类别的电影的观后感及评论。前者包含7061条正向样本与3240条负向样本；后者则包括25000条正向样本与25000条负向样本。The former mainly includes various tweets posted by users on Twitter, while the latter mainly includes users' impressions and comments on various types of movies. The former contains 7061 positive samples and 3240 negative samples; the latter includes 25,000 positive samples and 25,000 negative samples.

为了保证评测标准的统一，对于Buchel所提出的共情数据集而言，依据该数据集中原定的评测标准，同样采取皮尔逊相关系数(PCC)作为评测标注，由于该数据集有EC与PD这两个标签，因此评测标准为PCC-EC与PCC-PD；而对于Zhou所提出的共情数据集，同样依据该数据集中原定的评测标准，采取MSE loss与R2来作为评测标准。由于所涉及工作为迁移学习工作，因此只需考虑在共情预测数据集上的实验结果，极性分类数据集上的实验结果在下文分析中不进行讨论。In order to ensure the uniformity of the evaluation criteria, for the empathy dataset proposed by Buchel, the Pearson correlation coefficient (PCC) is also used as the evaluation annotation according to the original evaluation criteria in the dataset. Since the dataset has two labels, EC and PD, the evaluation criteria are PCC-EC and PCC-PD; and for the empathy dataset proposed by Zhou, MSE loss and R2 are used as the evaluation criteria according to the original evaluation criteria in the dataset. Since the work involved is transfer learning, only the experimental results on the empathy prediction dataset need to be considered, and the experimental results on the polarity classification dataset will not be discussed in the following analysis.

实验参数部分，首先针对编码器部分，采用了两种编码器，分别是Bi-LSTM与BERT。当Bi-LSTM作为编码器时，其前向LSTM与后向LSTM的隐层维度均被设置为200。为了保持一致，直接采用BERT中的bert-base-uncased模型作为基准编码模型。Bi-LSTM作为编码器时，学习率设置为0.001，而BERT作为编码器时，学习率设置为0.00002，衰减系数统一设置为0.95.Dropout率设置为0.3，训练的batch大小为16，正则化方法为L2正则化。训练过程中，极性数据集中的样本同共情数据集内的样本进行组合，以成对的形式输入到网络架构内。实验框架为Pytorch，优化器采用Adam优化器。In terms of experimental parameters, two encoders are used for the encoder part, namely Bi-LSTM and BERT. When Bi-LSTM is used as an encoder, the hidden dimension of its forward LSTM and backward LSTM is set to 200. In order to maintain consistency, the bert-base-uncased model in BERT is directly used as the baseline encoding model. When Bi-LSTM is used as an encoder, the learning rate is set to 0.001, and when BERT is used as an encoder, the learning rate is set to 0.00002, and the attenuation coefficient is uniformly set to 0.95. The dropout rate is set to 0.3, the batch size of training is 16, and the regularization method is L2 regularization. During the training process, the samples in the polarity dataset are combined with the samples in the empathy dataset and input into the network architecture in pairs. The experimental framework is Pytorch, and the optimizer uses the Adam optimizer.

同相关工作的比较主要分为两类，分别是不进行迁移学习的工作，即只使用共情数据进行共情预测的方法，例如FNN，CNN，RoBERTa等；第二类比较的工作为利用共情数据进行迁移学习的其余相关工作，例如DATNet,ADV-SA等，同样进行了比较。实验结果如表所示。The comparison with related works is mainly divided into two categories: the works without transfer learning, that is, the methods that only use empathy data for empathy prediction, such as FNN, CNN, RoBERTa, etc.; the second category of comparison work is the other related works that use empathy data for transfer learning, such as DATNet, ADV-SA, etc., which are also compared. The experimental results are shown in the table.

表4基于相关工作比较的实验结果Table 4 Experimental results based on comparison with related work

通过表4可以发现本发明的模型及方法的实验结果无论在Buchel的共情数据集上还是在Zhou的共情数据集上，都取得了目前该领域最优的实验结果。具体分析而言，相较于不进行迁移学习的工作，例如CNN,FNN、RoBERTa、BERT、Random Forest等，本文所提模型的性能明显更优。这是因为本文所提的迁移学习模型可以通过大规模的极性分类数据集帮助小规模的共情分析数据集学习到更好的公共特征表达，从而使得共情预测的性能更好。除此之外，相较于DATNet，ADV-SA等进行迁移学习的工作，本文所提模型的实验结果也更优。这是因为本文所提模型不仅通过对抗学习的方式降低了两个任务间领域差异所带来的干扰；也通过设计hinge-loss的方式减少了两个任务间标签差异所带来的干扰，从而使得学习到的可迁移的公共特征对于不同的领域，不同的标签都是有效的。It can be found from Table 4 that the experimental results of the model and method of the present invention have achieved the best experimental results in the field at present, whether on Buchel's empathy dataset or Zhou's empathy dataset. Specifically speaking, compared with the work without transfer learning, such as CNN, FNN, RoBERTa, BERT, Random Forest, etc., the performance of the model proposed in this article is significantly better. This is because the transfer learning model proposed in this article can help a small-scale empathy analysis dataset learn better common feature expressions through a large-scale polarity classification dataset, thereby making the performance of empathy prediction better. In addition, compared with the work of transfer learning such as DATNet and ADV-SA, the experimental results of the model proposed in this article are also better. This is because the model proposed in this article not only reduces the interference caused by the domain differences between the two tasks through adversarial learning; it also reduces the interference caused by the label differences between the two tasks by designing hinge-loss, so that the learned transferable common features are effective for different fields and different labels.

以上所述，仅为本发明较佳的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明披露的技术范围内，可轻易想到的变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应该以权利要求书的保护范围为准。本文背景技术部分公开的信息仅仅旨在加深对本发明的总体背景技术的理解，而不应当被视为承认或以任何形式暗示该信息构成已为本领域技术人员所公知的现有技术。The above is only a preferred specific embodiment of the present invention, but the protection scope of the present invention is not limited thereto. Any changes or substitutions that can be easily thought of by a technician familiar with the technical field within the technical scope disclosed in the present invention should be included in the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims. The information disclosed in the background technology section of this article is only intended to deepen the understanding of the overall background technology of the present invention, and should not be regarded as an admission or in any form that the information constitutes prior art known to those skilled in the art.

Claims

1. A text empathy prediction system, comprising:

Empathy private feature encoder, polarity private feature encoder, public feature encoder, empathy public-private feature fusion module, polarity public-private feature fusion module, empathy predictor and polarity classifier; wherein,

The input of the empathy private feature encoder is the empathy data in the empathy data set, and the private features of the input empathy data can be encoded to obtain the empathy private features;

The polarity private feature encoder is inputted with polarity data in the polarity data set, and can encode the private features of the inputted polarity data to obtain the polarity private features;

The public feature encoder receives the empathy data in the empathy data set and the polarity data in the polarity data set respectively, and can encode the public features of the empathy data to obtain empathy public features, and encode the public features of the polarity data to obtain polarity public features;

The empathy public-private feature fusion module is connected to the output end of the empathy private feature encoder and the output end of the public feature encoder respectively, and can weightedly fuse the empathy private features output by the empathy private feature encoder and the empathy public features output by the public feature encoder into a final empathy prediction feature expression;

The polarity public-private feature fusion module is connected to the output end of the polarity private feature encoder and the output end of the public feature encoder respectively, and can weightedly fuse the polarity private features output by the polarity private feature encoder and the polarity public features output by the public feature encoder into a final polarity classification feature expression;

The empathy predictor is connected to the output end of the empathy public and private feature fusion module, and can predict the final empathy prediction feature expression to obtain the corresponding empathy label;

The polarity classifier is connected to the output end of the polarity public-private feature fusion module, and can predict the final polarity classification feature expression to obtain the corresponding polarity label.

2. The text empathy prediction system according to claim 1, characterized in that:

The empathy private feature encoder also inputs polarity data in the polarity data set, and can encode the private features of the input polarity data to obtain empathy polarity private features;

The polarity private feature encoder also inputs the empathy data in the empathy data set, and can encode the private features of the input empathy data to obtain the polarity empathy private features;

It also includes: a domain binary classifier, the input end of which is connected to the output end of the empathy private feature encoder, the polarity private feature encoder and the public feature encoder respectively, and can perform binary classification processing on the public feature code output by the public feature encoder in an adversarial classification loss manner;

and/or, an empathy hinge loss module and a polarity hinge loss module; wherein,

The empathy hinge loss module is connected to the output end of the empathy predictor, and can perform empathy prediction when the difference between the empathy prediction result _Lem corresponding to the correct splicing result and the empathy prediction result _Lem ' corresponding to the incorrect splicing result is less than a preset difference, the correct splicing result refers to the splicing result of the empathy private feature and the empathy public feature, and the incorrect splicing result refers to the splicing result of the polarity private feature and the empathy private feature;

The polarity hinge loss module is connected to the output end of the polarity classifier, and can perform polarity prediction when the difference between the polarity prediction result _Lem corresponding to the correct splicing result and the polarity prediction result _Lem ' corresponding to the incorrect splicing result is less than a preset difference, the correct splicing result refers to the splicing result of the polarity private feature and the polarity public feature, and the incorrect splicing result refers to the splicing result of the empathy private feature and the polarity private feature.

3. The text empathy prediction system according to claim 2, characterized in that:

The polarity private feature encoder adopts a Bi-LSTM type or a BERT type feature encoder;

The common feature encoder adopts a Bi-LSTM type or a BERT type feature encoder;

The empathy public-private feature fusion module and the polarity public-private feature fusion module both use an attention network module;

The empathy predictor adopts a fully connected network;

The polarity classifier adopts a fully connected network;

The domain binary classifier adopts a fully connected network.

4. The text empathy prediction system according to claim 1 or 2, characterized in that the empathy private feature encoder adopts a Bi-LSTM type or a BERT type feature encoder;

The empathy predictor adopts a fully connected network;

The polarity classifier adopts a fully connected network.

5. A text empathy prediction method, characterized in that the text empathy prediction system according to any one of claims 1 to 4 is used as a text empathy prediction model, comprising:

Step 1, encoding the private features of the empathy data in the input empathy data set to obtain the empathy private features, and encoding the private features of the polarity data in the input polarity data set to obtain the polarity private features;

Encoding the public features of the empathy data in the input empathy data set to obtain empathy public features, and encoding the public features of the polarity data in the input polarity data set to obtain polarity public features;

Step 2, weighting and fusing the empathy private features and the empathy public features obtained in step 1 into a final empathy prediction feature expression; and weighting and fusing the polarity private features and the polarity public features obtained in step 1 into a final polarity prediction feature expression;

Step 3: Perform empathy prediction on the final empathy prediction feature expression obtained in step 2 to obtain a corresponding empathy label as a prediction result; and perform polarity classification on the final polarity classification feature expression to obtain a corresponding polarity label as a prediction result.

6. The text empathy prediction method according to claim 5, characterized in that, in step 2, the final empathy prediction feature expression _fem and the final polarity classification feature expression _fpo obtained by weighted fusion are respectively:

In the above formulas (1) and (2), q is q _k *U(S _em ), where q _k is a hyperparameter vector with a dimension of R ^1*d , q _k is a hyperparameter vector, and the initial value Q is obtained by random initialization; T is a transposed matrix, that is, U(s _em ) ^T is the transposed matrix of U(s _em ); d is the dimension of the feature vector; U(s _em ) is the concatenation result of the empathy private feature e _em (s _em ) and the empathy public feature e _c (s _em ), U(s _em )∈R ^2*d ; s _em is the empathy data in the empathy dataset; V(s _po ) is the concatenation result of the polarity private feature e _po (s _po ) and the polarity public feature e _c (s _po ), V(s _po )∈R ^2*d , s _po is the polarity data in the polarity dataset;

In step 3, the final empathy prediction feature expression _fem is input into the empathy predictor _gem () to predict the empathy label according to formula (3). Formula (3) is:

The final polarity prediction feature expression _fpo is input into the polarity classifier _gpo (.) and the polarity label is predicted according to formula (4): Formula (4) is:

7. The text empathy prediction method according to claim 6, characterized in that in step 3, if the empathy task is a regression task, the training loss function of the empathy predictor is formula (5):

In the above formula (5), θ _eem , θ _ec , θ _gem are the parameters of the empathy private feature encoder e _em (), the public feature encoder e _c (), and the empathy predictor g _em () respectively; is the predicted empathy label; em ^* is the true empathy label;

If the empathy task is a classification task, the training loss function of the empathy predictor is formula (6):

In the above formula (6), θ _eem , θ _ec , θ _gem are the parameters of the empathy private feature encoder e _em (), the public feature encoder e _c (), and the empathy predictor g _em () respectively; is the predicted empathy label; em ^* is the actual empathy label; N is the number of empathy label types when the empathy task is a classification task.

8. The text empathy prediction method according to claim 6, characterized in that in step 3, the training loss function of the polarity classifier is:

In the above formula (7), θ _epo , θ _ec , and θ _gpo are the parameters of the polarity private feature encoder e _po (), the public feature encoder e _c (), and the polarity classifier g _po () respectively; is the predicted polarity label; po ^* is the true polarity label.

9. The text empathy prediction method according to claim 6, characterized in that the method further comprises using a domain binary classifier g _l () to discriminate the source domains of the empathy private features obtained by the empathy private feature encoder e _em () and the polarity private features obtained by the polarity private feature encoder e _po () in an adversarial loss manner, and the training loss function of the domain binary classifier g _l () is L _cop :

And the domain binary classifier g _l () is used to discriminate the source domains of the empathy public features and polarity public features obtained by the common feature encoder e _c () in an adversarial loss manner. The training loss function of the domain binary classifier g _l () is L _adv :

And/or, when the difference between the prediction result _Lem corresponding to the empathy correct splicing result output by the empathy prediction and the prediction result _Lem ' corresponding to the empathy incorrect splicing result is less than a preset difference, an empathy prediction is performed by an empathy hinge loss module, wherein the empathy correct splicing result refers to the splicing result of the empathy private feature and the empathy public feature, and the empathy incorrect splicing result refers to the splicing result of the polarity private feature and the empathy private feature; and when the difference between the prediction result _Lem corresponding to the polarity correct splicing result output by the empathy prediction and the prediction result _Lem ' corresponding to the polarity incorrect splicing result is less than a preset difference, a polarity prediction is performed by a polarity hinge loss module, wherein the polarity correct splicing result refers to the splicing result of the polarity private feature and the polarity public feature, and the polarity incorrect splicing result refers to the splicing result of the empathy private feature and the polarity private feature; wherein the training loss function (10) of the empathy hinge loss module is:

In the above formula (10), the meanings of the parameters are as follows: _Lem is the empathy loss value corresponding to the correct empathy splicing result; _Lem ' is the empathy loss value corresponding to the incorrect empathy splicing result; _δ1 is the difference between the two different empathy splicing results that need to be set;

The training loss function (11) of the polar hinge loss module is:

In the above formula (11), the meanings of the parameters are as follows: _Lpo is the polarity loss value corresponding to the polarity correct splicing result; _Lpo ' is the polarity loss value corresponding to the polarity incorrect splicing result; _δ2 is the difference between the two different splicing results of the set polarity that needs to be achieved;

The final hinge loss function (12) of the text empathy prediction model is:

L _hin =L _{hin_1} +L _{hin_2} (12);

The final loss function L _o of the text empathy prediction model is:

L _o =λ ₁ *L _em +λ ₂ *L _po +λ ₃ *L _cop +λ ₄ *L _adv +λ ₅ *L _hin (13);

In the above formula (13), λ ₁ , λ ₂ , λ ₃ , λ ₄ , and λ ₅ are weights of the respective loss functions _Lem , _Lpo , _Lcop , _Ladv , and _Lhin , and the values of λ ₁ , λ ₂ , λ ₃ , λ ₄ , and λ ₅ are 1, 0.5, 2, 2, and 1.5, respectively.