Detailed Description
The embodiments of the present invention will be described below with reference to the accompanying drawings.
As previously mentioned, attempts are currently being made to build predictive models for text emotion analysis. According to one embodiment, the text with emotion colors is manually marked to form marked texts, and then the marked texts are used as training samples to train a text emotion analysis model. Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification. As shown in FIG. 1, a training sample set is formed using manually annotated text, and then the computing platform trains a text emotion analysis model using the training sample set. After training to obtain a text emotion analysis model, uploading commodity evaluation to a computing platform for the text to be analyzed, such as newly generated commodity evaluation, and analyzing emotion colors of the text by using the text emotion analysis model.
In the above supervised machine learning process, model training based on a large number of manually labeled text is required. The labeling mode of the text determines what training sample is obtained, and further determines the training effect of the model. Therefore, the problem of the labeling mode of the training sample needs to be considered.
In one embodiment, some alternative emotion type labels can be directly provided for the marker, so that the marker marks the emotion type label of the text according to the alternative labels and serves as a training sample. Wherein the alternative labels are typically flattened, requiring a marking person to clearly and unequivocally distinguish between the emotional semantics of each label therein. However, in some cases, where the distinction between several alternative labels is small, it is difficult for the marking personnel to achieve consistency of the marking. For example, in an emotion markup scenario, provided alternative emotion tags include, for example, angry, anger, annoyance, etc. Thus, for the same sample to be marked, it is possible that the label marked by the marking person a is angry, and the label marked by the marking person B is angry. For another example, for the same sample to be marked, the first marked label by the marking person a is angry, and if it is marked again, the marked label may be angry. Obviously, the marking mode has great difficulty for marking staff, and the consistency of the marking data obtained in the way is poor, and further, the accuracy and the effectiveness of the prediction result for the text emotion obtained by the trained prediction model are low.
Based on the method, the inventor proposes a new labeling mode, the labeling difficulty of a labeling person can be reduced, meanwhile, the effectiveness and consistency of labeling data are obviously improved, and therefore a training sample with higher usability is constructed and used for training a text emotion analysis model, and the accuracy of a model prediction result is improved. In one example, instead of providing the candidate tag to the labeler, a quantization index predetermined based on the candidate tag may be provided, for example, including a quantization index for emotion tendencies and emotion intensities of a certain emotion aspect, further, according to data marked by the labeler on the text based on the quantization index, an emotion type tag of the text is determined, and a training sample is further constructed for training the prediction model. In the following, specific implementations of the solution concept are described.
FIG. 2 illustrates a flow diagram of a text emotion analysis method, according to one embodiment, whose execution subject may be any device, apparatus, platform, cluster of devices, etc. having computing or processing capabilities, such as the computing platform illustrated in FIG. 1. As shown in fig. 2, the method comprises the steps of: step S210, labeling data obtained by labeling a first text is obtained, wherein the labeling data comprises a first emotion tendency selected from a plurality of alternative emotion tendencies aiming at a first emotion aspect and a first emotion intensity selected from a plurality of alternative emotion intensities aiming at the first emotion tendency; step S220, determining a first emotion type label corresponding to the combination of the first emotion tendency and the first emotion intensity based on a predetermined mapping relation between the plurality of alternative emotion tendencies and the alternative combinations of the plurality of alternative emotion intensities and the alternative emotion type label; step S230, determining a first training sample based on the first text and the first emotion type tag, for training a text emotion analysis model to perform emotion analysis on the text to be analyzed.
In the above steps, the labeling data of the first text is obtained in step S210, and then the text label of the first text is determined based on the obtained labeling data in step S220, and then a first training sample corresponding to the first text is constructed in step S230, for training the text emotion analysis model.
Further, as is known from the above steps S210 and S220, in order to implement the method, it is necessary to determine in advance an emotion aspect, a plurality of alternative emotion tendencies and a plurality of alternative emotion intensities for the emotion aspect, and establish a mapping relationship between an alternative combination of the plurality of alternative emotion tendencies and the plurality of alternative emotion intensities and an alternative emotion type tag. This part of the content is first described by way of example.
In one embodiment, the requirements of the text emotion analysis model can provide emotion tags which need to be identified for target analysis data according to actual requirements. In a specific embodiment, the above-mentioned requirement party may be an online or offline service platform for providing a certain product or service, such as an e-commerce platform, a lesson selection platform, a news consultation platform, an offline entity store, and so on. In a specific embodiment, the target analysis data may be user evaluation data, dynamic data published by the user on the social platform, and so on.
In one example, the news information platform may provide the following labels based on its emotional analysis needs for user ratings: happiness, excitement, happiness, surprise, depression, vitality, anger, sadness, calm, no-boring, smoldering, slight support, absolute support, affirmative, no-boring, neutral, questioning, resistance, rejection, and the like.
Further, one or more emotion aspects may be determined based on the labels provided by the demander, and a plurality of alternative emotion category labels are determined to be included in each emotion aspect. In a specific embodiment, the determined emotional aspect may include: pleasure, optimism, access, love, humming, photophobia, praise, approval and self-responsibility, and the like.
In one example, the determined emotional aspects include happiness and approval, and the respective emotion category labels are shown in table 1 below:
TABLE 1
In this manner, one or more emotion aspects and alternative emotion category labels corresponding to each aspect may be determined based on the plurality of labels provided by the model demander for the target analysis data. Hereinafter, description will be given mainly of one emotion aspect (hereinafter, collectively referred to as a first emotion aspect) as an example.
Still further, a plurality of alternative emotion tendencies for the first emotion aspect may be determined from the plurality of alternative emotion tags corresponding to the first emotion aspect, and wherein each of the alternative emotion tags corresponds to an alternative emotion tendency.
In a particular embodiment, the plurality of alternative emotional tendencies described above may include at least two of positive, negative, and no tendencies. In one example, assuming the first emotional aspect is pleasant, the alternative emotional trends are specifically a pleasant trend, an unpleasant trend, and no apparent trend in pleasure and unpleasantness. In another example, assuming the first emotion aspect is endorsement, the alternative emotion tendencies are specifically endorsement tendencies, disapproval tendencies, and no obvious tendencies in endorsement and disapproval. In another specific embodiment, the plurality of alternative emotional tendencies may include a predisposition and a no-predisposition. On the other hand, in one particular embodiment, the alternative emotional tendency may be represented by numbers or letters or other characters. In one example, positive, no tendency, and negative may be represented by 1, 0, and-1, respectively.
In a specific embodiment, a plurality of alternative emotion tendencies may be determined first, and then the alternative emotion tendencies corresponding to each of the alternative emotion tags may be determined, thereby obtaining alternative tags with respective alternative emotion tendencies. In another specific embodiment, a plurality of alternative emotion tags may be categorized first, and then the alternative emotion tendencies corresponding to the various tags may be determined.
In one example, taking the emotional aspect pleasure shown in table 1 as an example, the determined plurality of alternative emotional tendencies shown in table 2 includes positive, negative, and no tendencies, and alternative emotional category labels to which each of the alternative emotional tendencies corresponds, respectively.
TABLE 2
Thus, a plurality of alternative emotion tendencies, and alternative emotion type tags corresponding to the respective tendencies, can be obtained. In other words, the alternative emotion tendencies corresponding to the individual alternative emotion type tags can be obtained.
Still further, a plurality of alternative emotion intensities may be determined, and a mapping relationship between the plurality of alternative emotion tendencies and alternative combinations of the plurality of alternative emotion intensities and the alternative emotion tags is established.
In a specific embodiment, the service experience or the life experience may be combined first, and the corresponding number of the alternative emotion type labels under each alternative emotion inclination may be ranked, where the ranking result reflects the inclination degree of the alternative emotion inclination. And setting the alternative emotion intensity corresponding to each alternative emotion type label based on the sorting result. In one example, the expression form of the alternative emotion intensity is preferably selected from a form with a natural meaning of ordering, such as a single numerical value, or a numerical range, or a letter, etc.
In a specific example, assuming that the first emotion aspect is pleasure in table 2 and that a certain alternative emotion tendency is forward, the ranking result reflects the degree of tendency for pleasure, for example, the ranking result is: surprise, excitement, happiness, and happiness, the result shows that surprise is more pleasant than excitement, excitement is more pleasant than happiness, and so on. Further, in a more specific example, the correspondence between the alternative category labels and the alternative emotion intensities shown in table 3 may be obtained.
TABLE 3 Table 3
Alternative emotion category labels
|
Surprise is surprised
|
Excitation method
|
Open heart
|
Happy
|
Alternate emotion intensity
|
a
|
b
|
c
|
d |
In another more specific example, the correspondence between the alternative category labels and the alternative emotion intensities shown in table 4 may be obtained.
TABLE 4 Table 4
Alternative emotion category labels
|
Surprise is surprised
|
Excitation method
|
Open heart
|
Happy
|
Alternate emotion intensity
|
10
|
7-9
|
4-6
|
1-3 |
Therefore, the alternative emotion intensity corresponding to each alternative emotion label can be determined, and emotion quantification is achieved. It should be understood that the number of values or the range of values of the alternative emotion intensities corresponding to the alternative emotion tendencies may be completely the same, may be completely different, or may be partially the same.
The alternative emotion tendencies and the alternative emotion intensities corresponding to the alternative emotion tags can be determined, and then the mapping relation between the alternative combinations of the multiple alternative emotion tendencies and the multiple alternative emotion intensities for the first emotion aspect and the multiple alternative emotion tags can be established. In one example, as shown in table 5, the mapping relationships shown therein include: in the case where the emotion tendency for pleasure is positive and the emotion intensity is 4, the corresponding alternative emotion label is surprise.
TABLE 5
In the above, the process of establishing the mapping relationship between the candidate combinations of the plurality of candidate emotion tendencies and the plurality of candidate emotion intensities and the candidate emotion type tags is exemplarily described. It should be understood that some changes may be made based on the above-described establishment procedure, as long as the mapping relationship can be established. When determining the corresponding alternative emotion intensities of the alternative labels, the alternative emotion intensities can be directly set without sorting. In a variation, a certain alternative emotion type tag may be selected first, for example, a happy emotion, and its corresponding alternative emotion intensity is set to 2, then, another alternative emotion type tag may be selected, for example, a happy emotion, and the happy corresponding alternative emotion intensity is set to 1 based on the set happy emotion intensity being 2.
The steps involved in the method shown in fig. 2 are described next, specifically as follows:
first, in step S210, labeling data obtained by labeling the first text is obtained.
It should be noted that, for convenience of description and understanding, in the embodiment of the present specification, any text of a plurality of texts to be labeled is referred to as a first text. In one embodiment, the first text may be a word or sentence or chapter.
In one embodiment, step S210 may include sub-steps S31 and S32 shown in fig. 3. Specifically, first, in step S31, a text to be annotated is provided to an annotator, together with a plurality of alternative emotion tendencies for the first emotion aspect and a plurality of alternative emotion intensities for the respective alternative emotion tendencies. Specifically, the text to be annotated, the multiple alternative emotion tendencies and the multiple alternative emotion intensities can be displayed to the annotator in various forms. In a particular embodiment, presentation may be based on an interactive interface. In another specific embodiment, the presentation may be made by an electronic document, such as a word document or an excl document, or the like. In yet another specific embodiment, the presentation may also be made by a paper document.
Further, in step S32, a first emotion tendency selected by the labeling person from among a plurality of alternative emotion tendencies and a first emotion intensity selected from among a plurality of alternative emotion intensities for the first emotion tendency are acquired, and the first emotion tendency and the first emotion intensity are used as labeling data of the text or the picture. In a specific embodiment, the first emotion tendency and the first emotion intensity selected by the labeling personnel based on the interactive interface may be received as the labeling data. In another specific embodiment, the labeling person inputs the first emotion tendency and the first emotion intensity in the electronic document, and accordingly, the first emotion tendency and the first emotion intensity can be read from the electronic document as the labeling data.
According to one example, the interactive interface shown in fig. 4 includes: the text to be marked is also better in function of the bar-! "; a first emotional aspect, namely pleasure; a plurality of alternative emotional tendencies, namely positive, negative, and no tendency; a number of alternative emotional intensities, namely 1, 2, 3 and 4 for positive, 1, 2 and 3 for negative, and 1 and 2 for no trend. In this way, the first alternative emotion tendencies forward and first alternative emotion intensity 4 for the first emotion aspect selected by the labeling person can be obtained.
According to another example, the interactive interface shown in fig. 5 includes: the text to be annotated is "convincing, and our work is better-! "; a first emotional aspect, namely pleasure; a plurality of alternative emotional tendencies, namely positive, negative, and no tendency; a number of alternative emotional intensities, namely 1,2 and 3. A second emotional aspect, endorsement; a plurality of alternative emotional tendencies, namely positive, negative, and no tendency; a number of alternative emotional intensities, namely 1,2 and 3. In this way, the first alternative emotion tendencies forward direction and first alternative emotion intensity 1 for the first emotion aspect and the second alternative emotion tendencies forward direction and second alternative emotion intensity 2 for the second emotion aspect selected by the labeling person can be obtained.
According to yet another example, FIG. 6 shows labeling data recorded by a label maker through an excel table. The method comprises the following steps: for text number 3, emotion tendency is-1 and emotion intensity is 1. Thus, the recorded annotation data can be obtained by reading the excel document.
The above, the labeling data obtained by labeling the first text can be obtained.
Next, in step S220, a first emotion type tag corresponding to the labeling data for the first text is determined based on the predetermined mapping relation.
In one example, based on the mapping relationship shown in table 5, assuming that the first emotion type tendency included in the annotation data is negative and the first emotion type intensity is 5, it may be determined that the corresponding first emotion type tag is angry. In another example, assuming that the first emotion type tendency included in the annotation data is negative and the first emotion type intensity is 6, it may be determined that the corresponding first emotion type label is anger. Thus, the corresponding emotion type label can be accurately positioned through the quantization index.
In the above, the text label may be determined, and in step S230, a first training sample is determined based on the first text and the first emotion type label, and the first training sample is used for training a text emotion analysis model to perform emotion analysis on the text to be analyzed.
In one embodiment, the text emotion analysis model may be a multi-classification model. In another embodiment, the text emotion analysis model may include a plurality of classification models. In one embodiment, the algorithm on which the text emotion analysis model is based may include decision trees, bayes, support vector machines SVMs, and the like. In another embodiment, the text emotion analysis model may employ a neural network model such as RNN, LSTM, GRU.
Therefore, the corresponding first emotion type label can be determined according to the labeling data of the first text, the first training sample can be further determined, a plurality of other training samples can be similarly determined, a training sample set is constructed, and then the text emotion analysis model is trained.
In summary, by adopting the text emotion analysis method disclosed by the embodiment of the specification, the labeling data has higher effectiveness and consistency, and the correspondingly determined label data also has higher effectiveness and consistency, so that the trained text emotion analysis model has high availability, the confidence coefficient of the obtained prediction result is high, the accuracy is high, and the use value is higher.
In the above embodiment, the method for emotion analysis of text shown in fig. 2 will be mainly described. Similarly, the embodiment of the specification also discloses a method for emotion analysis of pictures. In particular, fig. 7 shows a flowchart of a method of emotion analysis of a picture, according to an embodiment, the subject of which may be any device, apparatus, platform, cluster of devices, etc. having computing or processing capabilities. As shown in fig. 7, the method comprises the steps of:
step S710, obtaining labeling data obtained by labeling the first picture, wherein the labeling data comprises a first emotion tendency selected from a plurality of alternative emotion tendencies aiming at the first emotion aspect and a first emotion intensity selected from a plurality of alternative emotion intensities aiming at the first emotion tendency; step S720, determining a first emotion type label corresponding to the combination of the first emotion tendency and the first emotion intensity based on a predetermined mapping relation between the candidate combinations of the plurality of candidate emotion tendencies and the plurality of candidate emotion intensities and the candidate emotion type label; step S730, determining a first training sample based on the first picture and the first emotion type tag, where the first training sample is used to train a picture emotion analysis model to perform emotion analysis on a picture to be analyzed.
In the above steps, in the embodiment of the present disclosure, any one of the plurality of pictures to be marked is referred to as a first picture. In one embodiment, the plurality of pictures may be pictures with emotional colors. In a specific embodiment, a picture with facial expression may be included. In one example, multiple people may be included to self-photograph. In another specific embodiment, a picture with limb language may be included. In one example, a whole body photograph of a plurality of animals or humans may be included.
In one embodiment, the photo emotion analysis model may be a neural network model. In a particular embodiment, a CNN network, a DNN network, or the like may be employed.
In addition, for the description of the above steps S710 to S730, reference may also be made to the description of the above steps S210 to S230.
In summary, by adopting the image emotion analysis method disclosed by the embodiment of the specification, the labeling data has higher effectiveness and consistency, and the correspondingly determined label data also has higher effectiveness and consistency, so that the trained image emotion analysis model has high availability, the confidence coefficient of the obtained prediction result is high, the accuracy is high, and the application value is higher.
From another point of view, in the emotion analysis method shown in fig. 2, the annotation data includes the set first emotion aspect, and the first emotion tendency and the corresponding first emotion intensity selected by the marker for the emotion aspect. In summary, the above mentioned emotion aspects, emotion tendencies and emotion intensities have a hierarchical relationship, in other words, three correspond to three levels, and three emotion dimensions corresponding to the three levels are represented respectively. Further, the candidate dimension values corresponding to each emotion dimension can be provided for the marking personnel, wherein the candidate dimension values comprise candidate emotion tendencies corresponding to emotion tendency dimensions and candidate emotion intensities corresponding to emotion intensity dimensions.
Based on this, hierarchical information that is richer or has other hierarchical structures may also be included in the annotation data. In one embodiment, the first emotional aspect described above may not be set, but rather selected by the marking person from a plurality of alternative emotional aspects. In another embodiment, one or more of emotion aspects, emotion tendencies, and emotion intensities may be included in the plurality of emotion dimensions. In yet another embodiment, emotion realism, emotion expression intent, and the like may be included in the plurality of emotion dimensions. In a specific embodiment, the candidate dimension values corresponding to the emotion realism may include: true, camouflage, and indistinguishable. In another specific embodiment, the candidate dimension values to which the emotion expressions are to correspond may include: restraint, natural bleeding, release, etc.
In particular, FIG. 8 illustrates a flow chart of a text emotion analysis method, according to another embodiment, whose execution subject may be any device, apparatus, platform, cluster of devices, etc. that has computing or processing capabilities. As shown in fig. 8, the method comprises the steps of:
step S810, marking data obtained by marking the first text is obtained, wherein the marking data comprises selected dimension values selected from alternative dimension values aiming at each of a plurality of preset emotion dimensions in a plurality of layers; step S820, determining a first emotion type label corresponding to the combination of the selected dimension values based on a predetermined mapping relation between each alternative combination of the alternative dimension values of each emotion dimension and alternative emotion type labels; step S830, determining a first training sample based on the first text and the first emotion type tag, where the first training sample is used to train a text emotion analysis model to perform emotion analysis on the text to be analyzed.
For the above steps, it is to be understood that a hierarchical relationship exists between a plurality of emotion dimensions, and the plurality of emotion dimensions are respectively located in different hierarchies. In one embodiment, the plurality of emotion dimensions may include a hierarchy from high to low emotion aspect dimension, emotion tendencies dimension, and emotion intensity dimension. Further, emotion aspect dimensions correspond to a plurality of alternative emotion aspects; the emotion tendencies dimension corresponds to a plurality of alternative emotion tendencies including alternative emotion tendencies corresponding to respective alternative emotion aspects; the emotion intensity dimension corresponds to a plurality of alternative emotion intensities, including alternative emotion intensities corresponding to respective alternative emotion tendencies.
In another embodiment, the plurality of emotion dimensions may include a hierarchy of high-to-low emotion state dimensions and emotion real dimensions. In a specific embodiment, the mapping between each alternative combination of alternative dimension values and alternative emotion type labels in each of these two dimensions is shown in table 6:
TABLE 6
In one example, based on table 6, assuming that the selected dimension value that is annotated for a text includes both positive and natural, a first category label optimism can be determined from the above-described mapping. The first training sample can be determined and used for training a text emotion analysis model aiming at the text, and emotion analysis is carried out on the picture to be analyzed.
In the above-described embodiment, fig. 2 and 8 show an emotion analysis method in which a sample in text form is a target sample, and fig. 7 shows an emotion analysis method in which a sample in picture form is a target sample. Further, the methods shown in fig. 2, 7 and 8 can be further expanded, and the method can be suitable for more forms of target samples and more scenes. On the one hand, the method is not limited to the emotion analysis scene, but can be used for other classification scenes, particularly suitable for the situation that the degree of distinction between classification labels is low, for example, classification of the field to which the knowledge points belong, classification of subjective objectivity, view and the like of news content. On the other hand, the target sample may be text, a picture, audio, video, or the like.
In particular, FIG. 9 illustrates a flowchart of a classification method for a target sample, according to an embodiment, the subject of which may be any device, apparatus, platform, cluster of devices, etc. having computing or processing capabilities. As shown in fig. 9, the method comprises the steps of:
step S910, obtaining labeling data obtained by labeling the first sample, wherein the labeling data comprises each selected category selected from each alternative category aiming at each classification level in a plurality of preset classification levels; step S920, determining a first class label corresponding to each selected class combination based on a predetermined mapping relation between each candidate class candidate combination and candidate class labels of each classification hierarchy; step S930, determining a first training sample based on the first sample and the first class label, where the first training sample is used to train a classification model to classify the target sample to be classified.
It should be noted that for the above steps, for the multiple classification levels mentioned therein, in one embodiment, different category levels, or different classification granularities, may be corresponded. In a particular embodiment, the plurality of classification levels includes a primary category and a secondary category, wherein the plurality of alternative categories in the primary category includes: the multiple alternative categories under the mathematical, physical and chemical classes include: theory, application, acoustics, particles, lasers, inorganic, organic. Further, the predetermined mapping relationship may be as shown in table 7:
TABLE 7
In one example, based on Table 7, assuming that the selected category for which labeling is done for a paper includes physics and particles, it can be determined that the first category label is particle physics. From this, a first training sample may be determined for training a classification model for the articles, classifying the target articles to be classified.
In addition, for the description of step S910 to step S930 above, reference may also be made to the description of the foregoing related content.
Above, by adopting the classification method for the target sample disclosed in the embodiment of the specification, the labeling data has higher effectiveness and consistency, and the correspondingly determined label data also has higher effectiveness and consistency, so that the trained classification model has high availability, the confidence of the obtained prediction result is high, the accuracy is high, and the use value is higher.
It should be noted that a method for obtaining annotation data for text is disclosed in fig. 3. According to another embodiment, the present specification also discloses a method for obtaining annotation data for a picture. In particular, FIG. 10 illustrates a flowchart of a method for obtaining annotation data, according to another embodiment, the subject of execution of which may be any device, apparatus, platform, cluster of devices, etc. having computing or processing capabilities. As shown in fig. 10, the method includes the steps of:
Step S1010, providing the text or picture to be annotated to the annotator, as well as a plurality of alternative emotion tendencies for the first emotion aspect and a plurality of alternative emotion intensities for the respective alternative emotion tendencies. Step S1020, obtaining a first emotion tendency selected by the labeling person from the plurality of alternative emotion tendencies and a first emotion intensity selected from the plurality of alternative emotion intensities, and using the first emotion tendency and the first emotion intensity as labeling data of the text or the picture.
For the above steps, in one embodiment, the step S1010 may include: and displaying an interactive interface, wherein the interactive interface comprises the text or the picture to be annotated, the multiple alternative emotion tendencies and the multiple alternative emotion intensities. Accordingly, step S1020 may include: and receiving the first emotion tendency and the first emotion intensity selected by the labeling personnel based on the interactive interface.
Further, for the description of step S1010 and step S1020, reference may also be made to the foregoing description of step S31 and step S32.
Corresponding to the emotion analysis method shown in fig. 8, the present specification also discloses a method for acquiring annotation data. In particular, FIG. 11 illustrates a flowchart of a method for obtaining annotation data, according to another embodiment, the subject of execution of which may be any device, apparatus, platform, cluster of devices, etc. having computing or processing capabilities. As shown in fig. 11, the method comprises the steps of:
Step S1110, providing texts or pictures to be annotated and preset candidate dimension values of each emotion dimension in a plurality of emotion dimensions of a plurality of levels to an annotator; step S1120, obtaining each selected dimension value selected by the labeling personnel from the candidate dimension values of each emotion dimension, and taking each selected dimension value as labeling data for the text or the picture.
For the above steps, in one embodiment, step S1110 may include: and displaying an interactive interface, wherein the interactive interface comprises the text or the picture to be annotated, and the alternative dimension values of each emotion dimension. Accordingly, step S1120 may include: and receiving the selected dimension values selected by the annotators based on the interactive interface.
Further, for the description of step S1110 and step S1120, reference may also be made to the foregoing description of step S31 and step S32.
According to an embodiment of yet another aspect, the present disclosure further discloses a method for obtaining sample annotation data applicable to more scenarios. In particular, fig. 12 shows a flowchart of a method for obtaining annotation data according to yet another embodiment, where the method may be performed by any device, apparatus, platform, cluster of devices, etc. having computing or processing capabilities. As shown in fig. 12, the method includes the steps of:
In step S1210, the labeling personnel is provided with the sample to be labeled and the preset candidate category of each classification level in the plurality of classification levels. Step S1220, obtaining each selected category selected by the labeling personnel from the candidate categories of each classification hierarchy, and taking each selected category as labeling data of the sample.
For the above steps, in one embodiment, step S1210 may include: and displaying an interactive interface, wherein the interactive interface comprises the sample to be marked, and the alternative category of each classification level in the plurality of classification levels. Accordingly, step S1220 may include: and receiving the selected categories selected by the annotators based on the interactive interface.
Further, for the description of step S1210 and step S1220, reference may also be made to the foregoing description of step S31 and step S32.
Corresponding to the methods described in the above embodiments, various devices are also disclosed in this specification. The method comprises the following steps:
13 FIG. 13 shows a schematic block diagram of a text emotion analysis device according to one embodiment. As shown, the apparatus 1300 includes:
the annotation data obtaining unit 1310 is configured to obtain annotation data obtained by annotating a first text, where the annotation data includes a first emotion tendency selected from a plurality of alternative emotion tendencies for a first emotion aspect, and a first emotion intensity selected from a plurality of alternative emotion intensities for the first emotion tendency. A category label determining unit 1320 configured to determine a first emotion category label corresponding to a combination of the first emotion tendencies and first emotion intensities based on a predetermined mapping relationship between the plurality of alternative emotion tendencies and alternative combinations of the plurality of alternative emotion intensities and alternative emotion category labels; and a training sample determining unit 1330 configured to determine, based on the first text and the first emotion type tag, a first training sample for training a text emotion analysis model to perform emotion analysis on the text to be analyzed.
In one embodiment, the plurality of alternative emotional trends includes positive, negative, and no trends.
In one embodiment, the first affective aspect is one of the following: pleasure, optimism, access, love, humming, photophobia, approval and self-responsibility.
In one embodiment, the number of the plurality of alternative emotion intensities for the first emotion tendencies is determined based on a pre-sorted number of alternative emotion category labels with the first emotion orientation of the first emotion aspect.
Fig. 14 shows a schematic block diagram of a photo emotion analysis device according to one embodiment. As shown, the apparatus 1400 includes:
a labeling data obtaining unit 1410 configured to obtain labeling data obtained by labeling a first picture, where the labeling data includes a first emotion tendency selected from a plurality of alternative emotion tendencies for a first emotion aspect, and a first emotion intensity selected from a plurality of alternative emotion intensities for the first emotion tendency; a category label determining unit 1420 configured to determine a first emotion category label corresponding to a combination of the first emotion tendencies and first emotion intensities based on a predetermined mapping relationship between the plurality of alternative emotion tendencies and alternative combinations of the plurality of alternative emotion intensities and alternative emotion category labels; and a training sample determining unit 1430 configured to determine a first training sample based on the first picture and the first emotion type tag, where the first training sample is used to train a picture emotion analysis model to perform emotion analysis on a picture to be analyzed.
In one embodiment, the plurality of alternative emotional trends includes positive, negative, and no trends.
In one embodiment, the first affective aspect is one of the following: pleasure, optimism, access, love, humming, photophobia, approval and self-responsibility.
In one embodiment, the number of the plurality of alternative emotion intensities for the first emotion tendencies is determined based on a pre-sorted number of alternative emotion category labels with the first emotion orientation of the first emotion aspect.
Fig. 15 shows a schematic block diagram of a text emotion analyzing apparatus according to another embodiment. As shown in fig. 15, the apparatus 1500 includes: a labeling data obtaining unit 1510 configured to obtain labeling data obtained by labeling the first text, where the labeling data includes, for each of a plurality of emotion dimensions in a plurality of levels set in advance, each selected dimension value selected from the candidate dimension values; a category label determining unit 1520 configured to determine a first emotion category label corresponding to a combination of the selected dimension values based on a predetermined mapping relationship between each candidate combination of candidate dimension values of each emotion dimension and candidate emotion category labels; and a training sample determining unit 1530 configured to determine a first training sample based on the first text and the first emotion type tag, for training a text emotion analysis model to perform emotion analysis on the text to be analyzed.
In one embodiment, the plurality of emotion dimensions includes at least one of: emotion aspect, emotion tendency, emotion intensity.
Fig. 16 shows a schematic block diagram of a classification apparatus for target samples according to an embodiment. As shown in fig. 16, the apparatus 1600 includes: a labeling data obtaining unit 1610 configured to obtain labeling data obtained by labeling a first sample, where the labeling data includes, for each of a plurality of preset classification levels, each selected category selected from each candidate category; a category label determining unit 1620 configured to determine a first category label corresponding to each selected category item combination based on a predetermined mapping relationship between each candidate item combination of each classification hierarchy and a candidate category label; the training sample determining unit 1630 is configured to determine, based on the first sample and the first class label, a first training sample for training a classification model to classify the target sample to be classified.
FIG. 17 illustrates a schematic block diagram of an apparatus for acquiring annotation data, according to one embodiment. As shown in fig. 17, the apparatus 1700 includes:
A providing unit 1710 configured to provide a text or a picture to be annotated to an annotator, and a plurality of alternative emotion tendencies for the first emotion aspect and a plurality of alternative emotion intensities for the respective alternative emotion tendencies; an obtaining unit 1720 configured to obtain a first emotion tendency selected by the labeling person from the plurality of alternative emotion tendencies and a first emotion intensity selected from a plurality of alternative emotion intensities for the first emotion tendency, and use the first emotion tendency and the first emotion intensity as labeling data for the text or the picture.
In one embodiment, the providing unit 1710 is specifically configured to: displaying an interactive interface, wherein the interactive interface comprises the text or the picture to be annotated, the multiple alternative emotion tendencies and multiple alternative emotion intensities aiming at each alternative emotion tendency; the acquisition unit 1720 is specifically configured to: and receiving the first emotion tendency and the first emotion intensity selected by the labeling personnel based on the interactive interface.
FIG. 18 shows a schematic block diagram of an apparatus for acquiring annotation data, according to another embodiment. As shown in fig. 18, the apparatus 1800 includes: a providing unit 1810 configured to provide the labeling person with a text or a picture to be labeled, and preset candidate dimension values of each emotion dimension among a plurality of emotion dimensions in a plurality of levels; an obtaining unit 1820, configured to obtain each selected dimension value selected by the labeling person from the candidate dimension values of each emotion dimension, and use the each selected dimension value as labeling data for the text or the picture.
FIG. 19 shows a schematic block diagram of an apparatus for acquiring annotation data, according to another embodiment. As shown in fig. 19, the apparatus 1900 includes:
a providing unit 1910 configured to provide a sample to be annotated to an annotator and preset alternative categories of each of a plurality of classification levels; an obtaining unit 1920 configured to obtain each selected category selected by the labeling person from the candidate categories of each classification hierarchy, and take the each selected category as labeling data for the sample.
In one embodiment, the providing unit 1910 is specifically configured to: displaying an interactive interface, wherein the interactive interface comprises the sample to be marked, and the alternative category of each classification level in the plurality of classification levels; the acquiring unit 1920 is specifically configured to: and receiving the selected categories selected by the annotators based on the interactive interface.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2, 3, 8, 9, 10, 11 and 12.
According to an embodiment of yet another aspect, there is also provided a computing device including a memory having executable code stored therein and a processor that, when executing the executable code, implements the method described in connection with fig. 2, 3, 8, 9, 10, 11 and 12.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention in further detail, and are not to be construed as limiting the scope of the invention, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the invention.