CN115688766A - Alternative prompt answer generation method and device based on emotion analysis - Google Patents

Alternative prompt answer generation method and device based on emotion analysis Download PDF

Info

Publication number
CN115688766A
CN115688766A CN202211392664.8A CN202211392664A CN115688766A CN 115688766 A CN115688766 A CN 115688766A CN 202211392664 A CN202211392664 A CN 202211392664A CN 115688766 A CN115688766 A CN 115688766A
Authority
CN
China
Prior art keywords
word
emotion
sentences
words
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211392664.8A
Other languages
Chinese (zh)
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PEOPLE'S BANK OF CHINA NATIONAL CLEARING CENTER
Original Assignee
PEOPLE'S BANK OF CHINA NATIONAL CLEARING CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PEOPLE'S BANK OF CHINA NATIONAL CLEARING CENTER filed Critical PEOPLE'S BANK OF CHINA NATIONAL CLEARING CENTER
Priority to CN202211392664.8A priority Critical patent/CN115688766A/en
Publication of CN115688766A publication Critical patent/CN115688766A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides a method and a device for generating alternative prompt answers based on emotion analysis, wherein the method comprises the following steps: generating high-dimensional word vectors according to a pre-generated emotion word and sentence set; segmenting words of the emotion word and sentence set to generate a segmentation result; and generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results. The invention solves the problem that the aspect information is neglected in the existing method. When emotion analysis is carried out on small sample data, the data are effectively augmented by methods such as retracing and generation; vectorizing the text by using a continuous bag-of-words model, and effectively capturing context information of words; clustering word vectors according to seed words to find alternative prompt answers; and finding the best prompt answer through the gradient descending speed of the model.

Description

Alternative prompt answer generation method and device based on emotion analysis
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method and a device for generating alternative prompt answers based on emotion analysis.
Background
In recent years, with the rapid development of natural language processing technology, for example, beginning with BERT, finetune to a pre-training model has become a conventional paradigm in the related art, namely "pre-train, fine-tune". But starting from GPT-3, a new paradigm has started to attract attention of technicians and has become increasingly popular.
Take the text emotion classification task as an example: in The text emotion classification task, for The input of The ' I love this movie. "The movie is added with The form of prompt ' The movie is ' at The back, then The PLM fills in The answers such as ' great ', ' terble ' and The like which represent emotions, and finally The answers are converted into labels for emotion classification, so that The model prediction output can be controlled by selecting The appropriate prompt, and a pre-training language model which is completely unsupervised and trained can be used for solving various downstream tasks.
Like the "pre-train, fine-tune" approach, prompt performs better under small sample data. When using a prompt, the person skilled in the art will generally proceed as follows:
1. a template is used to reconstruct the input data.
The template is usually a segment of natural language and contains an empty position, which we denote by [ mask ]:
such as: in the task of text sentiment classification, assume that the input is "I love this movie. "Itwas a (mask.)"
Then the input statement becomes "I love this movie. It was [ mask ]" "It was [ mask ]" called template
2. And designing a prompt answer, and enabling the model to predict the prompt answer.
The prompt answer is the word corresponding to the mask part in the template. For example, in emotion analysis, positive emotion can be represented by "great" and negative emotion can be represented by "terrible". For the input: it can be determined that the emotional tendency in this sentence is positive if the model predicts "mask" as "great". The design of the prompt answers has a great influence on the accuracy of the model:
attribute-level sentiment analysis, text sentiment analysis, is a research for people to compute opinions, evaluations, attitudes and sentiments expressed by entities (including products, services, organizations, individuals, issues, events, topics and their attributes, etc.). While attribute-level sentiment analysis classifies opinions by aspect and identifies sentiments associated with each aspect.
For example, the following evaluations for restaurants, the analysis is a negative evaluation from an environmental point of view, but a positive evaluation from a service point of view:
"they are often crowded on weekends, but their services are efficient two accurate" -environment: negative effects; service: and a front surface. Unlike sentence-level sentiment analysis, attribute-level sentiment analysis requires analysis for some aspect. Applying Prompt to attribute-level sentiment analysis, existing methods generally use the following templates:
"The[ASPECT]is[mask]."
[ ASPECT ] indicates the ASPECT to be evaluated. For example, the following evaluations were made:
"unfortunately,the food is outstanding,but everything else about this restaurant is the pits"
for the evaluation of food, the template applied was changed to the following form:
"unfortunately,the food is outstanding,but everything else about this restaurant is the pits.The food is[mask]"。
when the prompt method is applied to attribute-level emotion analysis, the prompt answers used by the existing methods are not adjusted according to the evaluated aspects, for example, positive prompt answers are all represented by 'great', and negative prompt answers are all represented by 'terrible'. This practice ignores the important factor "aspect", and the expression of emotional polarity tends to be different for different aspects. For example, in the financial field, for stocks, "goodness" and "profit" are used to evaluate more appropriately; for example: for monetary policies, "loose" and "compact" are used to evaluate more appropriateness. The unified prompt answers are used in different aspects, the language habit is not met, and the knowledge learned in the pre-training language model cannot be better utilized.
Disclosure of Invention
According to the alternative prompt answer generation method and device based on the emotion analysis, the problem that aspect information is ignored in the existing method is solved. When emotion analysis is carried out on small sample data, the data are effectively augmented by methods such as translation and generation; vectorizing the text by using a continuous bag-of-words model, and effectively capturing context information of words; clustering word vectors according to seed words to find alternative prompt answers; and finding the best prompt answer through the gradient descending speed of the model.
In order to achieve the above object, the present invention provides a method for generating alternative prompted answers based on emotion analysis, including:
generating high-dimensional word vectors according to a pre-generated emotion word and sentence set;
segmenting words of the emotion word and sentence set to generate a segmentation result;
and generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results.
In one embodiment, the method for generating the emotion word and sentence set comprises the following steps:
and expanding the pre-acquired initial word and sentence set to generate the emotion word and sentence set.
In an embodiment, the expanding the pre-obtained initial word and sentence set to generate the emotion word and sentence set includes:
translating the emotion words and sentences in the initial word convergence set with the first natural language format into emotion words and sentences with a second natural language format;
and translating the emotion words and sentences with the second natural language format back to the emotion words and sentences with the first natural language format so as to expand the initial words and sentences set.
In an embodiment, the expanding the pre-obtained initial word and sentence set to generate the emotion word and sentence set further includes:
and randomly deleting partial words and sentences in the emotion words and sentences in the initial word congregation set, and randomly generating words and sentences similar to the emotion words and sentences before deletion according to the deleted emotion words and sentences so as to expand the initial word and sentence set.
In one embodiment, the generating a high-dimensional word vector according to a pre-generated emotion word and sentence set includes:
predicting the current central word of the emotion words and sentences according to the context in the emotion words and sentences set;
and generating the high-dimensional word vector according to the current central word and a pre-generated continuous bag-of-words model.
In an embodiment, the segmenting words for the emotion word and sentence set to generate a segmentation result includes:
and segmenting the emotion words and sentences in the expanded emotion word and sentence set to generate words and sentences with independent meanings.
In an embodiment, the generating an alternative prompt answer according to the high-dimensional word vector and the word segmentation result includes:
and clustering the high-dimensional word vectors and the word segmentation results to generate the alternative prompt answers.
In one embodiment of the method, the first and second containers are,
in an embodiment, the clustering the high-dimensional word vector and the word segmentation result to generate the alternative prompt answer includes:
determining the number of clusters according to the positive and negative emotion attributes of the emotion words and sentences so as to determine each cluster;
randomly selecting an emotion word and sentence as an initial cluster center;
calculating the cosine distance from each emotion word to the center of the initial cluster class corresponding to the emotion word;
and determining the alternative prompt answers in the emotion word and sentence set according to the cosine distance.
In an embodiment, the determining the alternative prompt answer in the emotion word and sentence set according to the cosine distance includes:
selecting a preset number of emotion words and sentences from the emotion words and sentences in the emotion word and sentence set according to the sequence of the cosine distance from small to large so as to generate a training set;
training a prompt attribute level emotion analysis model by using the training set;
and selecting the emotion words and sentences corresponding to the fastest gradient descent in the training process as the alternative prompt answers.
In a second aspect, the present invention provides an alternative prompted answer generating device based on emotion analysis, including:
the high-dimensional word vector generating module is used for generating high-dimensional word vectors according to the emotion word and sentence sets generated in advance;
the word segmentation result generation module is used for segmenting words of the emotion word and sentence set to generate word segmentation results;
and the prompt answer generating module is used for generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results.
In one embodiment, the alternative prompted answer generating apparatus based on emotion analysis further includes: a word and sentence set generating module for generating the emotion word and sentence set, wherein the word and sentence set generating module comprises:
and the word and sentence set generating unit is used for expanding a pre-acquired initial word and sentence set so as to generate the emotion word and sentence set.
In one embodiment, the word and sentence set generating unit includes:
the second word and sentence translation unit is used for translating the emotion words and sentences in the initial word convergence set with the first natural language format into emotion words and sentences with the second natural language format;
the first word and sentence translation unit is used for translating the emotion words and sentences with the second natural language format back to the emotion words and sentences with the first natural language format so as to expand the initial word and sentence set.
In one embodiment, the word and sentence set generating unit further includes:
and the similar word and sentence generating unit is used for randomly deleting partial words and sentences in the emotion words and sentences in the initial word convergence set, and randomly generating words and sentences similar to the emotion words and sentences before deletion according to the deleted emotion words and sentences so as to expand the initial word and sentence set.
In one embodiment, the high-dimensional word vector generation module includes:
the central word prediction unit is used for predicting the current central word of the emotion words and sentences according to the context in the emotion words and sentences in the emotion word and sentence set;
and the high-dimensional word vector generating unit is used for generating the high-dimensional word vector according to the current central word and a pre-generated continuous word bag model.
In one embodiment, the word segmentation result generation module includes:
and the independent word and sentence generating module is used for segmenting the emotion words and sentences in the emotion word and sentence set after the emotion words and sentences are expanded so as to generate words and sentences with independent meanings.
In one embodiment, the prompted answer generating module includes:
and the word and sentence clustering unit is used for clustering the high-dimensional word vectors and the word segmentation results to generate the alternative prompt answers.
In one embodiment, the word and sentence clustering unit includes:
the cluster number determining unit is used for determining the number of clusters according to the positive and negative emotion attributes of the emotion words and sentences so as to determine each cluster;
a cluster center selecting unit for randomly selecting an emotion word and sentence as an initial cluster center;
the cluster calculating unit is used for calculating the cosine distance from each emotion word to the center of the initial cluster corresponding to the emotion word in each cluster;
and the prompt answer determining unit is used for determining the alternative prompt answers in the emotion word and sentence set according to the cosine distance.
In one embodiment, the prompted answer determining unit includes:
the training combination generating unit is used for selecting a preset number of emotion words and sentences from the emotion words and sentences in the emotion word and sentence set according to the sequence of the cosine distance from small to large so as to generate a training set;
the model training unit is used for training a prompt attribute-level emotion analysis model by using the training set;
and the gradient selection unit is used for selecting the emotion words and sentences corresponding to the fastest gradient decrease in the training process as the alternative prompt answers.
In a third aspect, the present invention provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of a method for generating alternative prompted answers based on sentiment analysis.
In a fourth aspect, the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the alternative prompted answer generation method based on emotion analysis when executing the program.
In a fifth aspect, the present invention provides a computer readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of a method for generating alternative prompted answers based on sentiment analysis.
As can be seen from the above description, the alternative prompted answer generation method and apparatus based on emotion analysis provided in the embodiments of the present invention include: firstly, generating a high-dimensional word vector according to a pre-generated emotion word and sentence set; then, segmenting words of the emotion word and sentence set to generate a segmentation result; and finally, generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results. The invention solves the problem that the aspect information is neglected in the existing method. When emotion analysis is carried out on small sample data, the data are effectively augmented by methods such as translation and generation; vectorizing the text by using a continuous bag-of-words model, and effectively capturing context information of words; clustering word vectors according to seed words to find alternative prompt answers; and finding the best prompt answer through the gradient descending speed of the model.
Drawings
In order to more clearly illustrate the embodiments or technical solutions of the present invention, the drawings used in the embodiments or technical solutions in the prior art are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a first flowchart illustrating a method for generating alternative prompted answers based on emotion analysis according to an embodiment of the present invention;
FIG. 2 is a second flowchart illustrating a method for generating alternative prompted answers based on emotion analysis according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a step 400 according to an embodiment of the present invention;
FIG. 4 is a first flowchart illustrating step 401 according to an embodiment of the present invention;
FIG. 5 is a second flowchart illustrating step 401 according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating step 100 according to an embodiment of the present invention;
FIG. 7 is a flowchart illustrating step 200 according to an embodiment of the present invention;
FIG. 8 is a flowchart illustrating step 300 according to an embodiment of the present invention;
FIG. 9 is a flowchart illustrating step 301 according to an embodiment of the present invention;
FIG. 10 is a flowchart illustrating step 3014 according to an embodiment of the present invention;
FIG. 11 is a block diagram of an alternative hinted answer generation method architecture based on sentiment analysis in an exemplary embodiment of the present invention;
FIG. 12 is a flowchart illustrating a method for generating alternative prompted answers based on sentiment analysis according to an embodiment of the present invention;
FIG. 13 is a first schematic structural diagram illustrating an alternative prompted answer generating device for emotion analysis according to an embodiment of the present invention;
FIG. 14 is a second schematic structural diagram illustrating an alternative prompted answer generating device based on emotion analysis according to an embodiment of the present invention;
fig. 15 is a schematic structural diagram of the phrase set generating module 40 according to the embodiment of the present invention;
fig. 16 is a first schematic structural diagram of a word and sentence set generating unit 401 in the embodiment of the present invention;
fig. 17 is a second schematic structural diagram of the word and sentence set generating unit 401 in the embodiment of the present invention;
fig. 18 is a schematic structural diagram of the high-dimensional word vector generation module 10 according to the embodiment of the present invention;
fig. 19 is a schematic structural diagram of the word segmentation result generation module 20 according to the embodiment of the present invention;
fig. 20 is a schematic structural diagram of a prompt answer generating module 30 according to an embodiment of the present invention;
fig. 21 is a schematic structural diagram of a word and sentence clustering unit 301 according to an embodiment of the present invention;
fig. 22 is a schematic structural diagram of the prompt answer determining unit 3014 according to the embodiment of the present invention;
fig. 23 is a schematic structural diagram of an electronic device in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
The embodiment of the invention provides a specific implementation of a method for generating alternative prompt answers based on emotion analysis, and with reference to fig. 1, the method specifically includes the following steps:
step 100: and generating high-dimensional word vectors according to the emotion word and sentence sets generated in advance.
Specifically, a data modeling method is utilized to convert natural language into an array vector which can be blown by a machine, and in addition, a word vector is a common way for representing word characteristics. The value of each dimension of the word vector represents a feature with a certain semantic and grammatical interpretation. Therefore, each dimension of a word vector may be referred to as a word feature. The word vector has various forms, and distributed representation is one of them. A distributed representation is a dense, low-dimensional, real-valued vector. Each dimension of the distributed representation represents a potential feature of the term that captures useful syntactic and semantic properties. It can be seen that the distributed word in the distributed representation embodies such a feature of the word vector: different syntactic and semantic features of a word are distributed to each of its dimensions for representation.
Step 200: segmenting words of the emotion word and sentence set to generate a segmentation result;
according to specific requirements, the emotion words in the emotion word and sentence set are divided into a character string sequence (the elements of the sequence are generally called tokens or words), and specifically, texts in input data are divided into characters and words with independent meanings.
Preferably, in step 200, by using a word segmentation method based on deep learning, the most basic vectorized atomic features are directly used as input, and through multi-layer nonlinear transformation, the output layer can well predict the mark or the next action of the current word.
Step 300: and generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results.
As can be seen from the above description, the alternative prompted answer generation method based on emotion analysis provided in the embodiment of the present invention includes: firstly, generating high-dimensional word vectors according to a pre-generated emotion word and sentence set; then, segmenting words of the emotion word and sentence set to generate a word segmentation result; and finally, generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results. The invention provides an aspect-oriented prompt answer generation method. The method comprises the key steps of data augmentation, word2vec training, vectorization of input text, generation of alternative prompt answers, generation of optimal prompt answers and the like. The invention solves the problem that the aspect information is neglected in the existing method. When emotion analysis is carried out on small sample data, the data are effectively augmented by methods such as translation and generation; vectorizing the text by using a continuous bag-of-words model, and effectively capturing context information of words; clustering word vectors according to seed words to find alternative prompt answers; and finding the best prompt answer through the gradient descending speed of the model.
In an embodiment, referring to fig. 2, the alternative prompted answer generation method based on emotion analysis further includes:
step 400: generating the emotion word and sentence set, further referring to fig. 3, step 400 includes:
step 401: and expanding the pre-acquired initial word and sentence set to generate the emotion word and sentence set.
The present application provides two embodiments of step 401, referring to fig. 4, a first embodiment of step 401 includes:
step 4011: translating the emotion words and sentences in the initial word convergence set with the first natural language format into emotion words and sentences with a second natural language format;
step 4012: translating the emotion words and sentences with the second natural language format back to the emotion words and sentences with the first natural language format to expand the initial words and sentences set.
In step 4011 and step 4012, the input text is translated into another language by using a translation tool, and then translated back into the original language, so as to obtain a new corpus having the same tag as the original text.
Referring to fig. 5, a second embodiment of step 401 includes:
step 4013: and randomly deleting partial words and sentences in the emotion words and sentences in the initial word congregation set, and randomly generating words and sentences similar to the emotion words and sentences before deletion according to the deleted emotion words and sentences so as to expand the initial word and sentence set.
Specifically, the following two models can be employed to generate sentences similar to the input question, thereby performing data augmentation.
Bert-based similar sentence generation
Randomly masking words in the input sentence based on a MLM model random sampling mode, thereby generating a sentence similar to the input sentence:
the GPT2-based similar sense generation adopts a GPT2 generation model to generate sentences similar to the input sentences
In one embodiment, referring to fig. 6, step 100 comprises:
step 101: predicting the current central word of the emotion words and sentences according to the context in the emotion words and sentences in the emotion word and sentence set;
step 102: and generating the high-dimensional word vector according to the current central word and a pre-generated continuous word bag model.
In steps 101 to 102, the continuous bag of words model CBOW is used to make word2vec (mapping words into high-dimensional word vectors). The CBOW predicts the current central word through the context word information of the current central word, and the word vector trained by the CBOW can effectively capture the context information of the central word. And then, mapping the word segmentation result into a high-dimensional word vector by using the trained CBOW model.
In one embodiment, referring to fig. 7, step 200 comprises:
step 201: and segmenting the emotion words in the emotion word and sentence set after the expansion to generate words and sentences with independent meanings.
The text in the input data is divided into words and phrases with independent meanings.
In one embodiment, referring to fig. 8, step 300 comprises:
step 301: and clustering the high-dimensional word vectors and the word segmentation results to generate the alternative prompt answers.
And converting the texts belonging to the same aspect into word vectors, and specifically clustering the word vectors generated in the third step by using a k-means method so as to generate alternative prompt answers.
Further, referring to fig. 9, step 301 includes:
step 3011: determining the number of clusters according to the positive and negative emotion attributes of the emotion words and sentences so as to determine each cluster;
step 3012: randomly selecting an emotion word and sentence as an initial cluster center;
step 3013: calculating the cosine distance from each emotion word to the center of the initial cluster class corresponding to the emotion word;
step 3014: and determining the alternative prompt answers in the emotion word and sentence set according to the cosine distance.
In steps 3011 to 3014, it can be understood that K-means is based on the clustering by the partition method, and the principle is to initialize K cluster centers first, summarize the samples under each cluster based on the distance between the calculated sample and the central point, and iteratively achieve the target that the distance between the sample and the cluster center to which the sample belongs is the minimum. The specific method comprises the following steps:
1. number of clusters k: k is the number of emotion polarities, for example, if the emotion polarities are positive and negative, k is 2.
2. Selecting an initial cluster center: for each emotion polarity of each aspect, an emotion word is preset as an initial cluster center. For example, for restaurant food evaluation, "good eating" and "hard eating" may be selected as the initial cluster center
3. And calculating the distance from each word vector in the data set to the cluster class center, and attributing the word vector to the class corresponding to the cluster class center with the minimum distance. The distance is cosine distance, and the target function is as follows:
Figure BDA0003932594100000101
4. recalculating its cluster class center position for each cluster class
5. And repeating the above 3 and 4 steps until the position of the cluster center is unchanged, thereby determining the final cluster center.
6. And for each cluster class, selecting 5 words nearest to the center of the cluster class as alternative prompt answers.
In one embodiment, referring to fig. 10, step 3014 includes:
step 30141: selecting a preset number of emotion words and sentences from the emotion words and sentences in the emotion word and sentence set according to the sequence of the cosine distance from small to large so as to generate a training set;
step 30142: training a prompt attribute-level emotion analysis model by using the training set;
step 30143: and selecting the emotion words and sentences corresponding to the fastest gradient decrease in the training process as the alternative prompt answers.
And (5) taking the alternative words generated in the step (4) as prompt answers in sequence, training by using a prompt-based attribute-level emotion analysis model, and selecting the word with the fastest gradient decline in the training as the best prompt answer.
To further illustrate the present solution, the present application provides a specific application example of the alternative prompted answer generation method based on emotion analysis, see fig. 11 and fig. 12.
S1: inputting attribute-level emotion analysis small sample data;
each sample data consists of two items: text and aspects, for example: "they are often crowded on weekends, but their service is efficient twice as accurate" this is text, where "service" is an aspect. The data is small sample data and is suitable for training in a prompt mode. The above examples, when applied to a template, become:
they are often crowded on weekends, but their service is efficient and accurate. Service very [ mask ];
the value of the mask is predefined, and the value range is the same as the emotion direction needing to be judged.
If the emotion polar direction is two, [ mask ] can take two values to represent two emotion polar directions. For example, [ Zhou ] and [ bad ], these two words are called affective words. The task of the model is to predict whether the mask is week or bad, so as to predict the attribute-level emotional orientation of the text.
Because the value of the mask is predefined and the value of the mask is related to the effect of model prediction, the invention aims to find the emotional word with the best effect as the value of the mask.
S2: and (5) data amplification.
The input data is small sample data, the contained emotion words are limited, and the data is firstly augmented in order to select the most appropriate emotion words as prompt answers. Two data augmentation approaches are used here:
1. back translation
Translating the input text into another language by a translation tool, and then translating the input text into the original language to obtain new language material with the same label as the original text
2. Mode of generation
The following two models may be used to generate sentences similar to the input question, thereby implementing the Bert-based similar sensor generation model and the GPT2-based similar sensor generation model.
S3: word2vec training.
Preferably, word2vec is made using the continuous bag of words model CBOW. The CBOW predicts the current central word through the context word information of the current central word, and the word vector trained by the CBOW can effectively capture the context information of the central word.
S4: and performing word segmentation and vectorization processing on the augmented data.
For the augmented data in step S2, word segmentation and vectorization are performed on the texts belonging to the same aspect. And mapping the result after word segmentation into a high-dimensional word vector by using the CBOW model trained in the step S3.
S5: an alternative prompted answer is generated.
And (3) converting the texts belonging to the same aspect into word vectors, and clustering the word vectors generated in the step (S4) by a k-means method so as to generate alternative prompt answers.
S6: and generating the best prompt answer.
And (5) sequentially taking the alternative words generated in the step (S5) as prompt answers, training by using a property level emotion analysis model based on prompt, and selecting the word with the highest gradient reduction in the training as the best prompt answer.
Based on the same inventive concept, the embodiment of the present application further provides an alternative prompted answer generating device based on emotion analysis, which can be used to implement the methods described in the above embodiments, such as the following embodiments. Because the problem solving principle of the alternative prompted answer generating device based on emotion analysis is similar to that of the alternative prompted answer generating method based on emotion analysis, the implementation of the alternative prompted answer generating device based on emotion analysis can be implemented by the alternative prompted answer generating method based on emotion analysis, and repeated parts are not described again. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
The embodiment of the present invention provides a specific implementation manner of an alternative prompted answer generating device based on emotion analysis, which is capable of implementing an alternative prompted answer generating method based on emotion analysis, and with reference to fig. 13, the alternative prompted answer generating device based on emotion analysis specifically includes the following contents:
the high-dimensional word vector generation module 10 is used for generating high-dimensional word vectors according to a pre-generated emotion word and sentence set;
a word segmentation result generation module 20, configured to perform word segmentation on the emotion word and sentence set to generate a word segmentation result;
and a prompt answer generating module 30, configured to generate an alternative prompt answer according to the high-dimensional word vector and the word segmentation result.
In an embodiment, referring to fig. 14, the apparatus for generating an alternative prompted answer based on emotion analysis further includes: a phrase set generating module 40, configured to generate the emotion phrase set, referring to fig. 15, where the phrase set generating module 40 includes:
a phrase set generating unit 401, configured to expand a pre-obtained initial phrase set to generate the emotion phrase set.
In one embodiment, referring to fig. 16, the phrase set generating unit 401 includes:
a second word and sentence translating unit 4011, configured to translate the emotion words and sentences in the initial word congregation with the first natural language format into emotion words and sentences with the second natural language format;
a first sentence translation unit 4012, configured to translate the emotion sentences in the second natural language format back to the emotion sentences in the first natural language format to expand the initial set of sentences.
In an embodiment, referring to fig. 17, the sentence set generating unit 401 further includes:
and the similar word and sentence generating unit 4013 is configured to randomly delete a part of words and sentences in the emotion words and sentences in the initial word congregation set, and randomly generate words and sentences similar to the emotion words and sentences before deletion according to the deleted emotion words and sentences to expand the initial word and sentence set.
In one embodiment, referring to fig. 18, the high-dimensional word vector generation module 10 includes:
a headword prediction unit 101, configured to predict a current headword of an emotion word and sentence according to a context in the emotion word and sentence set;
a high-dimensional word vector generating unit 102, configured to generate the high-dimensional word vector according to the current central word and a pre-generated continuous bag-of-words model.
In one embodiment, referring to fig. 19, the word segmentation result generation module 20 includes:
and an independent word and sentence generating module 201, configured to segment the emotion words and sentences in the augmented emotion word and sentence set to generate words and sentences with independent meanings.
In one embodiment, referring to fig. 20, the prompted answer generating module 30 includes:
and a word and sentence clustering unit 301, configured to cluster the high-dimensional word vector and the word segmentation result to generate the candidate prompt answer.
In one embodiment, referring to fig. 21, the phrase clustering unit 301 includes:
a cluster number determining unit 3011, configured to determine the number of clusters according to the positive and negative emotion attributes of the emotion words and phrases to determine each cluster;
a cluster center selecting unit 3012 configured to randomly select an emotion word and sentence as an initial cluster center;
a cluster calculating unit 3013, configured to calculate, in each cluster, a cosine distance from each emotion word to a center of the initial cluster corresponding to the emotion word;
and a prompt answer determining unit 3014, configured to determine the alternative prompt answer in the emotion word and sentence set according to the cosine distance.
In an embodiment, referring to fig. 22, the prompted answer determining unit 3014 includes:
a training combination generating unit 30141, configured to select a preset number of emotion words and phrases from among the emotion words and phrases in the emotion word and phrase set according to a sequence that the cosine distance is from small to large, so as to generate a training set;
the model training unit 30142 is configured to train a prompt attribute-level emotion analysis model by using the training set;
and the gradient selecting unit 30143 is configured to select, as the alternative prompt answer, an emotion word and sentence corresponding to the fastest gradient decrease in the training process.
As can be seen from the above description, the alternative prompted answer generating device based on emotion analysis provided in the embodiment of the present invention includes: firstly, generating a high-dimensional word vector according to a pre-generated emotion word and sentence set; then, segmenting words of the emotion word and sentence set to generate a segmentation result; and finally, generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results. The invention provides an aspect-oriented prompt answer generation method. The method comprises the key steps of data augmentation, word2vec training, vectorization of input text, generation of alternative prompt answers, generation of optimal prompt answers and the like. The invention solves the problem that the aspect information is neglected in the existing method. When emotion analysis is carried out on small sample data, the data are effectively augmented by methods such as translation and generation; vectorizing the text by using a continuous bag-of-words model, and effectively capturing context information of words; clustering word vectors according to seed words to find alternative prompt answers; and finding the best prompt answer through the gradient descending speed of the model.
The embodiment of the present application further provides a specific implementation manner of an electronic device, which is capable of implementing all steps in the alternative prompted answer generation method based on emotion analysis in the foregoing embodiment, and referring to fig. 23, the electronic device specifically includes the following contents:
a processor (processor) 1201, a memory (memory) 1202, a communication Interface (Communications Interface) 1203, and a bus 1204;
the processor 1201, the memory 1202, and the communication interface 1203 complete communication with each other through the bus 1204; the communication interface 1203 is configured to implement information transmission between related devices, such as a server-side device and a client-side device.
The processor 1201 is configured to call the computer program in the memory 1202, and the processor executes the computer program to implement all the steps in the alternative prompted answer generation method based on emotion analysis in the above embodiments, for example, the processor executes the computer program to implement the following steps:
step 100: generating high-dimensional word vectors according to a pre-generated emotion word and sentence set;
step 200: segmenting words of the emotion word and sentence set to generate a segmentation result;
step 300: and generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results.
Embodiments of the present application further provide a computer-readable storage medium capable of implementing all steps of the alternative prompted answer generation method based on emotion analysis in the foregoing embodiments, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all steps of the alternative prompted answer generation method based on emotion analysis in the foregoing embodiments, for example, when the processor executes the computer program, the processor implements the following steps:
step 100: generating high-dimensional word vectors according to a pre-generated emotion word and sentence set;
step 200: segmenting words of the emotion word and sentence set to generate word segmentation results;
step 300: and generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the hardware + program class embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Although the present application provides method steps as in embodiments or flowcharts, additional or fewer steps may be included based on routine or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an actual apparatus or client product executes, it may execute sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or methods shown in the figures.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (21)

1. A method for generating alternative prompt answers based on emotion analysis is characterized by comprising the following steps:
generating high-dimensional word vectors according to a pre-generated emotion word and sentence set;
segmenting words of the emotion word and sentence set to generate a segmentation result;
and generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results.
2. The alternative prompted answer generation method of claim 1, wherein the method of generating the set of emotion words and sentences comprises:
and expanding the pre-acquired initial word and sentence set to generate the emotion word and sentence set.
3. The alternative prompted answer generating method of claim 2, wherein said expanding a pre-obtained initial set of words and sentences to generate said set of emotional words and sentences comprises:
translating the emotion words and sentences in the initial word convergence set with the first natural language format into emotion words and sentences with a second natural language format;
and translating the emotion words and sentences with the second natural language format back to the emotion words and sentences with the first natural language format so as to expand the initial words and sentences set.
4. The alternative prompted answer generating method of claim 2, wherein said expanding a pre-obtained initial set of words and sentences to generate said set of emotional words and sentences further comprises:
and randomly deleting partial words and sentences in the emotion words and sentences in the initial word convergence set, and randomly generating words and sentences similar to the emotion words and sentences before deletion according to the deleted emotion words and sentences so as to expand the initial word and sentence set.
5. The alternative prompted answer generating method of claim 1, wherein the generating of high-dimensional word vectors according to the pre-generated emotion word and sentence sets comprises:
predicting the current central word of the emotion words and sentences according to the context in the emotion words and sentences in the emotion word and sentence set;
and generating the high-dimensional word vector according to the current central word and a pre-generated continuous word bag model.
6. The alternative prompted answer generating method of claim 2, wherein the segmenting the set of emotion words and sentences to generate segmented results comprises:
and segmenting the emotion words in the emotion word and sentence set after the expansion to generate words and sentences with independent meanings.
7. The method according to claim 1, wherein the generating of the alternative prompted answer according to the high-dimensional word vector and the word segmentation result comprises:
and clustering the high-dimensional word vectors and the word segmentation result to generate the alternative prompt answer.
8. The method according to claim 7, wherein the clustering the high-dimensional word vector and the segmentation result to generate the alternative prompted answer comprises:
determining the number of clusters according to the positive and negative emotion attributes of the emotion words and sentences so as to determine each cluster;
randomly selecting an emotion word and sentence as an initial cluster center;
calculating the cosine distance from each emotion word to the center of the initial cluster class corresponding to the emotion word;
and determining the alternative prompt answers in the emotion word and sentence set according to the cosine distance.
9. The method for generating alternative prompted answers of claim 8, wherein said determining said alternative prompted answers in said set of emotion words and sentences according to said cosine distance comprises:
selecting a preset number of emotion words and sentences from the emotion words and sentences in the emotion word and sentence set according to the sequence of cosine distances from small to large so as to generate a training set;
training a prompt attribute-level emotion analysis model by using the training set;
and selecting the emotion words and sentences corresponding to the fastest gradient descent in the training process as the alternative prompt answers.
10. An alternative prompted answer generating device based on emotion analysis is characterized by comprising:
the high-dimensional word vector generation module is used for generating high-dimensional word vectors according to the emotion word and sentence sets generated in advance;
the word segmentation result generation module is used for segmenting words of the emotion word and sentence set to generate word segmentation results;
and the prompt answer generating module is used for generating alternative prompt answers according to the high-dimensional word vectors and the word segmentation results.
11. The alternative prompted answer generating apparatus according to claim 10, further comprising: a word and sentence set generating module for generating the emotion word and sentence set, wherein the word and sentence set generating module comprises:
and the word and sentence set generating unit is used for expanding a pre-acquired initial word and sentence set so as to generate the emotion word and sentence set.
12. The alternative prompted answer generating apparatus according to claim 11, wherein the sentence set generating unit includes:
the second word and sentence translation unit is used for translating the emotion words and sentences in the initial word convergence set with the first natural language format into emotion words and sentences with a second natural language format;
and the first word and sentence translation unit is used for translating the emotion words and sentences with the second natural language format back to the emotion words and sentences with the first natural language format so as to expand the initial word and sentence set.
13. The alternative prompted answer generating apparatus according to claim 11, wherein the sentence set generating unit further comprises:
and the similar word and sentence generating unit is used for randomly deleting partial words and sentences in the emotion words and sentences in the initial word convergence set, and randomly generating words and sentences similar to the emotion words and sentences before deletion according to the deleted emotion words and sentences so as to expand the initial word and sentence set.
14. The alternative prompted answer generating apparatus of claim 10, wherein the high-dimensional word vector generating module comprises:
the central word prediction unit is used for predicting the current central word of the emotion words and sentences according to the context in the emotion words and sentences in the emotion word and sentence set;
and the high-dimensional word vector generating unit is used for generating the high-dimensional word vector according to the current central word and a pre-generated continuous word bag model.
15. The alternative prompted answer generating apparatus of claim 11, wherein the word segmentation result generating module comprises:
and the independent word and sentence generating module is used for segmenting the emotion words and sentences in the emotion word and sentence set after the emotion words and sentences are expanded so as to generate words and sentences with independent meanings.
16. The alternative prompted answer generating apparatus according to claim 10, wherein the prompted answer generating module comprises:
and the word and sentence clustering unit is used for clustering the high-dimensional word vectors and the word segmentation results to generate the alternative prompt answers.
17. The alternative prompted answer generating apparatus of claim 16, wherein the word and sentence clustering unit comprises:
the cluster number determining unit is used for determining the number of clusters according to the positive and negative emotion attributes of the emotion words and sentences so as to determine each cluster;
a cluster center selecting unit for randomly selecting an emotion word and sentence as an initial cluster center;
the cluster calculating unit is used for calculating the cosine distance from each emotion word to the center of the initial cluster corresponding to the emotion word in each cluster;
and the prompt answer determining unit is used for determining the alternative prompt answers in the emotion word and sentence set according to the cosine distance.
18. The alternative prompted answer generating apparatus according to claim 17, wherein the prompted answer determining unit includes:
the training combination generating unit is used for selecting a preset number of emotion words and sentences from the emotion words and sentences in the emotion word and sentence set according to the sequence of the cosine distance from small to large so as to generate a training set;
the model training unit is used for training a prompt attribute level emotion analysis model by using the training set;
and the gradient selection unit is used for selecting the emotion words and sentences corresponding to the fastest gradient decrease in the training process as the alternative prompt answers.
19. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the emotion analysis based alternative prompted answer generation method of any of claims 1 to 10.
20. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method for generating alternative prompted answers according to any one of claims 1 to 10 when the program is executed.
21. A computer-readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of the alternative prompted answer generation method based on emotion analysis according to any one of claims 1 to 10.
CN202211392664.8A 2022-11-08 2022-11-08 Alternative prompt answer generation method and device based on emotion analysis Pending CN115688766A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211392664.8A CN115688766A (en) 2022-11-08 2022-11-08 Alternative prompt answer generation method and device based on emotion analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211392664.8A CN115688766A (en) 2022-11-08 2022-11-08 Alternative prompt answer generation method and device based on emotion analysis

Publications (1)

Publication Number Publication Date
CN115688766A true CN115688766A (en) 2023-02-03

Family

ID=85049629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211392664.8A Pending CN115688766A (en) 2022-11-08 2022-11-08 Alternative prompt answer generation method and device based on emotion analysis

Country Status (1)

Country Link
CN (1) CN115688766A (en)

Similar Documents

Publication Publication Date Title
CN107066446B (en) Logic rule embedded cyclic neural network text emotion analysis method
CN109492229B (en) Cross-domain emotion classification method and related device
CN113127624B (en) Question-answer model training method and device
CN112231569A (en) News recommendation method and device, computer equipment and storage medium
August et al. Generating scientific definitions with controllable complexity
Sekhar et al. Emotion recognition through human conversation using machine learning techniques
Tran et al. Sentiment analysis of movie reviews using machine learning techniques
Han et al. CNN-BiLSTM-CRF model for term extraction in Chinese corpus
Choudhary et al. An intelligent chatbot design and implementation model using long short-term memory with recurrent neural networks and attention mechanism
Shruthi et al. A prior case study of natural language processing on different domain
Panahandeh Nigjeh et al. Leveraging ParsBERT for cross-domain polarity sentiment classification of Persian social media comments
Lamons et al. Python Deep Learning Projects: 9 projects demystifying neural network and deep learning models for building intelligent systems
Esmaeilzadeh et al. Providing insights for open-response surveys via end-to-end context-aware clustering
Rady et al. High dimensional autonomous computing on arabic language classification
CN115688766A (en) Alternative prompt answer generation method and device based on emotion analysis
JP2018010481A (en) Deep case analyzer, deep case learning device, deep case estimation device, method, and program
Le et al. A multi-filter BiLSTM-CNN architecture for vietnamese sentiment analysis
Chawla et al. Counsellor chatbot
Rakhimova et al. The Task of Generating Text Based on a Semantic Approach for a Low-Resource Kazakh Language
CN112115717B (en) Data processing method, device and equipment and readable storage medium
Andrabi et al. A Comparative Study of Word Embedding Techniques in Natural Language Processing
Deshpande et al. A survey on statistical approaches for abstractive summarization of low resource language documents
Duan et al. Multi-emotion estimation in narratives from crowdsourced annotations
Karamchandani et al. Artificially Talented Architecture for Theme Detection
Kozlovsky et al. An Approach for Making a Conversation with an Intelligent Assistant

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination