CN114239575A

CN114239575A - Statement analysis model construction method, statement analysis method, device, medium and computing equipment

Info

Publication number: CN114239575A
Application number: CN202111565998.6A
Authority: CN
Inventors: 郑叔亮
Original assignee: Beijing Lingxin Intelligent Technology Co ltd
Current assignee: Beijing Lingxin Intelligent Technology Co ltd
Priority date: 2021-12-20
Filing date: 2021-12-20
Publication date: 2022-03-25
Anticipated expiration: 2041-12-20
Also published as: CN114239575B

Abstract

The embodiment of the invention provides a sentence analysis model construction method, a sentence analysis device, a medium and a computing device. The construction method comprises the following steps: obtaining a statement sample and a result label corresponding to the statement sample; inputting the statement sample into a statement analysis model for processing, and at least obtaining the probability that the statement analysis model generates a corresponding result label in a preset word list based on the statement sample, wherein the statement analysis model is constructed based on a text generation task; calculating an optimization loss based on the probability; adjusting the statement analysis model based on the optimization penalty. The sentence analysis model is built based on the text generation task, the classification tasks in the prior art are converted into the generation tasks, and even if various different emotion analysis tasks need to be executed, the classification heads corresponding to one another do not need to be maintained, so that the storage burden is remarkably reduced, the emotion which does not appear during training can be analyzed, and the generalization capability is stronger.

Description

Statement analysis model construction method, statement analysis method, device, medium and computing equipment

Technical Field

The embodiment of the invention relates to the field of natural language processing, in particular to a sentence analysis model construction method, a sentence analysis device, a medium and a computing device.

Background

This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

Today, the internet has become an important carrier of information, and especially in recent years, with the rise of e-commerce and social networking sites, a great number of comments appear on the internet, which generally explicitly contain or imply some emotions. Studying the emotional information conveyed by the text helps to understand and analyze the people's opinion attitude of the thing or event, for example, the text expression conveying emotional words of happiness, happiness and the like is the support of the thing or event, and the text expression conveying emotional words of vitality, vexation and the like is the opposition. Under such circumstances, sentence analysis is becoming an increasingly focused research topic in the field of natural language processing.

At present, a classification model is generally adopted to analyze emotions contained in a sentence, each specific emotion analysis task needs to maintain a classification head, and as the number of tasks increases, the number of classification heads also increases gradually, so that the storage burden is increased; in addition, the classification head in the prior art can only classify the emotion in a specific emotion analysis task correspondingly, namely, the emotion included in training data when the classification head is trained in an intelligent classification mode is not generalized.

Disclosure of Invention

In this context, embodiments of the present invention are intended to provide a sentence analysis model construction method, a sentence analysis method, an apparatus, a medium, and a computing device.

In a first aspect of the embodiments of the present invention, there is provided a method for constructing a statement analysis model, including:

obtaining a statement sample and a result label corresponding to the statement sample;

inputting the statement sample into a statement analysis model for processing, and at least obtaining the probability that the statement analysis model generates a corresponding result label in a preset word list based on the statement sample, wherein the statement analysis model is constructed based on a text generation task;

calculating an optimization loss based on the probability;

adjusting the statement analysis model based on the optimization penalty.

In an embodiment of this embodiment, before the sentence sample is input into a sentence analysis model and processed, the method further includes:

performing word segmentation processing on the statement sample to obtain a word sequence of the statement sample;

inputting the statement sample into a statement analysis model for processing, and at least obtaining the probability that the statement analysis model generates a corresponding result label in a preset word list based on the statement sample, wherein the probability comprises the following steps:

and inputting the word sequence into a statement analysis model for processing, and at least obtaining the probability that the statement analysis model generates a corresponding result label in a preset word list based on the word sequence.

In an embodiment of the present invention, performing word segmentation processing on the sentence sample to obtain a word sequence of the sentence sample includes:

converting the statement sample and the corresponding result label into a formatted statement sample according to a preset form;

and performing word segmentation processing on the formatted sentence samples to obtain word sequences of the formatted sentence samples.

In an embodiment of the present invention, the converting the statement sample and the corresponding result label into a formatted statement sample according to a preset form includes:

and converting the statement sample and the corresponding result label into a question-answer statement according to a preset form, wherein the statement sample is converted into a question, the result label is converted into an answer, and the answer is natural language.

In an embodiment of the present invention, the inputting the word sequence into a sentence analysis model for processing to obtain at least a probability that the sentence analysis model generates a corresponding result tag in a preset word list based on the word sequence includes:

inputting the word sequence into a sentence analysis model to obtain the probability that the sentence analysis model generates an ith word in a preset word list based on all words before the ith word in the word sequence, wherein the word sequence comprises n words, i is [2 ], n ];

and generating the probability of the corresponding result label in a preset word list based on the word sequence based on the statement analysis model.

In an embodiment of this embodiment, the calculating the optimization loss based on the probability includes:

generating each probability of each word and label result in a preset word list according to the statement analysis model, and calculating a plurality of corresponding optimization losses;

and combining a plurality of optimization losses, and optimizing and adjusting the statement analysis model.

In an embodiment of the present invention, the inputting the word sequence into a sentence analysis model for processing includes:

converting all words in the word sequence into corresponding word vectors;

and inputting the word vector into a statement analysis model for processing, and at least obtaining the probability that the statement analysis model generates a corresponding result label in a preset word list based on the word sequence.

In one embodiment of the present embodiment, the type of the result tag includes one of an emotion tag and an intention tag.

In a second aspect of the embodiments of the present invention, there is provided a sentence analysis method, including:

constructing a statement analysis model by adopting the statement analysis model construction method in any one of the first aspect;

and inputting the sentence to be analyzed into the sentence analysis model, and generating the emotion of the sentence to be analyzed by the sentence analysis model.

In an embodiment of the present invention, the inputting the sentence to be analyzed into the sentence analysis model includes:

performing word segmentation processing on the sentence to be analyzed to obtain a word sequence of the sentence to be analyzed;

coding each word in the word sequence to obtain a word vector of each word;

and sequentially inputting the word vectors of all words into the sentence analysis model, and generating the emotion of the sentence to be analyzed by the sentence analysis model.

In a third aspect of the embodiments of the present invention, there is provided a sentence analysis model construction apparatus, including:

the preprocessing module is configured to obtain a statement sample and a result label corresponding to the statement sample;

the processing module is configured to input the statement sample into a statement analysis model for processing, and at least obtain the probability that the statement analysis model generates a corresponding result label in a preset word list based on the statement sample, wherein the statement analysis model is constructed based on a text generation task; and

calculating an optimization loss based on the probability;

a model building module configured to adjust the statement analysis model based on the optimization loss.

In an embodiment of this embodiment, the preprocessing module is further configured to perform word segmentation on the sentence sample to obtain a word sequence of the sentence sample;

the processing module is further configured to input the word sequence into a sentence analysis model for processing, and obtain at least a probability that the sentence analysis model generates a corresponding result tag in a preset word list based on the word sequence.

In one embodiment of this embodiment, the preprocessing module at least includes:

the conversion unit is configured to convert the statement sample and the corresponding result label into a formatted statement sample according to a preset form;

and the word segmentation unit is configured to perform word segmentation processing on the formatted sentence samples to obtain a word sequence of the formatted sentence samples.

In an embodiment of the present invention, the conversion unit is further configured to convert the sentence sample and the corresponding result tag into a question-and-answer sentence according to a preset form, where the sentence sample is converted into a question, the result tag is converted into an answer, and the answer is a natural language.

In an embodiment of this embodiment, the processing module includes:

a word vector generation unit configured to convert all words in the word sequence into corresponding word vectors;

and the probability calculation unit is configured to input the word vector into a statement analysis model for processing, and at least obtain the probability that the statement analysis model generates a corresponding result label in a preset word list based on the word sequence.

In one embodiment of this embodiment, the processing module includes:

a sentence generation probability obtaining unit configured to input the word sequence into a sentence analysis model to obtain a probability that the sentence analysis model generates an ith word in a preset word list based on all words before the ith word in the word sequence, wherein the word sequence includes n words, i ═ 2, ·, n ];

an emotion generation probability acquisition unit configured to acquire a probability that a corresponding result tag is generated in a preset word list based on the word sequence based on the sentence analysis model;

the optimization loss calculation unit is configured to generate each word and each probability of the label result in a preset word list according to the statement analysis model, and calculate a plurality of corresponding optimization losses;

the model building module is further configured to combine a plurality of optimization penalties to optimize and adjust the statement analysis model.

In a fourth aspect of the embodiments of the present invention, there is provided a term analysis device that constructs a term analysis model using the term analysis model construction device according to any one of the third aspects, including:

and the sentence analysis module is configured to input the sentence to be analyzed into the sentence analysis model, and the emotion of the sentence to be analyzed is generated by the sentence analysis model.

In an embodiment of this embodiment, the statement analysis module includes:

the word segmentation unit is configured to perform word segmentation processing on the sentence to be analyzed to obtain a word sequence of the sentence to be analyzed;

the encoding unit is configured to encode each word in the word sequence to obtain a word vector of each word;

and the analysis unit is configured to input word vectors of all words into the sentence analysis model in sequence, and the emotion of the sentence to be analyzed is generated by the sentence analysis model.

In a fifth aspect of embodiments of the present invention, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, is capable of implementing the method of any one of the first and second aspects.

In a sixth aspect of embodiments of the present invention, there is provided a computing device comprising: a processor; a memory for storing the processor-executable instructions; the processor configured to perform the method of any one of the first and second aspects.

According to the statement analysis model construction method, the statement analysis method, the device, the medium and the computing equipment, the statement analysis model construction method comprises the following steps: obtaining a statement sample and a result label corresponding to the statement sample; performing word segmentation processing on the statement sample to obtain a word sequence of the statement sample; inputting the word sequence into a sentence analysis model for processing, and at least obtaining the probability that the sentence analysis model generates a corresponding result label in a preset word list based on the word sequence, wherein the sentence analysis model is constructed based on a text generation task; calculating an optimization loss based on the probability; adjusting the statement analysis model based on the optimization penalty. The sentence analysis model is constructed based on the text generation task, the classification task in the prior art is converted into the generation task, and the text analysis model constructed based on the text generation task can model the process that the sentence to be analyzed generates a result label from the word list, so that the trained sentence analysis model can generate the corresponding result label based on any sentence to be analyzed, and even if various different emotion analysis tasks need to be executed, the classification heads corresponding to one another do not need to be maintained, thereby remarkably reducing the storage burden and bringing better experience to users; in addition, because the emotion or intention of the text is analyzed based on the text generation task, the emotion or intention which is not included in the training data can be analyzed, and the method has stronger generalization capability.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

FIG. 1 is a diagram schematically illustrating a conventional analysis of emotion of a sentence by a classification model;

FIG. 2 schematically shows a diagram of a training sentence analysis model and a test sentence analysis model, according to one embodiment of the invention;

FIG. 3 is a flow chart diagram schematically illustrating a method for constructing a statement analysis model according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a sentence analysis model building apparatus according to an embodiment of the present invention;

FIG. 5 schematically shows a schematic of the structure of a medium according to an embodiment of the invention;

fig. 6 schematically shows a structural diagram of a computing device according to an embodiment of the present invention.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to the embodiment of the invention, a sentence analysis model construction method, a sentence analysis device, a medium and a computing device are provided.

Moreover, any number of elements in the drawings are by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.

The principles and spirit of the present invention are explained in detail below with reference to several representative embodiments of the invention.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to a wide field, namely a hardware level technology and a software level technology; on the software level, the method mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Machine Learning (ML) is a one-field multi-field cross discipline, and relates to multiple disciplines such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal learning.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. NLP techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, and knowledge mapping.

Based on the technologies such as natural language processing and machine learning mentioned in the above artificial intelligence technology, the embodiment of the present application provides a method for analyzing emotion or intention contained in an input sentence based on the input sentence.

Summary of The Invention

The inventor finds that the existing text emotion analysis model based on the pre-training technology generally follows the following flow:

firstly, a language model loss function is used for training on a large-scale unsupervised corpus to obtain a pre-training model, then for a specific downstream task (such as an emotion classification task under a specific scene), a classification head required by classification is added on the basis of the weight of the pre-training model, and then the parameters of the whole model are finely adjusted.

The prior art mainly has the following defects:

1. in the Model pre-training stage, in the prior art, only unsupervised natural sentences are often used as pre-training corpora, and the used pre-training Loss function only uses an Auto-Regressive Language Model (Auto-Regressive Language Model Loss) Loss function or a mask Language Model (mass Language) Loss function. When these pre-trained models are applied to downstream tasks, the gap between the pre-training phase and the fine-tuning phase is large because the penalty function used in the pre-training phase is not uniform with the penalty function used in the fine-tuning phase, which is the cross-entropy penalty in the tag set space. The traditional approach requires discarding the Language Model prediction header of the pre-training phase, and the Language Model Head, and re-initializing the classification header for the intended classification during the fine-tuning phase. .

2. For each specific downstream task, a classification head related to the task needs to be added, so that as the number of tasks increases, the number of classification heads also increases gradually, and the storage burden is increased. For example, as shown in fig. 1, a model of a two-classification emotion analysis task requires two classification headers, specifically, for two emotion classification data sets D1 and D2 (i.e., two emotion classification tasks), a total of 5 emotion tags are in data set D1, and a total of 3 emotion tags are in data set D2, if a conventional method is used, a 3 classification header needs to be maintained for data set D1, and a 5 classification header needs to be maintained for data set D2.

3. For each specific downstream task, a specific optimization function needs to be designed, and all the downstream tasks cannot be processed by using a uniform optimization target.

In order to solve the above problems discovered by the inventors, the inventors consider that the classification task can be converted into a generation task, that is, assuming that one sentence sample corresponds to one emotion result label, the occurrence probability of the sentence sample can be first subjected to conditional probability decomposition, that is, the sentence sample is first converted into a word sequence (token sequence) by a word segmenter (token), and it is assumed that there are n words in the sequence: x1, x2, …, xn. Then the joint probability distribution P (x1, x2, …, xn) of these n words can be decomposed into the following form:

P(x1,x2,…,xn)＝P(x1)*P(x2|x1)*P(x3|x1,x2)*P(x4|x1,x2,x3)…P(xn|x1,x2,…,xn-1)

therefore, a neural network model similar to a transformer can be used to approximate each conditional probability distribution at the right end of the above formula, namely P (xi | x1, x2, …, xi-1}), so as to completely construct the mapping relation between the statement sample and the result label.

Therefore, a general statement analysis model pre-training scheme can be provided, as shown in fig. 2, the pre-training scheme can be used as a base for various text classification tasks, and for different downstream tasks, the pre-training base constructed based on the method can be finely adjusted by using a uniform task form, and a classification head does not need to be maintained for each downstream task.

Having described the general principles of the invention, various non-limiting embodiments of the invention are described in detail below.

Exemplary method

A method for construction of a sentence analysis model according to an exemplary embodiment of the present invention is described below with reference to fig. 3.

The embodiment of the invention provides a method for constructing a statement analysis model, which comprises the following steps:

step S110, obtaining a statement sample and a result label corresponding to the statement sample;

in an embodiment of the present invention, the sentence sample may be from a real sentence in the real world or a sentence generated by a digital technology such as a text generation model, which is not limited in this embodiment, and the sentence sample is a complete sentence, for example, "today's weather is really good. "accordingly," today's weather is really good. The result label of "may be" happy, "which is a word indicating happy mood. It is understood that the result labels may not be emotions expressed in natural language, for example, emotions expressed in other forms (indexes, keys, etc.), as long as in the subsequent training step, the input and output correspondence is converted into natural language, that is, the training data input into the sentence analysis model are all natural language sentences.

Next, step S120 is executed, the statement sample is input into a statement analysis model for processing, and at least a probability that the statement analysis model generates a corresponding result tag in a preset word list based on the statement sample is obtained, where the statement analysis model is constructed based on a text generation task;

in this embodiment, the sentence analysis model is constructed based on a text generation task, that is, the original classification task is converted into a generation task, and the original emotion classification based on the sentence sample is converted into the corresponding emotion based on the sentence sample. In the text generation task, when training a corresponding model, training data in a natural language form needs to be converted into a vector form.

Thus, in an embodiment of the present embodiment, before the sentence sample is input into a sentence analysis model for processing, the method further includes performing word segmentation processing on the sentence sample to obtain a word sequence of the sentence sample;

in an embodiment of the present invention, before the sentence sample is input into the sentence analysis model for training, a word segmentation process is performed on the sentence sample, where the word segmentation process is to convert the sentence sample from a complete sentence into a word sequence, for example, "today is good weather. "this sentence sample can be converted into a word sequence { today, weather, true } by a tokenizer.

In another embodiment of this embodiment, in order to facilitate the statement analysis model to model a relationship between a statement sample and a corresponding result tag, the statement sample and the corresponding result tag are converted into an integrated statement, so that the statement analysis model processes the statement sample and the corresponding tag as a whole, and the corresponding result tag is generated according to the statement sample, specifically, the statement sample is subjected to word segmentation to obtain a word sequence of the statement sample, where the method includes:

converting the statement sample and the corresponding result label into a formatted statement sample according to a preset form, wherein the last word of the formatted statement sample is represented by the natural language of the result label;

in this embodiment, the preset form may be a dialogue form, a question-and-answer form, or another form that may integrate the sentence sample and the corresponding result tag into a whole;

specifically, in an embodiment of the present invention, converting a sentence sample and a corresponding result tag into a question-and-answer form includes:

converting the statement sample and the corresponding result label into a question-answer statement according to a preset form, wherein the statement sample is converted into a question, the result label is converted into an answer, and the answer is natural language;

for example, for a pair of sentence sample and result label of "today's weather is really good" and "happy", the above-mentioned manner can be converted into:

"what is the emotion in the weather today? Happy "this question-answer sentence.

Or, for example, may be converted into a conversational format:

"today the weather is really good. I feel you happy. "

Or, for example, can be directly converted into:

"today the weather is really good and happy. "

It should be noted that, in the above example, the result tag is directly expressed in the natural language of "happy" but not represented in the real production environment, and the result tag is directly expressed in the natural language, that is, the result tag expressed in the non-natural language form needs to be converted into the natural language form, and then the formatted sentence sample in the preset form is constructed in combination with the corresponding sentence sample.

In the embodiment of the present invention, after the statement sample and the corresponding result tag are converted into the formatted statement sample according to the preset form, the steps in the previous embodiment need to be further executed to perform the word segmentation processing on the formatted statement sample to obtain the corresponding word sequence, specifically, the word segmentation processing may be performed by using a word segmentation device, the word segmentation process and the result are similar to those in the previous embodiment, and the detailed description of the word segmentation process is not described in this embodiment.

It should be noted that, in another embodiment of the present application, the sentence analysis model may include a preprocessing module, which converts the sentence samples and/or result tags into corresponding word sequences, for example, a word segmenter may be preset in the sentence analysis model.

That is to say, the embodiment includes two statement analysis models with different structures, one statement analysis model can directly accept the input of a complete statement, and then the preprocessing module of the statement analysis model performs word segmentation processing on the complete statement; another sentence analysis model cannot directly accept complete sentence input, and requires to perform analysis processing on a word sequence after a complete sentence is segmented to obtain the word sequence. It can be understood that the word segmentation processing and the word sequence in the present embodiment do not only refer to words with more than two words, but also refer to single words, that is, the word segmentation processing may be word segmentation processing, and the word sequence may also be a word sequence.

After introducing how to obtain the word sequence of the sentence sample and/or result label, the word sequence is input into a sentence analysis model for processing, and the processing comprises the following steps:

converting all words in the word sequence into corresponding word vectors;

in this embodiment, the method for converting all words in the word sequence into corresponding word vectors may be any one of a neural network, a dimension reduction of a word co-occurrence matrix, a probability model, and an interpretable knowledge base method, for example, in an embodiment of this embodiment, any one of the following neural network language models may be used to convert the word sequence into word vectors of each word:

1.Neural Network Language Model，NNLM

2.Log-Bilinear Language Model，LBL

3.Recurrent Neural Network based Language Model，RNNLM

c & W model proposed in 2008 by Collobert and Weston

Mikolov et al propose CBOW (Continuous bag-Words) and Skip-gram models.

It should be noted that, in the above embodiments, the sentence analysis model provided in this application does not have a function of converting words in a natural language form into word vectors, that is, the sentence analysis model can accept inputs in a word vector form.

In another embodiment of this embodiment, the sentence analysis model may also integrate the word vector generation function, that is, include some modules that adopt the above word vector conversion method, and thus, the input of the sentence analysis model may also be a sentence in a natural language form.

After the word vectors of all words of the word sequence are obtained, the word vectors are input into a statement analysis model to be processed, and at least the probability that the statement analysis model generates corresponding result labels in a preset word list based on the word sequence is obtained.

In an embodiment of the present invention, it is assumed that the sentence analysis model is constructed based on a transformer model, after a word vector of each word of a word sequence of the sentence sample is input into the sentence analysis model, a transformer block stacked in the sentence analysis model obtains a representation vector of a last word of the word sequence by calculation, then the representation vector is input into a linear mapping layer, the identification vector is mapped into an N-dimensional vector, where N is a size of a vocabulary (which may be a set of words constructed based on a dictionary and including all words representing emotions), then the N-dimensional vector is input into a softmax layer, a probability distribution on a preset vocabulary is obtained, and then a probability of a word represented by a natural language corresponding to an emotion tag in the probability distribution is determined; in this embodiment, the sentence analysis model is equivalent to establishing a mapping between the whole of the complete sentence sample and the result tag, the input of the sentence analysis model may be only the sentence sample (word vector of each word of the word sequence), then determine the expression vector of the last word of the sentence sample, and may map the probability of the word represented by the natural language corresponding to the result tag in a preset word list, or may be the formatted sentence sample (word vector of each word of the word sequence) into which the sentence sample and the result tag are converted, then determine the expression vector of the second last word of the formatted sentence sample, and may map the probability of the last word of the formatted sentence sample (word represented by the natural language corresponding to the result tag) in the preset word list.

In another embodiment of this embodiment, modeling the relationship between the sentence sample and the corresponding result tag, that is, modeling the generation probability of each word in the sentence sample by the sentence analysis model, and modeling the relationship between the sentence sample and the corresponding result tag, may further include:

in this embodiment, the sentence analysis model will calculate the generation probability of each word in the sentence sample in the preset vocabulary, that is, assuming that the word sequence of the sentence sample Y is { x1, x2, …, xn }, and xn represents the nth word in the sentence sample Y, then the sentence analysis model will calculate:

P(x2|x1)，P(x3|x1,x2)，P(x4|x1,x2,x3)…P(xn|x1,x2,…,xn-1)。

and the statement analysis model generates the probability of the corresponding result label in a preset word list based on the word sequence.

In this embodiment, the generating probability of each word in the preset word list is calculated in a similar manner as described in the previous embodiment, and is not described herein again.

Next, step S140 is executed to calculate an optimization penalty based on the probability;

in the present embodiment, it is necessary to calculate the optimization penalties according to the number of probability distributions obtained in step S130, and in an embodiment of the present embodiment, if only the probability that the sentence analysis model generates the corresponding result labels in the preset word list based on the word sequence is obtained in step S130, the cross entropy penalty may be calculated as the optimization penalty according to only the one probability.

In another embodiment of the present invention, if the step S130 obtains the generation probability of each word in the sentence sample in the preset word list and the probability of the sentence analysis model generating the corresponding result tag in the preset word list based on the word sequence, a cross entropy loss may be calculated according to each probability, and then all the cross entropy losses are combined as the optimization loss.

Finally, step S150 is executed to adjust the statement analysis model based on the optimization loss.

In various embodiments of the present disclosure, the type of the result tag includes one of an emotion tag and an intention tag. That is, the sentence analysis method trained in the present application may be applied to an application scenario of emotion analysis or intention analysis, which is not limited in the present embodiment.

In the sentence analysis model construction method provided in each embodiment of the present embodiment, the sentence analysis model is constructed based on the text generation task, the classification task in the prior art is converted into the generation task, and the text analysis model constructed based on the text generation task can model the process in which the sentence to be analyzed generates the result tag from the vocabulary, so that the trained sentence analysis model can generate the corresponding result tag based on any sentence to be analyzed, and even if a plurality of different emotion analysis tasks need to be executed, the classification heads corresponding to one another do not need to be maintained, thereby significantly reducing the storage burden and bringing better experience to the user.

In addition, the sentence analysis model constructed by each embodiment of the present embodiment is used as a base model of the existing emotion analysis system, so that the base model developed based on each embodiment of the present embodiment can be directly used for Zero-shot prediction without a specific task data set; in addition, where a task-specific dataset can be collected, fine-tuning can be done on the base model using the task-specific dataset, resulting in better performance on the task-specific dataset.

In general, the method for constructing a statement analysis model provided in each embodiment of this embodiment can unify different tasks in a unified classification form (i.e., text generation), and does not need to maintain additional weight for a brand-new task or data set, thereby saving the storage overhead. And the unified task form can better utilize the capability of the pre-training model and reduce the difference between pre-training and fine-tuning, thereby obtaining better performance. In addition, because each embodiment of the embodiment adopts the text generation task to construct the sentence analysis model, the model can analyze the emotion or intention which is not included in the training data, and has stronger generalization capability.

In another aspect of this embodiment, a statement analysis method is further provided, including:

constructing a statement analysis model by adopting the construction method of the statement analysis model;

coding each word in the word sequence to obtain a word vector of each word;

Exemplary devices

Having described the method of the exemplary embodiment of the present invention, next, a sentence analysis model construction apparatus of an exemplary embodiment of the present invention will be described with reference to fig. 4, including:

a preprocessing module 410 configured to obtain a statement sample and a result tag corresponding to the statement sample;

a processing module 420 configured to input the statement sample into a statement analysis model for processing, so as to obtain at least a probability that the statement analysis model generates a corresponding result tag in a preset word list based on the statement sample, where the statement analysis model is constructed based on a text generation task; and

calculating an optimization loss based on the probability;

a model building module 430 configured to adjust the statement analysis model based on the optimization loss.

In an embodiment of this embodiment, the preprocessing module 410 is further configured to perform word segmentation on the sentence sample to obtain a word sequence of the sentence sample;

the processing module 420 is further configured to input the word sequence into a sentence analysis model for processing, so as to obtain at least a probability that the sentence analysis model generates a corresponding result tag in a preset word list based on the word sequence.

In an embodiment of this embodiment, the preprocessing module 410 at least includes:

In an embodiment of this embodiment, the processing module 420 includes:

the model building module 430 is further configured to combine multiple optimization penalties to optimize tuning of the statement analysis model.

In a further aspect of the embodiments of the present invention, there is provided a sentence analyzing apparatus which constructs a sentence analyzing model using the sentence analyzing model constructing apparatus as described above, including:

In an embodiment of this embodiment, the statement analysis module includes:

Exemplary Medium

Having described the method and apparatus of the exemplary embodiments of this invention, next, a computer-readable storage medium of the exemplary embodiments of this invention is described with reference to fig. 5, please refer to fig. 5, which illustrates a computer-readable storage medium being an optical disc 50 having stored thereon a computer program (i.e., a program product), which when executed by a processor, implements the steps described in the above-mentioned method embodiments, for example, obtaining a sentence sample and a result tag corresponding to the sentence sample; inputting the statement sample into a statement analysis model for processing, and at least obtaining the probability that the statement analysis model generates a corresponding result label in a preset word list based on the statement sample, wherein the statement analysis model is constructed based on a text generation task; calculating an optimization loss based on the probability; adjusting the statement analysis model based on the optimization penalty; the specific implementation of each step is not repeated here.

It should be noted that examples of the computer-readable storage medium may also include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, or other optical and magnetic storage media, which are not described in detail herein.

Exemplary computing device

Having described the method, medium, and apparatus of exemplary embodiments of the present invention, a computing device for text analysis of exemplary embodiments of the present invention is next described with reference to the drawings.

FIG. 6 illustrates a block diagram of an exemplary computing device 60 suitable for use in implementing embodiments of the present invention, the computing device 60 may be a computer system or server. The computing device 60 shown in FIG. 6 is only one example and should not be taken to limit the scope of use and functionality of embodiments of the present invention.

As shown in fig. 6, components of computing device 60 may include, but are not limited to: one or more processors or processing units 601, a system memory 602, and a bus 603 that couples various system components including the system memory 602 and the processing unit 601.

Computing device 60 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computing device 60 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 602 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)6021 and/or cache memory 6022. Computing device 60 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, ROM6023 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, but typically referred to as a "hard disk drive"). Although not shown in FIG. 6, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus 603 by one or more data media interfaces. At least one program product may be included in system memory 602 with a set (e.g., at least one) of program modules configured to perform the functions of embodiments of the present invention.

A program/utility 6025 having a set (at least one) of program modules 6024 may be stored, for example, in the system memory 602, and such program modules 6024 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment. Program modules 6024 generally carry out the functions and/or methodologies of embodiments of the invention as described herein.

Computing device 60 may also communicate with one or more external devices 604, such as a keyboard, pointing device, display, etc. Such communication may occur via input/output (I/O) interfaces 605. Moreover, computing device 60 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through network adapter 606. As shown in FIG. 6, network adapter 606 communicates with other modules of computing device 60, such as processing unit 601, via bus 603. It should be appreciated that although not shown in FIG. 6, other hardware and/or software modules may be used in conjunction with computing device 60.

The processing unit 601 executes various functional applications and data processing, for example, obtaining a sentence sample and a result tag corresponding to the sentence sample, by running a program stored in the system memory 602; inputting the statement sample into a statement analysis model for processing, and at least obtaining the probability that the statement analysis model generates a corresponding result label in a preset word list based on the statement sample, wherein the statement analysis model is constructed based on a text generation task; calculating an optimization loss based on the probability; adjusting the statement analysis model based on the optimization penalty. The specific implementation of each step is not repeated here.

It should be noted that although in the above detailed description reference is made to the sentence analysis model building means and to several units/modules or sub-units/modules of the sentence analysis means, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Through the above description, the embodiments of the present invention provide the following technical solutions, but are not limited thereto:

1. a sentence analysis model construction method comprises the following steps:

calculating an optimization loss based on the probability;

adjusting the statement analysis model based on the optimization penalty.

2. The method for constructing a sentence analysis model according to scheme 1, wherein before the sentence sample is input into the sentence analysis model for processing, the method further comprises:

3. The method for constructing a statement analysis model according to scheme 2, wherein performing word segmentation processing on the statement sample to obtain a word sequence of the statement sample includes:

4. The method for constructing a statement analysis model according to scheme 3, wherein the converting the statement sample and the corresponding result tag into a formatted statement sample according to a preset form includes:

5. The method for constructing a sentence analysis model according to any of schemes 2 to 4, wherein the step of inputting the word sequence into the sentence analysis model for processing to obtain at least a probability that the sentence analysis model generates a corresponding result tag in a preset word list based on the word sequence includes:

6. The method for constructing a statement analysis model according to scheme 4, wherein the calculating an optimization loss based on the probability comprises:

7. The method for constructing a sentence analysis model according to any of schemes 2 to 6, wherein the inputting the word sequence into the sentence analysis model for processing includes:

converting all words in the word sequence into corresponding word vectors;

8. The sentence analysis model construction method according to any of the schemes 1 to 7, wherein the type of the result tag includes one of an emotion tag and an intention tag.

9. A statement analysis method, comprising:

adopting the sentence analysis model construction method according to any one of the schemes 1-8 to construct a sentence analysis model;

10. The sentence analysis method according to claim 9, wherein the inputting the sentence to be analyzed into the sentence analysis model includes:

coding each word in the word sequence to obtain a word vector of each word;

11. An apparatus for constructing a sentence analysis model, comprising:

calculating an optimization loss based on the probability;

12. The apparatus for constructing a sentence analysis model according to claim 11, wherein the preprocessing module is further configured to perform word segmentation processing on the sentence samples to obtain word sequences of the sentence samples;

13. The apparatus for constructing a sentence analysis model according to claim 12, wherein the preprocessing module at least includes:

14. The apparatus for constructing a sentence analysis model according to claim 13, wherein the conversion unit is further configured to convert the sentence sample and the corresponding result tag into a question-and-answer sentence according to a preset form, wherein the sentence sample is converted into a question, the result tag is converted into an answer, and the answer is a natural language.

15. The sentence analysis model construction apparatus according to any of claims 12-14, the processing module comprising:

16. The sentence analysis model construction apparatus according to any of claims 12-15, the processing module comprising:

a sentence generation probability obtaining unit configured to input the word sequence into a sentence analysis model to obtain a probability that the sentence analysis model generates an ith word in a preset word list based on all words before the ith word in the word sequence, wherein the word sequence includes n words, i ═ 2, ·, n ]; an emotion generation probability acquisition unit configured to acquire a probability that a corresponding result tag is generated in a preset word list based on the word sequence based on the sentence analysis model;

17. The sentence analysis model construction apparatus of any of claims 12-16, wherein the type of the result tag comprises one of an emotion tag and an intention tag.

18. A sentence analysis apparatus for constructing a sentence analysis model by using the sentence analysis model construction apparatus according to any of claims 11 to 17, comprising:

19. The sentence analysis apparatus according to claim 18, wherein the sentence analysis module includes:

20. A computer-readable storage medium storing program code which, when executed by a processor, implements a method as in one of schemes 1-10.

21. A computing device comprising a processor and a storage medium storing program code which, when executed by the processor, implements a method as in one of schemes 1-10.

Claims

1. A sentence analysis model construction method comprises the following steps:

calculating an optimization loss based on the probability;

adjusting the statement analysis model based on the optimization penalty.

2. A method of constructing a statement analysis model according to claim 1 wherein, prior to inputting the statement sample into the statement analysis model for processing, the method further comprises:

3. The method for constructing a sentence analysis model according to claim 2, wherein the obtaining of the word sequence of the sentence sample by performing the word segmentation process on the sentence sample comprises:

4. The method for constructing a sentence analysis model according to claim 3, wherein the converting the sentence samples and the corresponding result labels into formatted sentence samples according to a preset form comprises:

5. The method for constructing a sentence analysis model according to any of claims 2 to 4, wherein the inputting the word sequence into the sentence analysis model for processing to obtain at least a probability that the sentence analysis model generates a corresponding result tag in a preset word list based on the word sequence comprises:

6. A statement analysis method, comprising:

constructing a statement analysis model by adopting the construction method of the statement analysis model according to any one of claims 1 to 5;

7. An apparatus for constructing a sentence analysis model, comprising:

calculating an optimization loss based on the probability;

8. A sentence analysis apparatus for constructing a sentence analysis model using the sentence analysis model construction apparatus according to claim 7, comprising:

9. A computer-readable storage medium storing program code which, when executed by a processor, implements a method according to one of claims 1 to 6.

10. A computing device comprising a processor and a storage medium storing program code which, when executed by the processor, implements the method of one of claims 1 to 6.