CN115495568A

CN115495568A - Training method and device for dialogue model and dialogue response method and device

Info

Publication number: CN115495568A
Application number: CN202211441290.4A
Authority: CN
Inventors: 刘红丽; 李峰
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2022-12-20
Anticipated expiration: 2042-11-17
Also published as: WO2024103609A1; CN115495568B

Abstract

The invention discloses a training method of a dialogue model, which comprises the following steps: training an original dialogue model by using a general dialogue data set to obtain a general dialogue model; acquiring a preset professional keyword group, and performing data screening on the universal dialogue data set according to the professional keyword group; training the general dialogue model by using the screened initial labeling data set to obtain an initial professional dialogue model; verifying the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; and if so, determining the initial professional dialogue model as the target professional dialogue model. The method and the system enable the trained target professional dialogue model to have universality and specialty at the same time, and improve the use experience of users. The invention also discloses a training device of the dialogue model, a dialogue response method and device, electronic equipment and a computer readable storage medium, and has corresponding technical effects.

Description

Training method and device for dialogue model and dialogue response method and device

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for training a dialogue model, a method and an apparatus for dialogue response, an electronic device, and a computer-readable storage medium.

Background

Human-computer conversation has been regarded by academic and industrial circles as a basic application of Natural Language Processing (NLP). With the development of artificial intelligence technology, generative-based dialogue models are increasingly popular, which are trained specifically for dialogue data and achieve very good performance in open-domain dialogue. However, training a large dialogue model from the beginning requires a large amount of multi-type dialogue data as a training corpus, which requires high cost and long training time.

There are also often different chat needs in professional human-machine dialog systems, including: chat, general-sense question and answer, professional question and answer, and the like. For example, the medical robot is required to answer medical professional knowledge and relate to common sense questions in life in the process of chatting with a patient, and is required to chatty to relieve the emotion of the patient. Most of the current professional dialogue models adopt a retrieval mode, and the main principle is semantic matching, namely finding answers of questions asked by users in a knowledge base. Although the technology is mature, the technology is very dependent on corpora, knowledge is one-sided, the reply is single and hard, the universality and the diversity are lacked, and the user experience is poor.

In summary, how to effectively solve the problems of single and harsh reply, lack of generality and diversity, poor user experience and the like of the existing dialog response method is a problem which needs to be solved urgently by a person skilled in the art at present.

Disclosure of Invention

The invention aims to provide a training method of a dialogue model, which enables the trained target professional dialogue model to have universality and specialty at the same time, and improves the use experience of a user; another object of the present invention is to provide a training apparatus for a dialogue model, a dialogue response method and apparatus, an electronic device, and a computer-readable storage medium.

In order to solve the technical problems, the invention provides the following technical scheme:

a method of training a dialogue model, comprising:

training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;

acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;

training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model;

carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;

judging whether the verification score is larger than a preset score threshold value or not;

and if so, determining the initial professional dialogue model as a target professional dialogue model.

In a specific embodiment of the present invention, when it is determined that the verification score is equal to or less than the preset score threshold, the method further includes:

generating corresponding response data for each sample data in a preset unmarked pool by using the initial professional dialogue model;

respectively calculating the automatic evaluation score corresponding to each response data;

sorting the automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from the end with smaller scores;

outputting marking prompt information for marking the response data corresponding to each selected automatic evaluation score;

updating the initial labeling data set according to the labeling result to obtain an updated labeling data set;

training the initial professional dialogue model based on the updated labeling data set to obtain an updated professional dialogue model;

and carrying out verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.

In an embodiment of the present invention, after obtaining the updated annotation data set, the method further includes:

and updating the preset unmarked pool according to the updated marked data set.

In a specific embodiment of the present invention, the performing a verification operation on the initial professional dialogue model by using a verification data set and a preset natural language processing evaluation index includes:

and carrying out verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index and the DISTINCT index through the following formula:

；

wherein ,

for the score of the initial professional dialogue model on the BLEU index,

for the score of the initial professional dialogue model on the route index,

the score of the initial professional dialogue model on the PPL index is in the form of the reciprocal of the PPL index score,

for the score of the initial professional dialogue model on the DISTINCT index,

is the verification score.

In a specific embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the BLEU index

Has been calculatedProgram, score of said initial professional dialogue model on BLEU index

The calculation process of (2) comprises:

calculating the score of the initial professional dialogue model on the BLEU index through the following formula

：

；

wherein ,

，

is the length of the machine-translated version,

is the shortest length of the reference translation sentence,

to be the accuracy of the n-gram,

as weights of n-grams, with any n

And BP is a penalty factor.

In a specific embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the ROUGE index

The score of the initial professional dialogue model on the ROUGE index

The calculation process of (2) includes:

calculating the score of the initial professional dialogue model on the ROUGE index by the following formula

：

；

Wherein { reference translation } denotes a set of reference translations,

a combination of N words is represented,

the numerator of the formula is used for counting the number of the N-grams in all the reference translations, and the numerator is used for counting the number of the N-grams shared by all the reference translations and the machine translations.

In an embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the PPL index

The score of the initial professional dialogue model on the PPL index

The calculation process of (2):

；

wherein ,

indicating the probability of predicting the ith word from the above word, N represents the sentence length.

In an embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the DISTINCT index

The score of the initial professional dialogue model on the DISTINCT index

The calculation process of (2) comprises:

calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula

：

；

wherein ,

indicating a non-repeating number of ngrams in the reply,

representing the total number of ngram words in the reply.

In an embodiment of the present invention, before training the original dialogue model by using the pre-acquired common dialogue dataset, the method further includes:

and respectively filtering the question-answer data and the chatting data in the general dialogue data set.

In a specific embodiment of the present invention, training an original dialog model with a pre-acquired universal dialog dataset to obtain a universal dialog model, includes:

inputting the general dialogue data set into the original dialogue model for model iterative training;

obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;

determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation;

and if so, determining the dialogue model obtained by the iterative training as the general dialogue model.

In an embodiment of the present invention, determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation includes:

and judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value.

In an embodiment of the present invention, when it is determined that the current iteration number is greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value, the method further includes:

judging whether the current iteration number is larger than a third preset value or not; wherein the third preset value is greater than the first preset value;

if yes, determining the dialogue model obtained by the iterative training in the current round as the general dialogue model;

and if not, inputting the general dialogue data set into a dialogue model obtained by the current iteration training for model iteration training, and repeatedly executing the steps of obtaining the current iteration number and the loss standard deviation obtained by the current iteration training.

In a specific embodiment of the present invention, the data screening of the general dialog data set according to the professional keyword group includes:

and performing data screening on the universal dialogue data set according to the professional keyword group by utilizing a DFA algorithm.

A dialogue response method is applied to a dialogue system containing a target professional dialogue model obtained by the training, and comprises the following steps:

receiving target question voice to be responded;

generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model;

and performing output operation on the target response voice.

In one embodiment of the present invention, the method further comprises:

searching relevant answers from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;

and carrying out voice output on the related answers.

A training apparatus of a dialogue model, comprising:

the universal dialogue model acquisition module is used for training an original dialogue model by utilizing a pre-acquired universal dialogue data set to obtain a universal dialogue model;

the initial labeling data set determining module is used for acquiring a preset professional keyword group, performing data screening on the general conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;

the initial professional dialogue model acquisition module is used for training the general dialogue model by utilizing the initial labeling data set to obtain an initial professional dialogue model;

the verification score obtaining module is used for carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;

the judging module is used for judging whether the verification score is larger than a preset score threshold value or not;

and the target professional dialogue model determining module is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than a preset score threshold value.

A dialog response device comprising:

the question voice receiving module is used for receiving target question voice to be responded;

the response voice generating module is used for generating target response voice corresponding to the target question voice by utilizing a target professional dialogue model obtained by training a general dialogue model;

and the response voice output module is used for outputting the target response voice.

An electronic device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the training method or the dialogue response method of the dialogue model as described above when the computer program is executed.

A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the training method or the dialogue response method of the dialogue model as previously described.

The training method of the dialogue model provided by the invention utilizes the pre-acquired general dialogue data set to train the original dialogue model to obtain the general dialogue model; acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set; training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model; carrying out verification operation on the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; and if so, determining the initial professional dialogue model as the target professional dialogue model.

According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the user experience is improved.

Correspondingly, the invention also provides a training device of the dialogue model, a dialogue response method and device, an electronic device and a computer readable storage medium corresponding to the training method of the dialogue model, which have the technical effects and are not described herein again.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of an implementation of a training method for a dialogue model according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating another embodiment of a method for training a dialogue model according to the present invention;

FIG. 3 is a flowchart illustrating an implementation of a dialog response method according to an embodiment of the present invention;

FIG. 4 is a block diagram of a training apparatus for a dialogue model according to an embodiment of the present invention;

FIG. 5 is a block diagram of a dialog response device according to an embodiment of the present invention;

FIG. 6 is a block diagram of an electronic device according to an embodiment of the invention;

fig. 7 is a schematic structural diagram of an electronic device provided in this embodiment.

Detailed Description

In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart of an implementation of a training method for a dialogue model according to an embodiment of the present invention, where the method may include the following steps:

s101: and training the original dialogue model by using the pre-acquired general dialogue data set to obtain the general dialogue model.

A common dialogue dataset is collected in advance in a public dataset, and can be divided into two main categories, namely question answering and chat. The question and answer data can relate to multiple fields such as common knowledge, facts, mothers and babies, medical treatment, law, insurance, aviation, psychology, traditional Chinese medicine, epidemic situation and the like. The chatty data can comprise a plurality of data sets such as microblog discussions, drama dialogue, post-bar discussions, bean comments, e-commerce conversations and the like, and can relate to various topic discussions of daily life such as history, movies, weather, entertainment, sports and the like.

Specific examples of constructing a universal dialog data set are as follows:

the format of the vocabulary entry interpretation class prompt is as follows: "title", article: "text". Original corpus example { "id": "0", "url": https: // xxx, "title": "economics", "text": "economics is a social science of research on production, distribution and consumption of products and services \8230; \8230 }, which is composed in the format of prompt: title: "economics", article: "economics is a social science of research into the production, distribution, and consumption of products and services \8230;".

Question-answer type prompt format: asking: "title + desc" answer: "answer". Raw corpus example { "qid":0, "title": "will be only go by AlphaGo, can be written by afu dog in novel story", "desc": "no intelligent robot will be able to engage in literary authoring now, < br > what level of work, if any, can be written", "answer": "AlphaGo only goes down go, because its design purpose, architecture, technical scheme and training data are all \8230;" }, which is done around the core of playing go, and is composed according to prompt format: asking for: "Alphago will only go, and attorney can write novels, and now there will not be intelligent robot that can engage in literary creation, if have, what level of works can be written" answer: "Alphago only goes to go because its design purpose, architecture, technical scheme and training data are all \8230;" all done around the core of going to go.

Reading comprehension of the prompt-like format: a context question: "queuing" response: "answer". Original corpus example { "id": "0", "context": the treatment of cholelithiasis should be carried out separately according to different conditions, and asymptomatic gallstone can not be treated, but good eating habits of \8230;, "," and "restraint": "what type of gallstone can be untreated", "answer", "asymptomatic gallstone" }, which are composed according to the prompt format: the treatment of cholelithiasis should be handled separately in different cases, and asymptomatic gallstone may not be treated, but good dietary habits of \8230 \ 823030, questions: "what type of gallstone may be untreated" is answered: asymptomatic gallstone.

Single-round or multi-round dialog-like prompt format: conversation: "dialog1", "dialog2", "dialog3" \ 8230; \ 8230;. After composition in the format of prompt: conversation: "how to live broadcast, i can not see your people" "" broadcast is not "" "cherish i like you" \8230

And training the original dialogue model by utilizing the pre-acquired general dialogue data set to obtain the general dialogue model.

S102: and acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set.

Professional dialogue data sets are generally marked by experts, and although the data requirement is far smaller than that of a general dialogue data set, the method is time-consuming and labor-consuming only by means of expert marking, so that professional key word groups are preset. Training an original dialogue model by using a universal dialogue data set to obtain a universal dialogue model, acquiring a preset professional keyword group, screening data of the universal dialogue data set according to the professional keyword group, determining the screened data set as an initial labeling data set, and recording the initial labeling data set as an initial labeling data set

. The initial labeling data set is obtained by screening the general conversation data set through setting the professional key phrases, and compared with a simple manual labeling method, the generation efficiency of the professional conversation data set is greatly improved.

In a specific embodiment of the present invention, the data screening of the universal dialogue data set according to the professional keyword group may include the following steps:

and screening the data of the general dialogue data set according to the professional keyword set by utilizing a DFA algorithm.

And when the universal dialogue data set is screened, the DFA algorithm is used for screening the universal dialogue data set according to the professional keyword group. Therefore, the advantage of filtering sensitive words can be realized while efficient keyword matching can be realized by fully utilizing the DFA algorithm.

The process of adopting DFA algorithm to realize keyword matching and screening out professional dialogue data from general dialogue data set in the embodiment of the invention can comprise the following steps:

(1) Providing professional key phrases by experts;

(2) Constructing a professional word linked list (ending with a specific character '\ x 00') by establishing a nested dictionary for the professional key word group;

(3) And traversing each group of dialogues in the general dialog data set, taking the dialogues as an input traversal professional word linked list, and if a specific character \ x00 is met, indicating that the group of dialogues contains professional keywords and screening out the professional keywords.

Although some of the professional dialogue data can be screened out through keyword matching, the professional dialogue usually involved in the common dialogue data set is limited, especially in some biased professions, so that expert marking is still needed. Data marked by experts relate to privacy, and desensitization treatment (hiding private information such as names, mobile phone numbers, mailboxes and the like in conversations) needs to be added. As with the construction of the universal dialogue dataset, the professional dialogue dataset is composed in the prompt format of table 1.

Specific examples of building server specialized dialog datasets are as follows:

if the intelligent customer service of the server belongs to a plurality of rounds of conversations, the conversation content is as follows: "you are good asking what can help you. The ' status light is always related to the power supply, the normal operation of the server is not influenced, the ' status ' is a total light, the machine is on when a problem occurs, and 4 circuits are all suggested to be plugged in. "there is no condition to insert 4 way power on the spot, there is no way to let status light on" ", there is, with the order, brush the power tactics into two electricity. "

S103: and training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model.

After obtaining the initial annotated data set, utilizing the initial annotated data set

Training the general dialogue model to obtain an initial professional dialogue model which is recorded as

。

S104: and carrying out verification operation on the initial professional dialogue model by utilizing the verification data set and preset natural language processing evaluation indexes to obtain a verification score.

Training to obtain initial professional dialogue model

Then, the initial professional dialogue model is verified by utilizing the verification data set and the preset natural language processing evaluation index, and a verification score is obtained and recorded as

. And predicting the response performance of the initial professional dialogue model to the voice question through the verification score.

S105: and judging whether the verification score is larger than a preset score threshold value, if so, executing the step S106, and if not, continuing training the initial professional dialogue model.

And presetting a score threshold, after the initial professional dialogue model is verified by using the verification data set and the preset natural language processing evaluation index, judging whether the verification score is greater than the preset score threshold, if so, indicating that the model is trained, and executing the step S106, otherwise, indicating that the initial professional dialogue model needs to be trained continuously.

S106: and determining the initial professional dialogue model as a target professional dialogue model.

And when the verification score is determined to be larger than the preset score threshold value, the model is trained, and the initial professional dialogue model is determined as the target professional dialogue model. The target professional dialogue model and the current all expert annotation data set can also be output. Whether the professional dialogue model is trained or not is judged according to the preset score threshold value, so that the trained target professional dialogue model can have good answer generation capacity on the questioning voice.

Referring to fig. 2, fig. 2 is a flowchart of another implementation of a training method of a dialogue model in an embodiment of the present invention, where the method may include the following steps:

s201: and training the original dialogue model by using the pre-acquired general dialogue data set to obtain the general dialogue model.

In an embodiment of the present invention, before step S201, the method for training a dialogue model may further include the following steps:

and respectively filtering the question-answer data and the chatting data in the universal dialogue data set.

And after the universal dialogue data set is obtained, respectively filtering the question answering data and the chatting data in the universal dialogue data set. For example, since the overall noise of the question-answer data set is low, simple filtering may be performed, including removing conversations containing sensitive words, removing insufficiency

Word conversations, conversations removing questions that are the same as answers, removing meaningless characters in the corpus, and the like. Because the whole noise of the chatting data set is large, strict filtering is required. The adopted filtering mode comprises removing dialogues containing sensitive words and removing deficiency

Word conversations, removing conversations having only one sentence, removing conversations not containing Chinese characters, deleting advertising conversations, deleting repeated conversationsAnd removing meaningless characters in the corpus and the like. The original dialogue model is trained by utilizing the filtered general dialogue data set, so that the interference of useless data is avoided, the model training complexity is reduced, the model training efficiency is improved, and the accuracy of the trained model is improved.

In order to make the training effect better, the data set can be composed according to different categories and a certain prompt format, as follows:

TABLE 1

And the subsequent processing work is reduced through a fixed prompt format.

In one embodiment of the present invention, step S201 may include the following steps:

the method comprises the following steps: inputting the general dialogue data set into an original dialogue model to carry out model iterative training;

step two: obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;

step three: determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation, if so, executing a fourth step, and if not, executing a fifth step;

step four: determining a dialogue model obtained by the iterative training of the current round as a general dialogue model;

step five: and inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to carry out model iterative training, and returning to execute the second step.

For convenience of description, the above five steps may be combined for illustration.

The process of training an original dialogue model by using a general dialogue data set to obtain a general dialogue model can comprise the steps of inputting the general dialogue data set into the original dialogue model to perform model iterative training, obtaining a current iteration number and a loss standard deviation obtained by the current iteration training, determining whether a model training cut-off condition is met according to the current iteration number and the loss standard deviation, if so, indicating that the model obtained by the current training can already give a better voice response to a general question, determining the dialogue model obtained by the current iteration training as the general dialogue model, if not, indicating that the model obtained by the current training cannot give a better voice response to the general question, inputting the general dialogue data set into the dialogue model obtained by the current iteration training to perform model iterative training, obtaining the current iteration number and the loss standard deviation obtained by the current iteration training again, and continuously optimizing the model through multiple times of training iterations.

It should be noted that the model training cutoff condition may be set and adjusted according to an actual situation, which is not limited in the embodiment of the present invention, and may be set as an upper limit of the number of iterations, or may be set as a loss threshold.

In an embodiment of the present invention, determining whether the model training cutoff condition is reached according to the current iteration number and the loss standard deviation may include the following steps:

step three: judging whether the current iteration number is greater than a first preset value and the loss standard deviation is less than a second preset value, if so, executing a fourth step, and if not, executing a fifth step when the current iteration number is determined to be greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value;

step five: judging whether the current iteration number is larger than a third preset value or not, if so, returning to execute the fourth step, and if not, executing the sixth step;

wherein the third preset value is greater than the first preset value;

step six: and inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to carry out model iterative training, and returning to execute the second step.

For convenience of description, the above six steps may be combined for illustration.

The hyper-parameters in the model training are preset, and the hyper-parameters can comprise the iteration number

Pre-training minimum iteration number obtained by pre-training

(i.e., first predetermined value), loss standard deviation

Standard deviation threshold of

(i.e., second predetermined value), loss standard deviation

The standard deviation of the latest ten iterations loss is represented. After obtaining the current iteration number and the loss standard deviation obtained by the iteration training of the current round, judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value, namely

And thereby determine whether a model training cutoff condition has been reached. The model training stage is judged by combining the current iteration number and the loss standard deviation, so that the model meeting the training cutoff condition is subjected to iteration for a certain number of times, and the performance of the model is improved.

The pre-set hyper-parameters in the model training may further include the number of iterations

Pre-training maximum iteration number obtained by pre-training

(i.e., a third preset value) greater than the third preset valueA predetermined value, i.e.

. When the current iteration number is larger than a first preset value and the loss standard deviation is larger than or equal to a second preset value, judging whether the current iteration number is larger than a third preset value or not, if so, judging that the loss value is slowly reduced, training the model to be close to global optimum, determining the dialogue model obtained by the iteration training of the current round as a universal dialogue model, if not, judging that the model needs to be continuously trained, inputting a universal dialogue data set into the dialogue model obtained by the iteration training of the current round for model iteration training, obtaining the loss standard deviation obtained by the current iteration number and the iteration training of the current round again, judging whether a model training cut-off condition is reached or not based on the data of the iteration of the current round, and repeating the steps until the preset model training cut-off condition is reached, thereby obtaining the universal dialogue model capable of well responding to the universal questioning voice.

S202: and acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set.

S203: and training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model.

S204: and carrying out verification operation on the initial professional dialogue model by utilizing the verification data set and preset natural language processing evaluation indexes to obtain a verification score.

In a specific embodiment of the present invention, the verifying the initial professional dialogue model by using the verification data set and the preset evaluation index of natural language processing may include the following steps:

and (3) carrying out verification operation on the initial professional dialogue model by combining a verification data set, a BLEU index, a ROUGE index, a PPL index and a DISTINCT index through the following formula:

；

wherein ,

to score the initial professional dialogue model on the BLEU index,

to score the initial professional dialogue model on the route index,

the score of the initial professional dialogue model on the PPL index is obtained in the form of reciprocal of the PPL index score,

to score the initial professional dialogue model on the DISTINCT index,

to verify the score.

When the initial professional dialogue model is verified, verification operation can be performed on the initial professional dialogue model by combining a verification data set, a BLEU index, a ROUGE index, a PPL index and a DISTINCT index. The verification score may be calculated as follows:

；

wherein ,

to score the initial professional dialogue model on the BLEU index,

to score the initial professional dialogue model on the route index,

the score of the initial professional dialogue model on the PPL index is in the form of reciprocal of the PPL index score,

the smaller the size is, the worse the model is generated,

the score of the initial professional dialogue model on the DISTINCT index.

The performance of the model on the verification data set is comprehensively judged by adopting four indexes of BLEU, ROUGE, PPL and DISTINCT. The method ensures the accuracy and recall rate of the generated answers while ensuring the smoothness and diversity of the model generation.

In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the BLEU index

The score of the initial professional dialogue model on the BLEU index

The calculation process of (a) may include the steps of:

the score of the initial professional dialogue model on the BLEU index is calculated by the following formula

：

；

wherein ,

，

is the length of the machine translation and,

translating the length of a sentence for the shortest referenceThe degree of the water is measured by the following method,

to be the accuracy of the n-gram,

is the weight of the n-gram, with any n

And BP is a penalty factor.

The core idea of the BLEU is to compare the degree of coincidence of the n-grams in the candidate translation and the reference translation, and the higher the degree of coincidence, the higher the quality of the translation is considered. In practice, N =1 to 4 is usually taken, and then weighted average is performed.

；

wherein ,

，

is the length of the machine-translated version,

is the shortest length of the reference translation sentence,

is the precision of the n-gram,

the weights for n-grams are typically set to be uniform, i.e., there is a weight for any n

. BP is a penalty factor, and if the length of the translation is smaller than the shortest reference translation, the BP is smaller than 1. 1-gram accuracy of BLEU indicates that the decoded text is faithful to the original textThe degree, while other n-grams represent the fluency of translation.

In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the route index

The score of the initial professional dialogue model on the ROUGE index

The calculation process of (a) may include:

：

；

Wherein { reference translation } denotes a set of reference translations,

a combination of N words is represented by,

ROUGE-N focuses on recall rather than accuracy. See how many n-gram phrases in the reference sentence appear in the output. "N" refers to an N-gram that is computed in a similar manner to BLEU, except that BLEU is based on precision and ROUGE is based on recall. The ROUGE-N is mainly used for counting the recall rate on the N-gram, and for the N-gram, a ROUGE-N score can be calculated, wherein the calculation formula is as follows:

；

where { reference translation } denotes a set of reference translations, and there may be a plurality of reference translations in actual applications.

A combination of N words is represented,

indicating the number of N-grams in the computed grammar. The denominator of the formula is to count the number of N-grams in all the reference translations, and the numerator is to count the number of N-grams shared by all the reference translations and the machine translation.

In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the PPL index

The score of the initial professional dialogue model on the PPL index

The calculation process of (2):

；

wherein ,

representing the probability of predicting the ith word from the above word, N represents the sentence length.

PPL refers to Perplexity in a language model, and Perplexity is an index for measuring whether a sentence is smooth or not. Is defined as:

；

wherein ,

representing the probability of predicting the ith word from the above word, N represents the sentence length. The smaller the PPL value, the more natural the model generates and the smoother the sentence. The reply quality is evaluated through the PPL, and the situations that replies generated by the model are out of order and front and back are reversed can be avoided.

In one embodiment of the invention, the method may further comprise scoring the initial professional dialogue model on the DISTINCT index

The score of the initial professional dialogue model on the DISTINCT index

The calculation process of (2) includes:

：

；

wherein ,

indicating a non-repeating number of ngrams in the reply,

indicating the total number of ngram words in the reply.

The Distingt evaluation index judges the diversity of machine recovery, and the Distingt index judges whether a large amount of universality and repeatability recovery occur. Distingt is defined as follows:

；

wherein ,

indicating a non-repeating number of ngrams in the reply,

indicating the total number of ngram words in the reply.

Larger indicates more diversity in generating replies.

S205: and judging whether the verification score is larger than a preset score threshold value, if so, executing the step S106, and if not, executing the step S207.

S206: and determining the initial professional dialogue model as a target professional dialogue model.

S207: and generating corresponding response data aiming at each sample data in the preset unmarked pool by utilizing the initial professional dialogue model.

When the verification score is determined to be less than or equal to the preset score threshold value, the model needs to be trained continuously, and the initial professional dialogue model is utilized

And generating corresponding response data aiming at each sample data in the preset unmarked pool.

S208: and respectively calculating the automatic evaluation scores corresponding to the response data.

After corresponding response data are generated for each sample data in the preset unmarked pool by using the initial professional dialogue model, the automatic evaluation score corresponding to each response data is respectively calculated. For example, the automatic evaluation score corresponding to each response data may be calculated according to the PPL index and the distint index, and the calculation formula is as follows:

；

thereby obtaining the automatic evaluation score corresponding to each response data.

S209: and sorting the respective automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from the end with smaller score.

After the automatic evaluation scores corresponding to the response data are respectively obtained through calculation, the automatic evaluation scores are sorted, a preset number of automatic evaluation scores are selected from one end with smaller scores, for example, the lowest N automatic evaluation scores are selected

And (6) scoring.

S210: and outputting marking prompt information for marking the response data corresponding to the selected respective dynamic evaluation scores.

After a preset number of automatic evaluation scores are selected from the end with smaller score, the marking prompt information for marking the response data corresponding to the selected automatic evaluation scores is output, thereby prompting the lowest N automatic evaluation scores

And carrying out expert annotation on the response data corresponding to the scores.

S211: and updating the initial labeling data set according to the labeling result to obtain an updated labeling data set.

And after the labeling prompt information for labeling the response data corresponding to the selected respective automatic evaluation score is output, obtaining a labeling result, updating the initial labeling data set according to the labeling result, and obtaining an updated labeling data set, thereby realizing effective labeling of data with poor response data effect generated by the current professional dialogue model.

In an embodiment of the present invention, after step S211, the method for training a dialogue model may further include the following steps:

and updating the preset unmarked pool according to the updated marked data set.

And after the updated marked data set is obtained, updating the preset unmarked pool according to the updated marked data set, thereby realizing the timely updating of the unmarked sample data in the preset unmarked pool.

S212: and training the initial professional dialogue model based on the updated labeling data set to obtain the updated professional dialogue model.

And after the initial annotation data set is updated according to the annotation result to obtain an updated annotation data set, training the initial professional dialogue model based on the updated annotation data set to obtain an updated professional dialogue model.

According to the embodiment of the invention, the active learning mode is adopted, so that the amount of the expert labeled samples is reduced as much as possible, and the influence on the model performance is reduced. And the 'difficult sample' which has the largest improvement on the model performance is continuously selected from the preset unmarked pool, so that the model performance is improved.

S213: and performing verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and returning to execute the step S205.

Training the initial professional dialogue model based on the updated labeling data set to obtain the updated professional dialogue model, verifying the updated professional dialogue model by using the verification data set to obtain a verification score, returning to the step of judging whether the verification score is larger than a preset score threshold value, and repeating the steps until the verification score obtained by calculation is larger than the preset score threshold value, thereby obtaining a target professional dialogue model capable of well responding to the received questioning voice.

Referring to fig. 3, fig. 3 is a flowchart of an implementation of a dialog response method in an embodiment of the present invention, applied to a dialog system including a target specialized dialog model obtained by the preceding training, where the method may include the following steps:

s301: and receiving target question voice to be responded.

When a user needs to perform scene dialogue, target question voice is output to the dialogue response control center, and the dialogue response control center receives the target question voice to be responded.

The dialogue response control center may be a processor deployed with a dialogue model.

The target questioning voice can be chat, common knowledge question and answer, professional question and answer and the like.

S302: and generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model.

The general dialogue model is obtained by Training the general dialogue model in advance, for example, model Training can be performed on the general dialogue data set based on a large model, wherein the large model can be based on a transform structure and is suitable for generating tasks, such as a GPT (general Pre-Training) model, a BERT (Bidirectional Encoder reporting from transforms) model, and the like. And training based on the general dialogue model to obtain a target professional dialogue model. And after receiving the target question voice to be responded, generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model.

By retraining on the basis of the large model, the requirements for data volume and computing power are greatly reduced, and a two-stage training model mode is adopted, so that the trained target professional dialogue model has universality and professionality at the same time.

S303: and performing output operation on the target response voice.

And after generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model, outputting the target response voice, thereby realizing the response to the target question voice.

Because the model training process needs more resources compared with the model application process, more resources can be allocated to the model training process in advance, and relatively less resources are allocated to the model application process. For example, 8 and above 80G sized GPUs (Graphics Processing units, image processors) can be pre-divided for model training, and 1 and above 80G sized GPUs are divided for model application.

According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the use experience of a user is improved.

In an embodiment of the present invention, the dialog response method may further include the steps of:

the method comprises the following steps: searching a relevant answer from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;

step two: and carrying out voice output on the related answers.

For convenience of description, the above two steps may be combined for illustration.

The embodiment of the invention presets a bottom-pocketing scheme, constructs a professional database by using a professional data set, searches related answers from the database based on a preset retrieval algorithm when a target professional dialogue model fails to respond to a target question voice, namely when the output of the target professional dialogue model is empty, and outputs the voice of the related answers. Therefore, the application process of the professional dialogue model is optimized, the situation that the user asks the voice to fall into the air is further guaranteed, and the user experience is improved.

Corresponding to the above method embodiment, the present invention further provides a training apparatus for a dialogue model, and the training apparatus for a dialogue model described below and the training method for a dialogue model described above may be referred to in correspondence.

Referring to fig. 4, fig. 4 is a block diagram illustrating a training apparatus for a dialogue model according to an embodiment of the present invention, where the training apparatus for a dialogue model may include:

a general dialogue model obtaining module 41, configured to train an original dialogue model by using a pre-obtained general dialogue dataset to obtain a general dialogue model;

an initial labeling data set determining module 42, configured to obtain a preset professional keyword group, perform data screening on the general conversation data set according to the professional keyword group, and determine a data set obtained by the screening as an initial labeling data set;

an initial professional dialogue model obtaining module 43, configured to train the general dialogue model by using the initial annotation data set, so as to obtain an initial professional dialogue model;

a verification score obtaining module 44, configured to perform a verification operation on the initial professional dialogue model by using the verification data set and a preset natural language processing evaluation index, so as to obtain a verification score;

a judging module 45, configured to judge whether the verification score is greater than a preset score threshold;

and the target professional dialogue model determining module 46 is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than the preset score threshold value.

In an embodiment of the present invention, the training device for dialogue model may further include:

the response data generation module is used for generating corresponding response data for each sample data in the preset unmarked pool by using the initial professional dialogue model when the verification score is determined to be less than or equal to the preset score threshold value;

the automatic evaluation score calculation module is used for respectively calculating the automatic evaluation scores corresponding to the response data;

the automatic evaluation score selection module is used for sorting the automatic evaluation scores and selecting a preset number of automatic evaluation scores from the end with smaller scores;

the labeling prompt information output module is used for outputting labeling prompt information for labeling the response data corresponding to the selected respective dynamic evaluation scores;

the annotation data set updating module is used for updating the initial annotation data set according to the annotation result to obtain an updated annotation data set;

the professional dialogue model updating module is used for training the initial professional dialogue model based on the updated labeled data set to obtain an updated professional dialogue model;

and the repeated execution module is used for carrying out verification operation on the updated professional dialogue model by utilizing the verification data set to obtain a verification score and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.

and the unmarked pool updating module is used for updating the preset unmarked pool according to the updated marked data set after the updated marked data set is obtained.

In a specific embodiment of the present invention, the verification score obtaining module 44 is specifically configured to perform a verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index, and the DISTINCT index according to the following formulas:

；

wherein ,

to score the initial professional dialogue model on the BLEU index,

to score the initial professional dialogue model on the route index,

to score the initial professional dialogue model on the DISTINCT index,

to verify the score.

a score calculating module on the BLEU index for calculating the score of the initial professional dialogue model on the BLEU index by the following formula

：

；

wherein ,

，

is the length of the machine-translated version,

is the shortest length of the reference translation sentence,

to be the accuracy of the n-gram,

is the weight of the n-gram, with any n

And BP is a penalty factor.

a score calculating module on the ROUGE index for calculating the score of the initial professional dialogue model on the ROUGE index by the following formula

：

；

Wherein { reference translation } denotes a set of reference translations,

a combination of N words is represented,

a score calculating module on the PPL index for calculating the score of the initial professional dialogue model on the PPL index by the following formula

：

；

wherein ,

a score calculating module on the DISTINCT index for calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula

：

；

wherein ,

indicating a non-repeating number of ngrams in the reply,

representing the total number of ngram words in the reply.

and the data filtering module is used for respectively filtering question-answer data and chatting data in the universal dialogue data set before training the original dialogue model by using the pre-acquired universal dialogue data set.

In one embodiment of the present invention, the general dialogue model obtaining module 41 includes:

the iterative training submodule is used for inputting the general dialogue data set into an original dialogue model to carry out model iterative training;

the loss standard deviation obtaining submodule is used for obtaining the current iteration number and the loss standard deviation obtained by the iteration training of the current round;

the training cutoff judgment submodule is used for determining whether a model training cutoff condition is met according to the current iteration number and the loss standard deviation;

and the general dialogue model determining submodule is used for determining the dialogue model obtained by the iteration training in the current round as the general dialogue model when the model training cut-off condition is determined to be reached according to the current iteration number and the loss standard deviation.

In an embodiment of the present invention, the training cutoff determination sub-module is a module for determining whether the current iteration number is greater than a first preset value and the loss standard deviation is smaller than a second preset value.

the iteration number counting submodule is used for judging whether the current iteration number is greater than a third preset value or not when the current iteration number is determined to be greater than a first preset value and the loss standard deviation is greater than or equal to a second preset value; wherein the third preset value is greater than the first preset value;

the general dialogue model determining submodule is also used for determining the dialogue model obtained by the iteration training in the current round as the general dialogue model when the current iteration number is larger than a third preset value;

and the iterative training submodule is also used for inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to perform model iterative training when the current iterative number is less than or equal to a third preset value, and repeatedly executing the step of obtaining the current iterative number and the loss standard deviation obtained by the iterative training of the current round.

In an embodiment of the present invention, the initial labeled data set determining module 42 is specifically a module for performing data screening on the general dialogue data set according to the professional keyword set by using a DFA algorithm.

Corresponding to the above method embodiment, the present invention further provides a dialog response device, and the dialog response device described below and the dialog response method described above may be referred to correspondingly.

Referring to fig. 5, fig. 5 is a block diagram of a dialog response device according to an embodiment of the present invention, where the dialog response device may include:

a question voice receiving module 51, configured to receive a target question voice to be responded;

a response speech generation module 52, configured to generate a target response speech corresponding to the target question speech by using a target professional dialogue model obtained based on training of the general dialogue model;

and a response voice output module 53, configured to perform an output operation on the target response voice.

In an embodiment of the present invention, the dialog response device may further include:

the answer searching module is used for searching related answers from the database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;

and the voice output module is used for performing voice output on the related answers.

Corresponding to the above method embodiment, referring to fig. 6, fig. 6 is a schematic diagram of an electronic device provided by the present invention, which may include:

a memory 332 for storing a computer program;

a processor 322 for implementing the steps of the training method or the dialogue response method of the dialogue model of the above method embodiments when executing the computer program.

Specifically, referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device provided in this embodiment, which may generate relatively large differences due to different configurations or performances, and may include a processor (CPU) 322 (e.g., one or more processors) and a memory 332, where the memory 332 stores one or more computer programs 342 or data 344. Memory 332 may be, among other things, transient storage or persistent storage. The program stored in memory 332 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the data processing apparatus. Further, the processor 322 may be configured to communicate with the memory 332 to execute a series of instruction operations in the memory 332 on the electronic device 301.

The electronic device 301 may also include one or more power sources 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341.

The steps in the above-described dialog response method may be implemented by the structure of the electronic device.

Corresponding to the above method embodiment, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, can implement the following steps:

training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model; acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set; training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model; verifying the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; if so, determining the initial professional dialogue model as a target professional dialogue model;

or the like, or, alternatively,

receiving target question voice to be responded; generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model; and performing output operation on the target response voice.

The computer-readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.

For the introduction of the computer-readable storage medium provided by the present invention, please refer to the above method embodiments, which are not described herein again.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device, the electronic device and the computer-readable storage medium disclosed by the embodiments correspond to the method disclosed by the embodiments, so that the description is simple, and the relevant points can be referred to the description of the method part.

The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, without departing from the principle of the present invention, it is possible to make various improvements and modifications to the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims

1. A method for training a dialogue model, comprising:

2. The training method of a dialogue model according to claim 1, wherein when it is determined that the verification score is equal to or less than the preset score threshold, the method further comprises:

sorting the automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from one end with smaller score;

outputting labeling prompt information for labeling the response data corresponding to each selected automatic evaluation score;

3. The method for training a dialogue model of claim 2, further comprising, after obtaining the updated annotation data set:

and updating the preset unmarked pool according to the updated labeled data set.

4. The training method of dialogue model according to claim 1, wherein the performing a verification operation on the initial professional dialogue model using a verification data set and a preset evaluation index of natural language processing comprises:

；

wherein ,

for the score of the initial professional dialogue model on the BLEU index,

for the score of the initial professional dialogue model on the route index,

the score of the initial professional dialogue model on the PPL index,in the form of the reciprocal of the score of the PPL indicator,

for the score of the initial professional dialogue model on the DISTINCT index,

to verify the score.

5. The method of claim 4, further comprising scoring the initial specialized dialogue model on a BLEU indicator

The score of the initial professional dialogue model on the BLEU index

The calculation process of (2) comprises:

calculating the score of the initial professional dialogue model on the BLEU index by the following formula

：

；

wherein ,

，

is the length of the machine-translated version,

is the shortest length of the reference translation sentence,

is the precision of the n-gram,

is the weight of the n-gram, with any n

And BP is a penalty factor.

6. The method for training dialogue model according to claim 4, further comprising scoring the initial professional dialogue model on a ROUGE index

The score of the initial professional dialogue model on the ROUGE index

The calculation process of (2) comprises:

calculating the score of the initial professional dialogue model on the ROUGE index through the following formula

：

；

Wherein { reference translation } denotes a set of reference translations,

a combination of N words is represented by,

representing the number of N-grams, formula, in the computer textThe denominator is used for counting the number of N-grams in all the reference translations, and the numerator is used for counting the number of N-grams shared by all the reference translations and the machine translations.

7. The method of claim 4, further comprising scoring the initial specialized dialogue model on a PPL metric

The score of the initial professional dialogue model on the PPL index

The calculation process of (2):

；

wherein ,

8. The method of claim 4, further comprising scoring the initial specialized dialog model on a DISTINCT index

The score of the initial professional dialogue model on the DISTINCT index

The calculation process of (2) includes:

：

；

wherein ,

representing the number of ngrams in the reply that do not repeat,

representing the total number of ngram words in the reply.

9. The method for training a dialogue model according to claim 1, further comprising, before training an original dialogue model using a pre-acquired common dialogue dataset:

10. The method for training a dialogue model according to claim 1, wherein training an original dialogue model using a pre-acquired universal dialogue dataset to obtain a universal dialogue model comprises:

11. The method of claim 10, wherein determining whether a model training cutoff condition is met based on the current iteration number and the loss standard deviation comprises:

12. The method for training a dialogue model of claim 11, wherein when it is determined that the current iteration number is greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value, the method further comprises:

if yes, determining the dialogue model obtained by the iterative training as the general dialogue model;

13. The method for training a dialogue model according to claim 1, wherein the step of performing data screening on the general dialogue data set according to the professional keyword group comprises:

14. A dialogue response method applied to a dialogue system including a target professional dialogue model trained according to any one of claims 1 to 13, comprising:

receiving target question voice to be responded;

and performing output operation on the target response voice.

15. The dialog response method of claim 14 further comprising:

searching a relevant answer from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;

and carrying out voice output on the related answers.

16. An apparatus for training a dialogue model, comprising:

the general dialogue model acquisition module is used for training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;

the initial professional dialogue model obtaining module is used for training the general dialogue model by utilizing the initial labeling data set to obtain an initial professional dialogue model;

17. A dialog response device comprising:

the response voice generation module is used for generating target response voice corresponding to the target question voice by utilizing a target professional dialogue model obtained by training a general dialogue model;

18. An electronic device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the training method of the dialogue model according to any one of claims 1 to 13 or the dialogue response method according to any one of claims 14 to 15 when executing the computer program.

19. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the training method of the dialogue model according to any one of claims 1 to 13 or the dialogue response method according to any one of claims 14 to 15.