CN115495568A - Training method and device for dialogue model and dialogue response method and device - Google Patents

Training method and device for dialogue model and dialogue response method and device Download PDF

Info

Publication number
CN115495568A
CN115495568A CN202211441290.4A CN202211441290A CN115495568A CN 115495568 A CN115495568 A CN 115495568A CN 202211441290 A CN202211441290 A CN 202211441290A CN 115495568 A CN115495568 A CN 115495568A
Authority
CN
China
Prior art keywords
dialogue model
dialogue
professional
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211441290.4A
Other languages
Chinese (zh)
Other versions
CN115495568B (en
Inventor
刘红丽
李峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202211441290.4A priority Critical patent/CN115495568B/en
Publication of CN115495568A publication Critical patent/CN115495568A/en
Priority to PCT/CN2023/086071 priority patent/WO2024103609A1/en
Application granted granted Critical
Publication of CN115495568B publication Critical patent/CN115495568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a training method of a dialogue model, which comprises the following steps: training an original dialogue model by using a general dialogue data set to obtain a general dialogue model; acquiring a preset professional keyword group, and performing data screening on the universal dialogue data set according to the professional keyword group; training the general dialogue model by using the screened initial labeling data set to obtain an initial professional dialogue model; verifying the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; and if so, determining the initial professional dialogue model as the target professional dialogue model. The method and the system enable the trained target professional dialogue model to have universality and specialty at the same time, and improve the use experience of users. The invention also discloses a training device of the dialogue model, a dialogue response method and device, electronic equipment and a computer readable storage medium, and has corresponding technical effects.

Description

Training method and device for dialogue model and dialogue response method and device
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for training a dialogue model, a method and an apparatus for dialogue response, an electronic device, and a computer-readable storage medium.
Background
Human-computer conversation has been regarded by academic and industrial circles as a basic application of Natural Language Processing (NLP). With the development of artificial intelligence technology, generative-based dialogue models are increasingly popular, which are trained specifically for dialogue data and achieve very good performance in open-domain dialogue. However, training a large dialogue model from the beginning requires a large amount of multi-type dialogue data as a training corpus, which requires high cost and long training time.
There are also often different chat needs in professional human-machine dialog systems, including: chat, general-sense question and answer, professional question and answer, and the like. For example, the medical robot is required to answer medical professional knowledge and relate to common sense questions in life in the process of chatting with a patient, and is required to chatty to relieve the emotion of the patient. Most of the current professional dialogue models adopt a retrieval mode, and the main principle is semantic matching, namely finding answers of questions asked by users in a knowledge base. Although the technology is mature, the technology is very dependent on corpora, knowledge is one-sided, the reply is single and hard, the universality and the diversity are lacked, and the user experience is poor.
In summary, how to effectively solve the problems of single and harsh reply, lack of generality and diversity, poor user experience and the like of the existing dialog response method is a problem which needs to be solved urgently by a person skilled in the art at present.
Disclosure of Invention
The invention aims to provide a training method of a dialogue model, which enables the trained target professional dialogue model to have universality and specialty at the same time, and improves the use experience of a user; another object of the present invention is to provide a training apparatus for a dialogue model, a dialogue response method and apparatus, an electronic device, and a computer-readable storage medium.
In order to solve the technical problems, the invention provides the following technical scheme:
a method of training a dialogue model, comprising:
training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;
acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model;
carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
judging whether the verification score is larger than a preset score threshold value or not;
and if so, determining the initial professional dialogue model as a target professional dialogue model.
In a specific embodiment of the present invention, when it is determined that the verification score is equal to or less than the preset score threshold, the method further includes:
generating corresponding response data for each sample data in a preset unmarked pool by using the initial professional dialogue model;
respectively calculating the automatic evaluation score corresponding to each response data;
sorting the automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from the end with smaller scores;
outputting marking prompt information for marking the response data corresponding to each selected automatic evaluation score;
updating the initial labeling data set according to the labeling result to obtain an updated labeling data set;
training the initial professional dialogue model based on the updated labeling data set to obtain an updated professional dialogue model;
and carrying out verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.
In an embodiment of the present invention, after obtaining the updated annotation data set, the method further includes:
and updating the preset unmarked pool according to the updated marked data set.
In a specific embodiment of the present invention, the performing a verification operation on the initial professional dialogue model by using a verification data set and a preset natural language processing evaluation index includes:
and carrying out verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index and the DISTINCT index through the following formula:
Figure 540671DEST_PATH_IMAGE001
wherein ,
Figure 215234DEST_PATH_IMAGE002
for the score of the initial professional dialogue model on the BLEU index,
Figure 7522DEST_PATH_IMAGE002
for the score of the initial professional dialogue model on the route index,
Figure 707625DEST_PATH_IMAGE003
the score of the initial professional dialogue model on the PPL index is in the form of the reciprocal of the PPL index score,
Figure 154787DEST_PATH_IMAGE004
for the score of the initial professional dialogue model on the DISTINCT index,
Figure 875618DEST_PATH_IMAGE005
is the verification score.
In a specific embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the BLEU index
Figure 821446DEST_PATH_IMAGE006
Has been calculatedProgram, score of said initial professional dialogue model on BLEU index
Figure 918715DEST_PATH_IMAGE006
The calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the BLEU index through the following formula
Figure 751542DEST_PATH_IMAGE006
Figure 846537DEST_PATH_IMAGE007
wherein ,
Figure 217344DEST_PATH_IMAGE008
Figure 118304DEST_PATH_IMAGE009
is the length of the machine-translated version,
Figure 821949DEST_PATH_IMAGE010
is the shortest length of the reference translation sentence,
Figure 884583DEST_PATH_IMAGE011
to be the accuracy of the n-gram,
Figure 290156DEST_PATH_IMAGE012
as weights of n-grams, with any n
Figure 994807DEST_PATH_IMAGE013
And BP is a penalty factor.
In a specific embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the ROUGE index
Figure 739909DEST_PATH_IMAGE014
The score of the initial professional dialogue model on the ROUGE index
Figure 488291DEST_PATH_IMAGE014
The calculation process of (2) includes:
calculating the score of the initial professional dialogue model on the ROUGE index by the following formula
Figure 381161DEST_PATH_IMAGE014
Figure 623924DEST_PATH_IMAGE015
Wherein { reference translation } denotes a set of reference translations,
Figure 20270DEST_PATH_IMAGE016
a combination of N words is represented,
Figure 893548DEST_PATH_IMAGE017
the numerator of the formula is used for counting the number of the N-grams in all the reference translations, and the numerator is used for counting the number of the N-grams shared by all the reference translations and the machine translations.
In an embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the PPL index
Figure 758867DEST_PATH_IMAGE018
The score of the initial professional dialogue model on the PPL index
Figure 805320DEST_PATH_IMAGE018
The calculation process of (2):
Figure 321752DEST_PATH_IMAGE019
wherein ,
Figure 365932DEST_PATH_IMAGE020
indicating the probability of predicting the ith word from the above word, N represents the sentence length.
In an embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the DISTINCT index
Figure 967814DEST_PATH_IMAGE021
The score of the initial professional dialogue model on the DISTINCT index
Figure 819225DEST_PATH_IMAGE021
The calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula
Figure 455743DEST_PATH_IMAGE021
Figure 467561DEST_PATH_IMAGE022
wherein ,
Figure 494423DEST_PATH_IMAGE023
indicating a non-repeating number of ngrams in the reply,
Figure 492466DEST_PATH_IMAGE024
representing the total number of ngram words in the reply.
In an embodiment of the present invention, before training the original dialogue model by using the pre-acquired common dialogue dataset, the method further includes:
and respectively filtering the question-answer data and the chatting data in the general dialogue data set.
In a specific embodiment of the present invention, training an original dialog model with a pre-acquired universal dialog dataset to obtain a universal dialog model, includes:
inputting the general dialogue data set into the original dialogue model for model iterative training;
obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation;
and if so, determining the dialogue model obtained by the iterative training as the general dialogue model.
In an embodiment of the present invention, determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation includes:
and judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value.
In an embodiment of the present invention, when it is determined that the current iteration number is greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value, the method further includes:
judging whether the current iteration number is larger than a third preset value or not; wherein the third preset value is greater than the first preset value;
if yes, determining the dialogue model obtained by the iterative training in the current round as the general dialogue model;
and if not, inputting the general dialogue data set into a dialogue model obtained by the current iteration training for model iteration training, and repeatedly executing the steps of obtaining the current iteration number and the loss standard deviation obtained by the current iteration training.
In a specific embodiment of the present invention, the data screening of the general dialog data set according to the professional keyword group includes:
and performing data screening on the universal dialogue data set according to the professional keyword group by utilizing a DFA algorithm.
A dialogue response method is applied to a dialogue system containing a target professional dialogue model obtained by the training, and comprises the following steps:
receiving target question voice to be responded;
generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model;
and performing output operation on the target response voice.
In one embodiment of the present invention, the method further comprises:
searching relevant answers from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
and carrying out voice output on the related answers.
A training apparatus of a dialogue model, comprising:
the universal dialogue model acquisition module is used for training an original dialogue model by utilizing a pre-acquired universal dialogue data set to obtain a universal dialogue model;
the initial labeling data set determining module is used for acquiring a preset professional keyword group, performing data screening on the general conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
the initial professional dialogue model acquisition module is used for training the general dialogue model by utilizing the initial labeling data set to obtain an initial professional dialogue model;
the verification score obtaining module is used for carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
the judging module is used for judging whether the verification score is larger than a preset score threshold value or not;
and the target professional dialogue model determining module is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than a preset score threshold value.
A dialog response device comprising:
the question voice receiving module is used for receiving target question voice to be responded;
the response voice generating module is used for generating target response voice corresponding to the target question voice by utilizing a target professional dialogue model obtained by training a general dialogue model;
and the response voice output module is used for outputting the target response voice.
An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the training method or the dialogue response method of the dialogue model as described above when the computer program is executed.
A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the training method or the dialogue response method of the dialogue model as previously described.
The training method of the dialogue model provided by the invention utilizes the pre-acquired general dialogue data set to train the original dialogue model to obtain the general dialogue model; acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set; training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model; carrying out verification operation on the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; and if so, determining the initial professional dialogue model as the target professional dialogue model.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the user experience is improved.
Correspondingly, the invention also provides a training device of the dialogue model, a dialogue response method and device, an electronic device and a computer readable storage medium corresponding to the training method of the dialogue model, which have the technical effects and are not described herein again.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of an implementation of a training method for a dialogue model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating another embodiment of a method for training a dialogue model according to the present invention;
FIG. 3 is a flowchart illustrating an implementation of a dialog response method according to an embodiment of the present invention;
FIG. 4 is a block diagram of a training apparatus for a dialogue model according to an embodiment of the present invention;
FIG. 5 is a block diagram of a dialog response device according to an embodiment of the present invention;
FIG. 6 is a block diagram of an electronic device according to an embodiment of the invention;
fig. 7 is a schematic structural diagram of an electronic device provided in this embodiment.
Detailed Description
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart of an implementation of a training method for a dialogue model according to an embodiment of the present invention, where the method may include the following steps:
s101: and training the original dialogue model by using the pre-acquired general dialogue data set to obtain the general dialogue model.
A common dialogue dataset is collected in advance in a public dataset, and can be divided into two main categories, namely question answering and chat. The question and answer data can relate to multiple fields such as common knowledge, facts, mothers and babies, medical treatment, law, insurance, aviation, psychology, traditional Chinese medicine, epidemic situation and the like. The chatty data can comprise a plurality of data sets such as microblog discussions, drama dialogue, post-bar discussions, bean comments, e-commerce conversations and the like, and can relate to various topic discussions of daily life such as history, movies, weather, entertainment, sports and the like.
Specific examples of constructing a universal dialog data set are as follows:
the format of the vocabulary entry interpretation class prompt is as follows: "title", article: "text". Original corpus example { "id": "0", "url": https: // xxx, "title": "economics", "text": "economics is a social science of research on production, distribution and consumption of products and services \8230; \8230 }, which is composed in the format of prompt: title: "economics", article: "economics is a social science of research into the production, distribution, and consumption of products and services \8230;".
Question-answer type prompt format: asking: "title + desc" answer: "answer". Raw corpus example { "qid":0, "title": "will be only go by AlphaGo, can be written by afu dog in novel story", "desc": "no intelligent robot will be able to engage in literary authoring now, < br > what level of work, if any, can be written", "answer": "AlphaGo only goes down go, because its design purpose, architecture, technical scheme and training data are all \8230;" }, which is done around the core of playing go, and is composed according to prompt format: asking for: "Alphago will only go, and attorney can write novels, and now there will not be intelligent robot that can engage in literary creation, if have, what level of works can be written" answer: "Alphago only goes to go because its design purpose, architecture, technical scheme and training data are all \8230;" all done around the core of going to go.
Reading comprehension of the prompt-like format: a context question: "queuing" response: "answer". Original corpus example { "id": "0", "context": the treatment of cholelithiasis should be carried out separately according to different conditions, and asymptomatic gallstone can not be treated, but good eating habits of \8230;, "," and "restraint": "what type of gallstone can be untreated", "answer", "asymptomatic gallstone" }, which are composed according to the prompt format: the treatment of cholelithiasis should be handled separately in different cases, and asymptomatic gallstone may not be treated, but good dietary habits of \8230 \ 823030, questions: "what type of gallstone may be untreated" is answered: asymptomatic gallstone.
Single-round or multi-round dialog-like prompt format: conversation: "dialog1", "dialog2", "dialog3" \ 8230; \ 8230;. After composition in the format of prompt: conversation: "how to live broadcast, i can not see your people" "" broadcast is not "" "cherish i like you" \8230
And training the original dialogue model by utilizing the pre-acquired general dialogue data set to obtain the general dialogue model.
S102: and acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set.
Professional dialogue data sets are generally marked by experts, and although the data requirement is far smaller than that of a general dialogue data set, the method is time-consuming and labor-consuming only by means of expert marking, so that professional key word groups are preset. Training an original dialogue model by using a universal dialogue data set to obtain a universal dialogue model, acquiring a preset professional keyword group, screening data of the universal dialogue data set according to the professional keyword group, determining the screened data set as an initial labeling data set, and recording the initial labeling data set as an initial labeling data set
Figure 717911DEST_PATH_IMAGE025
. The initial labeling data set is obtained by screening the general conversation data set through setting the professional key phrases, and compared with a simple manual labeling method, the generation efficiency of the professional conversation data set is greatly improved.
In a specific embodiment of the present invention, the data screening of the universal dialogue data set according to the professional keyword group may include the following steps:
and screening the data of the general dialogue data set according to the professional keyword set by utilizing a DFA algorithm.
And when the universal dialogue data set is screened, the DFA algorithm is used for screening the universal dialogue data set according to the professional keyword group. Therefore, the advantage of filtering sensitive words can be realized while efficient keyword matching can be realized by fully utilizing the DFA algorithm.
The process of adopting DFA algorithm to realize keyword matching and screening out professional dialogue data from general dialogue data set in the embodiment of the invention can comprise the following steps:
(1) Providing professional key phrases by experts;
(2) Constructing a professional word linked list (ending with a specific character '\ x 00') by establishing a nested dictionary for the professional key word group;
(3) And traversing each group of dialogues in the general dialog data set, taking the dialogues as an input traversal professional word linked list, and if a specific character \ x00 is met, indicating that the group of dialogues contains professional keywords and screening out the professional keywords.
Although some of the professional dialogue data can be screened out through keyword matching, the professional dialogue usually involved in the common dialogue data set is limited, especially in some biased professions, so that expert marking is still needed. Data marked by experts relate to privacy, and desensitization treatment (hiding private information such as names, mobile phone numbers, mailboxes and the like in conversations) needs to be added. As with the construction of the universal dialogue dataset, the professional dialogue dataset is composed in the prompt format of table 1.
Specific examples of building server specialized dialog datasets are as follows:
if the intelligent customer service of the server belongs to a plurality of rounds of conversations, the conversation content is as follows: "you are good asking what can help you. The ' status light is always related to the power supply, the normal operation of the server is not influenced, the ' status ' is a total light, the machine is on when a problem occurs, and 4 circuits are all suggested to be plugged in. "there is no condition to insert 4 way power on the spot, there is no way to let status light on" ", there is, with the order, brush the power tactics into two electricity. "
S103: and training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model.
After obtaining the initial annotated data set, utilizing the initial annotated data set
Figure 149898DEST_PATH_IMAGE025
Training the general dialogue model to obtain an initial professional dialogue model which is recorded as
Figure 929635DEST_PATH_IMAGE026
S104: and carrying out verification operation on the initial professional dialogue model by utilizing the verification data set and preset natural language processing evaluation indexes to obtain a verification score.
Training to obtain initial professional dialogue model
Figure 856003DEST_PATH_IMAGE026
Then, the initial professional dialogue model is verified by utilizing the verification data set and the preset natural language processing evaluation index, and a verification score is obtained and recorded as
Figure 935955DEST_PATH_IMAGE027
. And predicting the response performance of the initial professional dialogue model to the voice question through the verification score.
S105: and judging whether the verification score is larger than a preset score threshold value, if so, executing the step S106, and if not, continuing training the initial professional dialogue model.
And presetting a score threshold, after the initial professional dialogue model is verified by using the verification data set and the preset natural language processing evaluation index, judging whether the verification score is greater than the preset score threshold, if so, indicating that the model is trained, and executing the step S106, otherwise, indicating that the initial professional dialogue model needs to be trained continuously.
S106: and determining the initial professional dialogue model as a target professional dialogue model.
And when the verification score is determined to be larger than the preset score threshold value, the model is trained, and the initial professional dialogue model is determined as the target professional dialogue model. The target professional dialogue model and the current all expert annotation data set can also be output. Whether the professional dialogue model is trained or not is judged according to the preset score threshold value, so that the trained target professional dialogue model can have good answer generation capacity on the questioning voice.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the user experience is improved.
Referring to fig. 2, fig. 2 is a flowchart of another implementation of a training method of a dialogue model in an embodiment of the present invention, where the method may include the following steps:
s201: and training the original dialogue model by using the pre-acquired general dialogue data set to obtain the general dialogue model.
In an embodiment of the present invention, before step S201, the method for training a dialogue model may further include the following steps:
and respectively filtering the question-answer data and the chatting data in the universal dialogue data set.
And after the universal dialogue data set is obtained, respectively filtering the question answering data and the chatting data in the universal dialogue data set. For example, since the overall noise of the question-answer data set is low, simple filtering may be performed, including removing conversations containing sensitive words, removing insufficiency
Figure 820734DEST_PATH_IMAGE028
Word conversations, conversations removing questions that are the same as answers, removing meaningless characters in the corpus, and the like. Because the whole noise of the chatting data set is large, strict filtering is required. The adopted filtering mode comprises removing dialogues containing sensitive words and removing deficiency
Figure 369658DEST_PATH_IMAGE028
Word conversations, removing conversations having only one sentence, removing conversations not containing Chinese characters, deleting advertising conversations, deleting repeated conversationsAnd removing meaningless characters in the corpus and the like. The original dialogue model is trained by utilizing the filtered general dialogue data set, so that the interference of useless data is avoided, the model training complexity is reduced, the model training efficiency is improved, and the accuracy of the trained model is improved.
In order to make the training effect better, the data set can be composed according to different categories and a certain prompt format, as follows:
TABLE 1
Figure 302979DEST_PATH_IMAGE029
And the subsequent processing work is reduced through a fixed prompt format.
In one embodiment of the present invention, step S201 may include the following steps:
the method comprises the following steps: inputting the general dialogue data set into an original dialogue model to carry out model iterative training;
step two: obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
step three: determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation, if so, executing a fourth step, and if not, executing a fifth step;
step four: determining a dialogue model obtained by the iterative training of the current round as a general dialogue model;
step five: and inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to carry out model iterative training, and returning to execute the second step.
For convenience of description, the above five steps may be combined for illustration.
The process of training an original dialogue model by using a general dialogue data set to obtain a general dialogue model can comprise the steps of inputting the general dialogue data set into the original dialogue model to perform model iterative training, obtaining a current iteration number and a loss standard deviation obtained by the current iteration training, determining whether a model training cut-off condition is met according to the current iteration number and the loss standard deviation, if so, indicating that the model obtained by the current training can already give a better voice response to a general question, determining the dialogue model obtained by the current iteration training as the general dialogue model, if not, indicating that the model obtained by the current training cannot give a better voice response to the general question, inputting the general dialogue data set into the dialogue model obtained by the current iteration training to perform model iterative training, obtaining the current iteration number and the loss standard deviation obtained by the current iteration training again, and continuously optimizing the model through multiple times of training iterations.
It should be noted that the model training cutoff condition may be set and adjusted according to an actual situation, which is not limited in the embodiment of the present invention, and may be set as an upper limit of the number of iterations, or may be set as a loss threshold.
In an embodiment of the present invention, determining whether the model training cutoff condition is reached according to the current iteration number and the loss standard deviation may include the following steps:
the method comprises the following steps: inputting the general dialogue data set into an original dialogue model to carry out model iterative training;
step two: obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
step three: judging whether the current iteration number is greater than a first preset value and the loss standard deviation is less than a second preset value, if so, executing a fourth step, and if not, executing a fifth step when the current iteration number is determined to be greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value;
step four: determining a dialogue model obtained by the iterative training of the current round as a general dialogue model;
step five: judging whether the current iteration number is larger than a third preset value or not, if so, returning to execute the fourth step, and if not, executing the sixth step;
wherein the third preset value is greater than the first preset value;
step six: and inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to carry out model iterative training, and returning to execute the second step.
For convenience of description, the above six steps may be combined for illustration.
The hyper-parameters in the model training are preset, and the hyper-parameters can comprise the iteration number
Figure 503016DEST_PATH_IMAGE030
Pre-training minimum iteration number obtained by pre-training
Figure 293118DEST_PATH_IMAGE031
(i.e., first predetermined value), loss standard deviation
Figure 781868DEST_PATH_IMAGE032
Standard deviation threshold of
Figure 50038DEST_PATH_IMAGE033
(i.e., second predetermined value), loss standard deviation
Figure 619429DEST_PATH_IMAGE034
The standard deviation of the latest ten iterations loss is represented. After obtaining the current iteration number and the loss standard deviation obtained by the iteration training of the current round, judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value, namely
Figure 580431DEST_PATH_IMAGE035
And thereby determine whether a model training cutoff condition has been reached. The model training stage is judged by combining the current iteration number and the loss standard deviation, so that the model meeting the training cutoff condition is subjected to iteration for a certain number of times, and the performance of the model is improved.
The pre-set hyper-parameters in the model training may further include the number of iterations
Figure 87636DEST_PATH_IMAGE030
Pre-training maximum iteration number obtained by pre-training
Figure 362760DEST_PATH_IMAGE036
(i.e., a third preset value) greater than the third preset valueA predetermined value, i.e.
Figure 537389DEST_PATH_IMAGE037
. When the current iteration number is larger than a first preset value and the loss standard deviation is larger than or equal to a second preset value, judging whether the current iteration number is larger than a third preset value or not, if so, judging that the loss value is slowly reduced, training the model to be close to global optimum, determining the dialogue model obtained by the iteration training of the current round as a universal dialogue model, if not, judging that the model needs to be continuously trained, inputting a universal dialogue data set into the dialogue model obtained by the iteration training of the current round for model iteration training, obtaining the loss standard deviation obtained by the current iteration number and the iteration training of the current round again, judging whether a model training cut-off condition is reached or not based on the data of the iteration of the current round, and repeating the steps until the preset model training cut-off condition is reached, thereby obtaining the universal dialogue model capable of well responding to the universal questioning voice.
S202: and acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set.
S203: and training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model.
S204: and carrying out verification operation on the initial professional dialogue model by utilizing the verification data set and preset natural language processing evaluation indexes to obtain a verification score.
In a specific embodiment of the present invention, the verifying the initial professional dialogue model by using the verification data set and the preset evaluation index of natural language processing may include the following steps:
and (3) carrying out verification operation on the initial professional dialogue model by combining a verification data set, a BLEU index, a ROUGE index, a PPL index and a DISTINCT index through the following formula:
Figure 420025DEST_PATH_IMAGE038
wherein ,
Figure 883368DEST_PATH_IMAGE039
to score the initial professional dialogue model on the BLEU index,
Figure 493341DEST_PATH_IMAGE040
to score the initial professional dialogue model on the route index,
Figure 788056DEST_PATH_IMAGE041
the score of the initial professional dialogue model on the PPL index is obtained in the form of reciprocal of the PPL index score,
Figure 668025DEST_PATH_IMAGE042
to score the initial professional dialogue model on the DISTINCT index,
Figure 415401DEST_PATH_IMAGE005
to verify the score.
When the initial professional dialogue model is verified, verification operation can be performed on the initial professional dialogue model by combining a verification data set, a BLEU index, a ROUGE index, a PPL index and a DISTINCT index. The verification score may be calculated as follows:
Figure 579797DEST_PATH_IMAGE001
wherein ,
Figure 463440DEST_PATH_IMAGE002
to score the initial professional dialogue model on the BLEU index,
Figure 937146DEST_PATH_IMAGE043
to score the initial professional dialogue model on the route index,
Figure 906239DEST_PATH_IMAGE003
the score of the initial professional dialogue model on the PPL index is in the form of reciprocal of the PPL index score,
Figure 113142DEST_PATH_IMAGE044
the smaller the size is, the worse the model is generated,
Figure 116870DEST_PATH_IMAGE004
the score of the initial professional dialogue model on the DISTINCT index.
The performance of the model on the verification data set is comprehensively judged by adopting four indexes of BLEU, ROUGE, PPL and DISTINCT. The method ensures the accuracy and recall rate of the generated answers while ensuring the smoothness and diversity of the model generation.
In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the BLEU index
Figure 495899DEST_PATH_IMAGE002
The score of the initial professional dialogue model on the BLEU index
Figure 155550DEST_PATH_IMAGE002
The calculation process of (a) may include the steps of:
the score of the initial professional dialogue model on the BLEU index is calculated by the following formula
Figure 911016DEST_PATH_IMAGE002
Figure 519983DEST_PATH_IMAGE045
wherein ,
Figure 132230DEST_PATH_IMAGE008
Figure 75916DEST_PATH_IMAGE009
is the length of the machine translation and,
Figure 618761DEST_PATH_IMAGE010
translating the length of a sentence for the shortest referenceThe degree of the water is measured by the following method,
Figure 331502DEST_PATH_IMAGE011
to be the accuracy of the n-gram,
Figure 317913DEST_PATH_IMAGE012
is the weight of the n-gram, with any n
Figure 483315DEST_PATH_IMAGE013
And BP is a penalty factor.
The core idea of the BLEU is to compare the degree of coincidence of the n-grams in the candidate translation and the reference translation, and the higher the degree of coincidence, the higher the quality of the translation is considered. In practice, N =1 to 4 is usually taken, and then weighted average is performed.
Figure 128054DEST_PATH_IMAGE046
wherein ,
Figure 898564DEST_PATH_IMAGE008
Figure 852614DEST_PATH_IMAGE009
is the length of the machine-translated version,
Figure 20159DEST_PATH_IMAGE010
is the shortest length of the reference translation sentence,
Figure 186698DEST_PATH_IMAGE011
is the precision of the n-gram,
Figure 874031DEST_PATH_IMAGE012
the weights for n-grams are typically set to be uniform, i.e., there is a weight for any n
Figure 952977DEST_PATH_IMAGE013
. BP is a penalty factor, and if the length of the translation is smaller than the shortest reference translation, the BP is smaller than 1. 1-gram accuracy of BLEU indicates that the decoded text is faithful to the original textThe degree, while other n-grams represent the fluency of translation.
In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the route index
Figure 92971DEST_PATH_IMAGE014
The score of the initial professional dialogue model on the ROUGE index
Figure 797622DEST_PATH_IMAGE014
The calculation process of (a) may include:
calculating the score of the initial professional dialogue model on the ROUGE index by the following formula
Figure 339462DEST_PATH_IMAGE014
Figure 822264DEST_PATH_IMAGE015
Wherein { reference translation } denotes a set of reference translations,
Figure 715134DEST_PATH_IMAGE016
a combination of N words is represented by,
Figure 505367DEST_PATH_IMAGE017
the numerator of the formula is used for counting the number of the N-grams in all the reference translations, and the numerator is used for counting the number of the N-grams shared by all the reference translations and the machine translations.
ROUGE-N focuses on recall rather than accuracy. See how many n-gram phrases in the reference sentence appear in the output. "N" refers to an N-gram that is computed in a similar manner to BLEU, except that BLEU is based on precision and ROUGE is based on recall. The ROUGE-N is mainly used for counting the recall rate on the N-gram, and for the N-gram, a ROUGE-N score can be calculated, wherein the calculation formula is as follows:
Figure 167292DEST_PATH_IMAGE015
where { reference translation } denotes a set of reference translations, and there may be a plurality of reference translations in actual applications.
Figure 837308DEST_PATH_IMAGE016
A combination of N words is represented,
Figure 951894DEST_PATH_IMAGE017
indicating the number of N-grams in the computed grammar. The denominator of the formula is to count the number of N-grams in all the reference translations, and the numerator is to count the number of N-grams shared by all the reference translations and the machine translation.
In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the PPL index
Figure 976177DEST_PATH_IMAGE018
The score of the initial professional dialogue model on the PPL index
Figure 492609DEST_PATH_IMAGE018
The calculation process of (2):
Figure 536788DEST_PATH_IMAGE019
wherein ,
Figure 138671DEST_PATH_IMAGE020
representing the probability of predicting the ith word from the above word, N represents the sentence length.
PPL refers to Perplexity in a language model, and Perplexity is an index for measuring whether a sentence is smooth or not. Is defined as:
Figure 988815DEST_PATH_IMAGE019
wherein ,
Figure 563016DEST_PATH_IMAGE020
representing the probability of predicting the ith word from the above word, N represents the sentence length. The smaller the PPL value, the more natural the model generates and the smoother the sentence. The reply quality is evaluated through the PPL, and the situations that replies generated by the model are out of order and front and back are reversed can be avoided.
In one embodiment of the invention, the method may further comprise scoring the initial professional dialogue model on the DISTINCT index
Figure 325567DEST_PATH_IMAGE047
The score of the initial professional dialogue model on the DISTINCT index
Figure 477062DEST_PATH_IMAGE047
The calculation process of (2) includes:
calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula
Figure 849007DEST_PATH_IMAGE047
Figure 74452DEST_PATH_IMAGE048
wherein ,
Figure 257171DEST_PATH_IMAGE023
indicating a non-repeating number of ngrams in the reply,
Figure 292753DEST_PATH_IMAGE024
indicating the total number of ngram words in the reply.
The Distingt evaluation index judges the diversity of machine recovery, and the Distingt index judges whether a large amount of universality and repeatability recovery occur. Distingt is defined as follows:
Figure 999546DEST_PATH_IMAGE048
wherein ,
Figure 876235DEST_PATH_IMAGE023
indicating a non-repeating number of ngrams in the reply,
Figure 495436DEST_PATH_IMAGE024
indicating the total number of ngram words in the reply.
Figure 198687DEST_PATH_IMAGE023
Larger indicates more diversity in generating replies.
S205: and judging whether the verification score is larger than a preset score threshold value, if so, executing the step S106, and if not, executing the step S207.
S206: and determining the initial professional dialogue model as a target professional dialogue model.
S207: and generating corresponding response data aiming at each sample data in the preset unmarked pool by utilizing the initial professional dialogue model.
When the verification score is determined to be less than or equal to the preset score threshold value, the model needs to be trained continuously, and the initial professional dialogue model is utilized
Figure 945057DEST_PATH_IMAGE026
And generating corresponding response data aiming at each sample data in the preset unmarked pool.
S208: and respectively calculating the automatic evaluation scores corresponding to the response data.
After corresponding response data are generated for each sample data in the preset unmarked pool by using the initial professional dialogue model, the automatic evaluation score corresponding to each response data is respectively calculated. For example, the automatic evaluation score corresponding to each response data may be calculated according to the PPL index and the distint index, and the calculation formula is as follows:
Figure 145095DEST_PATH_IMAGE049
thereby obtaining the automatic evaluation score corresponding to each response data.
S209: and sorting the respective automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from the end with smaller score.
After the automatic evaluation scores corresponding to the response data are respectively obtained through calculation, the automatic evaluation scores are sorted, a preset number of automatic evaluation scores are selected from one end with smaller scores, for example, the lowest N automatic evaluation scores are selected
Figure 818085DEST_PATH_IMAGE050
And (6) scoring.
S210: and outputting marking prompt information for marking the response data corresponding to the selected respective dynamic evaluation scores.
After a preset number of automatic evaluation scores are selected from the end with smaller score, the marking prompt information for marking the response data corresponding to the selected automatic evaluation scores is output, thereby prompting the lowest N automatic evaluation scores
Figure 916622DEST_PATH_IMAGE050
And carrying out expert annotation on the response data corresponding to the scores.
S211: and updating the initial labeling data set according to the labeling result to obtain an updated labeling data set.
And after the labeling prompt information for labeling the response data corresponding to the selected respective automatic evaluation score is output, obtaining a labeling result, updating the initial labeling data set according to the labeling result, and obtaining an updated labeling data set, thereby realizing effective labeling of data with poor response data effect generated by the current professional dialogue model.
In an embodiment of the present invention, after step S211, the method for training a dialogue model may further include the following steps:
and updating the preset unmarked pool according to the updated marked data set.
And after the updated marked data set is obtained, updating the preset unmarked pool according to the updated marked data set, thereby realizing the timely updating of the unmarked sample data in the preset unmarked pool.
S212: and training the initial professional dialogue model based on the updated labeling data set to obtain the updated professional dialogue model.
And after the initial annotation data set is updated according to the annotation result to obtain an updated annotation data set, training the initial professional dialogue model based on the updated annotation data set to obtain an updated professional dialogue model.
According to the embodiment of the invention, the active learning mode is adopted, so that the amount of the expert labeled samples is reduced as much as possible, and the influence on the model performance is reduced. And the 'difficult sample' which has the largest improvement on the model performance is continuously selected from the preset unmarked pool, so that the model performance is improved.
S213: and performing verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and returning to execute the step S205.
Training the initial professional dialogue model based on the updated labeling data set to obtain the updated professional dialogue model, verifying the updated professional dialogue model by using the verification data set to obtain a verification score, returning to the step of judging whether the verification score is larger than a preset score threshold value, and repeating the steps until the verification score obtained by calculation is larger than the preset score threshold value, thereby obtaining a target professional dialogue model capable of well responding to the received questioning voice.
Referring to fig. 3, fig. 3 is a flowchart of an implementation of a dialog response method in an embodiment of the present invention, applied to a dialog system including a target specialized dialog model obtained by the preceding training, where the method may include the following steps:
s301: and receiving target question voice to be responded.
When a user needs to perform scene dialogue, target question voice is output to the dialogue response control center, and the dialogue response control center receives the target question voice to be responded.
The dialogue response control center may be a processor deployed with a dialogue model.
The target questioning voice can be chat, common knowledge question and answer, professional question and answer and the like.
S302: and generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model.
The general dialogue model is obtained by Training the general dialogue model in advance, for example, model Training can be performed on the general dialogue data set based on a large model, wherein the large model can be based on a transform structure and is suitable for generating tasks, such as a GPT (general Pre-Training) model, a BERT (Bidirectional Encoder reporting from transforms) model, and the like. And training based on the general dialogue model to obtain a target professional dialogue model. And after receiving the target question voice to be responded, generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model.
By retraining on the basis of the large model, the requirements for data volume and computing power are greatly reduced, and a two-stage training model mode is adopted, so that the trained target professional dialogue model has universality and professionality at the same time.
S303: and performing output operation on the target response voice.
And after generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model, outputting the target response voice, thereby realizing the response to the target question voice.
Because the model training process needs more resources compared with the model application process, more resources can be allocated to the model training process in advance, and relatively less resources are allocated to the model application process. For example, 8 and above 80G sized GPUs (Graphics Processing units, image processors) can be pre-divided for model training, and 1 and above 80G sized GPUs are divided for model application.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the use experience of a user is improved.
In an embodiment of the present invention, the dialog response method may further include the steps of:
the method comprises the following steps: searching a relevant answer from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
step two: and carrying out voice output on the related answers.
For convenience of description, the above two steps may be combined for illustration.
The embodiment of the invention presets a bottom-pocketing scheme, constructs a professional database by using a professional data set, searches related answers from the database based on a preset retrieval algorithm when a target professional dialogue model fails to respond to a target question voice, namely when the output of the target professional dialogue model is empty, and outputs the voice of the related answers. Therefore, the application process of the professional dialogue model is optimized, the situation that the user asks the voice to fall into the air is further guaranteed, and the user experience is improved.
Corresponding to the above method embodiment, the present invention further provides a training apparatus for a dialogue model, and the training apparatus for a dialogue model described below and the training method for a dialogue model described above may be referred to in correspondence.
Referring to fig. 4, fig. 4 is a block diagram illustrating a training apparatus for a dialogue model according to an embodiment of the present invention, where the training apparatus for a dialogue model may include:
a general dialogue model obtaining module 41, configured to train an original dialogue model by using a pre-obtained general dialogue dataset to obtain a general dialogue model;
an initial labeling data set determining module 42, configured to obtain a preset professional keyword group, perform data screening on the general conversation data set according to the professional keyword group, and determine a data set obtained by the screening as an initial labeling data set;
an initial professional dialogue model obtaining module 43, configured to train the general dialogue model by using the initial annotation data set, so as to obtain an initial professional dialogue model;
a verification score obtaining module 44, configured to perform a verification operation on the initial professional dialogue model by using the verification data set and a preset natural language processing evaluation index, so as to obtain a verification score;
a judging module 45, configured to judge whether the verification score is greater than a preset score threshold;
and the target professional dialogue model determining module 46 is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than the preset score threshold value.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the use experience of a user is improved.
In an embodiment of the present invention, the training device for dialogue model may further include:
the response data generation module is used for generating corresponding response data for each sample data in the preset unmarked pool by using the initial professional dialogue model when the verification score is determined to be less than or equal to the preset score threshold value;
the automatic evaluation score calculation module is used for respectively calculating the automatic evaluation scores corresponding to the response data;
the automatic evaluation score selection module is used for sorting the automatic evaluation scores and selecting a preset number of automatic evaluation scores from the end with smaller scores;
the labeling prompt information output module is used for outputting labeling prompt information for labeling the response data corresponding to the selected respective dynamic evaluation scores;
the annotation data set updating module is used for updating the initial annotation data set according to the annotation result to obtain an updated annotation data set;
the professional dialogue model updating module is used for training the initial professional dialogue model based on the updated labeled data set to obtain an updated professional dialogue model;
and the repeated execution module is used for carrying out verification operation on the updated professional dialogue model by utilizing the verification data set to obtain a verification score and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.
In an embodiment of the present invention, the training device for dialogue model may further include:
and the unmarked pool updating module is used for updating the preset unmarked pool according to the updated marked data set after the updated marked data set is obtained.
In a specific embodiment of the present invention, the verification score obtaining module 44 is specifically configured to perform a verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index, and the DISTINCT index according to the following formulas:
Figure 184792DEST_PATH_IMAGE051
wherein ,
Figure 347658DEST_PATH_IMAGE002
to score the initial professional dialogue model on the BLEU index,
Figure 793814DEST_PATH_IMAGE043
to score the initial professional dialogue model on the route index,
Figure 409341DEST_PATH_IMAGE003
the score of the initial professional dialogue model on the PPL index is obtained in the form of reciprocal of the PPL index score,
Figure 277940DEST_PATH_IMAGE004
to score the initial professional dialogue model on the DISTINCT index,
Figure 203302DEST_PATH_IMAGE005
to verify the score.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the BLEU index for calculating the score of the initial professional dialogue model on the BLEU index by the following formula
Figure 335206DEST_PATH_IMAGE006
Figure 532969DEST_PATH_IMAGE007
wherein ,
Figure 408521DEST_PATH_IMAGE008
Figure 703236DEST_PATH_IMAGE009
is the length of the machine-translated version,
Figure 995589DEST_PATH_IMAGE010
is the shortest length of the reference translation sentence,
Figure 742965DEST_PATH_IMAGE011
to be the accuracy of the n-gram,
Figure 156629DEST_PATH_IMAGE012
is the weight of the n-gram, with any n
Figure 40271DEST_PATH_IMAGE013
And BP is a penalty factor.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the ROUGE index for calculating the score of the initial professional dialogue model on the ROUGE index by the following formula
Figure 999131DEST_PATH_IMAGE014
Figure 702645DEST_PATH_IMAGE052
Wherein { reference translation } denotes a set of reference translations,
Figure 185579DEST_PATH_IMAGE016
a combination of N words is represented,
Figure 658149DEST_PATH_IMAGE017
the numerator of the formula is used for counting the number of the N-grams in all the reference translations, and the numerator is used for counting the number of the N-grams shared by all the reference translations and the machine translations.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the PPL index for calculating the score of the initial professional dialogue model on the PPL index by the following formula
Figure 552024DEST_PATH_IMAGE018
Figure 8413DEST_PATH_IMAGE019
wherein ,
Figure 763880DEST_PATH_IMAGE020
representing the probability of predicting the ith word from the above word, N represents the sentence length.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the DISTINCT index for calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula
Figure 887694DEST_PATH_IMAGE021
Figure 453935DEST_PATH_IMAGE053
wherein ,
Figure 335304DEST_PATH_IMAGE023
indicating a non-repeating number of ngrams in the reply,
Figure 894461DEST_PATH_IMAGE024
representing the total number of ngram words in the reply.
In an embodiment of the present invention, the training device for dialogue model may further include:
and the data filtering module is used for respectively filtering question-answer data and chatting data in the universal dialogue data set before training the original dialogue model by using the pre-acquired universal dialogue data set.
In one embodiment of the present invention, the general dialogue model obtaining module 41 includes:
the iterative training submodule is used for inputting the general dialogue data set into an original dialogue model to carry out model iterative training;
the loss standard deviation obtaining submodule is used for obtaining the current iteration number and the loss standard deviation obtained by the iteration training of the current round;
the training cutoff judgment submodule is used for determining whether a model training cutoff condition is met according to the current iteration number and the loss standard deviation;
and the general dialogue model determining submodule is used for determining the dialogue model obtained by the iteration training in the current round as the general dialogue model when the model training cut-off condition is determined to be reached according to the current iteration number and the loss standard deviation.
In an embodiment of the present invention, the training cutoff determination sub-module is a module for determining whether the current iteration number is greater than a first preset value and the loss standard deviation is smaller than a second preset value.
In an embodiment of the present invention, the training device for dialogue model may further include:
the iteration number counting submodule is used for judging whether the current iteration number is greater than a third preset value or not when the current iteration number is determined to be greater than a first preset value and the loss standard deviation is greater than or equal to a second preset value; wherein the third preset value is greater than the first preset value;
the general dialogue model determining submodule is also used for determining the dialogue model obtained by the iteration training in the current round as the general dialogue model when the current iteration number is larger than a third preset value;
and the iterative training submodule is also used for inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to perform model iterative training when the current iterative number is less than or equal to a third preset value, and repeatedly executing the step of obtaining the current iterative number and the loss standard deviation obtained by the iterative training of the current round.
In an embodiment of the present invention, the initial labeled data set determining module 42 is specifically a module for performing data screening on the general dialogue data set according to the professional keyword set by using a DFA algorithm.
Corresponding to the above method embodiment, the present invention further provides a dialog response device, and the dialog response device described below and the dialog response method described above may be referred to correspondingly.
Referring to fig. 5, fig. 5 is a block diagram of a dialog response device according to an embodiment of the present invention, where the dialog response device may include:
a question voice receiving module 51, configured to receive a target question voice to be responded;
a response speech generation module 52, configured to generate a target response speech corresponding to the target question speech by using a target professional dialogue model obtained based on training of the general dialogue model;
and a response voice output module 53, configured to perform an output operation on the target response voice.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the user experience is improved.
In an embodiment of the present invention, the dialog response device may further include:
the answer searching module is used for searching related answers from the database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
and the voice output module is used for performing voice output on the related answers.
Corresponding to the above method embodiment, referring to fig. 6, fig. 6 is a schematic diagram of an electronic device provided by the present invention, which may include:
a memory 332 for storing a computer program;
a processor 322 for implementing the steps of the training method or the dialogue response method of the dialogue model of the above method embodiments when executing the computer program.
Specifically, referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device provided in this embodiment, which may generate relatively large differences due to different configurations or performances, and may include a processor (CPU) 322 (e.g., one or more processors) and a memory 332, where the memory 332 stores one or more computer programs 342 or data 344. Memory 332 may be, among other things, transient storage or persistent storage. The program stored in memory 332 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the data processing apparatus. Further, the processor 322 may be configured to communicate with the memory 332 to execute a series of instruction operations in the memory 332 on the electronic device 301.
The electronic device 301 may also include one or more power sources 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341.
The steps in the above-described dialog response method may be implemented by the structure of the electronic device.
Corresponding to the above method embodiment, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, can implement the following steps:
training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model; acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set; training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model; verifying the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; if so, determining the initial professional dialogue model as a target professional dialogue model;
or the like, or, alternatively,
receiving target question voice to be responded; generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model; and performing output operation on the target response voice.
The computer-readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
For the introduction of the computer-readable storage medium provided by the present invention, please refer to the above method embodiments, which are not described herein again.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device, the electronic device and the computer-readable storage medium disclosed by the embodiments correspond to the method disclosed by the embodiments, so that the description is simple, and the relevant points can be referred to the description of the method part.
The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, without departing from the principle of the present invention, it is possible to make various improvements and modifications to the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (19)

1. A method for training a dialogue model, comprising:
training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;
acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model;
carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
judging whether the verification score is larger than a preset score threshold value or not;
and if so, determining the initial professional dialogue model as a target professional dialogue model.
2. The training method of a dialogue model according to claim 1, wherein when it is determined that the verification score is equal to or less than the preset score threshold, the method further comprises:
generating corresponding response data for each sample data in a preset unmarked pool by using the initial professional dialogue model;
respectively calculating the automatic evaluation score corresponding to each response data;
sorting the automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from one end with smaller score;
outputting labeling prompt information for labeling the response data corresponding to each selected automatic evaluation score;
updating the initial labeling data set according to the labeling result to obtain an updated labeling data set;
training the initial professional dialogue model based on the updated labeling data set to obtain an updated professional dialogue model;
and carrying out verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.
3. The method for training a dialogue model of claim 2, further comprising, after obtaining the updated annotation data set:
and updating the preset unmarked pool according to the updated labeled data set.
4. The training method of dialogue model according to claim 1, wherein the performing a verification operation on the initial professional dialogue model using a verification data set and a preset evaluation index of natural language processing comprises:
and carrying out verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index and the DISTINCT index through the following formula:
Figure 463665DEST_PATH_IMAGE001
wherein ,
Figure 911964DEST_PATH_IMAGE002
for the score of the initial professional dialogue model on the BLEU index,
Figure 737706DEST_PATH_IMAGE003
for the score of the initial professional dialogue model on the route index,
Figure 664074DEST_PATH_IMAGE004
the score of the initial professional dialogue model on the PPL index,in the form of the reciprocal of the score of the PPL indicator,
Figure 212867DEST_PATH_IMAGE005
for the score of the initial professional dialogue model on the DISTINCT index,
Figure 566488DEST_PATH_IMAGE006
to verify the score.
5. The method of claim 4, further comprising scoring the initial specialized dialogue model on a BLEU indicator
Figure 912150DEST_PATH_IMAGE007
The score of the initial professional dialogue model on the BLEU index
Figure 642209DEST_PATH_IMAGE007
The calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the BLEU index by the following formula
Figure 45508DEST_PATH_IMAGE007
Figure 570030DEST_PATH_IMAGE008
wherein ,
Figure 653522DEST_PATH_IMAGE009
Figure 921693DEST_PATH_IMAGE010
is the length of the machine-translated version,
Figure 976236DEST_PATH_IMAGE011
is the shortest length of the reference translation sentence,
Figure 202818DEST_PATH_IMAGE012
is the precision of the n-gram,
Figure 726335DEST_PATH_IMAGE013
is the weight of the n-gram, with any n
Figure 532617DEST_PATH_IMAGE014
And BP is a penalty factor.
6. The method for training dialogue model according to claim 4, further comprising scoring the initial professional dialogue model on a ROUGE index
Figure 910508DEST_PATH_IMAGE015
The score of the initial professional dialogue model on the ROUGE index
Figure 776833DEST_PATH_IMAGE015
The calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the ROUGE index through the following formula
Figure 36913DEST_PATH_IMAGE015
Figure 896154DEST_PATH_IMAGE016
Wherein { reference translation } denotes a set of reference translations,
Figure 925290DEST_PATH_IMAGE017
a combination of N words is represented by,
Figure 493674DEST_PATH_IMAGE018
representing the number of N-grams, formula, in the computer textThe denominator is used for counting the number of N-grams in all the reference translations, and the numerator is used for counting the number of N-grams shared by all the reference translations and the machine translations.
7. The method of claim 4, further comprising scoring the initial specialized dialogue model on a PPL metric
Figure 975471DEST_PATH_IMAGE019
The score of the initial professional dialogue model on the PPL index
Figure 139867DEST_PATH_IMAGE019
The calculation process of (2):
Figure 226772DEST_PATH_IMAGE020
wherein ,
Figure 966058DEST_PATH_IMAGE021
representing the probability of predicting the ith word from the above word, N represents the sentence length.
8. The method of claim 4, further comprising scoring the initial specialized dialog model on a DISTINCT index
Figure 935151DEST_PATH_IMAGE022
The score of the initial professional dialogue model on the DISTINCT index
Figure 90189DEST_PATH_IMAGE023
The calculation process of (2) includes:
calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula
Figure 93917DEST_PATH_IMAGE023
Figure 987792DEST_PATH_IMAGE024
wherein ,
Figure 709761DEST_PATH_IMAGE025
representing the number of ngrams in the reply that do not repeat,
Figure 668489DEST_PATH_IMAGE026
representing the total number of ngram words in the reply.
9. The method for training a dialogue model according to claim 1, further comprising, before training an original dialogue model using a pre-acquired common dialogue dataset:
and respectively filtering the question-answer data and the chatting data in the general dialogue data set.
10. The method for training a dialogue model according to claim 1, wherein training an original dialogue model using a pre-acquired universal dialogue dataset to obtain a universal dialogue model comprises:
inputting the general dialogue data set into the original dialogue model for model iterative training;
obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation;
and if so, determining the dialogue model obtained by the iterative training as the general dialogue model.
11. The method of claim 10, wherein determining whether a model training cutoff condition is met based on the current iteration number and the loss standard deviation comprises:
and judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value.
12. The method for training a dialogue model of claim 11, wherein when it is determined that the current iteration number is greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value, the method further comprises:
judging whether the current iteration number is larger than a third preset value or not; wherein the third preset value is greater than the first preset value;
if yes, determining the dialogue model obtained by the iterative training as the general dialogue model;
and if not, inputting the general dialogue data set into a dialogue model obtained by the current iteration training for model iteration training, and repeatedly executing the steps of obtaining the current iteration number and the loss standard deviation obtained by the current iteration training.
13. The method for training a dialogue model according to claim 1, wherein the step of performing data screening on the general dialogue data set according to the professional keyword group comprises:
and screening the data of the general dialogue data set according to the professional keyword set by utilizing a DFA algorithm.
14. A dialogue response method applied to a dialogue system including a target professional dialogue model trained according to any one of claims 1 to 13, comprising:
receiving target question voice to be responded;
generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model;
and performing output operation on the target response voice.
15. The dialog response method of claim 14 further comprising:
searching a relevant answer from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
and carrying out voice output on the related answers.
16. An apparatus for training a dialogue model, comprising:
the general dialogue model acquisition module is used for training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;
the initial labeling data set determining module is used for acquiring a preset professional keyword group, performing data screening on the general conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
the initial professional dialogue model obtaining module is used for training the general dialogue model by utilizing the initial labeling data set to obtain an initial professional dialogue model;
the verification score obtaining module is used for carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
the judging module is used for judging whether the verification score is larger than a preset score threshold value or not;
and the target professional dialogue model determining module is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than a preset score threshold value.
17. A dialog response device comprising:
the question voice receiving module is used for receiving target question voice to be responded;
the response voice generation module is used for generating target response voice corresponding to the target question voice by utilizing a target professional dialogue model obtained by training a general dialogue model;
and the response voice output module is used for outputting the target response voice.
18. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the training method of the dialogue model according to any one of claims 1 to 13 or the dialogue response method according to any one of claims 14 to 15 when executing the computer program.
19. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the training method of the dialogue model according to any one of claims 1 to 13 or the dialogue response method according to any one of claims 14 to 15.
CN202211441290.4A 2022-11-17 2022-11-17 Training method and device for dialogue model, dialogue response method and device Active CN115495568B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211441290.4A CN115495568B (en) 2022-11-17 2022-11-17 Training method and device for dialogue model, dialogue response method and device
PCT/CN2023/086071 WO2024103609A1 (en) 2022-11-17 2023-04-04 Dialogue-model training method and apparatus, and dialogue response method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211441290.4A CN115495568B (en) 2022-11-17 2022-11-17 Training method and device for dialogue model, dialogue response method and device

Publications (2)

Publication Number Publication Date
CN115495568A true CN115495568A (en) 2022-12-20
CN115495568B CN115495568B (en) 2023-08-22

Family

ID=85116091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211441290.4A Active CN115495568B (en) 2022-11-17 2022-11-17 Training method and device for dialogue model, dialogue response method and device

Country Status (2)

Country Link
CN (1) CN115495568B (en)
WO (1) WO2024103609A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116127035A (en) * 2023-01-03 2023-05-16 北京百度网讯科技有限公司 Dialogue method, training method and training device for dialogue model
CN116432665A (en) * 2023-06-15 2023-07-14 北京中关村科金技术有限公司 Dialogue model construction method, text generation method, device, system and equipment
CN117828063A (en) * 2024-01-10 2024-04-05 广东数业智能科技有限公司 Psychological field data generation and model training method and device and storage medium
WO2024103609A1 (en) * 2022-11-17 2024-05-23 苏州元脑智能科技有限公司 Dialogue-model training method and apparatus, and dialogue response method and apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897797A (en) * 2018-06-12 2018-11-27 腾讯科技(深圳)有限公司 Update training method, device, storage medium and the electronic equipment of dialog model
WO2021049199A1 (en) * 2019-09-13 2021-03-18 Mitsubishi Electric Corporation System and method for a dialogue response generation system
CN114968788A (en) * 2022-05-27 2022-08-30 浙江大学 Method, apparatus, medium, and device for automatically evaluating programming capability of artificial intelligence algorithm

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188331B (en) * 2019-06-03 2023-05-26 腾讯科技(深圳)有限公司 Model training method, dialogue system evaluation method, device, equipment and storage medium
US11561969B2 (en) * 2020-03-30 2023-01-24 Adobe Inc. Utilizing logical-form dialogue generation for multi-turn construction of paired natural language queries and query-language representations
CN115495568B (en) * 2022-11-17 2023-08-22 苏州浪潮智能科技有限公司 Training method and device for dialogue model, dialogue response method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897797A (en) * 2018-06-12 2018-11-27 腾讯科技(深圳)有限公司 Update training method, device, storage medium and the electronic equipment of dialog model
WO2021049199A1 (en) * 2019-09-13 2021-03-18 Mitsubishi Electric Corporation System and method for a dialogue response generation system
CN114968788A (en) * 2022-05-27 2022-08-30 浙江大学 Method, apparatus, medium, and device for automatically evaluating programming capability of artificial intelligence algorithm

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024103609A1 (en) * 2022-11-17 2024-05-23 苏州元脑智能科技有限公司 Dialogue-model training method and apparatus, and dialogue response method and apparatus
CN116127035A (en) * 2023-01-03 2023-05-16 北京百度网讯科技有限公司 Dialogue method, training method and training device for dialogue model
CN116127035B (en) * 2023-01-03 2023-12-08 北京百度网讯科技有限公司 Dialogue method, training method and training device for dialogue model
CN116432665A (en) * 2023-06-15 2023-07-14 北京中关村科金技术有限公司 Dialogue model construction method, text generation method, device, system and equipment
CN116432665B (en) * 2023-06-15 2023-10-10 北京中关村科金技术有限公司 Dialogue model construction method, text generation method, device, system and equipment
CN117828063A (en) * 2024-01-10 2024-04-05 广东数业智能科技有限公司 Psychological field data generation and model training method and device and storage medium
CN117828063B (en) * 2024-01-10 2024-05-17 广东数业智能科技有限公司 Psychological field data generation and model training method and device and storage medium

Also Published As

Publication number Publication date
WO2024103609A1 (en) 2024-05-23
CN115495568B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN107944027B (en) Method and system for creating semantic key index
CN115495568A (en) Training method and device for dialogue model and dialogue response method and device
CN110096567B (en) QA knowledge base reasoning-based multi-round dialogue reply selection method and system
CN109815336B (en) Text aggregation method and system
CN108846138B (en) Question classification model construction method, device and medium fusing answer information
CN112417127B (en) Dialogue model training and dialogue generation methods, devices, equipment and media
CN110633359B (en) Sentence equivalence judgment method and device
CN1571013A (en) Method and device for predicting word error rate from text
CN111026840B (en) Text processing method, device, server and storage medium
WO2024066920A1 (en) Processing method and apparatus for dialogue in virtual scene, and electronic device, computer program product and computer storage medium
CN110347802A (en) A kind of text analyzing method and device
CN114020906A (en) Chinese medical text information matching method and system based on twin neural network
CN110727769B (en) Corpus generation method and device and man-machine interaction processing method and device
CN114003682A (en) Text classification method, device, equipment and storage medium
CN116049387A (en) Short text classification method, device and medium based on graph convolution
CN116910220A (en) Multi-round dialogue interaction processing method, device, equipment and storage medium
CN112905772A (en) Semantic correlation analysis method and device and related products
WO2023169301A1 (en) Text processing method and apparatus, and electronic device
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal
CN115203388A (en) Machine reading understanding method and device, computer equipment and storage medium
CN115408500A (en) Question-answer consistency evaluation method and device, electronic equipment and medium
CN111159339A (en) Text matching processing method and device
Enayet et al. An analysis of dialogue act sequence similarity across multiple domains
CN117453895B (en) Intelligent customer service response method, device, equipment and readable storage medium
CN116010583B (en) Cascade coupling knowledge enhancement dialogue generation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant