CN115495568A - Training method and device for dialogue model and dialogue response method and device - Google Patents
Training method and device for dialogue model and dialogue response method and device Download PDFInfo
- Publication number
- CN115495568A CN115495568A CN202211441290.4A CN202211441290A CN115495568A CN 115495568 A CN115495568 A CN 115495568A CN 202211441290 A CN202211441290 A CN 202211441290A CN 115495568 A CN115495568 A CN 115495568A
- Authority
- CN
- China
- Prior art keywords
- dialogue model
- dialogue
- professional
- training
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 187
- 238000000034 method Methods 0.000 title claims abstract description 106
- 230000004044 response Effects 0.000 title claims abstract description 82
- 238000012795 verification Methods 0.000 claims abstract description 92
- 238000002372 labelling Methods 0.000 claims abstract description 51
- 238000011156 evaluation Methods 0.000 claims abstract description 45
- 238000012216 screening Methods 0.000 claims abstract description 23
- 238000003058 natural language processing Methods 0.000 claims abstract description 17
- 238000003860 storage Methods 0.000 claims abstract description 13
- 238000013519 translation Methods 0.000 claims description 37
- 230000014616 translation Effects 0.000 claims description 37
- 230000008569 process Effects 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 17
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 claims description 17
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 10
- 230000000694 effects Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 8
- 208000001130 gallstones Diseases 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 201000001883 cholelithiasis Diseases 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- 244000046052 Phaseolus vulgaris Species 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000000586 desensitisation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 235000006694 eating habits Nutrition 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 235000014490 good eating habits Nutrition 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a training method of a dialogue model, which comprises the following steps: training an original dialogue model by using a general dialogue data set to obtain a general dialogue model; acquiring a preset professional keyword group, and performing data screening on the universal dialogue data set according to the professional keyword group; training the general dialogue model by using the screened initial labeling data set to obtain an initial professional dialogue model; verifying the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; and if so, determining the initial professional dialogue model as the target professional dialogue model. The method and the system enable the trained target professional dialogue model to have universality and specialty at the same time, and improve the use experience of users. The invention also discloses a training device of the dialogue model, a dialogue response method and device, electronic equipment and a computer readable storage medium, and has corresponding technical effects.
Description
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for training a dialogue model, a method and an apparatus for dialogue response, an electronic device, and a computer-readable storage medium.
Background
Human-computer conversation has been regarded by academic and industrial circles as a basic application of Natural Language Processing (NLP). With the development of artificial intelligence technology, generative-based dialogue models are increasingly popular, which are trained specifically for dialogue data and achieve very good performance in open-domain dialogue. However, training a large dialogue model from the beginning requires a large amount of multi-type dialogue data as a training corpus, which requires high cost and long training time.
There are also often different chat needs in professional human-machine dialog systems, including: chat, general-sense question and answer, professional question and answer, and the like. For example, the medical robot is required to answer medical professional knowledge and relate to common sense questions in life in the process of chatting with a patient, and is required to chatty to relieve the emotion of the patient. Most of the current professional dialogue models adopt a retrieval mode, and the main principle is semantic matching, namely finding answers of questions asked by users in a knowledge base. Although the technology is mature, the technology is very dependent on corpora, knowledge is one-sided, the reply is single and hard, the universality and the diversity are lacked, and the user experience is poor.
In summary, how to effectively solve the problems of single and harsh reply, lack of generality and diversity, poor user experience and the like of the existing dialog response method is a problem which needs to be solved urgently by a person skilled in the art at present.
Disclosure of Invention
The invention aims to provide a training method of a dialogue model, which enables the trained target professional dialogue model to have universality and specialty at the same time, and improves the use experience of a user; another object of the present invention is to provide a training apparatus for a dialogue model, a dialogue response method and apparatus, an electronic device, and a computer-readable storage medium.
In order to solve the technical problems, the invention provides the following technical scheme:
a method of training a dialogue model, comprising:
training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;
acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model;
carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
judging whether the verification score is larger than a preset score threshold value or not;
and if so, determining the initial professional dialogue model as a target professional dialogue model.
In a specific embodiment of the present invention, when it is determined that the verification score is equal to or less than the preset score threshold, the method further includes:
generating corresponding response data for each sample data in a preset unmarked pool by using the initial professional dialogue model;
respectively calculating the automatic evaluation score corresponding to each response data;
sorting the automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from the end with smaller scores;
outputting marking prompt information for marking the response data corresponding to each selected automatic evaluation score;
updating the initial labeling data set according to the labeling result to obtain an updated labeling data set;
training the initial professional dialogue model based on the updated labeling data set to obtain an updated professional dialogue model;
and carrying out verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.
In an embodiment of the present invention, after obtaining the updated annotation data set, the method further includes:
and updating the preset unmarked pool according to the updated marked data set.
In a specific embodiment of the present invention, the performing a verification operation on the initial professional dialogue model by using a verification data set and a preset natural language processing evaluation index includes:
and carrying out verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index and the DISTINCT index through the following formula:
wherein ,for the score of the initial professional dialogue model on the BLEU index,for the score of the initial professional dialogue model on the route index,the score of the initial professional dialogue model on the PPL index is in the form of the reciprocal of the PPL index score,for the score of the initial professional dialogue model on the DISTINCT index,is the verification score.
In a specific embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the BLEU indexHas been calculatedProgram, score of said initial professional dialogue model on BLEU indexThe calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the BLEU index through the following formula:
wherein ,,is the length of the machine-translated version,is the shortest length of the reference translation sentence,to be the accuracy of the n-gram,as weights of n-grams, with any nAnd BP is a penalty factor.
In a specific embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the ROUGE indexThe score of the initial professional dialogue model on the ROUGE indexThe calculation process of (2) includes:
calculating the score of the initial professional dialogue model on the ROUGE index by the following formula:
Wherein { reference translation } denotes a set of reference translations,a combination of N words is represented,the numerator of the formula is used for counting the number of the N-grams in all the reference translations, and the numerator is used for counting the number of the N-grams shared by all the reference translations and the machine translations.
In an embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the PPL indexThe score of the initial professional dialogue model on the PPL indexThe calculation process of (2):
wherein ,indicating the probability of predicting the ith word from the above word, N represents the sentence length.
In an embodiment of the present invention, the method further comprises the step of scoring the initial professional dialogue model on the DISTINCT indexThe score of the initial professional dialogue model on the DISTINCT indexThe calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula:
wherein ,indicating a non-repeating number of ngrams in the reply,representing the total number of ngram words in the reply.
In an embodiment of the present invention, before training the original dialogue model by using the pre-acquired common dialogue dataset, the method further includes:
and respectively filtering the question-answer data and the chatting data in the general dialogue data set.
In a specific embodiment of the present invention, training an original dialog model with a pre-acquired universal dialog dataset to obtain a universal dialog model, includes:
inputting the general dialogue data set into the original dialogue model for model iterative training;
obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation;
and if so, determining the dialogue model obtained by the iterative training as the general dialogue model.
In an embodiment of the present invention, determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation includes:
and judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value.
In an embodiment of the present invention, when it is determined that the current iteration number is greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value, the method further includes:
judging whether the current iteration number is larger than a third preset value or not; wherein the third preset value is greater than the first preset value;
if yes, determining the dialogue model obtained by the iterative training in the current round as the general dialogue model;
and if not, inputting the general dialogue data set into a dialogue model obtained by the current iteration training for model iteration training, and repeatedly executing the steps of obtaining the current iteration number and the loss standard deviation obtained by the current iteration training.
In a specific embodiment of the present invention, the data screening of the general dialog data set according to the professional keyword group includes:
and performing data screening on the universal dialogue data set according to the professional keyword group by utilizing a DFA algorithm.
A dialogue response method is applied to a dialogue system containing a target professional dialogue model obtained by the training, and comprises the following steps:
receiving target question voice to be responded;
generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model;
and performing output operation on the target response voice.
In one embodiment of the present invention, the method further comprises:
searching relevant answers from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
and carrying out voice output on the related answers.
A training apparatus of a dialogue model, comprising:
the universal dialogue model acquisition module is used for training an original dialogue model by utilizing a pre-acquired universal dialogue data set to obtain a universal dialogue model;
the initial labeling data set determining module is used for acquiring a preset professional keyword group, performing data screening on the general conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
the initial professional dialogue model acquisition module is used for training the general dialogue model by utilizing the initial labeling data set to obtain an initial professional dialogue model;
the verification score obtaining module is used for carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
the judging module is used for judging whether the verification score is larger than a preset score threshold value or not;
and the target professional dialogue model determining module is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than a preset score threshold value.
A dialog response device comprising:
the question voice receiving module is used for receiving target question voice to be responded;
the response voice generating module is used for generating target response voice corresponding to the target question voice by utilizing a target professional dialogue model obtained by training a general dialogue model;
and the response voice output module is used for outputting the target response voice.
An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the training method or the dialogue response method of the dialogue model as described above when the computer program is executed.
A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the training method or the dialogue response method of the dialogue model as previously described.
The training method of the dialogue model provided by the invention utilizes the pre-acquired general dialogue data set to train the original dialogue model to obtain the general dialogue model; acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set; training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model; carrying out verification operation on the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; and if so, determining the initial professional dialogue model as the target professional dialogue model.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the user experience is improved.
Correspondingly, the invention also provides a training device of the dialogue model, a dialogue response method and device, an electronic device and a computer readable storage medium corresponding to the training method of the dialogue model, which have the technical effects and are not described herein again.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of an implementation of a training method for a dialogue model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating another embodiment of a method for training a dialogue model according to the present invention;
FIG. 3 is a flowchart illustrating an implementation of a dialog response method according to an embodiment of the present invention;
FIG. 4 is a block diagram of a training apparatus for a dialogue model according to an embodiment of the present invention;
FIG. 5 is a block diagram of a dialog response device according to an embodiment of the present invention;
FIG. 6 is a block diagram of an electronic device according to an embodiment of the invention;
fig. 7 is a schematic structural diagram of an electronic device provided in this embodiment.
Detailed Description
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart of an implementation of a training method for a dialogue model according to an embodiment of the present invention, where the method may include the following steps:
s101: and training the original dialogue model by using the pre-acquired general dialogue data set to obtain the general dialogue model.
A common dialogue dataset is collected in advance in a public dataset, and can be divided into two main categories, namely question answering and chat. The question and answer data can relate to multiple fields such as common knowledge, facts, mothers and babies, medical treatment, law, insurance, aviation, psychology, traditional Chinese medicine, epidemic situation and the like. The chatty data can comprise a plurality of data sets such as microblog discussions, drama dialogue, post-bar discussions, bean comments, e-commerce conversations and the like, and can relate to various topic discussions of daily life such as history, movies, weather, entertainment, sports and the like.
Specific examples of constructing a universal dialog data set are as follows:
the format of the vocabulary entry interpretation class prompt is as follows: "title", article: "text". Original corpus example { "id": "0", "url": https: // xxx, "title": "economics", "text": "economics is a social science of research on production, distribution and consumption of products and services \8230; \8230 }, which is composed in the format of prompt: title: "economics", article: "economics is a social science of research into the production, distribution, and consumption of products and services \8230;".
Question-answer type prompt format: asking: "title + desc" answer: "answer". Raw corpus example { "qid":0, "title": "will be only go by AlphaGo, can be written by afu dog in novel story", "desc": "no intelligent robot will be able to engage in literary authoring now, < br > what level of work, if any, can be written", "answer": "AlphaGo only goes down go, because its design purpose, architecture, technical scheme and training data are all \8230;" }, which is done around the core of playing go, and is composed according to prompt format: asking for: "Alphago will only go, and attorney can write novels, and now there will not be intelligent robot that can engage in literary creation, if have, what level of works can be written" answer: "Alphago only goes to go because its design purpose, architecture, technical scheme and training data are all \8230;" all done around the core of going to go.
Reading comprehension of the prompt-like format: a context question: "queuing" response: "answer". Original corpus example { "id": "0", "context": the treatment of cholelithiasis should be carried out separately according to different conditions, and asymptomatic gallstone can not be treated, but good eating habits of \8230;, "," and "restraint": "what type of gallstone can be untreated", "answer", "asymptomatic gallstone" }, which are composed according to the prompt format: the treatment of cholelithiasis should be handled separately in different cases, and asymptomatic gallstone may not be treated, but good dietary habits of \8230 \ 823030, questions: "what type of gallstone may be untreated" is answered: asymptomatic gallstone.
Single-round or multi-round dialog-like prompt format: conversation: "dialog1", "dialog2", "dialog3" \ 8230; \ 8230;. After composition in the format of prompt: conversation: "how to live broadcast, i can not see your people" "" broadcast is not "" "cherish i like you" \8230
And training the original dialogue model by utilizing the pre-acquired general dialogue data set to obtain the general dialogue model.
S102: and acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set.
Professional dialogue data sets are generally marked by experts, and although the data requirement is far smaller than that of a general dialogue data set, the method is time-consuming and labor-consuming only by means of expert marking, so that professional key word groups are preset. Training an original dialogue model by using a universal dialogue data set to obtain a universal dialogue model, acquiring a preset professional keyword group, screening data of the universal dialogue data set according to the professional keyword group, determining the screened data set as an initial labeling data set, and recording the initial labeling data set as an initial labeling data set. The initial labeling data set is obtained by screening the general conversation data set through setting the professional key phrases, and compared with a simple manual labeling method, the generation efficiency of the professional conversation data set is greatly improved.
In a specific embodiment of the present invention, the data screening of the universal dialogue data set according to the professional keyword group may include the following steps:
and screening the data of the general dialogue data set according to the professional keyword set by utilizing a DFA algorithm.
And when the universal dialogue data set is screened, the DFA algorithm is used for screening the universal dialogue data set according to the professional keyword group. Therefore, the advantage of filtering sensitive words can be realized while efficient keyword matching can be realized by fully utilizing the DFA algorithm.
The process of adopting DFA algorithm to realize keyword matching and screening out professional dialogue data from general dialogue data set in the embodiment of the invention can comprise the following steps:
(1) Providing professional key phrases by experts;
(2) Constructing a professional word linked list (ending with a specific character '\ x 00') by establishing a nested dictionary for the professional key word group;
(3) And traversing each group of dialogues in the general dialog data set, taking the dialogues as an input traversal professional word linked list, and if a specific character \ x00 is met, indicating that the group of dialogues contains professional keywords and screening out the professional keywords.
Although some of the professional dialogue data can be screened out through keyword matching, the professional dialogue usually involved in the common dialogue data set is limited, especially in some biased professions, so that expert marking is still needed. Data marked by experts relate to privacy, and desensitization treatment (hiding private information such as names, mobile phone numbers, mailboxes and the like in conversations) needs to be added. As with the construction of the universal dialogue dataset, the professional dialogue dataset is composed in the prompt format of table 1.
Specific examples of building server specialized dialog datasets are as follows:
if the intelligent customer service of the server belongs to a plurality of rounds of conversations, the conversation content is as follows: "you are good asking what can help you. The ' status light is always related to the power supply, the normal operation of the server is not influenced, the ' status ' is a total light, the machine is on when a problem occurs, and 4 circuits are all suggested to be plugged in. "there is no condition to insert 4 way power on the spot, there is no way to let status light on" ", there is, with the order, brush the power tactics into two electricity. "
S103: and training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model.
After obtaining the initial annotated data set, utilizing the initial annotated data setTraining the general dialogue model to obtain an initial professional dialogue model which is recorded as。
S104: and carrying out verification operation on the initial professional dialogue model by utilizing the verification data set and preset natural language processing evaluation indexes to obtain a verification score.
Training to obtain initial professional dialogue modelThen, the initial professional dialogue model is verified by utilizing the verification data set and the preset natural language processing evaluation index, and a verification score is obtained and recorded as. And predicting the response performance of the initial professional dialogue model to the voice question through the verification score.
S105: and judging whether the verification score is larger than a preset score threshold value, if so, executing the step S106, and if not, continuing training the initial professional dialogue model.
And presetting a score threshold, after the initial professional dialogue model is verified by using the verification data set and the preset natural language processing evaluation index, judging whether the verification score is greater than the preset score threshold, if so, indicating that the model is trained, and executing the step S106, otherwise, indicating that the initial professional dialogue model needs to be trained continuously.
S106: and determining the initial professional dialogue model as a target professional dialogue model.
And when the verification score is determined to be larger than the preset score threshold value, the model is trained, and the initial professional dialogue model is determined as the target professional dialogue model. The target professional dialogue model and the current all expert annotation data set can also be output. Whether the professional dialogue model is trained or not is judged according to the preset score threshold value, so that the trained target professional dialogue model can have good answer generation capacity on the questioning voice.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the user experience is improved.
Referring to fig. 2, fig. 2 is a flowchart of another implementation of a training method of a dialogue model in an embodiment of the present invention, where the method may include the following steps:
s201: and training the original dialogue model by using the pre-acquired general dialogue data set to obtain the general dialogue model.
In an embodiment of the present invention, before step S201, the method for training a dialogue model may further include the following steps:
and respectively filtering the question-answer data and the chatting data in the universal dialogue data set.
And after the universal dialogue data set is obtained, respectively filtering the question answering data and the chatting data in the universal dialogue data set. For example, since the overall noise of the question-answer data set is low, simple filtering may be performed, including removing conversations containing sensitive words, removing insufficiencyWord conversations, conversations removing questions that are the same as answers, removing meaningless characters in the corpus, and the like. Because the whole noise of the chatting data set is large, strict filtering is required. The adopted filtering mode comprises removing dialogues containing sensitive words and removing deficiencyWord conversations, removing conversations having only one sentence, removing conversations not containing Chinese characters, deleting advertising conversations, deleting repeated conversationsAnd removing meaningless characters in the corpus and the like. The original dialogue model is trained by utilizing the filtered general dialogue data set, so that the interference of useless data is avoided, the model training complexity is reduced, the model training efficiency is improved, and the accuracy of the trained model is improved.
In order to make the training effect better, the data set can be composed according to different categories and a certain prompt format, as follows:
TABLE 1
And the subsequent processing work is reduced through a fixed prompt format.
In one embodiment of the present invention, step S201 may include the following steps:
the method comprises the following steps: inputting the general dialogue data set into an original dialogue model to carry out model iterative training;
step two: obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
step three: determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation, if so, executing a fourth step, and if not, executing a fifth step;
step four: determining a dialogue model obtained by the iterative training of the current round as a general dialogue model;
step five: and inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to carry out model iterative training, and returning to execute the second step.
For convenience of description, the above five steps may be combined for illustration.
The process of training an original dialogue model by using a general dialogue data set to obtain a general dialogue model can comprise the steps of inputting the general dialogue data set into the original dialogue model to perform model iterative training, obtaining a current iteration number and a loss standard deviation obtained by the current iteration training, determining whether a model training cut-off condition is met according to the current iteration number and the loss standard deviation, if so, indicating that the model obtained by the current training can already give a better voice response to a general question, determining the dialogue model obtained by the current iteration training as the general dialogue model, if not, indicating that the model obtained by the current training cannot give a better voice response to the general question, inputting the general dialogue data set into the dialogue model obtained by the current iteration training to perform model iterative training, obtaining the current iteration number and the loss standard deviation obtained by the current iteration training again, and continuously optimizing the model through multiple times of training iterations.
It should be noted that the model training cutoff condition may be set and adjusted according to an actual situation, which is not limited in the embodiment of the present invention, and may be set as an upper limit of the number of iterations, or may be set as a loss threshold.
In an embodiment of the present invention, determining whether the model training cutoff condition is reached according to the current iteration number and the loss standard deviation may include the following steps:
the method comprises the following steps: inputting the general dialogue data set into an original dialogue model to carry out model iterative training;
step two: obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
step three: judging whether the current iteration number is greater than a first preset value and the loss standard deviation is less than a second preset value, if so, executing a fourth step, and if not, executing a fifth step when the current iteration number is determined to be greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value;
step four: determining a dialogue model obtained by the iterative training of the current round as a general dialogue model;
step five: judging whether the current iteration number is larger than a third preset value or not, if so, returning to execute the fourth step, and if not, executing the sixth step;
wherein the third preset value is greater than the first preset value;
step six: and inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to carry out model iterative training, and returning to execute the second step.
For convenience of description, the above six steps may be combined for illustration.
The hyper-parameters in the model training are preset, and the hyper-parameters can comprise the iteration numberPre-training minimum iteration number obtained by pre-training(i.e., first predetermined value), loss standard deviationStandard deviation threshold of(i.e., second predetermined value), loss standard deviationThe standard deviation of the latest ten iterations loss is represented. After obtaining the current iteration number and the loss standard deviation obtained by the iteration training of the current round, judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value, namelyAnd thereby determine whether a model training cutoff condition has been reached. The model training stage is judged by combining the current iteration number and the loss standard deviation, so that the model meeting the training cutoff condition is subjected to iteration for a certain number of times, and the performance of the model is improved.
The pre-set hyper-parameters in the model training may further include the number of iterationsPre-training maximum iteration number obtained by pre-training(i.e., a third preset value) greater than the third preset valueA predetermined value, i.e.. When the current iteration number is larger than a first preset value and the loss standard deviation is larger than or equal to a second preset value, judging whether the current iteration number is larger than a third preset value or not, if so, judging that the loss value is slowly reduced, training the model to be close to global optimum, determining the dialogue model obtained by the iteration training of the current round as a universal dialogue model, if not, judging that the model needs to be continuously trained, inputting a universal dialogue data set into the dialogue model obtained by the iteration training of the current round for model iteration training, obtaining the loss standard deviation obtained by the current iteration number and the iteration training of the current round again, judging whether a model training cut-off condition is reached or not based on the data of the iteration of the current round, and repeating the steps until the preset model training cut-off condition is reached, thereby obtaining the universal dialogue model capable of well responding to the universal questioning voice.
S202: and acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set.
S203: and training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model.
S204: and carrying out verification operation on the initial professional dialogue model by utilizing the verification data set and preset natural language processing evaluation indexes to obtain a verification score.
In a specific embodiment of the present invention, the verifying the initial professional dialogue model by using the verification data set and the preset evaluation index of natural language processing may include the following steps:
and (3) carrying out verification operation on the initial professional dialogue model by combining a verification data set, a BLEU index, a ROUGE index, a PPL index and a DISTINCT index through the following formula:
wherein ,to score the initial professional dialogue model on the BLEU index,to score the initial professional dialogue model on the route index,the score of the initial professional dialogue model on the PPL index is obtained in the form of reciprocal of the PPL index score,to score the initial professional dialogue model on the DISTINCT index,to verify the score.
When the initial professional dialogue model is verified, verification operation can be performed on the initial professional dialogue model by combining a verification data set, a BLEU index, a ROUGE index, a PPL index and a DISTINCT index. The verification score may be calculated as follows:
wherein ,to score the initial professional dialogue model on the BLEU index,to score the initial professional dialogue model on the route index,the score of the initial professional dialogue model on the PPL index is in the form of reciprocal of the PPL index score,the smaller the size is, the worse the model is generated,the score of the initial professional dialogue model on the DISTINCT index.
The performance of the model on the verification data set is comprehensively judged by adopting four indexes of BLEU, ROUGE, PPL and DISTINCT. The method ensures the accuracy and recall rate of the generated answers while ensuring the smoothness and diversity of the model generation.
In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the BLEU indexThe score of the initial professional dialogue model on the BLEU indexThe calculation process of (a) may include the steps of:
the score of the initial professional dialogue model on the BLEU index is calculated by the following formula:
wherein ,,is the length of the machine translation and,translating the length of a sentence for the shortest referenceThe degree of the water is measured by the following method,to be the accuracy of the n-gram,is the weight of the n-gram, with any nAnd BP is a penalty factor.
The core idea of the BLEU is to compare the degree of coincidence of the n-grams in the candidate translation and the reference translation, and the higher the degree of coincidence, the higher the quality of the translation is considered. In practice, N =1 to 4 is usually taken, and then weighted average is performed.
wherein ,,is the length of the machine-translated version,is the shortest length of the reference translation sentence,is the precision of the n-gram,the weights for n-grams are typically set to be uniform, i.e., there is a weight for any n. BP is a penalty factor, and if the length of the translation is smaller than the shortest reference translation, the BP is smaller than 1. 1-gram accuracy of BLEU indicates that the decoded text is faithful to the original textThe degree, while other n-grams represent the fluency of translation.
In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the route indexThe score of the initial professional dialogue model on the ROUGE indexThe calculation process of (a) may include:
calculating the score of the initial professional dialogue model on the ROUGE index by the following formula:
Wherein { reference translation } denotes a set of reference translations,a combination of N words is represented by,the numerator of the formula is used for counting the number of the N-grams in all the reference translations, and the numerator is used for counting the number of the N-grams shared by all the reference translations and the machine translations.
ROUGE-N focuses on recall rather than accuracy. See how many n-gram phrases in the reference sentence appear in the output. "N" refers to an N-gram that is computed in a similar manner to BLEU, except that BLEU is based on precision and ROUGE is based on recall. The ROUGE-N is mainly used for counting the recall rate on the N-gram, and for the N-gram, a ROUGE-N score can be calculated, wherein the calculation formula is as follows:
where { reference translation } denotes a set of reference translations, and there may be a plurality of reference translations in actual applications.A combination of N words is represented,indicating the number of N-grams in the computed grammar. The denominator of the formula is to count the number of N-grams in all the reference translations, and the numerator is to count the number of N-grams shared by all the reference translations and the machine translation.
In an embodiment of the present invention, the training method of the dialogue model may further include scoring the initial professional dialogue model on the PPL indexThe score of the initial professional dialogue model on the PPL indexThe calculation process of (2):
wherein ,representing the probability of predicting the ith word from the above word, N represents the sentence length.
PPL refers to Perplexity in a language model, and Perplexity is an index for measuring whether a sentence is smooth or not. Is defined as:
wherein ,representing the probability of predicting the ith word from the above word, N represents the sentence length. The smaller the PPL value, the more natural the model generates and the smoother the sentence. The reply quality is evaluated through the PPL, and the situations that replies generated by the model are out of order and front and back are reversed can be avoided.
In one embodiment of the invention, the method may further comprise scoring the initial professional dialogue model on the DISTINCT indexThe score of the initial professional dialogue model on the DISTINCT indexThe calculation process of (2) includes:
calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula:
wherein ,indicating a non-repeating number of ngrams in the reply,indicating the total number of ngram words in the reply.
The Distingt evaluation index judges the diversity of machine recovery, and the Distingt index judges whether a large amount of universality and repeatability recovery occur. Distingt is defined as follows:
wherein ,indicating a non-repeating number of ngrams in the reply,indicating the total number of ngram words in the reply.Larger indicates more diversity in generating replies.
S205: and judging whether the verification score is larger than a preset score threshold value, if so, executing the step S106, and if not, executing the step S207.
S206: and determining the initial professional dialogue model as a target professional dialogue model.
S207: and generating corresponding response data aiming at each sample data in the preset unmarked pool by utilizing the initial professional dialogue model.
When the verification score is determined to be less than or equal to the preset score threshold value, the model needs to be trained continuously, and the initial professional dialogue model is utilizedAnd generating corresponding response data aiming at each sample data in the preset unmarked pool.
S208: and respectively calculating the automatic evaluation scores corresponding to the response data.
After corresponding response data are generated for each sample data in the preset unmarked pool by using the initial professional dialogue model, the automatic evaluation score corresponding to each response data is respectively calculated. For example, the automatic evaluation score corresponding to each response data may be calculated according to the PPL index and the distint index, and the calculation formula is as follows:
thereby obtaining the automatic evaluation score corresponding to each response data.
S209: and sorting the respective automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from the end with smaller score.
After the automatic evaluation scores corresponding to the response data are respectively obtained through calculation, the automatic evaluation scores are sorted, a preset number of automatic evaluation scores are selected from one end with smaller scores, for example, the lowest N automatic evaluation scores are selectedAnd (6) scoring.
S210: and outputting marking prompt information for marking the response data corresponding to the selected respective dynamic evaluation scores.
After a preset number of automatic evaluation scores are selected from the end with smaller score, the marking prompt information for marking the response data corresponding to the selected automatic evaluation scores is output, thereby prompting the lowest N automatic evaluation scoresAnd carrying out expert annotation on the response data corresponding to the scores.
S211: and updating the initial labeling data set according to the labeling result to obtain an updated labeling data set.
And after the labeling prompt information for labeling the response data corresponding to the selected respective automatic evaluation score is output, obtaining a labeling result, updating the initial labeling data set according to the labeling result, and obtaining an updated labeling data set, thereby realizing effective labeling of data with poor response data effect generated by the current professional dialogue model.
In an embodiment of the present invention, after step S211, the method for training a dialogue model may further include the following steps:
and updating the preset unmarked pool according to the updated marked data set.
And after the updated marked data set is obtained, updating the preset unmarked pool according to the updated marked data set, thereby realizing the timely updating of the unmarked sample data in the preset unmarked pool.
S212: and training the initial professional dialogue model based on the updated labeling data set to obtain the updated professional dialogue model.
And after the initial annotation data set is updated according to the annotation result to obtain an updated annotation data set, training the initial professional dialogue model based on the updated annotation data set to obtain an updated professional dialogue model.
According to the embodiment of the invention, the active learning mode is adopted, so that the amount of the expert labeled samples is reduced as much as possible, and the influence on the model performance is reduced. And the 'difficult sample' which has the largest improvement on the model performance is continuously selected from the preset unmarked pool, so that the model performance is improved.
S213: and performing verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and returning to execute the step S205.
Training the initial professional dialogue model based on the updated labeling data set to obtain the updated professional dialogue model, verifying the updated professional dialogue model by using the verification data set to obtain a verification score, returning to the step of judging whether the verification score is larger than a preset score threshold value, and repeating the steps until the verification score obtained by calculation is larger than the preset score threshold value, thereby obtaining a target professional dialogue model capable of well responding to the received questioning voice.
Referring to fig. 3, fig. 3 is a flowchart of an implementation of a dialog response method in an embodiment of the present invention, applied to a dialog system including a target specialized dialog model obtained by the preceding training, where the method may include the following steps:
s301: and receiving target question voice to be responded.
When a user needs to perform scene dialogue, target question voice is output to the dialogue response control center, and the dialogue response control center receives the target question voice to be responded.
The dialogue response control center may be a processor deployed with a dialogue model.
The target questioning voice can be chat, common knowledge question and answer, professional question and answer and the like.
S302: and generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model.
The general dialogue model is obtained by Training the general dialogue model in advance, for example, model Training can be performed on the general dialogue data set based on a large model, wherein the large model can be based on a transform structure and is suitable for generating tasks, such as a GPT (general Pre-Training) model, a BERT (Bidirectional Encoder reporting from transforms) model, and the like. And training based on the general dialogue model to obtain a target professional dialogue model. And after receiving the target question voice to be responded, generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model.
By retraining on the basis of the large model, the requirements for data volume and computing power are greatly reduced, and a two-stage training model mode is adopted, so that the trained target professional dialogue model has universality and professionality at the same time.
S303: and performing output operation on the target response voice.
And after generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training the general dialogue model, outputting the target response voice, thereby realizing the response to the target question voice.
Because the model training process needs more resources compared with the model application process, more resources can be allocated to the model training process in advance, and relatively less resources are allocated to the model application process. For example, 8 and above 80G sized GPUs (Graphics Processing units, image processors) can be pre-divided for model training, and 1 and above 80G sized GPUs are divided for model application.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the use experience of a user is improved.
In an embodiment of the present invention, the dialog response method may further include the steps of:
the method comprises the following steps: searching a relevant answer from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
step two: and carrying out voice output on the related answers.
For convenience of description, the above two steps may be combined for illustration.
The embodiment of the invention presets a bottom-pocketing scheme, constructs a professional database by using a professional data set, searches related answers from the database based on a preset retrieval algorithm when a target professional dialogue model fails to respond to a target question voice, namely when the output of the target professional dialogue model is empty, and outputs the voice of the related answers. Therefore, the application process of the professional dialogue model is optimized, the situation that the user asks the voice to fall into the air is further guaranteed, and the user experience is improved.
Corresponding to the above method embodiment, the present invention further provides a training apparatus for a dialogue model, and the training apparatus for a dialogue model described below and the training method for a dialogue model described above may be referred to in correspondence.
Referring to fig. 4, fig. 4 is a block diagram illustrating a training apparatus for a dialogue model according to an embodiment of the present invention, where the training apparatus for a dialogue model may include:
a general dialogue model obtaining module 41, configured to train an original dialogue model by using a pre-obtained general dialogue dataset to obtain a general dialogue model;
an initial labeling data set determining module 42, configured to obtain a preset professional keyword group, perform data screening on the general conversation data set according to the professional keyword group, and determine a data set obtained by the screening as an initial labeling data set;
an initial professional dialogue model obtaining module 43, configured to train the general dialogue model by using the initial annotation data set, so as to obtain an initial professional dialogue model;
a verification score obtaining module 44, configured to perform a verification operation on the initial professional dialogue model by using the verification data set and a preset natural language processing evaluation index, so as to obtain a verification score;
a judging module 45, configured to judge whether the verification score is greater than a preset score threshold;
and the target professional dialogue model determining module 46 is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than the preset score threshold value.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the use experience of a user is improved.
In an embodiment of the present invention, the training device for dialogue model may further include:
the response data generation module is used for generating corresponding response data for each sample data in the preset unmarked pool by using the initial professional dialogue model when the verification score is determined to be less than or equal to the preset score threshold value;
the automatic evaluation score calculation module is used for respectively calculating the automatic evaluation scores corresponding to the response data;
the automatic evaluation score selection module is used for sorting the automatic evaluation scores and selecting a preset number of automatic evaluation scores from the end with smaller scores;
the labeling prompt information output module is used for outputting labeling prompt information for labeling the response data corresponding to the selected respective dynamic evaluation scores;
the annotation data set updating module is used for updating the initial annotation data set according to the annotation result to obtain an updated annotation data set;
the professional dialogue model updating module is used for training the initial professional dialogue model based on the updated labeled data set to obtain an updated professional dialogue model;
and the repeated execution module is used for carrying out verification operation on the updated professional dialogue model by utilizing the verification data set to obtain a verification score and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.
In an embodiment of the present invention, the training device for dialogue model may further include:
and the unmarked pool updating module is used for updating the preset unmarked pool according to the updated marked data set after the updated marked data set is obtained.
In a specific embodiment of the present invention, the verification score obtaining module 44 is specifically configured to perform a verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index, and the DISTINCT index according to the following formulas:
wherein ,to score the initial professional dialogue model on the BLEU index,to score the initial professional dialogue model on the route index,the score of the initial professional dialogue model on the PPL index is obtained in the form of reciprocal of the PPL index score,to score the initial professional dialogue model on the DISTINCT index,to verify the score.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the BLEU index for calculating the score of the initial professional dialogue model on the BLEU index by the following formula:
wherein ,,is the length of the machine-translated version,is the shortest length of the reference translation sentence,to be the accuracy of the n-gram,is the weight of the n-gram, with any nAnd BP is a penalty factor.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the ROUGE index for calculating the score of the initial professional dialogue model on the ROUGE index by the following formula:
Wherein { reference translation } denotes a set of reference translations,a combination of N words is represented,the numerator of the formula is used for counting the number of the N-grams in all the reference translations, and the numerator is used for counting the number of the N-grams shared by all the reference translations and the machine translations.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the PPL index for calculating the score of the initial professional dialogue model on the PPL index by the following formula:
wherein ,representing the probability of predicting the ith word from the above word, N represents the sentence length.
In an embodiment of the present invention, the training device for dialogue model may further include:
a score calculating module on the DISTINCT index for calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula:
wherein ,indicating a non-repeating number of ngrams in the reply,representing the total number of ngram words in the reply.
In an embodiment of the present invention, the training device for dialogue model may further include:
and the data filtering module is used for respectively filtering question-answer data and chatting data in the universal dialogue data set before training the original dialogue model by using the pre-acquired universal dialogue data set.
In one embodiment of the present invention, the general dialogue model obtaining module 41 includes:
the iterative training submodule is used for inputting the general dialogue data set into an original dialogue model to carry out model iterative training;
the loss standard deviation obtaining submodule is used for obtaining the current iteration number and the loss standard deviation obtained by the iteration training of the current round;
the training cutoff judgment submodule is used for determining whether a model training cutoff condition is met according to the current iteration number and the loss standard deviation;
and the general dialogue model determining submodule is used for determining the dialogue model obtained by the iteration training in the current round as the general dialogue model when the model training cut-off condition is determined to be reached according to the current iteration number and the loss standard deviation.
In an embodiment of the present invention, the training cutoff determination sub-module is a module for determining whether the current iteration number is greater than a first preset value and the loss standard deviation is smaller than a second preset value.
In an embodiment of the present invention, the training device for dialogue model may further include:
the iteration number counting submodule is used for judging whether the current iteration number is greater than a third preset value or not when the current iteration number is determined to be greater than a first preset value and the loss standard deviation is greater than or equal to a second preset value; wherein the third preset value is greater than the first preset value;
the general dialogue model determining submodule is also used for determining the dialogue model obtained by the iteration training in the current round as the general dialogue model when the current iteration number is larger than a third preset value;
and the iterative training submodule is also used for inputting the general dialogue data set into the dialogue model obtained by the iterative training of the current round to perform model iterative training when the current iterative number is less than or equal to a third preset value, and repeatedly executing the step of obtaining the current iterative number and the loss standard deviation obtained by the iterative training of the current round.
In an embodiment of the present invention, the initial labeled data set determining module 42 is specifically a module for performing data screening on the general dialogue data set according to the professional keyword set by using a DFA algorithm.
Corresponding to the above method embodiment, the present invention further provides a dialog response device, and the dialog response device described below and the dialog response method described above may be referred to correspondingly.
Referring to fig. 5, fig. 5 is a block diagram of a dialog response device according to an embodiment of the present invention, where the dialog response device may include:
a question voice receiving module 51, configured to receive a target question voice to be responded;
a response speech generation module 52, configured to generate a target response speech corresponding to the target question speech by using a target professional dialogue model obtained based on training of the general dialogue model;
and a response voice output module 53, configured to perform an output operation on the target response voice.
According to the technical scheme, the target professional dialogue model applied to the specific dialogue scene is obtained through training based on the general dialogue model in advance, the requirements for data volume and computing power are greatly reduced, the trained target professional dialogue model has universality and professionality, and the user experience is improved.
In an embodiment of the present invention, the dialog response device may further include:
the answer searching module is used for searching related answers from the database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
and the voice output module is used for performing voice output on the related answers.
Corresponding to the above method embodiment, referring to fig. 6, fig. 6 is a schematic diagram of an electronic device provided by the present invention, which may include:
a memory 332 for storing a computer program;
a processor 322 for implementing the steps of the training method or the dialogue response method of the dialogue model of the above method embodiments when executing the computer program.
Specifically, referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device provided in this embodiment, which may generate relatively large differences due to different configurations or performances, and may include a processor (CPU) 322 (e.g., one or more processors) and a memory 332, where the memory 332 stores one or more computer programs 342 or data 344. Memory 332 may be, among other things, transient storage or persistent storage. The program stored in memory 332 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the data processing apparatus. Further, the processor 322 may be configured to communicate with the memory 332 to execute a series of instruction operations in the memory 332 on the electronic device 301.
The electronic device 301 may also include one or more power sources 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341.
The steps in the above-described dialog response method may be implemented by the structure of the electronic device.
Corresponding to the above method embodiment, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, can implement the following steps:
training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model; acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set; training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model; verifying the initial professional dialogue model by using a verification data set and preset natural language processing evaluation indexes to obtain a verification score; judging whether the verification score is larger than a preset score threshold value or not; if so, determining the initial professional dialogue model as a target professional dialogue model;
or the like, or, alternatively,
receiving target question voice to be responded; generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model; and performing output operation on the target response voice.
The computer-readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
For the introduction of the computer-readable storage medium provided by the present invention, please refer to the above method embodiments, which are not described herein again.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device, the electronic device and the computer-readable storage medium disclosed by the embodiments correspond to the method disclosed by the embodiments, so that the description is simple, and the relevant points can be referred to the description of the method part.
The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, without departing from the principle of the present invention, it is possible to make various improvements and modifications to the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
Claims (19)
1. A method for training a dialogue model, comprising:
training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;
acquiring a preset professional keyword group, performing data screening on the universal conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
training the general dialogue model by using the initial labeling data set to obtain an initial professional dialogue model;
carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
judging whether the verification score is larger than a preset score threshold value or not;
and if so, determining the initial professional dialogue model as a target professional dialogue model.
2. The training method of a dialogue model according to claim 1, wherein when it is determined that the verification score is equal to or less than the preset score threshold, the method further comprises:
generating corresponding response data for each sample data in a preset unmarked pool by using the initial professional dialogue model;
respectively calculating the automatic evaluation score corresponding to each response data;
sorting the automatic evaluation scores in size, and selecting a preset number of automatic evaluation scores from one end with smaller score;
outputting labeling prompt information for labeling the response data corresponding to each selected automatic evaluation score;
updating the initial labeling data set according to the labeling result to obtain an updated labeling data set;
training the initial professional dialogue model based on the updated labeling data set to obtain an updated professional dialogue model;
and carrying out verification operation on the updated professional dialogue model by using the verification data set to obtain a verification score, and repeatedly executing the step of judging whether the verification score is greater than a preset score threshold value.
3. The method for training a dialogue model of claim 2, further comprising, after obtaining the updated annotation data set:
and updating the preset unmarked pool according to the updated labeled data set.
4. The training method of dialogue model according to claim 1, wherein the performing a verification operation on the initial professional dialogue model using a verification data set and a preset evaluation index of natural language processing comprises:
and carrying out verification operation on the initial professional dialogue model by combining the verification data set, the BLEU index, the ROUGE index, the PPL index and the DISTINCT index through the following formula:
wherein ,for the score of the initial professional dialogue model on the BLEU index,for the score of the initial professional dialogue model on the route index,the score of the initial professional dialogue model on the PPL index,in the form of the reciprocal of the score of the PPL indicator,for the score of the initial professional dialogue model on the DISTINCT index,to verify the score.
5. The method of claim 4, further comprising scoring the initial specialized dialogue model on a BLEU indicatorThe score of the initial professional dialogue model on the BLEU indexThe calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the BLEU index by the following formula:
6. The method for training dialogue model according to claim 4, further comprising scoring the initial professional dialogue model on a ROUGE indexThe score of the initial professional dialogue model on the ROUGE indexThe calculation process of (2) comprises:
calculating the score of the initial professional dialogue model on the ROUGE index through the following formula:
Wherein { reference translation } denotes a set of reference translations,a combination of N words is represented by,representing the number of N-grams, formula, in the computer textThe denominator is used for counting the number of N-grams in all the reference translations, and the numerator is used for counting the number of N-grams shared by all the reference translations and the machine translations.
7. The method of claim 4, further comprising scoring the initial specialized dialogue model on a PPL metricThe score of the initial professional dialogue model on the PPL indexThe calculation process of (2):
8. The method of claim 4, further comprising scoring the initial specialized dialog model on a DISTINCT indexThe score of the initial professional dialogue model on the DISTINCT indexThe calculation process of (2) includes:
calculating the score of the initial professional dialogue model on the DISTINCT index by the following formula:
9. The method for training a dialogue model according to claim 1, further comprising, before training an original dialogue model using a pre-acquired common dialogue dataset:
and respectively filtering the question-answer data and the chatting data in the general dialogue data set.
10. The method for training a dialogue model according to claim 1, wherein training an original dialogue model using a pre-acquired universal dialogue dataset to obtain a universal dialogue model comprises:
inputting the general dialogue data set into the original dialogue model for model iterative training;
obtaining a current iteration number and a loss standard deviation obtained by the iteration training of the current iteration number;
determining whether a model training cutoff condition is reached according to the current iteration number and the loss standard deviation;
and if so, determining the dialogue model obtained by the iterative training as the general dialogue model.
11. The method of claim 10, wherein determining whether a model training cutoff condition is met based on the current iteration number and the loss standard deviation comprises:
and judging whether the current iteration number is larger than a first preset value or not and the loss standard deviation is smaller than a second preset value.
12. The method for training a dialogue model of claim 11, wherein when it is determined that the current iteration number is greater than the first preset value and the loss standard deviation is greater than or equal to the second preset value, the method further comprises:
judging whether the current iteration number is larger than a third preset value or not; wherein the third preset value is greater than the first preset value;
if yes, determining the dialogue model obtained by the iterative training as the general dialogue model;
and if not, inputting the general dialogue data set into a dialogue model obtained by the current iteration training for model iteration training, and repeatedly executing the steps of obtaining the current iteration number and the loss standard deviation obtained by the current iteration training.
13. The method for training a dialogue model according to claim 1, wherein the step of performing data screening on the general dialogue data set according to the professional keyword group comprises:
and screening the data of the general dialogue data set according to the professional keyword set by utilizing a DFA algorithm.
14. A dialogue response method applied to a dialogue system including a target professional dialogue model trained according to any one of claims 1 to 13, comprising:
receiving target question voice to be responded;
generating target response voice corresponding to the target question voice by using a target professional dialogue model obtained by training a general dialogue model;
and performing output operation on the target response voice.
15. The dialog response method of claim 14 further comprising:
searching a relevant answer from a database based on a preset retrieval algorithm when the target professional dialogue model fails to respond to the target question voice;
and carrying out voice output on the related answers.
16. An apparatus for training a dialogue model, comprising:
the general dialogue model acquisition module is used for training an original dialogue model by utilizing a pre-acquired general dialogue data set to obtain a general dialogue model;
the initial labeling data set determining module is used for acquiring a preset professional keyword group, performing data screening on the general conversation data set according to the professional keyword group, and determining the screened data set as an initial labeling data set;
the initial professional dialogue model obtaining module is used for training the general dialogue model by utilizing the initial labeling data set to obtain an initial professional dialogue model;
the verification score obtaining module is used for carrying out verification operation on the initial professional dialogue model by utilizing a verification data set and preset natural language processing evaluation indexes to obtain a verification score;
the judging module is used for judging whether the verification score is larger than a preset score threshold value or not;
and the target professional dialogue model determining module is used for determining the initial professional dialogue model as the target professional dialogue model when the verification score is larger than a preset score threshold value.
17. A dialog response device comprising:
the question voice receiving module is used for receiving target question voice to be responded;
the response voice generation module is used for generating target response voice corresponding to the target question voice by utilizing a target professional dialogue model obtained by training a general dialogue model;
and the response voice output module is used for outputting the target response voice.
18. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the training method of the dialogue model according to any one of claims 1 to 13 or the dialogue response method according to any one of claims 14 to 15 when executing the computer program.
19. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the training method of the dialogue model according to any one of claims 1 to 13 or the dialogue response method according to any one of claims 14 to 15.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211441290.4A CN115495568B (en) | 2022-11-17 | 2022-11-17 | Training method and device for dialogue model, dialogue response method and device |
PCT/CN2023/086071 WO2024103609A1 (en) | 2022-11-17 | 2023-04-04 | Dialogue-model training method and apparatus, and dialogue response method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211441290.4A CN115495568B (en) | 2022-11-17 | 2022-11-17 | Training method and device for dialogue model, dialogue response method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115495568A true CN115495568A (en) | 2022-12-20 |
CN115495568B CN115495568B (en) | 2023-08-22 |
Family
ID=85116091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211441290.4A Active CN115495568B (en) | 2022-11-17 | 2022-11-17 | Training method and device for dialogue model, dialogue response method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115495568B (en) |
WO (1) | WO2024103609A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116127035A (en) * | 2023-01-03 | 2023-05-16 | 北京百度网讯科技有限公司 | Dialogue method, training method and training device for dialogue model |
CN116432665A (en) * | 2023-06-15 | 2023-07-14 | 北京中关村科金技术有限公司 | Dialogue model construction method, text generation method, device, system and equipment |
CN117828063A (en) * | 2024-01-10 | 2024-04-05 | 广东数业智能科技有限公司 | Psychological field data generation and model training method and device and storage medium |
WO2024103609A1 (en) * | 2022-11-17 | 2024-05-23 | 苏州元脑智能科技有限公司 | Dialogue-model training method and apparatus, and dialogue response method and apparatus |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108897797A (en) * | 2018-06-12 | 2018-11-27 | 腾讯科技(深圳)有限公司 | Update training method, device, storage medium and the electronic equipment of dialog model |
WO2021049199A1 (en) * | 2019-09-13 | 2021-03-18 | Mitsubishi Electric Corporation | System and method for a dialogue response generation system |
CN114968788A (en) * | 2022-05-27 | 2022-08-30 | 浙江大学 | Method, apparatus, medium, and device for automatically evaluating programming capability of artificial intelligence algorithm |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188331B (en) * | 2019-06-03 | 2023-05-26 | 腾讯科技(深圳)有限公司 | Model training method, dialogue system evaluation method, device, equipment and storage medium |
US11561969B2 (en) * | 2020-03-30 | 2023-01-24 | Adobe Inc. | Utilizing logical-form dialogue generation for multi-turn construction of paired natural language queries and query-language representations |
CN115495568B (en) * | 2022-11-17 | 2023-08-22 | 苏州浪潮智能科技有限公司 | Training method and device for dialogue model, dialogue response method and device |
-
2022
- 2022-11-17 CN CN202211441290.4A patent/CN115495568B/en active Active
-
2023
- 2023-04-04 WO PCT/CN2023/086071 patent/WO2024103609A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108897797A (en) * | 2018-06-12 | 2018-11-27 | 腾讯科技(深圳)有限公司 | Update training method, device, storage medium and the electronic equipment of dialog model |
WO2021049199A1 (en) * | 2019-09-13 | 2021-03-18 | Mitsubishi Electric Corporation | System and method for a dialogue response generation system |
CN114968788A (en) * | 2022-05-27 | 2022-08-30 | 浙江大学 | Method, apparatus, medium, and device for automatically evaluating programming capability of artificial intelligence algorithm |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024103609A1 (en) * | 2022-11-17 | 2024-05-23 | 苏州元脑智能科技有限公司 | Dialogue-model training method and apparatus, and dialogue response method and apparatus |
CN116127035A (en) * | 2023-01-03 | 2023-05-16 | 北京百度网讯科技有限公司 | Dialogue method, training method and training device for dialogue model |
CN116127035B (en) * | 2023-01-03 | 2023-12-08 | 北京百度网讯科技有限公司 | Dialogue method, training method and training device for dialogue model |
CN116432665A (en) * | 2023-06-15 | 2023-07-14 | 北京中关村科金技术有限公司 | Dialogue model construction method, text generation method, device, system and equipment |
CN116432665B (en) * | 2023-06-15 | 2023-10-10 | 北京中关村科金技术有限公司 | Dialogue model construction method, text generation method, device, system and equipment |
CN117828063A (en) * | 2024-01-10 | 2024-04-05 | 广东数业智能科技有限公司 | Psychological field data generation and model training method and device and storage medium |
CN117828063B (en) * | 2024-01-10 | 2024-05-17 | 广东数业智能科技有限公司 | Psychological field data generation and model training method and device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2024103609A1 (en) | 2024-05-23 |
CN115495568B (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107944027B (en) | Method and system for creating semantic key index | |
CN115495568A (en) | Training method and device for dialogue model and dialogue response method and device | |
CN110096567B (en) | QA knowledge base reasoning-based multi-round dialogue reply selection method and system | |
CN109815336B (en) | Text aggregation method and system | |
CN108846138B (en) | Question classification model construction method, device and medium fusing answer information | |
CN112417127B (en) | Dialogue model training and dialogue generation methods, devices, equipment and media | |
CN110633359B (en) | Sentence equivalence judgment method and device | |
CN1571013A (en) | Method and device for predicting word error rate from text | |
CN111026840B (en) | Text processing method, device, server and storage medium | |
WO2024066920A1 (en) | Processing method and apparatus for dialogue in virtual scene, and electronic device, computer program product and computer storage medium | |
CN110347802A (en) | A kind of text analyzing method and device | |
CN114020906A (en) | Chinese medical text information matching method and system based on twin neural network | |
CN110727769B (en) | Corpus generation method and device and man-machine interaction processing method and device | |
CN114003682A (en) | Text classification method, device, equipment and storage medium | |
CN116049387A (en) | Short text classification method, device and medium based on graph convolution | |
CN116910220A (en) | Multi-round dialogue interaction processing method, device, equipment and storage medium | |
CN112905772A (en) | Semantic correlation analysis method and device and related products | |
WO2023169301A1 (en) | Text processing method and apparatus, and electronic device | |
CN111401069A (en) | Intention recognition method and intention recognition device for conversation text and terminal | |
CN115203388A (en) | Machine reading understanding method and device, computer equipment and storage medium | |
CN115408500A (en) | Question-answer consistency evaluation method and device, electronic equipment and medium | |
CN111159339A (en) | Text matching processing method and device | |
Enayet et al. | An analysis of dialogue act sequence similarity across multiple domains | |
CN117453895B (en) | Intelligent customer service response method, device, equipment and readable storage medium | |
CN116010583B (en) | Cascade coupling knowledge enhancement dialogue generation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |