CN117421573A

CN117421573A - Training method, device and storage medium for question-answer retrieval model

Info

Publication number: CN117421573A
Application number: CN202311457081.3A
Authority: CN
Inventors: 吴光鹏; 李娇; 薛智慧; 余小军
Original assignee: Beijing Topsec Technology Co Ltd; Beijing Topsec Network Security Technology Co Ltd; Beijing Topsec Software Co Ltd
Current assignee: Beijing Topsec Technology Co Ltd; Beijing Topsec Network Security Technology Co Ltd; Beijing Topsec Software Co Ltd
Priority date: 2023-11-03
Filing date: 2023-11-03
Publication date: 2024-01-19

Abstract

The embodiment of the application provides a training method, a training device and a storage medium for a question-answer retrieval model. The method comprises the following steps: determining a training data set, wherein the training data set comprises a plurality of long question-answer texts; sequentially inputting each long question-answer text to an encoder, and determining a first loss value of the training data set based on the output of the encoder and a first loss function; splitting each long question-answering text to obtain a plurality of short question-answering texts; determining a second loss value for the training data set based on the second loss function and the entire short question-answering text; inputting the plurality of answers to the encoder and the countermeasure network to determine a third loss value of the training dataset from the output of the encoder, the output of the countermeasure network, and the third loss function; determining a total loss value of the training data set according to the first loss value, the second loss value and the third loss value; and adjusting the weight coefficient according to the total loss value until the training of the question-answer retrieval model is completed, and improving the model performance and the accuracy of the subsequent question-answer retrieval.

Description

Training method, device and storage medium for question-answer retrieval model

Technical Field

The present application relates to the field of natural language processing technologies, and in particular, to a training method and apparatus for a question-answer search model, and a storage medium.

Background

The question-answering scene is a man-machine interaction mode based on natural language, and can provide relevant answers according to questions of users. Currently, when training a question-answer search model, positive and negative samples are first constructed, and an InfoNCE Loss (contrast learning Loss function) training model is used. And then, information minimization can be carried out, and loss training can be carried out by utilizing contrast learning. Then, positive and negative sample reconstruction can be performed, and the minimum information entropy is calculated. Finally, sentence characterization learning can be performed, and the model is evaluated through the downstream task.

However, for long texts, there is a difference in answer length, and if the model is directly trained by using the long text, the loss function fluctuates greatly. Especially after reconstruction, the loss function of the long text negative samples fluctuates more with constant temperature parameters and batch size control. And after the loss training is performed by using the contrast learning, a contrast loss function is calculated, and the value of the weight of the contrast loss function has larger influence on the result. Therefore, in the prior art, the technical scheme of training the question-answer retrieval model by using a long text is adopted, so that the performance of the trained question-answer retrieval model is lower, and the accuracy of the subsequent question-answer retrieval is lower.

Disclosure of Invention

The embodiment of the application aims to provide a training method, a training device and a storage medium for a question and answer retrieval model, which are used for solving the problem that the question and answer retrieval model in the prior art is low in performance.

To achieve the above object, a first aspect of the present application provides a training method for a question-answer search model including an encoder and an countermeasure network, including:

determining a training data set, wherein the training data set comprises a plurality of long question-answer texts, each long question-answer text carries a hierarchical label of a file where the long question-answer text is located, each long question-answer text comprises a question and an answer, and the answer at least comprises an answer sentence;

sequentially inputting each long question-answer text to an encoder, and determining a first loss value of the training data set based on the output of the encoder and a first loss function;

aiming at each long question-answer text, splitting the long question-answer text to obtain a plurality of short question-answer texts, wherein each short question-answer text comprises one answer in questions and answers;

determining a second loss value for the training data set based on the second loss function and the entire short question-answering text;

inputting a plurality of questions included in each long question-and-answer text to an encoder and a countermeasure network to determine a third loss value of the training data set through an output of the encoder, an output of the countermeasure network, and a third loss function;

Determining a total loss value of the training data set according to the first loss value, the second loss value and the third loss value;

and respectively adjusting the weight coefficients of the first loss function, the second loss function and the third loss function under the condition that the total loss value does not meet the preset condition, and returning to the step of determining the training data set until the redetermined total loss value meets the preset condition.

In an embodiment of the present application, determining the second loss value of the training data set based on the second loss function and the entire short question-answer text includes: cleaning the answers in each short question and answer text;

determining cosine similarity between the questions and the answers in each cleaned short question-answer text; randomly selecting one short question-answering text from a plurality of cleaned short question-answering texts included in each long question-answering text as a target text; determining a word frequency value and an inverse text frequency index of each keyword in a question included in the target text under each hierarchical label aiming at each target text; and determining a second loss value of the training data set according to the plurality of word frequency values and the plurality of inverse text frequency indexes corresponding to all target texts and all cosine similarity based on the second loss function.

In an embodiment of the present application, determining, based on the second loss function, the second loss value of the training data set according to the plurality of word frequency values and the plurality of inverse text frequency indexes corresponding to all the target texts, and all the cosine similarities includes: determining a total word frequency value and a total inverse text frequency index of the training data set according to a plurality of word frequency values and a plurality of inverse text frequency indexes corresponding to all target texts; aiming at each long question-answering text, determining the cosine similarity with the maximum numerical value in a plurality of cosine similarities; and determining a second loss value based on the second loss function according to the total word frequency value, the total inverse text frequency index and the cosine similarity maximum value of each long question-answer text.

In the embodiment of the application, each long question-answering text carries a primary label of the primary file and a secondary label of the secondary file, and the expression of the second loss function is defined by the formula (1):

wherein, loss ₁ Refers to the second loss value of the second loss function, QA refers to the total number of long question-answer texts, K refers to the total number of keywords in the question included in each target text, lambda ₂₁ Refers to a first weighting weight corresponding to a primary label,means that the word frequency value of each keyword under the first-level label is +. >Refers to the inverse text frequency index, lambda of each keyword under the primary label ₂₂ Means that the second weighting weight corresponding to the second level tag,/is>Means that the word frequency value of each keyword under the secondary label is ++>Refers to the inverse text frequency index, lambda, of each keyword under the secondary label ₂ The weight coefficient of the second loss function is that i refers to the i-th washed short question-answering text, the value range of i is 1-Z, Z refers to the total number of the washed short question-answering texts, Q refers to the vector corresponding to the question in each washed short question-answering text, y (i) refers to the vector corresponding to the answer in each washed short question-answering text, sim (Q, y (i)) refers to the cosine similarity between the question in each washed short question-answering text and the answer thereof.

In an embodiment of the present application, the challenge network includes a generator and a discriminator, and inputting the plurality of questions included in each long question-and-answer text to the encoder and the challenge network to determine a third loss value of the training data set by the output of the encoder, the output of the challenge network, and the third loss function includes: masking each answer according to a preset mask proportion for a plurality of answers corresponding to each long question answer text to obtain a plurality of masked answers; inputting each masked answer to a generator so that the generator outputs a new answer corresponding to each masked answer; inputting each of the questions to an encoder for a plurality of questions corresponding to each of the long question questions, such that the encoder outputs a first text vector corresponding to each of the questions; inputting the second text vector of each new answer and the first text vector of the corresponding answer to a discriminator so that the discriminator outputs a discrimination result for each new answer; determining the accuracy of the generator for each long question-answering text according to the discrimination results of all the new questions for all the new questions corresponding to each long question-answering text; a third loss value for the training data set is determined based on the third loss function and the overall accuracy.

In the embodiment of the present application, the expression of the third loss function is defined by formula (2):

wherein, loss ₂ Refers to a third loss value, lambda, of a third loss function ₃ Refers to the weight coefficient of the third loss function, QA refers to the total number of long question-answer texts, A refers to the total number of answers corresponding to each long question-answer text, N is the number of words to be masked in each answer corresponding to each long question answer text, P is the accuracy of the generator to output all new answers, m _x Refers to the x-th masked vocabulary in each sentence, A _y Refers to the y-th sentence of each long question-answer text.

In an embodiment of the present application, sequentially inputting each long question-answer text to an encoder, and determining a first loss value of a training data set based on an output of the encoder and a first loss function includes: for each long question-answering text, sequentially inputting the long question-answering text into an encoder according to preset times, so that the encoder sequentially outputs text vectors corresponding to the long question-answering text; determining a vector pair consisting of text vectors of the same long question-answering text as a positive example, and determining a vector pair consisting of text vectors of different long question-answering texts as a negative example; and determining a first loss value of the training data set based on the first loss function according to the number of positive cases and the number of negative cases corresponding to all the long question-answering texts.

In the embodiment of the present application, the expression of the first loss function is shown by formula (3):

wherein, loss ₃ Refers to a first loss value, lambda, of a first loss function ₁ Refers to the weight coefficient of the first loss function, QA refers to the total number of long question-answer texts,refers to the number of positive examples generated by the mth long question-answering text, (k) _m ,k _n ) Refers toThe number of negative examples generated by the mth long question-answering text and the nth long question-answering text, τ refers to a temperature coefficient, and q refers to a similarity calculation parameter.

A second aspect of the present application provides a training device for a question-answer search model, including:

a memory configured to store instructions; and

a processor configured to invoke the instructions from the memory and when executing the instructions, to enable the method for training of the question-answer retrieval model described above.

A third aspect of the present application provides a machine-readable storage medium having stored thereon instructions for causing a machine to perform the training method for a question-answer retrieval model described above.

According to the technical scheme, the influence of the whole long question-answer text on the question-answer retrieval model training is considered, each long question-answer text is sequentially input to the encoder, and the first loss value of the training data set is determined based on the output of the encoder and the first loss function. The long question-answer text is split into a plurality of short question-answer texts in consideration of divergence and representativeness of each answer in the answers of the long question-answer text, and a second loss value of the training data set is determined based on the second loss function and all of the short question-answer texts. Considering that the loss function fluctuation is large due to the fact that the masks are too concentrated, a plurality of questions included in each long question-answer text are input to an encoder and an countermeasure network, so that a third loss value of a training data set is determined through the output of the encoder, the output of the countermeasure network and the third loss function, a total loss value of the training data set is determined according to the first loss value, the second loss value and the third loss value, weight coefficients of the loss functions are adjusted based on the total loss value, the model is comprehensively trained, the trained question-answer retrieval model is better in performance, and the accuracy of subsequent question-answer retrieval is improved.

Additional features and advantages of embodiments of the present application will be set forth in the detailed description that follows.

Drawings

The accompanying drawings are included to provide a further understanding of embodiments of the present application and are incorporated in and constitute a part of this specification, illustrate embodiments of the present application and together with the description serve to explain, without limitation, the embodiments of the present application. In the drawings:

FIG. 1 schematically illustrates a flow diagram of a training method for a question-answer retrieval model according to an embodiment of the present application;

FIG. 2 schematically illustrates a flow diagram of a training method for a question-answer retrieval model according to another embodiment of the present application;

fig. 3 schematically shows an internal structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the specific implementations described herein are only for illustrating and explaining the embodiments of the present application, and are not intended to limit the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.

It should be noted that, in the embodiment of the present application, directional indications (such as up, down, left, right, front, and rear … …) are referred to, and the directional indications are merely used to explain the relative positional relationship, movement conditions, and the like between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are correspondingly changed.

In addition, if there is a description of "first", "second", etc. in the embodiments of the present application, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be regarded as not exist and not within the protection scope of the present application.

Fig. 1 schematically shows a flow diagram of a training method for a question-answer retrieval model according to an embodiment of the present application. As shown in fig. 1, in an embodiment of the present application, there is provided a training method for a question-answer search model including an encoder and an countermeasure network, including the steps of:

Step 101: a training data set is determined, wherein the training data set comprises a plurality of long question-answer texts, each long question-answer text carries a hierarchical label of a file where the long question-answer text is located, each long question-answer text comprises a question and an answer, and the answer at least comprises one answer sentence.

The processor may obtain a plurality of long question-answer texts from the corpus to determine a training dataset. Each long question-answer text includes a question and an answer, the answer including at least one answer. Each long question-answer text carries a hierarchical tag of the document in which it is located. The hierarchical label may be a file name of the file where the hierarchical label is located. For example, the file W1 includes a file W2, and the file W2 includes a long question-answering text QA ₁ And the file in which the long question-answering text is located comprises a file W1 and a file W2, and the corresponding hierarchical label comprises a label of the file W1 and a label of the file W1.

Step 102: each long question-answer text is input to the encoder in turn, and a first loss value of the training data set is determined based on the output of the encoder and the first loss function.

The processor may input each long question-answer text in turn to the encoder and may determine a first penalty value for the training data set based on the output of the encoder and the first penalty function. The encoder may include BERT and RoBERTa, among others.

For each long question-answer text, the processor may sequentially input the long question-answer text to the encoder by a preset number of times, so that the encoder sequentially outputs text vectors corresponding to the long question-answer text. The preset times can be customized according to actual conditions. For example, the preset number of times may be 2 times. The processor may determine a vector pair of text vectors of the same long question-and-answer text as positive examples, and may determine a vector pair of text vectors of different long question-and-answer texts as negative examples. The processor may determine a first loss value for the training dataset based on the first loss function and based on the number of positive examples and the number of negative examples corresponding to the entire long question-answer text. According to the scheme, the encoder is optimized by comparing the learning mode and adopting the first loss function of the similarity measurement, so that the encoder can show better generalization capability and migration capability.

In the embodiment of the present application, the expression of the first loss function is defined by formula (3):

wherein, loss ₃ Refers to a first loss value, lambda, of a first loss function ₁ Refers to the weight coefficient of the first loss function, QA refers to the total number of long question-answer texts,refers to the number of positive examples generated by the mth long question-answering text, (k) _m ,k _n ) Refers to the number of negative examples generated by the mth long question-answering text and the nth long question-answering text, tau refers to the temperature coefficient, q refers to the similarity calculationParameters.

Step 103: and splitting the long question-answer text aiming at each long question-answer text to obtain a plurality of short question-answer texts, wherein each short question-answer text comprises one answer sentence in the questions and the answers.

If fuzzy matching questions and answers are adopted, multiple answers with high similarity may exist for the same question. If the long question-answering text includes too long an answer, it may have too much divergence. To ensure the representativeness of each answer vector mapping in the long question-answer text, the processor may split the long question-answer text for each long question-answer text to obtain a plurality of short question-answer texts, each short question-answer text including one answer of the question and the answer. In one embodiment, the processor may identify periods included in answers in the long question-answer text and split the answers in the long question-answer text into a plurality of answers according to the periods. Thereafter, questions in the long question-answer text and each answer sentence may be combined to obtain a plurality of short question-answer texts. For example, for long question-answering text QA ₁ (q1+a1+a2+a3) whose corresponding short question-answering text includes: q1+a1; q1+a2; q1+a3.

Step 104: a second penalty value for the training data set is determined based on the second penalty function and the entire short question-answer text.

The processor may determine a second penalty value for the training dataset based on the second penalty function and the entire short question-answer text.

In an embodiment of the present application, determining the second loss value of the training data set based on the second loss function and the entire short question-answer text includes: cleaning the answers in each short question and answer text; determining cosine similarity between the questions and the answers in each cleaned short question-answer text; randomly selecting one short question-answering text from a plurality of cleaned short question-answering texts included in each long question-answering text as a target text; determining a word frequency value and an inverse text frequency index of each keyword in a question included in the target text under each hierarchical label aiming at each target text; and determining a second loss value of the training data set according to the plurality of word frequency values and the plurality of inverse text frequency indexes corresponding to all target texts and all cosine similarity based on the second loss function.

The processor may perform a cleaning process on the answers in each short question-answer text to obtain each cleaned short question-answer text. In one embodiment, the processor may clean the preset aides, adverbs, punctuations, etc. of the answers in each short question-answer text, so that the features of the answers in the short question-answer text may be highlighted. The processor can determine cosine similarity between the questions and the questions in each cleaned short question-answering text so as to avoid deviation caused by overlarge length difference of each short question-answering text and improve comparability and stability between the short question-answering texts. The processor may arbitrarily select one short question-answer text from among a plurality of washed short question-answer texts included in each long question-answer text as the target text. For each target text, the processor may determine a word frequency value and an inverse text frequency index for each keyword in the question included in the target text at each hierarchical label.

For example, for target text: q1+a1 ^* And Q1+A2 ^* ，Q1+A1 ^* The words in the first-level tag of the first-level file are X, X and Y, so that the word frequency value of the keyword X included in the Q1 under the first-level tag is 2/3, and Q1+A1 ^* The words in the second-level label of the second-level file are X, Y and Z, and the word frequency value of the keyword X included in the Q1 under the second-level label is 1/3. If Q1+A2 ^* The words in the primary labels of the primary files are X, Y and Y, and the reverse text frequency index of the keyword X included in the Q1 under the primary labels is the ratio of the total number of the primary files to the number of the primary files containing the keyword X, namely 2/2. In the case of determining the plurality of word frequency values and the plurality of inverse text frequency indices for each target text, the processor may determine a second loss value for the training dataset based on the second loss function from the plurality of word frequency values and the plurality of inverse text frequency indices for all target texts and all cosine similarities.

The processor may determine a total word frequency value and a total inverse text frequency index for the training dataset based on the plurality of word frequency values and the plurality of inverse text frequency indexes corresponding to the entire target text. For each long question-answering text, the processor can determine the cosine similarity with the largest value among a plurality of cosine similarities. The processor may determine a second penalty value based on the second penalty function and based on the total word frequency value, the total inverse text frequency index, and a cosine similarity maximum for each long question-answer text.

wherein, loss ₁ Refers to the second loss value of the second loss function, QA refers to the total number of long question-answer texts, K refers to the total number of keywords in the question included in each target text, lambda ₂₁ Refers to a first weighting weight corresponding to a primary label,refers to the word frequency value, N of each keyword under the first-level label ₁ Refers to the vocabulary quantity, M, of each keyword in the first-level label ₁ Refers to the total number of words in the first-level tag, +.>Refers to the reverse text frequency index of each keyword under the primary labels, D refers to the total number of the primary labels, and D _w1 Refers to the number, lambda, of primary labels containing keywords ₂₂ Means that the second weighting weight corresponding to the second level tag,/is>Refers to the word frequency value, N, of each keyword under the secondary label ₂ Refers to the vocabulary quantity, M, of each keyword in the secondary label ₂ Refers to the total number of words in the secondary label, < >>Refers to the inverse text frequency index of each keyword under the secondary label, D _w2 Refers to the number of secondary labels, lambda, containing keywords ₂ The weight coefficient of the second loss function is that i refers to the i-th washed short question-answering text, the value range of i is 1-Z, Z refers to the total number of the washed short question-answering texts, Q refers to the vector corresponding to the question in each washed short question-answering text, y (i) refers to the vector corresponding to the answer in each washed short question-answering text, sim (Q, y (i)) refers to the cosine similarity between the question in each washed short question-answering text and the answer thereof.

Step 105: the plurality of questions included in each long question-and-answer text are input to the encoder and the challenge network to determine a third loss value of the training data set via an output of the encoder, an output of the challenge network, and a third loss function.

Wherein the countermeasure network is a gan network. The processor may input a plurality of questions included in each long question-and-answer text to the encoder and the countermeasure network to determine a third loss value of the training data set through an output of the encoder, an output of the countermeasure network, and a third loss function.

Wherein the countermeasure network includes a generator and a discriminator. For a plurality of answers corresponding to each long question answer text, the processor may perform mask processing on each answer according to a preset mask ratio, so as to obtain a plurality of answers after mask processing. The preset mask proportion can be customized according to actual conditions. The processor may input each masked sentence to the generator so that the generator outputs a new sentence corresponding to each masked sentence. For example, the preset mask ratio may be 15%, at this time, the answer after the mask processing is input to the generator, and the integrity and representativeness of the new answer output by the generator are higher.

For a plurality of answers corresponding to each long question answer text, the processor may input each answer to the encoder to cause the encoder to output a first text vector corresponding to each answer. The processor may determine a second text vector of each new answer, and may input the second text vector of each new answer and the first text vector of the corresponding answer to the discriminator so that the discriminator outputs a discrimination result for each new answer. The discrimination result refers to whether the generated new answer is consistent with the input answer. For all new answers corresponding to each long question-answer text, the processor can determine the correctness of the generator for each long question-answer text according to the discrimination results of all new answers. The processor may determine a third loss value for the training data set based on the third loss function and the overall accuracy rate.

According to the scheme, the answers are split, the situation that the answers aiming at the same question are overlong is avoided, and the readability and the robustness of the model can be further improved under the condition that the long question-answer text mask fluctuates.

Step 106: a total loss value of the training data set is determined based on the first loss value, the second loss value, and the third loss value.

The processor may sum the first, second, and third loss values to determine a summed value as a total loss value for the training data set.

Step 107: and respectively adjusting the weight coefficients of the first loss function, the second loss function and the third loss function under the condition that the total loss value does not meet the preset condition, and returning to the step of determining the training data set until the redetermined total loss value meets the preset condition.

In case the total loss value does not meet the preset condition, the processor may adjust the weight coefficients of the first, second and third loss functions, respectively, and return to the step of determining the training data set. In one embodiment, after returning to the step of determining the training data set, other long question-and-answer text may be added to the training data set to construct a new training data set, and the question-and-answer retrieval model may be retrained with the new training data set. The long question-answer text in the training data set can be updated or replaced, and the updated training data set is adopted to train the question-answer retrieval model again. The question-answer retrieval model can be trained again by adopting the last training data set without processing the training data set. After the question-answer retrieval model is retrained by the determined training data set, the total loss value of the training data set can be redetermined until the redetermined total loss value meets the preset condition, and then the completion of the question-answer retrieval model training can be determined.

In one embodiment, when the total loss value reaches a preset value, or the number of training times or the number of iterations for the question-answer search model reaches a preset number, it may be determined that the total loss value satisfies a preset condition, and at this time, it may be determined that the question-answer search model training is completed. And under the condition that the total loss value does not reach a preset value or the training frequency or iteration frequency aiming at the question-answer search model does not reach the preset frequency, determining that the total loss value does not meet the preset condition.

In one embodiment, when the total loss value does not meet the preset condition, the processor may determine a weight coefficient having the greatest influence on the total loss value from the weight coefficients of the first loss function, the second loss function and the third loss function in a back propagation manner, and may adjust the weight coefficient of each loss function according to the influence degree, or may adjust the weight coefficient having the greatest influence first, and then adjust the weight coefficients of other loss functions until the total loss value meets the preset condition.

As shown in FIG. 2, a schematic diagram of another training method for a question-answer retrieval model is provided.

For long question-answering text (q+a1+a2+a3), it includes the question Q and the answer a1+a2+a3. Wherein A1, A2 and A3 respectively refer to 3 answers in the answers of the long question-answering text. The primary label of the primary file where the long question-answering text is located is T1, T1 refers to the file name of the primary file, the file label of the secondary file where the long question-answering text is located is T11, and T11 refers to the file name of the secondary file. In training the question and answer retrieval model, this can be done in three parts: (1) Combining the questions and the answers, and compiling and comparing and learning twice; (2) Splitting answers and layering balance weights, and calculating European-cosine similarity; (3) The answer is split into separate sentences and a cross entropy loss function is employed for negative examples.

For part (1) above, the long question-answering text (q+a1+a2+a3) may be input to the encoder twice to obtain two text vector representations: embedding1, embedding2. At this time, two text vector representations can be used as a pair of positive examples, and the similarity measurement loss function is optimized based on contrast learning, so that the generalization capability and the migration capability of the encoder are represented. Thereafter, weighting may be performed to represent rationality and flexibility with the weighting coefficients as super-parameters. Wherein the similarity measure loss function is formula (3):

for the above part (2), the answers in the long question-answering text (q+a1+a2+a3) can be split into multiple answers, and the question Q is combined with each answer to obtain multiple short question-answering texts: q+a1; q+a2; q+a3. Then, a cleaning mechanism can be adopted to clean each short question-answering text so as to obtain a plurality of cleaned short question-answering texts: Q+A1 ^* ；Q+A2 ^* ；Q+A3 ^* . In one embodiment, preset terms, adverbs and punctuations in each sentence can be cleaned to highlight important features in each sentence. Then, the product of the number of times of vocabulary appearance and the reciprocal of the ratio of the total word number to the document number of the hierarchy can be calculated according to the hierarchy, then the weight is divided into weight-balancing devices, and different weights are given to different hierarchies according to the importance and the length, so that the method is obtained: QA1 ^* +λ ₂₁ T1 ^* +λ ₂₂ T11 ^* ，QA2 ^* +λ ₂₁ T1 ^* +λ ₂₂ T11 ^* ，QA3 ^* +λ ₂₁ T1 ^* +λ ₂₂ T11 ^* . In order to avoid calculation deviation caused by overlarge length difference of short question-answering text, euclidean-cosine similarity can be adopted and weighted by maximum value, so that comparability and stability of the short question-answering text can be improved. That is, the similarity of the long text per sentence answer to its question and the loss value at the time of answer divergence may be determined using the formula (1), the formula (1) being:

for the above part (3), the answer may be split into a plurality of answers, and mask processing, i.e., mask replacement, may be performed respectively, where the mask default ratio is 15% of the total length. Thereafter, the masked words in each sentence may be replaced by a generator to generate a new sentence. Then, the discriminator can judge the new and old sentences as negative examples, obtain the loss value through cross entropy, and average and weight the loss value of the question-answer pair. That is, the loss value of the portion may be determined using equation (2), where equation (2) is:

in one embodiment, step1: question-answer pairs in the form of xlsx or txt can be input, and questions and answers therein can be taken out; step2: combining the questions and the answers, inputting the questions and the answers into an interpreter twice to obtain a pair of correction examples, and performing contrast learning; step3: the relative word frequency value of each level document can be calculated and multiplied by the inverse document value (rareness), and then weight coefficients can be respectively set according to different levels so as to reflect the requirements, standards, lengths, positions and importance of key words; step4: for filename segmentation, text preprocessing can be performed to remove auxiliary words, adverbs and punctuation, and then cleaning operation can be performed. Then, the same operation can be carried out on the names of the secondary catalogs, each question and the answer sentence corresponding to the question serve as a pair of positive examples, and an European-cosine function can be calculated; step5: the answers are split according to sentences, mask operation is carried out by using masks respectively, then answers are generated, the answers are compared to serve as a plurality of pairs of negative examples, and cross entropy of each pair is averaged. step6: step3, step4, and step5 are weighted and the model is trained to minimize the total loss function value.

Fig. 1 and 2 are flow diagrams of a training method for a question and answer retrieval model in one embodiment. It should be understood that, although the steps in the flowcharts of fig. 1 and 2 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 and 2 may include multiple sub-steps or phases that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the sub-steps or phases are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of other steps or other steps.

In one embodiment, there is provided a training apparatus for a question-answer retrieval model, comprising:

a memory configured to store instructions; and

In one embodiment, a storage medium is provided having a program stored thereon that when executed by a processor implements the training method for question and answer retrieval model described above.

In one embodiment, a processor is provided for running a program, wherein the program runs on performing the training method for the question-answer retrieval model described above.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 3. The computer device includes a processor a01, a network interface a02, a memory (not shown) and a database (not shown) connected by a system bus. Wherein the processor a01 of the computer device is adapted to provide computing and control capabilities. The memory of the computer device includes internal memory a03 and nonvolatile storage medium a04. The nonvolatile storage medium a04 stores an operating system B01, a computer program B02, and a database (not shown in the figure). The internal memory a03 provides an environment for the operation of the operating system B01 and the computer program B02 in the nonvolatile storage medium a04. The database of the computer device is used for storing data such as total loss values. The network interface a02 of the computer device is used for communication with an external terminal through a network connection. The computer program B02 is executed by the processor a01 to implement a training method for a question-answer retrieval model.

It will be appreciated by those skilled in the art that the structure shown in fig. 3 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

The embodiment of the application provides equipment, which comprises a processor, a memory and a program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the following steps: determining a training data set, wherein the training data set comprises a plurality of long question-answer texts, each long question-answer text carries a hierarchical label of a file where the long question-answer text is located, each long question-answer text comprises a question and an answer, and the answer at least comprises an answer sentence; sequentially inputting each long question-answer text to an encoder, and determining a first loss value of the training data set based on the output of the encoder and a first loss function; aiming at each long question-answer text, splitting the long question-answer text to obtain a plurality of short question-answer texts, wherein each short question-answer text comprises one answer in questions and answers; determining a second loss value for the training data set based on the second loss function and the entire short question-answering text; inputting a plurality of questions included in each long question-and-answer text to an encoder and a countermeasure network to determine a third loss value of the training data set through an output of the encoder, an output of the countermeasure network, and a third loss function; determining a total loss value of the training data set according to the first loss value, the second loss value and the third loss value; and respectively adjusting the weight coefficients of the first loss function, the second loss function and the third loss function under the condition that the total loss value does not meet the preset condition, and returning to the step of determining the training data set until the redetermined total loss value meets the preset condition.

wherein, loss ₁ Refers to the second loss value of the second loss function, QA refers to the total number of long question-answer texts, K refers to the total number of keywords in the question included in each target text, lambda ₂₁ Refers to a first weighting weight corresponding to a primary label,means that the word frequency value of each keyword under the first-level label is +.>Refers to the inverse text frequency index, lambda of each keyword under the primary label ₂₂ Means that the second weighting weight corresponding to the second level tag,/is>Means that the word frequency value of each keyword under the secondary label is ++>Refers to the inverse text frequency index, lambda, of each keyword under the secondary label ₂ The weight coefficient of the second loss function is that i refers to the i-th washed short question-answering text, the value range of i is 1-Z, Z refers to the total number of the washed short question-answering texts, Q refers to the vector corresponding to the question in each washed short question-answering text, y (i) refers to the vector corresponding to the answer in each washed short question-answering text, sim (Q, y (i)) refers to the cosine similarity between the question in each washed short question-answering text and the answer thereof.

wherein, loss ₃ Refers to a first loss value, lambda, of a first loss function ₁ Refers to the weight coefficient of the first loss function, QA refers to the total number of long question-answer texts,refers to the number of positive examples generated by the mth long question-answering text, (k) _m ,k _n ) Refers to the mth long question-answering text and the nth long questionThe number of negative examples generated by the answer text, τ refers to the temperature coefficient, and q refers to the similarity calculation parameter.

The present application also provides a computer program product adapted to perform a program which, when executed on a data processing apparatus, initializes the training method steps for a question-answer retrieval model.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. A training method for a question-answer search model, the question-answer search model comprising an encoder and an countermeasure network, the training method comprising:

determining a training data set, wherein the training data set comprises a plurality of long question-answer texts, each long question-answer text carries a hierarchical label of a file where the long question-answer text is located, each long question-answer text comprises a question and an answer, and the answer at least comprises one answer sentence;

Sequentially inputting each long question-answer text to the encoder, and determining a first loss value of the training data set based on the output of the encoder and a first loss function;

splitting the long question-answering text aiming at each long question-answering text to obtain a plurality of short question-answering texts, wherein each short question-answering text comprises one answer in the questions and the answers;

determining a second loss value for the training dataset based on a second loss function and the entire short question-answer text;

inputting a plurality of questions included in each long question-and-answer text to the encoder and the countermeasure network to determine a third loss value of the training dataset through an output of the encoder, an output of the countermeasure network, and a third loss function;

determining a total loss value of the training data set from the first loss value, the second loss value, and the third loss value;

2. The training method for a question-answer retrieval model according to claim 1, wherein said determining a second loss value of said training data set based on a second loss function and all short question-answer text comprises:

cleaning the answers in each short question and answer text;

determining cosine similarity between the questions and the answers in each cleaned short question-answer text;

randomly selecting one short question-answering text from a plurality of cleaned short question-answering texts included in each long question-answering text as a target text;

determining a word frequency value and an inverse text frequency index of each keyword in a question included in each target text under each hierarchical label;

and determining a second loss value of the training data set according to a plurality of word frequency values and a plurality of inverse text frequency indexes corresponding to all target texts and all cosine similarity based on the second loss function.

3. The training method for a question-answering retrieval model according to claim 2, wherein the determining, based on the second loss function, the second loss value of the training data set according to a plurality of word frequency values and a plurality of inverse text frequency indexes corresponding to all target texts, and all cosine similarities includes:

Determining a total word frequency value and a total inverse text frequency index of the training data set according to a plurality of word frequency values and a plurality of inverse text frequency indexes corresponding to all target texts;

aiming at each long question-answering text, determining the cosine similarity with the maximum numerical value in a plurality of cosine similarities;

and determining the second loss value based on the second loss function according to the total word frequency value, the total inverse text frequency index and the cosine similarity maximum value of each long question-answer text.

4. The training method for a question-answer retrieval model according to claim 1, wherein each long question-answer text carries a primary label of a primary file and a secondary label of a secondary file, and the expression of the second loss function is defined by formula (1):

5. The training method for a question-answer retrieval model according to claim 1, wherein the challenge network includes a generator and a discriminator, the inputting a plurality of questions included in each long question-answer text to the encoder and the challenge network to determine a third loss value of the training data set by an output of the encoder, an output of the challenge network, and a third loss function includes:

masking each answer according to a preset mask proportion for a plurality of answers corresponding to each long question answer text to obtain a plurality of masked answers;

Inputting each masked sentence to the generator so that the generator outputs a new sentence corresponding to each masked sentence;

inputting each answer to the encoder for a plurality of answers corresponding to each long question answer text, so that the encoder outputs a first text vector corresponding to each answer;

inputting the second text vector of each new answer sentence and the first text vector of the corresponding answer sentence to the discriminator so that the discriminator outputs a discrimination result for each new answer sentence;

determining the accuracy of the generator for each long question-answering text according to the discrimination results of all the new questions for all the new questions corresponding to each long question-answering text;

a third loss value for the training data set is determined based on the third loss function and an overall accuracy.

6. The training method for a question-answer retrieval model according to claim 5, wherein the expression of the third loss function is defined by formula (2):

7. The training method for a question-answer retrieval model according to claim 1, wherein said sequentially inputting each long question-answer text to the encoder and determining a first loss value of the training data set based on an output of the encoder and a first loss function comprises:

for each long question-answering text, sequentially inputting the long question-answering text into the encoder according to preset times, so that the encoder sequentially outputs text vectors corresponding to the long question-answering text;

determining a vector pair consisting of text vectors of the same long question-answering text as a positive example, and determining a vector pair consisting of text vectors of different long question-answering texts as a negative example;

and determining a first loss value of the training data set based on the first loss function according to the number of positive examples and the number of negative examples corresponding to all long question-answering texts.

8. The training method for a question-answer retrieval model according to claim 7, wherein the expression of the first loss function is defined by formula (3):

wherein, loss ₃ Refers to a first loss value, lambda, of a first loss function ₁ Refers to the weight coefficient of the first loss function, QA refers to the total number of long question-answer texts,refers to the number of positive examples generated by the mth long question-answering text, (k) _m ,k _n ) The number of negative examples generated by the mth long question-answering text and the nth long question-answering text is referred to as tau, the temperature coefficient is referred to as tau, and q is referred to as a similarity calculation parameter.

9. A training device for a question-answer retrieval model, the training device comprising:

a memory configured to store instructions; and

a processor configured to invoke the instructions from the memory and when executing the instructions is capable of implementing the method for training of a question-answer retrieval model according to any of claims 1 to 8.

10. A machine-readable storage medium having stored thereon instructions for causing a machine to perform the training method for a question-answer retrieval model according to any one of claims 1 to 8.