CN112580343A

CN112580343A - Model generation method, question and answer quality judgment method, device, equipment and medium

Info

Publication number: CN112580343A
Application number: CN202011210093.2A
Authority: CN
Inventors: 张新松; 柴琛林; 李航
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2020-11-03
Filing date: 2020-11-03
Publication date: 2021-03-30

Abstract

The embodiment of the disclosure discloses a model generation method, a response quality judgment method, a device, equipment and a medium, wherein the model generation method comprises the following steps: obtaining a preset number of question-answer pair sample sets, and analyzing sample data in each question-answer pair sample set to obtain word vectors; inputting the word vectors into a pre-training language model to obtain the expression vector of each question-answer pair sample; determining the likelihood probability that the prediction label of each question-answer pair sample is a correct label based on the expression vector, and calculating a loss function according to the prediction label of the question-answer pair sample corresponding to the maximum value in the likelihood probability; and generating a question-answer pair quality judgment model when the loss function result meets a preset condition. The method and the device solve the problem that the accuracy of the question and answer quality judgment is low under the condition that the content of the question and answer is large or complex, realize the judgment of the question and answer quality under the condition that the content of the question and answer is complex, and improve the accuracy of the question and answer quality judgment.

Description

Model generation method, question and answer quality judgment method, device, equipment and medium

Technical Field

The embodiment of the disclosure relates to the field of computer application, in particular to a model generation method, a response quality judgment method, a device, equipment and a medium.

Background

With the development of the internet, a plurality of interactive question and answer platforms appear, and users can acquire knowledge or share knowledge. And the platform audits the answers to the questions and displays the answers which are audited in the question-answering platform.

In the prior art, the quality of question and answer pairs is generally judged from the aspects of manual rules, feature engineering, similarity matching, semantic consistency and the like. However, in the approved question-answer pairs, for the question-answer text pairs with more contents and more complexity, a large number of answers with low quality are included, so that the case of answering questions frequently exists, and the accuracy of the question-answer quality judgment needs to be further improved.

BRIEF SUMMARY OF THE PRESENT DISCLOSURE

The embodiment of the disclosure provides a model generation method, a question and answer quality judgment device and a question and answer quality judgment medium, so that the question and answer quality is judged under the condition that the question and answer content is complex, and the accuracy of the question and answer quality judgment is improved.

In a first aspect, an embodiment of the present disclosure provides a model generation method, where the method includes:

obtaining a preset number of question-answer pair sample sets, and analyzing sample data in each question-answer pair sample set to obtain word vectors;

inputting the word vectors into a pre-training language model to obtain the expression vector of each question-answer pair sample;

determining the likelihood probability that the prediction label of each question-answer pair sample is a correct label based on the expression vector, and calculating a loss function according to the prediction label of the question-answer pair sample corresponding to the maximum value in the likelihood probability;

and generating a question-answer pair quality judgment model when the loss function result meets a preset condition.

In a second aspect, an embodiment of the present disclosure further provides a method for determining question and answer quality, where the method includes:

obtaining a question-answer pair text to be judged, and generating a question-answer pair quality judgment model based on the model generation method of any embodiment of the disclosure;

and inputting the question-answer pair text to be judged into the question-answer pair quality judgment model to obtain a quality judgment result of the question-answer pair to be judged.

In a third aspect, an embodiment of the present disclosure further provides a model generation apparatus, where the apparatus includes:

the first sample analysis module is used for acquiring a preset number of question-answer pair sample sets and analyzing sample data in each question-answer pair sample set to obtain word vectors;

the second sample analysis module is used for inputting the word vectors into a pre-training language model to obtain the expression vector of each question and answer pair sample;

the third sample analysis module is used for determining the likelihood probability that the prediction label of each question-answer pair sample is a correct label based on the expression vector, and calculating a loss function according to the prediction label of the question-answer pair sample corresponding to the maximum value in the likelihood probability;

and the model generation module is used for generating a question-answer pair quality judgment model when the loss function result meets a preset condition.

In a fourth aspect, an embodiment of the present disclosure further provides a device for determining question and answer quality, where the device includes:

the text and model acquisition module is used for acquiring a question-answer pair text to be judged and generating a question-answer pair quality judgment model based on the model generation method of any one embodiment of the disclosure;

and the question-answer quality judgment module is used for inputting the question-answer pair text to be judged into the question-answer pair quality judgment model to obtain a quality judgment result of the question-answer pair to be judged.

In a fifth aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:

one or more processors;

a memory for storing one or more programs;

when the one or more programs are executed by the one or more processors, the one or more processors implement the model generation method or the question and answer quality determination method according to any one of the embodiments of the present disclosure.

In a sixth aspect, the embodiments of the present disclosure further provide a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the model generation method or the question-answer quality determination method as described in any one of the embodiments of the present disclosure.

The method comprises the steps of taking a set containing a plurality of question-answer pair samples as a sample input unit, carrying out semantic analysis on the question-answer pair samples in the set, further calculating the probability value that a prediction label of each question-answer pair in the set is a real label, selecting the question-answer pair with the highest probability value as optimized model training data, adjusting the parameters of the model, and generating a question-answer pair quality judgment model when a loss function result meets a preset condition. The problem that the quality accuracy of the questions and answers of the complicated question-answer pairs is low in the prior art is solved, the stability of the generated question-answer pair quality judgment model is enhanced, and the judgment accuracy of the question-answer pair quality judgment model under the condition that the question-answer content is complicated is higher.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

FIG. 1 is a flow chart of a model generation method in a first embodiment of the disclosure;

FIG. 2 is a flow chart of a model generation method in a second embodiment of the disclosure;

fig. 3 is a schematic structural diagram of a question-answer quality determination model in the second embodiment of the present disclosure;

fig. 4 is a flowchart of a question-answer quality determination method in the third embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a model generation apparatus in a fourth embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of a question-answer quality determination device in the fifth embodiment of the present disclosure

Fig. 7 is a schematic structural diagram of an electronic device in a sixth embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Example one

Fig. 1 is a flowchart illustrating a model generation method provided in an embodiment of the present disclosure, where the embodiment of the present disclosure is applicable to a case where a question-answer quality determination model is trained on data based on a large number of question-answers, and the method may be implemented by a model generation apparatus, and may be specifically implemented by software and/or hardware in an electronic device.

As shown in fig. 1, the model generation method provided in the embodiment of the present disclosure includes the following steps:

s110, obtaining a preset number of question-answer pair sample sets, and analyzing sample data in each question-answer pair sample set to obtain word vectors.

Specifically, the question-answer pair sample data refers to question data and answer data in a platform for question-answer and knowledge sharing, and multiple answers may be provided for one question. Then, the question and each answer form a plurality of question-answer pair data. The question-answer pair sample data can also be a question and a corresponding answer in a book, for example, a post-lesson question in a Chinese book, the answers of the post-lesson question include standard answers in a teaching plan, and the corresponding answer given by each student, and a question and a corresponding answer form a question-answer pair sample.

The question-answer pair samples in this embodiment are samples labeled with data in advance, and each question-answer pair sample has a question-answer label or a question-answer label. The data labeling mode comprises manual labeling, namely, the label corresponding to a question-answer pair sample is identified and judged, or the label corresponding to the question-answer pair is obtained from a standard and credible platform.

Further, in a sample set, question-answer pair samples with the same label are contained. The number of the question-answer pair samples in the set can be comprehensively determined into a numerical value according to the calculated amount in the model generation process and the calculation capacity of hardware, or indexes such as model training efficiency and the like. When analyzing the question-answer pair sample data, the multiple question-answer pair samples in one sample set are also analyzed simultaneously, which is similar to batch synchronous processing. The specific data analysis process comprises the following steps: the sample is tokenized for each question and answer, e.g., "is a section of a tree today? Instead, tomorrow is, after semantic word segmentation, "today/yes/tree section/do/not/tomorrow/yes" can be obtained. Then, the vocabulary of the pre-trained language model is queried based on the word segmentation results to initialize a representation vector for each word, e.g., "today" for the corresponding representation vector "XXXXX".

The pre-training language model adopts a self-supervision learning method, a large number of unsupervised texts are sent into the model for learning, a universal pre-training model can be obtained, the subsequent text language processing process is facilitated, at present, a word list corresponding to the pre-training language model is a mature word vector database with rich content, word segmentation results are mapped into the word list of the pre-training language model, the help of high-frequency co-occurrence world knowledge can be obtained, and the understanding of the sample semantics by question answering and the subsequent question answering quality judgment are facilitated.

The pre-training language model comprises an autoregressive language model and an autorecoding language model, and can be selected according to the specific application scene characteristics. In this embodiment, the pre-training language model may be applied to the question-answer quality determination. The pre-training language model aims to utilize large-scale unmarked corpus training to obtain the Representation of the text containing rich semantic information, namely the semantic Representation of the text, and then the semantic Representation of the text is finely adjusted in a specific natural language processing task and finally applied to the natural language processing task. In this embodiment, the natural language processing task determines whether the question-answer pair is a question-answer or not.

And S120, inputting the word vectors into a pre-training language model to obtain the expression vector of each question-answer pair sample.

Through the mapping from the question-answer pair sample molecular result to the pre-training language model word list in the steps, the word vector of each Chinese word in the word segmentation result can be obtained. And then, inputting the word vector of each question-answer pair sample into a pre-training language model to obtain a semantic expression vector of the question-answer pair sample.

In this embodiment, the questions and the answers in the question-answer pairs are spliced together and processed as a whole, so that compared with the case that the questions and the answers are respectively modeled, the method can better capture the effective vocabulary dependence between the questions and the answers, and can have higher accuracy in the judgment of the complicated question answering situation. Specifically, a "complex question-and-answer situation" is a situation when the question text and the answer text have strong correlation. In this case, it is difficult for the prior art to determine whether or not there is a non-question. For example: what is the efficacy of black rice? The purple rice is a variety of rice, belongs to the waxy rice class, and is divided into purple polished round-grained rice and purple waxy rice, and generally, the purple polished glutinous rice is black polished glutinous rice; the black rice is a black rice processing product, belongs to the japonica rice class, has two types of grain types of indica rice and japonica rice, and is non-glutinous rice.

S130, determining the likelihood probability that the prediction label of each question-answer pair sample is the correct label based on the expression vector, and calculating a loss function according to the prediction label of the question-answer pair sample corresponding to the maximum value in the likelihood probability.

Specifically, the representation vector of each question-answer pair sample in the set can be input into a fully-connected neural network, and the activation function of the neural network is a Softmax function. And calculating the probability that the prediction label of each question-answer pair sample is the same as the real label through a fully-connected neural network.

Furthermore, the probability values corresponding to the question-answer pair samples in a question-answer pair sample set can be ranked, and the question-answer pair corresponding to the maximum probability value is used as an optimal model training sample to adjust the parameters of the neural network. When the question answering is carried out to label the sample data, the labeling cost is high, the quality is difficult to control, and whether a plurality of questions are answered or not answered needs to be subjectively judged. Therefore, the training data of the answers cannot guarantee that the labeled labels are completely correct, and a large amount of noise data, namely labels with wrong labels, exists. In this embodiment, the loss function is calculated for the question-answer pair sample corresponding to the maximum probability value of the prediction label as the label in the question-answer pair samples, so as to adjust the model parameters, thereby improving the model effect and effectively reducing noise in the model training stage (i.e., the model generation process).

And S140, generating a question-answer pair quality judgment model when the loss function result meets a preset condition.

The model generation process is to continuously adjust model parameters through the learning of samples so as to enable the model function to reach an optimized state. Specifically, in this embodiment, the loss function is calculated by predicting the question-answer pair sample with the highest confidence of the result until the loss function converges on the preset numerical condition, and the model parameter is determined, that is, the question-answer pair quality determination model is generated.

According to the technical scheme, a set containing a plurality of question-answer pair samples is used as a sample input unit, semantic analysis is conducted on the question-answer pair samples in the set, the probability value that the prediction label of each question-answer pair in the set is a real label is further calculated, the question-answer pair with the highest probability value is selected as the optimal model training data, the parameters of the model are adjusted, and a question-answer pair quality judgment model is generated when the loss function result meets the preset condition. The problem that the quality accuracy of the questions and answers of the complicated question-answer pairs is low in the prior art is solved, the stability of the generated question-answer pair quality judgment model is enhanced, and the judgment accuracy of the question-answer pair quality judgment model under the condition that the question-answer content is complicated is higher.

Example two

The present embodiment further describes the model generation process from the structural perspective of the model on the basis of the above embodiment, which belongs to the same inventive concept as the model training method proposed by the above embodiment, and the technical details not described in detail in the present embodiment can be referred to the above embodiment.

Fig. 2 shows a flowchart of a model generation method provided in the second embodiment of the present disclosure, where the model generation method provided in the second embodiment of the present disclosure includes the following steps:

s210, inputting a group of question-answer pair samples containing a preset number by taking the group as a unit through an input layer of the model to be trained.

And the input layer of the model to be trained is used for inputting the processing object of the model to be trained. In this embodiment, the model to be trained is a question-answer quality determination model, and the object processed by the model is question-answer pair text data. In the training process of the model, a group of question-answer pair samples are used as a unit of training sample, and the question-answer pairs in each group are the question-answer pairs with consistent labels. The number of question-answer pairs in each group of samples can be the same or different, and the specific number of question-answer pairs can be determined according to the hyper-parameter. The hyper-parameter is a parameter set before the learning process is started, and is not parameter data obtained by model training. In general, the hyper-parameters need to be optimized, so that the performance and effect of model training can be improved.

Illustratively, in the model structure diagram shown in fig. 3, a set of question-answer pairs labeled "answer questions," what is the efficacy of black rice? Purple rice is … … rice "and" what is the efficacy of black rice? Black rice, … … "to nourish yin and tonify kidney.

S220, performing semantic word segmentation on each question-answer pair through a representation layer of the model to be trained, and determining a representation vector of each word in a word segmentation result.

And the representation layer of the model to be trained is used for initializing the input question-answer pair text into the representation vector of each word in the question-answer pair text, and inputting the question-answer pair text into the pre-training language model in a format matched with the pre-training language model.

Specifically, first, a sample is subjected to word segmentation for each question and answer; then, according to the word segmentation result, a word list of the pre-training language model is queried to initialize a representation vector of each word. Through the mapping of words and word lists, the word vector representation of the question-answer pair text shown in the representation layer in fig. 3 is obtained. Wherein, the first token of each sequence is embedded in a special classification [ CLS ] all the time, and one token represents a word vector; when sentence pairs are packed into a sequence, they are separated by a special token SEP.

And S230, calculating the expression vector of each question and answer pair sample through a pre-training language model coding layer of the model to be trained.

The encoding layer of the pre-training language model is an Encoder Encoder in a Transformer layer, and the Transformer layer is formed by stacking a plurality of encoders and decoders and is used for converting input linguistic data into feature vectors. The input corpus is the word vector representation of each question-answer pair sample, and the feature vector can represent the feature vector of the text semantics of the corresponding question-answer pair.

S240, in a prediction layer of the model to be trained, determining the likelihood probability that the prediction label of each question-answer pair sample is the correct label based on the expression vector, selecting a target question-answer pair sample through a multi-example learning mechanism, and calculating a loss function.

And the prediction layer is used for predicting the sample label of each question and answer to obtain a prediction label. Specifically, a full-link layer is superimposed on the basis of the pre-training language model, the probability of which type of label the predicted label is calculated, and the calculation result meets the requirement of the loss function.

In the process of model training, namely the process of adjusting the parameters of the model to be trained. In this embodiment, a multi-instance learning mechanism is adopted, the probability values corresponding to the question-answer pair samples in a question-answer pair sample set are ranked, the question-answer pair corresponding to the maximum probability value is used as an optimal model training sample, and the parameters of the model to be trained are adjusted. When the question answering is carried out to label the sample data, the labeling cost is high, the quality is difficult to control, and whether a plurality of questions are answered or not answered needs to be subjectively judged. Therefore, the training data of the answers cannot guarantee that the labeled labels are completely correct, and a large amount of noise data, namely labels with wrong labels, exists. In this embodiment, the loss function is calculated for the question-answer pair sample corresponding to the maximum probability value of the prediction label as the label in the question-answer pair samples, so as to adjust the model parameters, thereby improving the model effect and effectively reducing noise in the model training stage (i.e., the model generation process).

As shown in FIG. 3, in the prediction layer, a vector representation (e.g., r) of question-answer pairs output by the pre-trained language model coding layer is input_t，r_m) The vector representation of the challenge-answer pair is input to a fully-connected network to obtain a probability value (e.g., p) that the challenge-answer pair predicts that the label is a true label_t，p_m). It should be noted here that, in the model training phase, the number of the question-answer pair representative vectors input into the fully-connected network is multiple, the specific number is consistent with the number of the question-answer pairs in a group of sample sets input into the input layer, the corresponding obtained probability values are multiple, further, the representative vector of the sample pair corresponding to the maximum value in the probability values is selected to perform the calculation of the loss function, the model parameters are adjusted, and finally, the trained question-answer quality judgment model is obtained. In the process of model application, in most cases, only one question-answer pair is judged at a time, namely only one question-answer pair finally represents a vector input to a full-connection network, and a model output result is obtained.

And S250, when the loss function result meets a preset condition, finishing training of the model to be trained to obtain a question-answer pair quality judgment model.

And when the parameters of the model to be trained can be loss function convergence and meet the preset conditions, completing the training process of the model to be trained to obtain a question-answer quality judgment model. When there is a question-answer pair which needs to be judged whether to be answered or not, the question-answer pair can be input into a question-answer quality judgment model to obtain a label of the question-answer pair, namely 'answering the question' or 'answering the question'.

According to the technical scheme of the embodiment, the input layer, the presentation layer, the pre-training language model conversion layer and the prediction layer are arranged for the question and answer quality judgment model to be trained, the input question and answer pair text is processed step by step to obtain the prediction labels of the question and answer pair, the stability of the question and answer quality judgment model is guaranteed through a multi-example learning mechanism on the prediction layer, and the accuracy of the output result of the model is improved. The problem that the accuracy of the question and answer quality of a complicated question and answer pair is low in the prior art is solved, the question and answer quality is judged under the condition that the question and answer content is complicated, and the accuracy of the question and answer quality judgment is improved.

EXAMPLE III

Fig. 4 is a flowchart illustrating a question and answer quality determination method provided in the third embodiment of the present disclosure, which may be implemented by a question and answer quality determination device, and may be implemented by software and/or hardware in a mobile terminal.

As shown in fig. 4, the question answering quality determination method includes the following steps:

s310, obtaining a question-answer pair text to be judged, and generating a question-answer pair quality judgment model based on the model generation method of any embodiment of the disclosure.

Specifically, the question-answer sample text may be question data and answer data in a platform for question-answer and knowledge sharing. Based on the question-answer pair quality judgment model generated by the model generation method in any of the embodiments, it can be judged whether a question-answer pair is a question-answer pair that is not asked.

The question-answer pair quality judgment model is used for analyzing and judging a whole question and answer text, can capture effective vocabulary dependence between questions and answers, and can better judge whether the questions are answered or not in a complex situation. For example, for the question-answer pair "question: what is the efficacy of black rice? The answer is: the purple rice is a variety of rice, belongs to the waxy rice class, and is divided into purple polished round-grained rice and purple waxy rice, and generally, the purple polished glutinous rice is black polished glutinous rice; the black rice is a black rice processing product, belongs to the japonica rice class, has two types of indica rice and japonica rice in grain type, and is non-glutinous rice.

Moreover, the question-answer quality judgment model is generated based on the training of the pre-training language model, the vocabulary of the pre-training language model is utilized, the learnable predictions are rich, the semantic features of the text can be better captured, the help of high-frequency co-occurrence world knowledge can be obtained, and the question-answer quality judgment model assists in judging whether question-answer pairs are answered or not. For example, models can be learned of "nutrients" - "proteins, DHA"; "milk powder" to "baby", thereby successfully judging the problem: is it unknown what nutrients are in milk powder? And (3) answer: protein: whey protein has sleep-regulating neurotransmitters, which aid in the sleep of infants and promote brain development in infants.

In addition, in the training process of the question-answer quality judgment model, the influence of the noise of the training data set on the model can be reduced through the multi-example learning technology, so that the stability of the model is improved, and the confidence coefficient of the output result of the model is higher.

And S320, inputting the question-answer pair text to be judged into the question-answer pair quality judgment model to obtain a quality judgment result of the question-answer pair to be judged.

The question-answer pair text of whether the question-answer pair is to be judged to be an answer-answer question-answer pair or not is input into the question-answer pair quality judgment model, so that a judgment result can be obtained, namely whether the question-answer pair is 'answer-answer question' or 'answer and question'.

According to the technical scheme of the embodiment of the disclosure, the question-answer quality judgment model is obtained by inputting the question-answer pair to be judged into the model generation method based on the embodiment, so that the judgment result of the question-answer pair to be judged is obtained, and the accuracy confidence of the result is higher. The problem that the accuracy of the question and answer quality of a complicated question and answer pair is low in the prior art is solved, the question and answer quality is judged under the condition that the question and answer content is complicated, and the accuracy of the question and answer quality judgment is improved.

Example four

Fig. 5 is a schematic structural diagram of a model generation device according to a fourth embodiment of the present disclosure, where the fourth embodiment of the present disclosure is applicable, and the question-answer quality determination device according to the present disclosure may implement the model generation method according to the fourth embodiment.

As shown in the drawings, the model generation apparatus in the embodiment of the present disclosure includes: a first sample analysis module 410, a second sample analysis module 420, a third sample analysis module 430, and a model generation module 440.

The first sample analysis module 410 is configured to obtain a preset number of question-answer pair sample sets, and analyze sample data in each question-answer pair sample set to obtain a word vector; the second sample analysis module 420 is configured to input the word vector into a pre-training language model to obtain a representation vector of each question and answer pair sample; a third sample analysis module 430, configured to determine, based on the representation vector, a likelihood probability that a prediction label of each question-answer pair sample is a correct label, and calculate a loss function according to the prediction label of the question-answer pair sample corresponding to a maximum value in the likelihood probabilities; and the model generating module 440 is configured to generate a question-answer pair quality judgment model when the loss function result meets a preset condition.

Optionally, the first sample analysis module is specifically configured to: acquiring question-answer pair texts with the same labels; dividing the question-answer pair texts with the same labels into a preset number of question-answer pair sample sets, wherein the labels comprise question-answer labels and question-answer labels.

Optionally, the first sample analysis module is further configured to: and matching the question-answer pair sample with the word list of the pre-training language model to obtain a word vector corresponding to the sample data in each question-answer pair sample set.

Optionally, the third sample analysis module is specifically configured to: and inputting each expression vector into a full-connection network to obtain the likelihood probability that the prediction label of each corresponding question-answer pair sample data pair is the correct classification label.

Optionally, the third sample analysis module implements corresponding functions through a multi-instance learning mechanism in the model generation process.

Optionally, the second sample analysis module implements a corresponding function through a pre-training language model coding layer of the question-answer pair quality judgment model.

The model generation device provided by the embodiment of the disclosure is the same as the model generation method provided by the embodiment, and technical details which are not described in detail in the embodiment of the disclosure can be referred to the embodiment, and the embodiment of the disclosure has the same beneficial effects as the embodiment.

EXAMPLE five

Fig. 6 is a schematic structural diagram of a question and answer quality determination device according to a fifth embodiment of the present disclosure, which is applicable to a case where whether a question and an answer pair is a question or not is determined.

As shown in fig. 6, the device for determining question and answer quality in the embodiment of the present disclosure includes: a text and model acquisition module 510 and a question-answer quality determination module 520.

The text and model obtaining module 510 is configured to obtain a question-answer pair text to be determined, and generate a question-answer pair quality determination model based on a model generation method according to any embodiment of the present disclosure; and the question-answer quality judging module 520 is configured to input the question-answer pair text to be judged into the question-answer pair quality judging model, so as to obtain a quality judging result of the question-answer pair to be judged.

The question and answer quality judging device provided by the embodiment of the disclosure and the question and answer quality judging method provided by the embodiment belong to the same inventive concept, technical details which are not described in detail in the embodiment of the disclosure can be referred to the embodiment, and the embodiment of the disclosure have the same beneficial effects.

EXAMPLE six

Referring now to FIG. 7, a block diagram of an electronic device 600 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage device 606 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 604 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 606 including, for example, magnetic tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 7 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such embodiments, the computer program may be downloaded and installed from a network through the communication device 609, or installed from the storage device 606, or installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: obtaining a preset number of question-answer pair sample sets, and analyzing sample data in each question-answer pair sample set to obtain word vectors; inputting the word vectors into a pre-training language model to obtain the expression vector of each question-answer pair sample; determining the likelihood probability that the prediction label of each question-answer pair sample is a correct label based on the expression vector, and calculating a loss function according to the prediction label of the question-answer pair sample corresponding to the maximum value in the likelihood probability; and generating a question-answer pair quality judgment model when the loss function result meets a preset condition.

Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:

obtaining a question-answer pair text to be judged, and generating a question-answer pair quality judgment model based on the model generation method of any embodiment of the disclosure; and inputting the question-answer pair text to be judged into the question-answer pair quality judgment model to obtain a quality judgment result of the question-answer pair to be judged.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, [ example one ] there is provided a model generation method comprising:

In accordance with one or more embodiments of the present disclosure, [ example two ] there is provided the method of example one, further comprising:

the obtaining of the preset number of question-answer pair sample sets includes:

acquiring question-answer pair texts with the same labels;

dividing the question-answer pair texts with the same labels into a preset number of question-answer pair sample sets, wherein the labels comprise question-answer labels and question-answer labels.

In accordance with one or more embodiments of the present disclosure, [ example three ] there is provided the method of example one, further comprising:

analyzing sample data in each question-answer pair sample set to obtain a word vector, wherein the method comprises the following steps:

performing semantic word segmentation on the sample data to obtain word segmentation results;

and matching the word segmentation result with a word list of the pre-training language model to obtain a word vector corresponding to sample data in each question-answer pair sample set.

According to one or more embodiments of the present disclosure, [ example four ] there is provided the method of example three, further comprising:

the determining the likelihood probability that the predictive label of each sample data pair is the correct classification label based on the representation vector comprises:

and inputting each expression vector into a full-connection network to obtain the likelihood probability that the prediction label of each corresponding question-answer pair sample data pair is the correct classification label.

In accordance with one or more embodiments of the present disclosure, [ example five ] there is provided the method of example one, further comprising:

the likelihood probability that the prediction label of each question-answer pair sample is the correct label is determined based on the expression vector, and the loss function is calculated according to the prediction label of the question-answer pair sample corresponding to the maximum value in the likelihood probability, and the method is realized through a multi-example learning mechanism in the model generation process.

In accordance with one or more embodiments of the present disclosure, [ example six ] there is provided the method of example one, further comprising:

and inputting the word vectors into a pre-training language model to obtain the expression vector of each question-answer pair sample, wherein the expression vector is realized through a pre-training language model coding layer of the question-answer pair quality judgment model.

According to one or more embodiments of the present disclosure, [ example seven ] there is provided a question and answer quality determination method including:

obtaining a question-answer pair text to be judged, and generating a question-answer pair quality judgment model based on the model generation method in any example;

According to one or more embodiments of the present disclosure, [ example eight ] there is provided a model generation apparatus comprising:

In accordance with one or more embodiments of the present disclosure, [ example nine ] there is provided the apparatus of example eight, further comprising:

the first sample analysis module is configured to:

acquiring question-answer pair texts with the same labels;

In accordance with one or more embodiments of the present disclosure, [ example ten ] there is provided the apparatus of example eight, further comprising:

the first sample analysis module is further configured to:

In accordance with one or more embodiments of the present disclosure, [ example eleven ] there is provided the apparatus of example ten, further comprising:

a third sample analysis module to:

In accordance with one or more embodiments of the present disclosure, [ example twelve ] there is provided the apparatus of example eight, further comprising:

In accordance with one or more embodiments of the present disclosure, [ example thirteen ] provides the apparatus of example eight, further comprising:

According to one or more embodiments of the present disclosure, [ example fourteen ] there is provided the question-answer quality determination device, further comprising:

the text and model acquisition module is used for acquiring a question-answer pair text to be judged and generating a question-answer pair quality judgment model based on the model generation method of any one of claims 1 to 6;

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. A method of model generation, comprising:

2. The method of claim 1, wherein obtaining a preset number of sample sets of question-answer pairs comprises:

acquiring question-answer pair texts with the same labels;

3. The method of claim 1, wherein analyzing sample data in the sample set for each question and answer to obtain a word vector comprises:

4. The method of claim 3, wherein determining the likelihood probability that the predictive label for each sample data pair is a correct classification label based on the representation vector comprises:

5. The method according to any one of claims 1 to 4, wherein the determining of the likelihood probability that the prediction label of each question-answer pair sample is the correct label based on the expression vector and the calculating of the loss function according to the prediction label of the question-answer pair sample corresponding to the maximum value in the likelihood probability are realized by a multi-instance learning mechanism in a model generation process.

6. The method according to any one of claims 1 to 4, wherein the inputting of the word vectors into the pre-trained language model to obtain the representation vector of each question-answer pair sample is performed by a pre-trained language model coding layer of the question-answer pair quality judgment model.

7. A question-answer quality judgment method is characterized by comprising the following steps:

obtaining a question-answer pair text to be judged, and generating a question-answer pair quality judgment model based on the model generation method of any one of claims 1 to 6;

8. A model generation apparatus, comprising:

9. A question-answer quality determination device characterized by comprising:

10. An electronic device, characterized in that the electronic device comprises:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the model generation method of any one of claims 1-6, or the question-answer quality determination method of claim 7.

11. A computer storage medium on which a computer program is stored, the program, when executed by a processor, implementing the model generation method according to any one of claims 1 to 6 or the question-answer quality determination method according to claim 7.