CN114691850A

CN114691850A - Method for generating question-answer pairs, training method and device of neural network model

Info

Publication number: CN114691850A
Application number: CN202210352786.8A
Authority: CN
Inventors: 崔震; 张士存; 聂砂; 罗奕康; 熊衍琴; 朱志鹏
Original assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Current assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Priority date: 2022-03-31
Filing date: 2022-03-31
Publication date: 2022-07-01

Abstract

The present disclosure provides a method for generating question-answer pairs using a neural network model, which can be applied to the field of financial technology. The method for generating question-answer pairs by using the neural network model comprises the following steps: inputting the first target text data into a first network model to obtain an answer extraction result corresponding to the first target text data; inputting the second target text data and the answer extraction result into a second network model to obtain a problem result corresponding to the second target text data; and generating question-answer pairs based on the answer extraction results and the question results. The disclosure also provides a training method, device, equipment, storage medium and program product of the neural network model.

Description

Question-answer pair generating method, and training method and device of neural network model

Technical Field

The present disclosure relates to the field of artificial intelligence technology, which may be applied to the field of financial technology, and more particularly, to a method for generating question-answer pairs using a neural network model, a training method of a neural network model, an apparatus, an electronic device, a storage medium, and a program product.

Background

With the rise of intelligent question answering, the construction of a knowledge base becomes particularly important. The knowledge base should contain rich pairs of questions and answers to the content to support intelligent questions and answers.

However, the construction of the current intelligent question-answer knowledge base requires manual extraction of question-answer pairs in a corpus through experience, and the result of extracting the question-answer pairs by different personnel is different, so that the construction range of the knowledge base is easily limited, the manual processing mode has poor efficiency, and the extracted question-answer pairs are not complete enough.

Disclosure of Invention

In view of the foregoing, the present disclosure provides methods, apparatuses, devices, media, and program products for generating question-answer pairs using a neural network model that improve question-answer pair extraction efficiency.

According to a first aspect of the present disclosure, there is provided a method of generating question-answer pairs using a neural network model, the neural network model comprising a first network model and a second network model, the method comprising: inputting first target text data into the first network model to obtain an answer extraction result corresponding to the first target text data; inputting second target text data and the answer extraction result into the second network model to obtain a question result corresponding to the second target text data; and generating question-answer pairs based on the answer extraction result and the question result.

According to an embodiment of the present disclosure, the first target text data, and/or the second target text data is obtained by: acquiring text data; wherein the text data comprises text data of a standard document; acquiring relevant information of a standard document, wherein the relevant information comprises: a standard document and format information corresponding to the standard document; performing text decomposition processing on the text data based on the relevant information of the standard document to obtain a text decomposition processing result corresponding to the text data; obtaining one or more of the first target text data and the second target text data based on a text decomposition processing result corresponding to the text data; wherein the text decomposition process comprises a regularization method.

According to an embodiment of the present disclosure, the first network model comprises a language characterization model and the second network model comprises a sequence-to-sequence model.

According to an embodiment of the present disclosure, the sequence-to-sequence model includes a language characterization model and a language model.

A second aspect of the present disclosure provides a training method of a neural network model, including: constructing a neural network model to be trained based on the language representation model and the language model; analyzing the text data to obtain title data, text data corresponding to the title data and answer data; and taking the header data, the text data corresponding to the header data and the answer data as sample input data, and training the neural network model to be trained by adopting a machine learning algorithm.

According to an embodiment of the present disclosure, the answer data includes an answer extraction result; the answer extraction result is obtained by the following operations: taking the text data as sample input data; and training a language representation model to be trained by adopting a machine learning algorithm according to the sample input data to obtain an answer extraction result corresponding to the sample input data

A third aspect of the present disclosure provides an apparatus for generating question-answer pairs using a neural network model, the neural network model comprising a first network model and a second network model, the apparatus comprising: the answer determining module is used for inputting first target text data into the first network model to obtain an answer extraction result corresponding to the first target text data; the question determining module is used for inputting second target text data and the answer extraction result into the second network model to obtain a question result corresponding to the second target text data; and the question-answer pair generating module is used for generating question-answer pairs based on the answer extraction results and the question results.

A fourth aspect of the present disclosure provides a training apparatus for a neural network model, including: the building module is used for building a neural network model to be trained based on the language representation model and the language model; the analysis module is used for analyzing the text data to obtain title data, and text data and answer data corresponding to the title data; and the training module is used for taking the header data, the text data corresponding to the header data and the answer data as sample input data and training the neural network model to be trained by adopting a machine learning algorithm.

A fifth aspect of the present disclosure provides an electronic device, comprising: one or more processors; memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above-disclosed methods.

A sixth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-disclosed method.

A seventh aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the method disclosed above.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:

fig. 1 schematically illustrates an application scenario diagram of a method of generating question-answer pairs using a neural network model, a training method of a neural network model, an apparatus, a device, a medium, and a program product according to an embodiment of the present disclosure;

fig. 2 schematically illustrates a flow chart of a method of generating question-answer pairs using a neural network model in accordance with an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart for extracting answers using a language characterization model according to an embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart of a method of training a neural network model in accordance with an embodiment of the present disclosure;

FIG. 5 schematically shows an execution diagram of a sequence-to-sequence model according to an embodiment of the present disclosure;

fig. 6 schematically shows a block diagram of a structure of an apparatus for generating question-answer pairs using a neural network model according to an embodiment of the present disclosure;

FIG. 7 schematically shows a block diagram of a training apparatus for a neural network model according to an embodiment of the present disclosure; and

fig. 8 schematically shows a block diagram of an electronic device adapted to implement a method of generating question-answer pairs using a neural network model and/or a training method of a neural network model according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

Embodiments of the present disclosure provide a method and apparatus for generating challenge-response pairs using a neural network model, the neural network model comprising a first network model and a second network model, the method comprising: inputting the first target text data into a first network model to obtain an answer extraction result corresponding to the first target text data; inputting the second target text data and the answer extraction result into a second network model to obtain a problem result corresponding to the second target text data; and generating question-answer pairs based on the answer extraction results and the question results.

Fig. 1 schematically illustrates an application scenario diagram of a method for generating question-answer pairs using a neural network model, a training method of the neural network model, an apparatus, a device, a medium, and a program product according to an embodiment of the present disclosure.

As shown in fig. 1, the application scenario 100 according to this embodiment may include

terminal devices

101, 102, 103, a network 104 and a server 105. Network 104 is the medium used to provide communication links between

terminal devices

101, 102, 103 and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

terminal devices

101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that the method for generating question-answer pairs using a neural network model and/or the training method of the neural network model provided by the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the apparatus for generating question-answer pairs using a neural network model and/or the training apparatus for the neural network model provided by the embodiments of the present disclosure may be generally disposed in the server 105. The method for generating question-answer pairs using a neural network model and/or the training method of the neural network model provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105. Accordingly, the apparatus for generating question-answer pairs using a neural network model and/or the training apparatus of the neural network model provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

A method of generating a challenge-answer pair using a neural network model of the disclosed embodiments will be described in detail below with reference to fig. 2 based on the scenario described in fig. 1.

Fig. 2 schematically shows a flow chart of a method of generating question-answer pairs using a neural network model according to an embodiment of the present disclosure.

As shown in fig. 2, the embodiment includes operations S210 to S230, and the method for generating question-answer pairs using a neural network model may be performed by a server, the neural network model including a first network model and a second network model.

In operation S210, the first target text data is input to the first network model, and an answer extraction result corresponding to the first target text data is obtained.

In operation S220, the second target text data and the answer extraction result are input to the second network model, and a question result corresponding to the second target text data is obtained.

In operation S230, a question-answer pair is generated based on the answer extraction result and the question result.

The first target text data and the second target text data may include text data in an official document corpus, and may include preprocessed text data, and the preprocessing may include data cleaning amount, data deduplication processing, and the like.

The first network model and the second network model can comprise a neural network model to be trained by adopting a machine learning algorithm.

Unlike the method of directly extracting question-answer pairs (questions and answers) from text data, the method of generating question-answer pairs using a neural network model provided in this embodiment divides the process of extracting question-answer pairs into multiple stages. Repeated analysis practice shows that: the answer part in the question-answer pair is extracted from the text original text in the text data, and the question part in the question-answer pair can be extracted from the text original text in the text data or the answer, namely the question is related to the original text and the answer. Therefore, the method for generating question-answer pairs using the neural network model provided in this embodiment divides the process of extracting question-answer pairs into three stages, such as a first stage, text data → answer, a second stage, (text data, answer) → question and a third stage, (answer extraction result, question result) → question-answer pairs.

Different from the method of positioning answer ranges according to keywords, in the method of generating question-answer pairs using a neural network model provided in this embodiment, an answer extraction result corresponding to the first target text data may be directly obtained by using the first network model. Therefore, the process of extracting the answers does not need to give keywords manually, and avoids dependence on the part of speech of the keywords, so that only answers in a certain aspect can be extracted, and the scope of knowledge base construction is not limited.

In the method for generating question-answer pairs by using a neural network model provided by this embodiment, the second target text data and the answer extraction result are input to the second network model, so as to obtain the question result corresponding to the second target text data, thereby solving the problem that the related keywords of the question are outside the answer content, for example, in the document processing, the related keywords of the question may be in the title of the article. If only the question is generated in the answer, the wrong generation result is easily caused; therefore, the second target text data and the answer extraction result are used as input data of the network model, and the wrong problem generation result is avoided; and does not limit the scope of knowledge base construction.

In the method for generating question-answer pairs by using a neural network model provided by the embodiment, the process of extracting question-answer pairs is divided into multiple stages, and answer generation and question generation are decoupled; obtaining an answer extraction result corresponding to the first target text data by inputting the first target text data into the first network model; inputting the second target text data and the answer extraction result into a second network model to obtain a problem result corresponding to the second target text data; generating question-answer pairs based on the answer extraction results and the question results; the method can avoid that the generation process of the answers depends on the part of speech of the keywords, so that only answers in a certain aspect can be extracted, and the scope of knowledge base construction is not limited; the second target text data and the answer extraction result are used as input data of the network model, so that the generation of wrong problem generation results is avoided; meanwhile, the extraction efficiency of the question-answer pairs is improved by using a network model mode, and the extracted question-answer pairs are more complete.

The first target text data, and/or the second target text data is obtained by: acquiring text data; wherein the text data comprises text data of a standard document; acquiring relevant information of the standard document, wherein the relevant information comprises: a standard document and format information corresponding to the standard document; performing text decomposition processing on the text data based on the relevant information of the standard document to obtain a text decomposition processing result corresponding to the text data; obtaining one or more of first target text data and second target text data based on a text decomposition processing result corresponding to the text data; wherein the text decomposition process comprises a regularization method.

The standard documents may be standard documents issued by a standards committee, such as the national standards committee or the industry standards committee, or standard documents, such as regulatory-like documents. The format information corresponding to the standard document may include related information of the format requirement of the standard document, the standard document has fixed format requirement, such as term information and term interpretation information, there is a chapter of terms in the current chapter, and there is standard definition for some proper nouns, i.e. terms; the cover page request information includes first-level title request information, second-level title request information, page setting request information, header and footer request information, and specifically may include font request, paragraph request, and tab related request.

For example, standard documents such as regulatory documents, including document titles, catalogues, texts, and attachments, are used as examples. For example, obtaining the corpus, i.e. the text data, so as to obtain one or more of the first target text data and the second target text data in the corpus. Obtaining relevant information of a standard document, wherein the official document corpus is written according to a fixed format, such as a regulatory official document comprising official document titles, a directory, a text, attachments and the like, and the text content comprises a primary title (such as a first chapter XXX), a secondary title (such as a first chapter XXX) and specific content of an item; and performing text decomposition processing on the text data based on the relevant information of the standard document, so as to obtain one or more of the first target text data and the second target text data based on a text decomposition processing result corresponding to the text data.

For example, the regular method is adopted to decompose the text content, and the text content is expressed as a tree structure, as shown in table 1:

TABLE 1 decomposition treatment comparison Table

In the method for generating question-answer pairs by using a neural network model provided by this embodiment, text decomposition processing is performed on text data by using relevant information of a standard document, and one or more of first target text data and second target text data are obtained by using regular matching; the standard document has a fixed format requirement, and is beneficial to more intelligent text decomposition processing; meanwhile, compared with an analyzer, the regularization method has stronger capability of capturing character strings, and can quickly and flexibly obtain one or more of first target text data and second target text data; .

The first network model includes a language characterization model and the second network model includes a sequence-to-sequence model.

The language characterization model may include a BERT model, i.e., a Bidirectional Encoder Representation from transformations model. BERT learns a good feature representation for a word by running an auto-supervised learning method on the basis of a large amount of corpora, so-called auto-supervised learning refers to supervised learning that runs on data without artificial labels. In a later specific NLP task, the feature representation of Bert can be directly used as the word embedding feature of the task. BERT therefore provides a model for migratory learning by other tasks, which can be fine-tuned or fixed according to the task and then used as a feature extractor. The biggest characteristic of Bert is that the traditional RNN and CNN are abandoned, and the distance between two words at any position is converted into 1 through the Attention mechanism, thereby effectively solving the problem of troublesome long-term dependence in NLP.

Fig. 3 schematically shows a flowchart for extracting answers using a language characterization model according to an embodiment of the present disclosure, and refer to fig. 3. Processing the text data 310 to obtain sentences sen1320, sen 2321 and sen 322; inputting the sentences sen1320, sen 2321 and sen 322 into the language representation model and converting the sentences into Vec vectors to obtain Vec 1340, Vec 2341 and Vecn 342; based on Vec 1340, Vec 2341 and Vecn 342, multi-label classification 350 processing is performed, resulting in answer extraction results 360.

For example, the text of a body is first segmented into sentences sen and input into the bert model, and then converted into vec vectors. To facilitate model batch training, a padding (padding) operation may be performed on the input text to pad the text into m × n codes, where m is a uniform length of sentences and n is a uniform number of sentences in the article. After passing through the bert model, it is converted into 756 × n sentence vectors. Assuming that the length of the sequence of text sentences to be recognized is n, it can be assumed that each answer entity to be recognized is a continuous segment of the sequence, is unlimited in length, and can be nested with one another (there is an intersection between two entities), then how many "candidate entities" are in the sequence? The answers are n (n +1)/2, that is, the sequence with the length of n sentences has n (n +1)/2 different continuous subsequences, these subsequences contain all possible answers, and then the true answer combination is picked out from the n (n +1)/2 "candidate answers", the multi-label classification question of "n (n +1)/2 k-to-k" is processed, k is the number of answers, if sen3 and sen4 are answer sentences, the matrix position of 1 determines the initial sentence and the end sentence of the answer, that is, the answer extraction result is obtained.

Sequence-to-Sequence models, which may include the Seq2Seq model, Sequence-to-Sequence model. For example, the sequence-to-sequence Model includes a Language characterization Model and a Language Model, i.e., a Seq2Seq (sequence-to-sequence Model) is constructed by the way of BERT (Language characterization Model) + UniLM (Language Model for Natural Language Understanding and Generation).

The language model UniLM is a multi-layer Transformer network, but the UniLM can simultaneously complete three pre-training targets, including a sequence-to-sequence training mode, so that the language model UniLM has good performance on NLG tasks.

In the method for generating a question-answer pair using a neural network model provided by this embodiment, the first network model includes a language representation model, and the second network model includes a sequence-to-sequence model, which is beneficial to obtaining answer extraction results and question results in different training stages, thereby generating the question-answer pair.

The sequence-to-sequence model includes a language characterization model and a language model.

The language characterization model may include a BERT model, i.e., a Bidirectional Encoder Representation from transformations model. The Language Model may include a UniLM Model, i.e., a Unified Language Model Pre-training for Natural Language Understanding and Generation).

According to the method for generating question-answer pairs by using the neural network model, the question result can be quickly and accurately obtained through the sequence built by the language representation model and the language model to the sequence model.

Fig. 4 schematically shows a flow chart of a method of training a neural network model according to an embodiment of the present disclosure.

As shown in fig. 4, the embodiment includes operations S410 to S430, and the training method of the neural network model may be performed by a server.

In operation S410, a neural network model to be trained is constructed based on the language characterization model and the language model.

In operation S420, the text data is parsed to obtain title data, body data corresponding to the title data, and answer data.

In operation S430, the header data, the text data corresponding to the header data, and the answer data are used as sample input data, and a machine learning algorithm is used to train the neural network model to be trained.

The neural network model to be trained may include a Sequence 2 Sequence model, i.e., a Sequence-to-Sequence model. The language Representation model can comprise a BERT model, namely a Bidirectional Encoder reproduction from transformations model. The Language Model, the Language Model Pre-training for Natural Language Understanding and Generation Model, the Language Model UniLM is a multi-layer transform network, but the UniLM can complete three Pre-training targets simultaneously, including sequence-to-sequence training mode, so it has good performance on NLG task.

Since the text data is written in a fixed format, the text data is parsed to obtain the title data, the body data corresponding to the title data, and the answer data. By using the title data, the text data corresponding to the title data and the answer data as sample input data, two situations that relevant keywords in the question appear in chapter names or certain sentences of answers can be dealt with, and rich question-asking modes are provided, such as that when text is intercepted, chapters are 'chapter names' + 'chapter head sentences' + 'chapter end sentences', and answers are 'certain sentences of answers'. And a threshold setting mode can be adopted, and if the problem is shorter than the threshold, interception is not carried out. And then, training a neural network model to be trained by adopting a machine learning algorithm, and in the training process, when the answer is sent to the nesting condition, taking the answer with the longest text.

Fig. 5 schematically shows an execution diagram of a sequence-to-sequence model according to an embodiment of the present disclosure, see fig. 5. Acquiring text data 510, and processing the text data 510, for example, parsing to obtain header data 520, text data 521 corresponding to the header data, and answer data 522; then, the header data 520, the text data 521 corresponding to the header data, and the answer data 522 are used as sample input data, and are input to a neural network model to be trained, which is constructed based on a language characterization model and a language model, such as a sequence-to-sequence model 530, and a machine learning algorithm is used to train the sequence-to-sequence model 530, so as to obtain a question result 540 output by the sequence-to-sequence model 530.

According to the training method of the neural network model provided by the embodiment, the header data, the text data corresponding to the header data and the answer data are used as sample input data, the machine learning algorithm is adopted to train the neural network model to be trained, the problem that relevant keywords of the problem are not in chapters is effectively solved, the problem can be generated by the neural network model by using the header data, the text data corresponding to the header data and the answer data together, and the problem result corresponding to the second target text data can be obtained by inputting the second target text data and the answer extraction result into the second network model.

The answer data comprises answer extraction results; the answer extraction result is obtained by the following operations: taking the text data as sample input data; and training the language representation model to be trained by adopting a machine learning algorithm according to the sample input data to obtain an answer extraction result corresponding to the sample input data.

The answer data may include answer extraction results. The answer extraction result can be obtained by utilizing a trained language representation model, namely the language representation model is obtained by adopting machine learning algorithm training, text data can be used as sample input data, and the answer extraction result corresponding to the sample input data is used as an output result.

The training method of the neural network model provided by this embodiment is beneficial to quickly obtaining answer data, that is, an answer extraction result is used as the answer data, and the answer extraction result can be directly obtained by using the trained language representation model.

Fig. 6 schematically shows a block diagram of a structure of an apparatus for generating question-answer pairs using a neural network model according to an embodiment of the present disclosure.

As shown in fig. 6, the apparatus 600 for generating question-answer pairs using a neural network model of this embodiment includes an answer determining module 610, a question determining module 620 and a question-answer generating module 630.

The answer determining module 610 is configured to input the first target text data to the first network model, and obtain an answer extraction result corresponding to the first target text data; a question determining module 620, configured to input the second target text data and the answer extraction result to the second network model, and obtain a question result corresponding to the second target text data; and a question-answer pair generation module 630 for generating a question-answer pair based on the answer extraction result and the question result.

In some embodiments, the first target text data, and/or the second target text data, is obtained by: acquiring text data; wherein the text data comprises text data of a standard document; acquiring relevant information of a standard document, wherein the relevant information comprises: a standard document and format information corresponding to the standard document; performing text decomposition processing on the text data based on the relevant information of the standard document to obtain a text decomposition processing result corresponding to the text data; and obtaining one or more of the first target text data and the second target text data based on a text decomposition processing result corresponding to the text data; wherein the text decomposition process comprises a regularization method.

In some embodiments, the first network model comprises a language characterization model and the second network model comprises a sequence-to-sequence model.

In some embodiments, the sequence-to-sequence model includes a language characterization model and a language model.

Any of the answer determining module 610, question determining module 620, and question and answer pair generating module 630 may be combined into one module or any one of them may be split into multiple modules according to an embodiment of the present disclosure. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the answer determining module 610, the question determining module 620, and the question and answer pair generating module 630 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or may be implemented in any one of three implementations of software, hardware, and firmware, or in any suitable combination of any of them. Alternatively, at least one of the answer determining module 610, the question determining module 620 and the question-and-answer pair generating module 630 may be at least partially implemented as a computer program module that, when executed, may perform corresponding functions.

Fig. 7 schematically shows a block diagram of a training apparatus of a neural network model according to an embodiment of the present disclosure.

As shown in fig. 7, the training apparatus 700 of the neural network model of this embodiment includes a building module 710, a parsing module 720, and a training module 730.

The building module 710 is used for building a neural network model to be trained based on the language representation model and the language model; the analysis module 720 is used for analyzing the text data to obtain the title data, the text data corresponding to the title data and the answer data; and a training module 730, configured to train the neural network model to be trained by using a machine learning algorithm, with the header data, and the text data and answer data corresponding to the header data as sample input data.

In some embodiments, the answer data comprises answer extraction results; the answer extraction result is obtained by the following operations: taking the text data as sample input data; and training a language representation model to be trained by adopting a machine learning algorithm according to the sample input data to obtain an answer extraction result corresponding to the sample input data.

Any of the building module 710, the parsing module 720, and the training module 730 may be combined in one module or any of them may be split into multiple modules according to embodiments of the present disclosure. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the building module 710, the parsing module 720 and the training module 730 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware by any other reasonable manner of integrating or packaging a circuit, or in any one of three implementations of software, hardware and firmware, or in a suitable combination of any of them. Alternatively, at least one of the building module 710, the parsing module 720 and the training module 730 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.

As shown in fig. 8, an electronic device 800 according to an embodiment of the present disclosure includes a processor 801 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. The processor 801 may include, for example, a general purpose microprocessor (e.g., CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., Application Specific Integrated Circuit (ASIC)), among others. The processor 801 may also include onboard memory for caching purposes. The processor 801 may include a single processing unit or multiple processing units for performing different actions of the method flows according to embodiments of the present disclosure.

In the RAM803, various programs and data necessary for the operation of the electronic apparatus 800 are stored. The processor 801, ROM802, and RAM803 are connected to each other by a bus 804. The processor 801 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM802 and/or RAM 803. Note that the programs may also be stored in one or more memories other than the ROM802 and RAM 803. The processor 801 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

Electronic device 800 may also include input/output (I/O) interface 805, input/output (I/O) interface 805 also connected to bus 804, according to an embodiment of the present disclosure. Electronic device 800 may also include one or more of the following components connected to I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement a method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include one or more memories other than the ROM802 and/or RAM803 and/or ROM802 and RAM803 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to implement the method for generating question-answer pairs using a neural network model and the training method of the neural network model provided by the embodiment of the disclosure.

The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 801. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of a signal, distributed over a network medium, downloaded and installed via communications portion 809, and/or installed from removable media 811. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program, when executed by the processor 801, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. A method of generating question-answer pairs using a neural network model, the neural network model comprising a first network model and a second network model, the method comprising:

inputting first target text data into the first network model to obtain an answer extraction result corresponding to the first target text data;

inputting second target text data and the answer extraction result into the second network model to obtain a question result corresponding to the second target text data; and

and generating question-answer pairs based on the answer extraction results and the question results.

2. The method of claim 1, wherein the first target text data, and/or the second target text data, is obtained by:

acquiring text data; wherein the text data comprises text data of a standard document;

acquiring relevant information of a standard document, wherein the relevant information comprises: a standard document and format information corresponding to the standard document;

performing text decomposition processing on the text data based on the relevant information of the standard document to obtain a text decomposition processing result corresponding to the text data; and

obtaining one or more of the first target text data and the second target text data based on a text decomposition processing result corresponding to the text data;

wherein the text decomposition process comprises a regularization method.

3. The method of claim 1 or 2, wherein the first network model comprises a language characterization model and the second network model comprises a sequence-to-sequence model.

4. The method of claim 3, wherein the sequence-to-sequence model includes a language characterization model and a language model.

5. A method of training a neural network model, comprising:

constructing a neural network model to be trained based on the language representation model and the language model;

analyzing the text data to obtain title data, text data corresponding to the title data and answer data; and

and taking the header data, the text data corresponding to the header data and the answer data as sample input data, and training the neural network model to be trained by adopting a machine learning algorithm.

6. The method of claim 5, wherein the answer data comprises answer extraction results; the answer extraction result is obtained by the following operations:

taking the text data as sample input data; and

and training a language representation model to be trained by adopting a machine learning algorithm according to the sample input data to obtain an answer extraction result corresponding to the sample input data.

7. An apparatus for generating question-answer pairs using a neural network model, the neural network model comprising a first network model and a second network model, the apparatus comprising:

the answer determining module is used for inputting first target text data into the first network model to obtain an answer extraction result corresponding to the first target text data;

the question determining module is used for inputting second target text data and the answer extraction result into the second network model to obtain a question result corresponding to the second target text data; and

and the question-answer pair generating module is used for generating question-answer pairs based on the answer extraction result and the question result.

8. An apparatus for training a neural network model, comprising:

the building module is used for building a neural network model to be trained based on the language representation model and the language model;

the analysis module is used for analyzing the text data to obtain title data, and text data and answer data corresponding to the title data; and

and the training module is used for taking the header data, the text data corresponding to the header data and the answer data as sample input data and training the neural network model to be trained by adopting a machine learning algorithm.

9. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-6.

10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 6.

11. A computer program product comprising a computer program which, when executed by a processor, implements a method according to any one of claims 1 to 6.