CN114077654A

CN114077654A - Question answering method, device, equipment and storage medium

Info

Publication number: CN114077654A
Application number: CN202010812055.8A
Authority: CN
Inventors: 黄伟文; 黄华新; 罗朝彤; 薛蓉蓉; 陈晓鸿; 陈庆; 邹伟政
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Information Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Information Technology Co Ltd
Priority date: 2020-08-13
Filing date: 2020-08-13
Publication date: 2022-02-22

Abstract

The application discloses a question answering method, a question answering device, question answering equipment and a storage medium, and relates to the technical field of information processing. The question answering method comprises the following steps: receiving a target question; inputting the target problem and a plurality of preset sub-text information into a similarity matching model, and determining target preset sub-text information matched with the target problem; inputting the target question and the target preset sub-text information into a question-answering model, and determining a target starting position and a target ending position in the target preset sub-text information; and determining the text information between the target starting position and the target ending position as a target answer corresponding to the target question, and outputting the target answer. According to the embodiment of the application, the questions and answers in the question-answer pairs can be set without manual work.

Description

Question answering method, device, equipment and storage medium

Technical Field

The present application belongs to the field of information processing technologies, and in particular, to a question answering method, device, apparatus, and storage medium.

Background

A question-answering system is a system that receives questions and returns corresponding answers. With the continuous development of science and technology, the application fields of the question answering system are more and more abundant, such as the fields of navigation, shopping guide, education, search and the like.

The question-answering system generally includes a pre-constructed question-answering pair knowledge base, in which a plurality of question-answering pairs composed of questions and answers corresponding thereto are included. After receiving the question, the question-answering system can search the answer corresponding to the question in the question-answering pair knowledge base and output the answer.

However, the question-answer pairs in the knowledge base of question-answer pairs require manual setting of questions and answers in each question-answer pair.

Disclosure of Invention

The embodiment of the application provides a question answering method, a question answering device, question answering equipment and a storage medium, and the questions and answers in question answering pairs can be set without manual work.

In order to solve the technical problem, the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides a question answering method, including:

receiving a target question;

inputting the target problem and a plurality of preset sub-text information into a similarity matching model, and determining target preset sub-text information matched with the target problem;

inputting the target question and the target preset sub-text information into a question-answering model, and determining a target starting position and a target ending position in the target preset sub-text information;

and determining the text information between the target starting position and the target ending position as a target answer corresponding to the target question, and outputting the target answer.

In a second aspect, an embodiment of the present application provides a question answering device, including:

a receiving module for receiving a target question;

the first determining module is used for inputting the target problem and the preset sub-text information into the similarity matching model and determining the preset sub-text information of the target matched with the target problem;

the second determining module is used for inputting the target question and the target preset sub-text information into the question-answering model and determining a target starting position and a target ending position in the target preset sub-text information;

and the output module is used for determining the text information between the target starting position and the target ending position as a target answer corresponding to the target question and outputting the target answer.

In a third aspect, an embodiment of the present application provides an apparatus, including: a processor and a memory storing computer program instructions;

the processor, when executing the computer program instructions, implements a question-answering method as in the first aspect.

In a fourth aspect, embodiments of the present application provide a computer storage medium having computer program instructions stored thereon, where the computer program instructions, when executed by a processor, implement the question answering method according to the first aspect.

In the embodiment of the application, the target question and the preset sub-text information are input into the similarity matching model, so that the preset sub-text information of the target matched with the target question can be determined; then, inputting the target question and the target preset sub-text information into a question-answering model, and determining a target starting position and a target ending position in the target preset sub-text information; and then, the text information between the target starting position and the target ending position can be determined as a target answer corresponding to the target question and output. The target answer is text information between the target starting position and the target ending position in the target preset sub-text information, and the target question and the target preset sub-text information do not need to be manually associated, so that the question and the answer can be manually set, and the corresponding answer can be output after the question is received.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic view of a scenario provided by an embodiment of the present application;

FIG. 2 is a schematic flow chart of a question answering method according to another embodiment of the present application;

FIG. 3 is a schematic flow chart of constructing a knowledge base of question-answer pairs according to another embodiment of the present application;

FIG. 4 is a schematic flow chart of a method for constructing a question answering system according to another embodiment of the present application;

fig. 5 is a schematic structural diagram of a question answering device according to another embodiment of the present application;

fig. 6 is a schematic structural diagram of an apparatus according to another embodiment of the present application.

Detailed Description

Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

As described in the background section, the question-answering system in the related art, which uses question-answer pairs in the knowledge base, needs to manually set the questions and answers in each question-answer pair, which consumes a lot of labor cost. In addition, for questions which are not included in the question-answering pairs, the question-answering system in the related art cannot output answers, so that the application scene of the question-answering system is narrow.

In order to solve the problems of the prior art, embodiments of the present application provide a question answering method, device, apparatus, and storage medium. First, a question answering method provided in the embodiment of the present application is described below.

The execution main body of the question answering method provided by the embodiment of the application can be a question answering device, the question answering device can be a single server or a service cluster consisting of a plurality of servers, and the question answering device can be deployed in a machine room managed by a service provider operating a question answering system. A corresponding application scenario may be as shown in fig. 1, the question-answering apparatus 100 may receive a target question, and may then output a target answer corresponding to the target question through a pre-configured similarity matching model and question-answering model. The output target answer is the text information between the starting position and the ending position in the preset sub-text information, and the target question and the preset sub-text information do not need to be associated manually, so that the question and the answer do not need to be set manually.

In the question and answer method provided in the embodiment of the present application, a similarity matching model and a question and answer model trained in advance need to be used, so the similarity matching model and the question and answer model are introduced first, and then the question and answer method is introduced.

Optionally, the training process of the similarity matching model may specifically be as follows: obtaining a bidirectional encoder characterization BERT model and a first training sample; adding a first full-connection layer behind the BERT model to obtain a first initial model; and performing iterative training on the first initial model according to the first training sample to obtain a similarity matching model.

In some embodiments, the first initial model may be iteratively trained using the first training sample, resulting in a similarity matching model.

Specifically, the training sample for training the similarity matching model may be referred to as a first training sample, which may be constructed using a universal chinese corpus. The first training sample may include a plurality of questions, an answer corresponding to each question, and text information including the answers. For the first initial model, a fully-connected layer, which may be referred to as a first fully-connected layer, may be added after a Bidirectional Encoder characterization from transforms (BERTs) model to obtain the first initial model.

In some embodiments, the BERT model may have a plurality of inputs, and an input related to the similarity matching model may be referred to as a first input, which is also an input of the first initial model, and the first input may be a plurality of sub-text information obtained by dividing the text information in advance. The output of the BERT model that is associated with the similarity matching model may be referred to as a first output, which may be an input to the first fully-connected layer, and the first output may be vector information of the sub-text information. Accordingly, the output of the first fully connected layer may be the output of the first initial model, i.e. may be the probability that the sub-text information contains the answer to the question.

It should be noted that the vector information of the sub-text information may not be subjected to word segmentation processing.

Optionally, the training process of the question-answering model may specifically be as follows: obtaining a bidirectional encoder characterization BERT model and a second training sample; adding a second full-connection layer behind the BERT model to obtain a second initial model; and performing iterative training on the second initial model according to the second training sample to obtain a question-answer model.

In some embodiments, the second initial model may be iteratively trained using a second training sample to obtain a question-and-answer model.

Specifically, the training samples for training the question-answering model may be referred to as second training samples, which may be constructed using a universal chinese corpus. The second training sample may include a plurality of questions, an answer corresponding to each question, text information including the answers, a start position and an end position of the answers in the text information. For the second initial model, it may add a fully-connected layer after the BERT model to obtain the second initial model, and the fully-connected layer may be referred to as a second fully-connected layer.

In some embodiments, the input in the BERT model related to the question-answer model may be referred to as a second input, which is also an input to a second initial model, which may be question and sub-text information. Accordingly, the output of the BERT model related to the question-answering model may be referred to as a second output, which may be an input of a second fully-connected layer, the second output being vector information of the question and the sub-text information, the output of the second fully-connected layer being a start position, an end position, and probabilities of the start position and the end position of the answer to the question in the sub-text information, and the output of the second fully-connected layer being an output of the second initial model.

The question answering method provided by the embodiment of the application is introduced below. As shown in fig. 2, the question answering method provided in the embodiment of the present application includes the following steps:

s210, receiving the target problem.

In some embodiments, the target problem may be any problem, the target problem may be a problem in the training sample described above, or the target problem may not be a problem in the training sample described above.

S220, inputting the target problem and the preset sub-text information into the similarity matching model, and determining the preset sub-text information of the target matched with the target problem.

In some embodiments, the preset sub-text information may be obtained by dividing an original document, where the original document may be a paper, a periodical, or an article on the internet.

Optionally, the preset sub-text information may be obtained based on the text information overlapping rate, and the corresponding processing may be as follows: acquiring original text information; and dividing the original text information into a plurality of preset sub-text information according to a preset proportion.

In some embodiments, the original text information may be the original document mentioned above, so that after the original text information is obtained, the original text information may be divided into a plurality of preset sub-text information according to a preset ratio, that is, a text information overlapping rate of the nth preset sub-text information and the nth-1 st preset sub-text information, where N is a positive integer and N ≧ 2,

in this way, after receiving the target question, the target question and the plurality of preset sub-text information may be input to the similarity matching model to determine the target preset sub-text information matching the target question.

Specifically, considering that the BERT model limits the input data amount, i.e., 512 characters cannot be exceeded, and the similarity matching model is obtained based on the BERT model, the data amount of the preset sub-text information, e.g., 256 characters, 512 characters, etc., may be set on the principle that the data amount does not exceed 512 characters.

It should be noted that there may be some long answers to the questions, and in this case, the data size of the preset sub-text information may be increased appropriately.

Optionally, the target preset sub-text information may be determined based on a probability that the preset sub-text information includes an answer to the target question, and correspondingly, the specific processing in step S220 may be as follows: inputting the target question and a plurality of preset sub-text messages into a similarity matching model to obtain the probability that each preset sub-text message contains the answer of the target question; and determining the preset sub-text information containing the answer of the target question with the highest probability as the target preset sub-text information.

In some embodiments, the similarity matching model may output a probability that each preset sub-text information includes an answer to the target question. In this way, the target question and the plurality of preset sub-text information can be input to the similarity matching model, and the probability that each preset sub-text information contains the answer to the target question is obtained. Then, the preset sub-text information containing the answer to the target question with the highest probability can be determined as the target preset sub-text information.

And S230, inputting the target question and the target preset sub-text information into a question-answering model, and determining a target starting position and a target ending position in the target preset sub-text information.

In some embodiments, after the target preset sub-text information is determined, the target question and the target preset sub-text information may be input to the question-and-answer model to determine a target start position and a target end position in the target preset sub-text information.

Optionally, the target starting position and the target ending position may be determined based on the probability of the starting position and the ending position of the answer to the target question in the target preset sub-text information, and accordingly, the specific processing in step S230 may be as follows: inputting the target question and the target preset sub-text information into a question-answer model to obtain the probability of the starting position and the ending position of the answer of the target question in the target preset sub-text information; and determining the starting position and the ending position of the target preset sub-text information corresponding to the maximum probability as the target starting position and the target ending position.

In some embodiments, the question-answering model may output probabilities of the starting and ending positions of the answers to a certain question in the target preset sub-text information. In this way, the target question and the target preset sub-text information can be input into the question-and-answer model, and the probability of the starting position and the ending position of the answer of the target question in the target preset sub-text information can be obtained. Then, the starting position and the ending position of the target preset sub-text information corresponding to the maximum probability can be determined as the target starting position and the target ending position.

And S240, determining the text information between the target starting position and the target ending position as a target answer corresponding to the target question, and outputting the target answer.

In some embodiments, the target answer may be a text message between the target start position and the target end position. In this way, text information between the target start position and the target end position can be output as the target answer.

In order to better understand the question answering method provided in the embodiment of the present application, a scenario embodiment is provided below.

First, a BERT model, which may be referred to as M0, is obtained.

Then, a knowledge base of question-answer pairs for training is constructed, wherein the knowledge base comprises questions, articles for extracting answers, the answers and positions of the answers in the articles. As shown in fig. 3, the question-answer pair knowledge base for training may be constructed according to a flow of "collecting articles from the Web → preprocessing data → designing questions → designing answers → constructing a question-answer pair knowledge base for training".

Then, the question-answer pair knowledge base may be vectorized to construct a training set X0, in the input format S0: "[ cls ]" + question + "[ seq ]" + article + "[ seq ]" used to extract the question. Where the output corresponding to cls can be used for text classification, and seq is a delimiter between question and article. The label format is "(position where answer starts in text, position where answer ends in text)".

Next, a fully-connected layer W1 is added after the BERT model, where the inputs of the fully-connected layer W1 are: an output vector C0 corresponding to "question" + "[ seq ]" + article + "[ seq ]" for extracting question in M0, the output of the full link layer W1 is (BATCH _ SIZE, seq _ len, 2), where BATCH _ SIZE represents the BATCH SIZE, and numeral 2 represents two dimensions for outputting the start point and end point of the answer; in addition, a matrix transpose layer may be added after W1 to transpose the output of W1 to (BATCH _ SIZE, 2, seq _ len), so that the final output is the position where the answer starts and the position where the answer ends in the article. The training set X0 can then be used to train and tune the above model, resulting in the question-answer model M1.

Then, a full-link layer W2 may be added after M0, where the input of the full-link layer W2 is the output vector C1 corresponding to the special class mark [ cls ] in M0, and the output of W2 is (batt _ SIZE, 1); the training set text may then be divided into multiple paragraphs by 512 characters, with the question and paragraph as inputs and the output being the probability of finding the answer to the question in the paragraph, to arrive at the similarity matching model M2.

It should be noted that the training processes of M1 and M2 are not sequential.

Then, a production knowledge base X1 may be constructed, wherein the documents in the production knowledge base X1 should be unprocessed original documents, and specifically, the text may be divided into text segments of 512 characters, that is, the above-mentioned preset sub-text information, for the model to extract answers from.

Then, the actually input question and the TEXT segment with the length of 512 characters in the production knowledge base X1 can be input into the similarity matching model M2 for calculating the similarity of each TEXT segment and the question, and the TEXT segment with the length of 512 characters, TEXT segment TEXT0, which is most similar to the question is obtained.

Then, the question and TEXT0 can be encoded according to the input format S0, and then input to the question-answering model M1 to obtain the start position and the end position of the answer corresponding to the question in TEXT0, and extract the answer accordingly.

It should be noted that, as shown in fig. 4, the question-answering model M1 and the similarity matching model M2 may be accessed to a Web system in the form of an application program interface to construct an intelligent question-answering system.

Therefore, the question-answering knowledge base does not need to be manually defined, the labor cost is reduced, the method is more suitable for large-scale application, the 512-character limitation of the BERT model can be avoided, answers can be extracted from texts with any length, and the application scene of the question-answering system is greatly widened.

Based on the question answering method provided by the embodiment, correspondingly, the application also provides a specific implementation mode of the question answering device. Please see the examples below.

Referring to fig. 5, the question answering device provided in the embodiment of the present application includes the following modules:

a receiving module 510 for receiving a target question;

a first determining module 520, configured to input the target question and the plurality of preset sub-text information into the similarity matching model, and determine target preset sub-text information matched with the target question;

a second determining module 530, configured to input the target question and the target preset sub-text information into the question-and-answer model, and determine a target starting position and a target ending position in the target preset sub-text information;

and the output module 540 is configured to determine text information between the target starting position and the target ending position as a target answer corresponding to the target question, and output the target answer.

Optionally, the apparatus further comprises a first training module configured to:

acquiring a bidirectional encoder representation BERT model and a first training sample, wherein the first training sample comprises a plurality of questions, answers corresponding to the questions and text information including the answers;

adding a first full-connection layer behind the BERT model to obtain a first initial model, wherein the first input of the BERT model is input of the first initial model, the first input is a plurality of sub-text information obtained by dividing text information in advance, the first output of the BERT model is input of the first full-connection layer, the first output is vector information of the sub-text information, the output of the first full-connection layer is the probability that the sub-text information contains answers to questions, and the output of the first full-connection layer is output of the first initial model;

and performing iterative training on the first initial model according to the first training sample to obtain a similarity matching model.

Optionally, the apparatus further comprises a second training module, configured to:

acquiring a bidirectional encoder characterization BERT model and a second training sample, wherein the second training sample comprises a plurality of questions, answers corresponding to the questions, text information including the answers, and a starting position and an ending position of the answers in the text information;

adding a second full-connection layer behind the BERT model to obtain a second initial model, wherein the second input of the BERT model is the input of the second initial model, the second input is the question and the sub-text information, the second output of the BERT model is the input of the second full-connection layer, the second output is the vector information of the question and the sub-text information, the output of the second full-connection layer is the starting position, the ending position and the probability of the starting position and the ending position of the answer of the question in the sub-text information, and the output of the second full-connection layer is the output of the second initial model;

and performing iterative training on the second initial model according to the second training sample to obtain a question-answer model.

Optionally, the first determining module is further configured to:

inputting the target question and a plurality of preset sub-text messages into a similarity matching model to obtain the probability that each preset sub-text message contains the answer of the target question;

and determining the preset sub-text information containing the answer of the target question with the highest probability as the target preset sub-text information.

Optionally, the apparatus further includes an obtaining module, configured to:

acquiring original text information;

dividing the original text information into a plurality of preset sub-text information according to a preset proportion, wherein the preset proportion is the text information overlapping rate of the Nth preset sub-text information and the (N-1) th preset sub-text information, N is a positive integer, and N is larger than or equal to 2.

Each module in the question answering device provided in fig. 5 has a function of implementing each step in the embodiment shown in fig. 2, and achieves the same technical effect as the question answering method shown in fig. 2, and is not described herein again for brevity.

Fig. 6 is a schematic hardware structure diagram of a device implementing various embodiments of the present application.

The device may comprise a processor 601 and a memory 602 in which computer program instructions are stored.

Specifically, the processor 601 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.

Memory 602 may include mass storage for data or instructions. By way of example, and not limitation, memory 602 may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 602 may include removable or non-removable (or fixed) media, where appropriate. The memory 602 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 602 is a non-volatile solid-state memory. In a particular embodiment, the memory 602 includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory or a combination of two or more of these.

The processor 601 realizes any one of the above-described methods of answering a question by reading and executing computer program instructions stored in the memory 602.

In one example, the device may also include a communication interface 603 and a bus 610. As shown in fig. 6, the processor 601, the memory 602, and the communication interface 603 are connected via a bus 610 to complete communication therebetween.

The communication interface 603 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present application.

Bus 610 includes hardware, software, or both to couple the devices' components to one another. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 610 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

The device may perform the question answering method in the embodiment of the present application, thereby implementing the question answering method in conjunction with the embodiment shown in fig. 2.

An embodiment of the present application further provides a computer-readable storage medium, where the computer storage medium has computer program instructions stored thereon; when executed by a processor, the computer program instructions implement the processes of the above-described embodiment of the question-answering method, and can achieve the same technical effects, and are not described herein again to avoid repetition.

It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.

The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.

As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims

1. A question-answering method, characterized in that it comprises:

receiving a target question;

inputting the target problem and a plurality of preset sub-text messages into a similarity matching model, and determining target preset sub-text messages matched with the target problem;

2. The method of claim 1, wherein prior to receiving the target issue, the method further comprises:

obtaining a bidirectional encoder characterization BERT model and a first training sample, wherein the first training sample comprises a plurality of questions, answers corresponding to the questions and text information including the answers;

adding a first full-connection layer behind the BERT model to obtain a first initial model, wherein a first input of the BERT model is the input of the first initial model, the first input is a plurality of sub-text information obtained by dividing the text information in advance, a first output of the BERT model is the input of the first full-connection layer, the first output is vector information of the sub-text information, the output of the first full-connection layer is the probability that the sub-text information contains answers to the questions, and the output of the first full-connection layer is the output of the first initial model;

and performing iterative training on the first initial model according to the first training sample to obtain the similarity matching model.

3. The method of claim 1, wherein prior to receiving the target issue, the method further comprises:

obtaining a bidirectional encoder characterization BERT model and a second training sample, wherein the second training sample comprises a plurality of questions, answers corresponding to the questions, text information including the answers, and starting positions and ending positions of the answers in the text information;

adding a second fully-connected layer behind the BERT model to obtain a second initial model, wherein a second input of the BERT model is an input of the second initial model, the second input is the question and the sub-text information, a second output of the BERT model is an input of the second fully-connected layer, the second output is vector information of the question and the sub-text information, an output of the second fully-connected layer is a starting position, an ending position and probabilities of the starting position and the ending position of an answer of the question in the sub-text information, and an output of the second fully-connected layer is an output of the second initial model;

and performing iterative training on the second initial model according to the second training sample to obtain the question-answer model.

4. The method according to claim 1, wherein the inputting the target question and a plurality of preset sub-text information into a similarity matching model, and determining the target preset sub-text information matching the target question comprises:

and determining preset sub-text information which has the highest probability and contains the answer of the target question as the target preset sub-text information.

5. The method according to claim 1 or 4, wherein before inputting the target question and a plurality of preset sub-text information into a similarity matching model, the method further comprises:

acquiring original text information;

6. The method according to claim 1, wherein the inputting the target question and the target preset sub-text information into a question-and-answer model, and the determining the target starting position and the target ending position in the target preset sub-text information comprises:

inputting the target question and the target preset sub-text information into a question-answer model to obtain the probability of the starting position and the ending position of the answer of the target question in the target preset sub-text information;

and determining the starting position and the ending position of the target preset sub-text information corresponding to the maximum probability as the target starting position and the target ending position.

7. A question answering device, characterized in that the device comprises:

a receiving module for receiving a target question;

the first determining module is used for inputting the target question and a plurality of preset sub-text messages into a similarity matching model and determining the target preset sub-text messages matched with the target question;

the second determining module is used for inputting the target question and the target preset sub-text information into a question-answering model and determining a target starting position and a target ending position in the target preset sub-text information;

8. The apparatus of claim 7, further comprising a first training module to:

9. The apparatus of claim 7, further comprising a second training module to:

10. The apparatus of claim 7, wherein the first determining module is further configured to:

11. The apparatus of claim 7 or 10, further comprising an acquisition module configured to:

acquiring original text information;

12. An apparatus, characterized in that the apparatus comprises: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements the question answering method according to any one of claims 1 to 6.

13. A computer storage medium having computer program instructions stored thereon that, when executed by a processor, implement the question answering method according to any one of claims 1 to 6.