CN115033674A

CN115033674A - Question-answer matching method, question-answer matching device, electronic equipment and storage medium

Info

Publication number: CN115033674A
Application number: CN202210687408.5A
Authority: CN
Inventors: 张炜
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2022-06-17
Filing date: 2022-06-17
Publication date: 2022-09-09

Abstract

The application provides a question-answer matching method, a question-answer matching device, electronic equipment and a storage medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: acquiring original sentence data to be processed; extracting context characteristics of original sentence data to obtain initial sentence data; carrying out sequence annotation on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, and the candidate answer data comprises at least one candidate answer; constructing an initial question-answer pair according to the target question and the candidate answers; performing feature extraction on the initial question-answer pair through a question-answer matching model to obtain question-answer semantic features; and performing matching probability calculation on the question-answer semantic features to obtain a question-answer matching value, and screening the initial question-answer pair according to the question-answer matching value to obtain a target question-answer pair, wherein the target question-answer pair comprises a target question and a target answer corresponding to the target question. The method and the device can improve the matching accuracy of the questions and the answers.

Description

Question-answer matching method, question-answer matching device, electronic equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a question and answer matching method, a question and answer matching device, an electronic device, and a storage medium.

Background

In the field of intelligent prediction, it is generally possible to predict answers from questions; at present, usually, the matching answers are extracted from the database according to the questions to realize the prediction, and the matching degree between the predicted answers and the questions is not high enough, so how to improve the matching accuracy of the questions and the answers becomes a technical problem to be solved urgently.

Disclosure of Invention

The embodiment of the application mainly aims to provide a question-answer matching method, a question-answer matching device, electronic equipment and a storage medium, and aims to improve the matching accuracy of questions and answers.

In order to achieve the above object, a first aspect of the embodiments of the present application provides a question-answer matching method, where the method includes:

acquiring original sentence data to be processed;

extracting the context characteristics of the original sentence data to obtain initial sentence data;

carrying out sequence labeling on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, and each candidate answer is used for solving one target question;

constructing an initial question-answer pair according to the target question and the candidate answers;

performing feature extraction on the initial question-answer pair through a preset question-answer matching model to obtain question-answer semantic features;

and performing matching probability calculation on the question-answer semantic features through the question-answer matching model to obtain a question-answer matching value, and screening the initial question-answer pair according to the question-answer matching value to obtain a target question-answer pair, wherein the target question-answer pair comprises a target question and a target answer corresponding to the target question.

In some embodiments, the step of extracting the context feature of the original sentence data to obtain the initial sentence data includes:

performing feature embedding processing on the original sentence data to obtain a sentence embedding vector;

and extracting context characteristics of the sentence embedding vector through a preset attention mechanism model to obtain the initial sentence data.

In some embodiments, the step of performing feature embedding processing on the original sentence data to obtain a sentence embedding vector includes:

performing word segmentation processing on the original sentence data to obtain an original word segment;

carrying out word embedding processing on the original word segment to obtain an original word embedding vector;

and splicing the original word embedded vectors to obtain the sentence embedded vectors.

In some embodiments, the step of performing sequence annotation on the initial sentence data to obtain target question data and candidate answer data includes:

predicting the position of the initial sentence data according to a preset first function and a BIO label to obtain a question position label and an answer position label of the initial sentence data;

and segmenting the initial sentence data according to the question position tags to obtain the target question data, and segmenting the initial sentence data according to the answer position tags to obtain the candidate answer data.

In some embodiments, the step of segmenting the initial sentence data according to the question position tags to obtain the target question data, and segmenting the initial sentence data according to the answer position tags to obtain the candidate answer data includes:

extracting a problem starting label and a problem ending label in the problem position label, and segmenting the initial sentence data according to the problem starting label and the problem ending label to obtain the target problem data;

and extracting an answer starting label and an answer ending label in the answer position labels, and segmenting the candidate answer data according to the answer starting label and the answer ending label to obtain the candidate answer data.

In some embodiments, the step of performing feature extraction on the initial question-answer pair through a preset question-answer matching model to obtain question-answer semantic features includes:

coding the initial question-answer pair through the question-answer matching model to obtain a question-answer coding vector;

and carrying out normalization processing on the question-answer encoding vector to obtain question-answer semantic features, wherein the question-answer semantic features are characterization features used for characterizing sentence context semantic information.

In some embodiments, the step of performing matching probability calculation on the question-answer semantic features through the question-answer matching model to obtain a question-answer matching value, and performing screening processing on the initial question-answer pair according to the question-answer matching value to obtain a target question-answer pair includes:

performing matching probability calculation on the question-answer semantic features through a second function of the question-answer matching model to obtain a question-answer matching value;

and taking the initial question-answer pair with the maximum question-answer matching value as the target question-answer pair.

To achieve the above object, a second aspect of the embodiments of the present application proposes a question-answer matching apparatus, including:

the acquisition module is used for acquiring original sentence data to be processed;

the first feature extraction module is used for extracting the context features of the original sentence data to obtain initial sentence data;

the sequence marking module is used for carrying out sequence marking on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, and each candidate answer is used for solving one target question;

the construction module is used for constructing an initial question-answer pair according to the target question and the candidate answer;

the second feature extraction module is used for performing feature extraction on the initial question-answer pair through a preset question-answer matching model to obtain question-answer semantic features;

and the computing module is used for performing matching probability computation on the question and answer semantic features through the question and answer matching model to obtain a question and answer matching value, and screening the initial question and answer pair according to the question and answer matching value to obtain a target question and answer pair, wherein the target question and answer pair comprises a target question and a target answer corresponding to the target question.

In order to achieve the above object, a third aspect of the embodiments of the present application provides an electronic device, which includes a memory, a processor, a program stored on the memory and executable on the processor, and a data bus for implementing connection communication between the processor and the memory, wherein the program, when executed by the processor, implements the method of the first aspect.

To achieve the above object, a fourth aspect of the embodiments of the present application proposes a storage medium, which is a computer-readable storage medium for computer-readable storage, and stores one or more programs, which are executable by one or more processors to implement the method of the first aspect.

The question-answer matching method, the question-answer matching device, the electronic equipment and the storage medium provided by the application acquire original sentence data to be processed; the original sentence data is subjected to context feature extraction to obtain the initial sentence data, so that the context semantic information of the original sentence can be well reserved, and the semantic integrity of the original sentence is improved. Further, carrying out sequence labeling on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, each candidate answer is used for answering one target question, the target question and the candidate answer in the initial sentence data can be split into independent text data according to the labeling information, therefore, the target question and the candidate answer are paired to construct initial question-answer pairs, and each initial question-answer pair comprises one target question and one candidate answer. Finally, performing feature extraction on the initial question-answer pairs through a preset question-answer matching model to obtain question-answer semantic features, thereby extracting question-answer characterization information of the initial question-answer pairs, performing matching probability calculation on the question-answer semantic features to obtain question-answer matching values, and performing screening processing on the initial question-answer pairs according to the question-answer matching values to obtain target question-answer pairs, wherein the target question-answer pairs comprise target questions and target answers corresponding to the target questions, and the method can better capture complete semantic information of the target questions and candidate answers, improve the matching effect of the target questions and the target answers, and improve the matching accuracy of the questions and the answers.

Drawings

Fig. 1 is a flowchart of a question-answer matching method provided in an embodiment of the present application;

fig. 2 is a flowchart of step S102 in fig. 1;

fig. 3 is a flowchart of step S201 in fig. 2;

fig. 4 is a flowchart of step S103 in fig. 1;

FIG. 5 is a flowchart of step S402 in FIG. 4;

fig. 6 is a flowchart of step S105 in fig. 1;

FIG. 7 is a flowchart of step S106 in FIG. 1;

fig. 8 is a schematic structural diagram of a question-answer matching device according to an embodiment of the present application;

fig. 9 is a schematic hardware structure diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

First, several terms referred to in the present application are resolved:

artificial Intelligence (AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produces a new intelligent machine that can react in a manner similar to human intelligence, and research in this field includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others. The artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results.

Natural Language Processing (NLP): NLP uses computer to process, understand and use human language (such as chinese, english, etc.), and belongs to a branch of artificial intelligence, which is a cross discipline between computer science and linguistics, also commonly called computational linguistics. Natural language processing includes parsing, semantic analysis, discourse understanding, and the like. Natural language processing is commonly used in the technical fields of machine translation, character recognition of handwriting and print, speech recognition and text-to-speech conversion, information intention recognition, information extraction and filtering, text classification and clustering, public opinion analysis and viewpoint mining, and relates to data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research, linguistic research related to language calculation and the like related to language processing.

Information Extraction (Information Extraction): and extracting the fact information of entities, relations, events and the like of specified types from the natural language text, and forming a text processing technology for outputting structured data. Information extraction is a technique for extracting specific information from text data. The text data is composed of specific units, such as sentences, paragraphs and chapters, and the text information is composed of small specific units, such as words, phrases, sentences and paragraphs or combinations of these specific units. The extraction of noun phrases, names of people, names of places, etc. in the text data is text information extraction, and of course, the information extracted by the text information extraction technology may be various types of information.

Encoding (encoder): the input sequence is converted into a vector of fixed length.

Embedding (Embedding): an Embedding Layer is a word Embedding that learns in conjunction with a neural network model on a particular natural language process. The embedding method one hot encodes (hot encodes) the words in the cleaned text, and the size or dimension of the vector space is specified as a part of the model, for example, 50, 100 or 300 dimensions. The vector is initialized with a small random number. The Embedding Layer is used for the front end of the neural network and adopts a back propagation algorithm for supervision. The encoded words are mapped to word vectors, which are concatenated before being input to the model if a multi-layered perceptron model MLP is used. If a recurrent neural network RNN is used, each word may be entered as one of the sequences. This method of learning the embedding layer requires a large amount of training data, may be slow, but may learn to train an embedding model for both specific text data and NLP. Embellding is a vector representation, which means that a low-dimensional vector represents an object, which can be a word, a commodity, a movie, etc.; the property of the Embedding vector is that objects corresponding to vectors with similar distances have similar meanings, for example, the distance between the Embedding (revenge league) and the Embedding (ironmen) is very close, but the distance between the Embedding (revenge league) and the Embedding (dinking) is far away. The Embedding is essentially a mapping from a semantic space to a vector space, and simultaneously maintains the relation of an original sample in the semantic space as much as possible in the vector space, for example, the positions of two words with similar semantics in the vector space are also relatively close. Embedding can encode an object by using a low-dimensional vector and also can reserve the meaning of the object, is usually applied to machine learning, and in the process of constructing a machine learning model, the object is encoded into a low-dimensional dense vector and then transmitted to DNN (digital noise network) so as to improve the efficiency.

Long Short-Term Memory network (LSTM): the time-cycle neural network is specially designed for solving the long-term dependence problem of the general RNN (cyclic neural network), and all the RNNs have a chain form of repeated neural network modules. In the standard RNN, this repeated structure block has only a very simple structure, e.g. one tanh layer. LSTM is a neural network of the type that contains LSTM blocks (blocks) or other types of neural networks, which may be described in literature or other literature as intelligent network elements because it can remember values of varying lengths of time, with a gate in the block that can determine whether an input is important enough to be remembered and cannot be output.

Bi-directional Long Short-Term Memory (Bi-directional Long Short-Term Memory, Bi-LSTM): is formed by combining a forward LSTM and a backward LSTM. Are commonly used in natural language processing tasks to model context information. Bi-LSTM combines the information of the input sequence in both the forward and backward directions on the basis of LSTM. For output at time t, the forward LSTM layer has information of time t and previous times in the input sequence, and the backward LSTM layer has information of time t and later times in the input sequence. The output of the forward LSTM layer at the time t is recorded as X, the output result of the backward LSTM layer at the time t is recorded as Y, and the vectors output by the two LSTM layers can be processed by means of addition, average value or connection and the like.

Word segmenter (Tokenizer): the text is divided into individual entities (usually words).

Softmax function: the Softmax function is a normalized exponential function that "compresses" a K-dimensional vector z containing arbitrary real numbers into another K-dimensional real vector σ (z) such that each element ranges between (0,1) and the sum of all elements is 1, which is commonly used in multi-classification problems.

Normalization: the normalization method has two forms, one is to change a number to a decimal between (0,1), and the other is to change a dimensional expression to a dimensionless expression. The method mainly aims at bringing out data processing convenience, maps data into a range of 0-1 for processing, is more convenient and faster, and belongs to the digital signal processing category.

In the field of intelligent prediction, it is generally possible to predict answers based on questions; at present, the matching answers are usually extracted from the database according to the questions to realize the prediction, in a specific question-answer scenario, the method usually predicts the answers for each question individually, and the matching degree between the predicted answers and the questions is not high enough, so how to improve the matching accuracy of the questions and the answers becomes a technical problem to be solved urgently.

Based on this, the embodiment of the application provides a question-answer matching method, a question-answer matching device, an electronic device and a storage medium, aiming at improving the matching accuracy of questions and answers.

The question-answer matching method, the question-answer matching device, the electronic device, and the storage medium provided in the embodiments of the present application are specifically described in the following embodiments, and first, the question-answer matching method in the embodiments of the present application is described.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence base technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The embodiment of the application provides a question-answer matching method, and relates to the technical field of artificial intelligence. The question and answer matching method provided by the embodiment of the application can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smartphone, tablet, laptop, desktop computer, or the like; the server side can be configured into an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and cloud servers for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN (content delivery network) and big data and artificial intelligence platforms; the software may be an application or the like that implements a question-and-answer matching method, but is not limited to the above form.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Fig. 1 is an optional flowchart of a question-answer matching method provided in an embodiment of the present application, and the method in fig. 1 may include, but is not limited to, step S101 to step S106.

Step S101, obtaining original sentence data to be processed;

step S102, extracting context characteristics of original sentence data to obtain initial sentence data;

step S103, carrying out sequence annotation on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, and each candidate answer is used for solving one target question;

step S104, constructing an initial question-answer pair according to the target question and the candidate answer;

step S105, performing feature extraction on the initial question-answer pair through a preset question-answer matching model to obtain question-answer semantic features;

and step S106, performing matching probability calculation on the question-answer semantic features through a question-answer matching model to obtain a question-answer matching value, and screening the initial question-answer pair according to the question-answer matching value to obtain a target question-answer pair, wherein the target question-answer pair comprises a target question and a target answer corresponding to the target question.

In steps S101 to S106 illustrated in the embodiment of the present application, original sentence data to be processed is obtained; the context characteristics of the original sentence data are extracted to obtain the initial sentence data, so that the context semantic information of the original sentence can be well kept, and the semantic integrity of the original sentence is improved. Further, carrying out sequence labeling on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, each candidate answer is used for answering one target question, the target question and the candidate answer in the initial sentence data can be split into independent text data according to the labeling information, therefore, the target question and the candidate answer are paired to construct initial question-answer pairs, and each initial question-answer pair comprises one target question and one candidate answer. Finally, performing feature extraction on the initial question-answer pairs through a preset question-answer matching model to obtain question-answer semantic features, thereby extracting question-answer characterization information of the initial question-answer pairs, performing matching probability calculation on the question-answer semantic features to obtain question-answer matching values, and performing screening processing on the initial question-answer pairs according to the question-answer matching values to obtain target question-answer pairs, wherein the target question-answer pairs comprise target questions and target answers corresponding to the target questions, and the method can better capture complete semantic information of the target questions and candidate answers, improve the matching effect of the target questions and the target answers, and improve the matching accuracy of the questions and the answers.

In step S101 of some embodiments, the original sentence data to be processed can be obtained by writing a web crawler, and performing targeted crawling on data after setting a data source. The original sentence data may be acquired by other means, but is not limited thereto. The original sentence data may be text data, and the original sentence data mainly includes two kinds of sentences, i.e., a question sentence and an answer sentence.

Referring to fig. 2, in some embodiments, step S102 may include, but is not limited to, step S201 to step S202:

step S201, performing feature embedding processing on original sentence data to obtain sentence embedding vectors;

and S202, extracting context characteristics of the sentence embedding vectors through a preset attention mechanism model to obtain initial sentence data.

In step S201 of some embodiments, the original sentence data is feature-embedded by a preset pre-trained model, and this feature embedding process includes an embedding process of a question sentence part and an answer sentence part in the original sentence data, wherein the preset pre-trained model may be a roberta model or the like. For example, the original sentence data is subjected to feature embedding processing through a roberta model to obtain initial word embedding data and initial sentence embedding data corresponding to the original sentence data, multi-head attention calculation is performed on the word embedding data to obtain candidate sentence embedding data, and the candidate sentence embedding data and the initial sentence embedding data are subjected to splicing processing to obtain a sentence embedding vector.

In step S202 of some embodiments, the preset attention mechanism model may be constructed based on an LSTM algorithm or a Bi-LSTM algorithm, and taking the Bi-LSTM algorithm as an example, the sentence embedding vector may be encoded by the attention mechanism model from left to right to obtain a first feature vector a, and then the sentence embedding vector may be encoded from right to left to obtain a second feature vector b, and the first feature vector and the second feature vector may be processed in a manner of vector addition, vector averaging, and the like, so that context semantic information of the sentence embedding vector may be better captured, and thus initial sentence data may be obtained, where the initial sentence data is mainly CLS feature data.

It should be noted that the CLS feature is mainly a representation of information obtained at a sentence level through an attention mechanism, and in different tasks, the CLS feature may be used to represent context information in a specific environment, and in this embodiment of the present application, the CLS feature may represent context information of questions and answers in a question-answer matching environment.

The steps S201 to S202 can better retain the context semantic information of the original sentence, thereby improving the semantic integrity of the original sentence.

Referring to fig. 3, in some embodiments, step S201 may include, but is not limited to, step S301 to step S303:

step S301, performing word segmentation processing on original sentence data to obtain an original word segment;

step S302, carrying out word embedding processing on the original word segment to obtain an original word embedding vector;

and step S303, splicing the original word embedded vectors to obtain sentence embedded vectors.

In step S301 of some embodiments, a preset Jieba tokenizer is used to perform tokenization on original sentence data, specifically, a dictionary file in the Jieba tokenizer is loaded first, each word in the Jieba tokenizer and the number of times of occurrence of the word are obtained, further, the dictionary file is traversed, a directed acyclic graph of all possible tokenization conditions in the original sentence data is constructed through all character string matching, the maximum probability in all paths from each chinese character node in the original sentence data to the end of the sentence is calculated, and the end position of the corresponding chinese character segment in the directed acyclic graph is recorded at the same time when the maximum probability is recorded. And finally segmenting the word segments of the core of the original sentence data according to the node paths to obtain the original word segments. In addition, the original sentence data without corresponding words in the dictionary file can be processed by a statistical method to obtain original word segments.

In step S302 of some embodiments, a word embedding process is performed on the original word segment, so as to implement mapping of the original word segment from a semantic space to a vector space, and obtain an original word embedding vector.

In step S303 of some embodiments, the original word embedding vectors are spliced according to the sequence of the original word segments in the entire original sentence data, so as to obtain a complete long vector, which is the sentence embedding vector.

Referring to fig. 4, in some embodiments, step S103 may include, but is not limited to, step S401 to step S402:

step S401, predicting the position of initial sentence data according to a preset first function and a BIO label to obtain a question position label and an answer position label of the initial sentence data;

step S402, the initial sentence data is segmented according to the question position labels to obtain target question data, and the initial sentence data is segmented according to the answer position labels to obtain candidate answer data.

In step S401 of some embodiments, when the position of the initial sentence data is predicted according to the preset first function, the first function may adopt a classification function such as a softmax function, and may also introduce a joint labeling manner, that is, adopt a BIO labeling manner, and label different sentence portions of each initial sentence with different labels, where in the BIO labeling, B-X generally means that a segment where the portion is located belongs to X and is at a beginning position, I-X generally means that a segment where the portion is located belongs to X and is at an intermediate position, and O generally means that the portion does not belong to any portion.

Specifically, according to the BIO label, a preset BIO tag is set as a question location tag and an answer location tag, wherein the question location tag includes a question start tag Bq, a question intermediate tag Iq, and a question end tag Oq, and the answer location tag includes an answer start tag Ba, an answer intermediate tag Ia, and an answer end tag Oa. And performing position probability calculation on the initial sentence data through a softmax function to obtain the probability distribution condition (namely a position probability vector) of each preset position tag corresponding to the initial sentence data. And reflecting the matching degree of the sentence fragments corresponding to the initial sentence data and the preset position tags through the probability distribution condition, wherein if the initial probability vector is larger, the probability that the sentence fragments belong to the preset position tags is higher, so that the preset position tags corresponding to the position probability vectors with the maximum numerical value are selected as the position tags of each sentence fragment, and the question position tags and the answer position tags of the initial sentence data are obtained.

It should be noted that each target question includes question content and a question position tag, the question position tag is used for marking the position of the question content, each candidate answer includes answer content and an answer position tag, and the answer position tag is used for marking the position of the answer content.

For example, for an initial sentence: all articles are articles of manufacture having advantages and disadvantages, and are directed to the text and results herein. Asking a question again what is the core idea of the article? According to the first function and the preset position label, the sentence fragment and the corresponding position label can be split into sentence fragments, each sentence fragment and the corresponding position label can form a labeling form of < sentence fragment n, position label n >, that is < all articles are advantageous and disadvantageous, Bq1>, < please aim at the content and result of the article, Iq1>, < pointing out the advantages and disadvantages of the text, Iq2>, < asking a question again, Oq1>, < what is the core idea of the article? Bq2>, in this way, the target question in the initial sentence is "please aim at the content and result of the article, indicate the advantages and disadvantages of the text" and "what the core idea of the article is" can be obtained more intuitively.

Referring to fig. 5, in some embodiments, step S402 may include, but is not limited to, steps S501 to S502:

step S501, extracting a problem starting label and a problem ending label in problem position labels, and segmenting initial sentence data according to the problem starting label and the problem ending label to obtain target problem data;

step S502, an answer starting label and an answer ending label in the answer position labels are extracted, and candidate answer data are segmented according to the answer starting label and the answer ending label to obtain candidate answer data.

In step S501 of some embodiments, the question location tag includes a question start tag Bq, a question intermediate tag Iq, and a question end tag Oq, and since a target question may be formed between the question start tag Bq and the question end tag Oq, the initial sentence data is segmented according to the question start tag Bq and the question end tag, and a sentence fragment between the question start tag Bq and the question end tag Oq is intercepted to obtain target question data.

In step S502 of some embodiments, the answer position tags include an answer start tag Ba, an answer middle tag Ia, and an answer end tag Oa, and since a candidate question may be formed between the answer start tag Ba and the answer end tag Oa, the initial sentence data is segmented according to the answer start tag Ba and the answer end tag Oa, and a sentence fragment between the answer start tag Ba and the answer end tag Oa is intercepted to obtain candidate answer data.

Through the steps S501 to S502, while the context semantic information of the original sentence is retained, the initial sentence data can be split into a plurality of individual target questions and candidate answers according to the question position tags and the answer position tags, so that the matching accuracy of the questions and the answers is improved.

In step S104 of some embodiments, when an initial question-answer pair is constructed according to the target question and the candidate answers, the target question and each candidate answer are paired to form a one-to-many mapping relationship between the target question and the candidate answers, so as to obtain an initial question-answer pair, where each initial question-answer pair includes one target question and one candidate answer.

Before step S105 of some embodiments, the question-answer matching method further includes training a question-answer matching model in advance, where the question-answer matching model may be constructed based on a Roberta model, and the question-answer matching model includes a coding layer and a linear layer, where the coding layer is mainly used to code input question-answer pairs, capture CLS features of the input question-answer pairs, and the linear layer is mainly used to perform probability calculation on the CLS features, and determine the degree of correlation between questions and answers in each question-answer pair. When the question-answer matching model is trained, a sample question-answer pair is input into the question-answer matching model, model loss is calculated through a loss function of the question-answer matching model, wherein the loss function can be a common cross entropy loss function, meanwhile, a gradient descent method can be adopted to carry out backward propagation on the model loss, and model parameters matched with the question and answer are adjusted according to the model loss so as to train the question-answer matching model.

Referring to fig. 6, in some embodiments, step S105 includes, but is not limited to, steps S601 to S602:

step S601, encoding the initial question-answer pair through a question-answer matching model to obtain a question-answer encoding vector;

step S602, carrying out normalization processing on the question-answer encoding vector to obtain question-answer semantic features, wherein the question-answer semantic features are representation features used for representing context semantic information of sentences.

In step S601 of some embodiments, the initial question-answer pair may be subjected to word embedding processing by a transformer algorithm of a question-answer matching model, so as to implement mapping of the initial question-answer pair from a semantic space to a vector space, and the relationship of the initial question-answer pair in the semantic space can be maintained in the vector space, thereby obtaining a question-answer encoding vector in an embedded form.

In step S602 of some embodiments, the question-answer encoding vector is normalized by the question-answer matching model, and the question-answer encoding vector is converted from a dimensional form to a question-answer semantic feature in a dimensionless representation form, where the question-answer semantic feature is a CLS feature that can be used to characterize semantic context information of a question-answer pair in a question-answer matching scenario.

In some embodiments, the specific process of encoding and normalizing the initial question-answer pair by the question-answer matching model may be represented as shown in formula (1):

h _cls bert (x) formula (1)

Wherein h is _cls The method is characterized in that the method is a question-answer semantic feature, X is an input initial question-answer pair, and BERT is an operation process of coding and normalization.

Referring to fig. 7, in some embodiments, step S106 may include, but is not limited to, step S701 to step S702:

step S701, performing matching probability calculation on the question-answer semantic features through a second function of the question-answer matching model to obtain a question-answer matching value;

step S702, the initial question-answer pair with the maximum question-answer matching value is used as a target question-answer pair.

In step S701 of some embodiments, the second function may be a classification function such as a softmax function or a sigmod function, for example, the softmax function is used to perform matching probability calculation on the question-answer semantic features through the softmax function to obtain a probability distribution situation of each question-answer semantic feature in a preset classification tag, where the probability distribution situation is a question-answer matching value, and if the question-answer matching value is greater than a preset matching threshold, it indicates that the target question and the candidate answer of the initial question-answer pair corresponding to the question-answer semantic feature are related, and if the question-answer matching value is less than or equal to the preset matching threshold, it indicates that the target question and the candidate answer of the initial question-answer pair corresponding to the question-answer semantic feature are not related, and by this means, the degree of correlation between the target question and the candidate answer of each initial question-answer pair can be conveniently obtained.

Further, in order to represent whether the target question of the initial question-answer pair is related to the candidate answer, the initial question-answer pair may be classified according to a question-answer matching value and a preset matching threshold, and the classification label of the initial question-answer pair with the question-answer matching value greater than the preset matching threshold is labeled as related and is represented by numeral 1; and labeling the classification labels of the initial question-answer pairs with the question-answer matching values smaller than or equal to the preset matching threshold as irrelevant, and representing the irrelevant labels by using a numeral 0.

In step S702 of some embodiments, since a larger question-answer matching value indicates a higher degree of correlation between the target question of the initial question-answer pair and the candidate answer, in the initial question-answer pair with the classification label of 1, the initial question-answer pair with the largest question-answer matching value is selected as the target question-answer pair, where the target question-answer pair includes the target question and the target answer corresponding to the target question.

According to the question-answer matching method, original sentence data to be processed are obtained; the original sentence data is subjected to context feature extraction to obtain the initial sentence data, so that the context semantic information of the original sentence can be well reserved, and the semantic integrity of the original sentence is improved. Further, carrying out sequence labeling on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, each candidate answer is used for answering one of the target questions, the target question and the candidate answer in the initial sentence data can be split into independent text data according to the labeling information, therefore, the target question and the candidate answer are paired to construct initial question-answer pairs, and each initial question-answer pair comprises one target question and one candidate answer. Finally, performing feature extraction on the initial question-answer pairs through a preset question-answer matching model to obtain question-answer semantic features, thereby extracting question-answer characterization information of the initial question-answer pairs, performing matching probability calculation on the question-answer semantic features to obtain question-answer matching values, and performing screening processing on the initial question-answer pairs according to the question-answer matching values to obtain target question-answer pairs, wherein the target question-answer pairs comprise target questions and target answers corresponding to the target questions, and the method can better capture complete semantic information of the target questions and candidate answers, improve the matching effect of the target questions and the target answers, and improve the matching accuracy of the questions and the answers.

Referring to fig. 8, an embodiment of the present application further provides a question-answer matching device, which can implement the question-answer matching method described above, and the device includes:

an obtaining module 801, configured to obtain original sentence data to be processed;

a first feature extraction module 802, configured to perform context feature extraction on original sentence data to obtain initial sentence data;

a sequence labeling module 803, configured to perform sequence labeling on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, and each candidate answer is used for solving one target question;

a construction module 804, configured to construct an initial question-answer pair according to the target question and the candidate answer;

a second feature extraction module 805, configured to perform feature extraction on the initial question-answer pair through a preset question-answer matching model, to obtain question-answer semantic features;

the calculating module 806 is configured to perform matching probability calculation on the question-answer semantic features through the question-answer matching model to obtain a question-answer matching value, and perform screening processing on the initial question-answer pair according to the question-answer matching value to obtain a target question-answer pair, where the target question-answer pair includes a target question and a target answer corresponding to the target question.

In some embodiments, the first feature extraction module 802 comprises:

the embedding unit is used for carrying out feature embedding processing on original sentence data to obtain a sentence embedding vector;

and the extraction unit is used for extracting the context characteristics of the sentence embedding vector through a preset attention mechanism model to obtain initial sentence data.

In some embodiments, the embedding unit comprises:

the word segmentation subunit is used for carrying out word segmentation processing on the original sentence data to obtain an original word segment;

the word embedding subunit is used for carrying out word embedding processing on the original word segment to obtain an original word embedding vector;

and the splicing subunit is used for splicing the original word embedded vectors to obtain sentence embedded vectors.

In some embodiments, the sequence annotation module 803 includes:

the position prediction unit is used for predicting the position of the initial sentence data according to a preset first function and the BIO label to obtain a question position label and an answer position label of the initial sentence data;

and the segmentation unit is used for segmenting the initial sentence data according to the question position tags to obtain target question data, and segmenting the initial sentence data according to the answer position tags to obtain candidate answer data.

In some embodiments, the segmentation unit comprises:

the first segmentation subunit is used for extracting a problem starting label and a problem ending label from the problem position labels, and segmenting the initial sentence data according to the problem starting label and the problem ending label to obtain target problem data;

and the second segmentation subunit is used for extracting an answer starting label and an answer ending label from the answer position labels, and segmenting the candidate answer data according to the answer starting label and the answer ending label to obtain the candidate answer data.

In some embodiments, the second feature extraction module 805 comprises:

the encoding unit is used for encoding the initial question-answer pair through the question-answer matching model to obtain a question-answer encoding vector;

and the normalization unit is used for performing normalization processing on the question-answer encoding vector to obtain question-answer semantic features, wherein the question-answer semantic features are characterization features used for characterizing the context semantic information of the sentence.

In some embodiments, the calculation module 806 includes:

the probability calculation unit is used for performing matching probability calculation on the question-answer semantic features through a second function of the question-answer matching model to obtain a question-answer matching value;

and the screening unit is used for taking the initial question-answer pair with the maximum question-answer matching value as a target question-answer pair.

The specific implementation of the question-answer matching device is basically the same as the specific implementation of the question-answer matching method, and is not described herein again.

An embodiment of the present application further provides an electronic device, where the electronic device includes: the system comprises a memory, a processor, a program stored on the memory and capable of running on the processor, and a data bus for realizing connection communication between the processor and the memory, wherein the program realizes the question-answer matching method when being executed by the processor. The electronic equipment can be any intelligent terminal including a tablet computer, a vehicle-mounted computer and the like.

Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device according to another embodiment, where the electronic device includes:

the processor 901 may be implemented by a general-purpose CPU (central processing unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits, and is configured to execute a relevant program to implement the technical solution provided in the embodiment of the present application;

the memory 902 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a Random Access Memory (RAM). The memory 902 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 902 and called by the processor 901 to execute the question-answer matching method of the embodiments of the present application;

an input/output interface 903 for implementing information input and output;

a communication interface 904, configured to implement communication interaction between the device and another device, where communication may be implemented in a wired manner (e.g., USB, network cable, etc.), or in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);

a bus 905 that transfers information between various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);

wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 are communicatively connected to each other within the device via a bus 905.

The embodiment of the present application further provides a storage medium, which is a computer-readable storage medium for computer-readable storage, where the storage medium stores one or more programs, and the one or more programs are executable by one or more processors to implement the above question and answer matching method.

The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The question and answer matching method, the question and answer matching device, the electronic equipment and the storage medium provided by the embodiment of the application acquire original sentence data to be processed; the context characteristics of the original sentence data are extracted to obtain the initial sentence data, so that the context semantic information of the original sentence can be well kept, and the semantic integrity of the original sentence is improved. Further, carrying out sequence labeling on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, each candidate answer is used for answering one target question, the target question and the candidate answer in the initial sentence data can be split into independent text data according to the labeling information, therefore, the target question and the candidate answer are paired to construct initial question-answer pairs, and each initial question-answer pair comprises one target question and one candidate answer. Finally, performing feature extraction on the initial question-answer pairs through a preset question-answer matching model to obtain question-answer semantic features, thereby extracting question-answer characterization information of the initial question-answer pairs, performing matching probability calculation on the question-answer semantic features to obtain question-answer matching values, and performing screening processing on the initial question-answer pairs according to the question-answer matching values to obtain target question-answer pairs, wherein the target question-answer pairs comprise target questions and target answers corresponding to the target questions, and the method can better capture complete semantic information of the target questions and candidate answers, improve the matching effect of the target questions and the target answers, and improve the matching accuracy of the questions and the answers.

The embodiments described in the embodiments of the present application are for more clearly illustrating the technical solutions of the embodiments of the present application, and do not constitute a limitation to the technical solutions provided in the embodiments of the present application, and it is obvious to those skilled in the art that the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems with the evolution of technology and the emergence of new application scenarios.

It will be appreciated by those skilled in the art that the solutions shown in fig. 1-7 are not intended to limit the embodiments of the present application and may include more or fewer steps than those shown, or some of the steps may be combined, or different steps may be included.

The above described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, and functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like (if any) in the description of the present application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be implemented in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that, in this application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes multiple instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing programs, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and the scope of the claims of the embodiments of the present application is not limited thereto. Any modifications, equivalents and improvements that may occur to those skilled in the art without departing from the scope and spirit of the embodiments of the present application are intended to be within the scope of the claims of the embodiments of the present application.

Claims

1. A question-answer matching method, characterized in that the method comprises:

acquiring original sentence data to be processed;

2. The question-answer matching method according to claim 1, wherein the step of extracting the context feature of the original sentence data to obtain the initial sentence data comprises:

3. The question-answer matching method according to claim 2, wherein the step of performing feature embedding processing on the original sentence data to obtain a sentence embedding vector comprises:

4. The question-answer matching method according to claim 1, wherein the step of performing sequence labeling on the initial sentence data to obtain target question data and candidate answer data comprises:

5. The question-answer matching method according to claim 4, wherein the step of obtaining the candidate answer data by segmenting the initial sentence data according to the question position tags and by segmenting the initial sentence data according to the answer position tags includes:

6. The question-answer matching method according to claim 1, wherein the step of performing feature extraction on the initial question-answer pair through a preset question-answer matching model to obtain question-answer semantic features comprises:

7. The question-answer matching method according to any one of claims 1 to 6, wherein the step of performing matching probability calculation on the question-answer semantic features through the question-answer matching model to obtain a question-answer matching value, and performing screening processing on the initial question-answer pair according to the question-answer matching value to obtain a target question-answer pair comprises:

8. A question-answer matching apparatus, characterized in that the apparatus comprises:

the first characteristic extraction module is used for extracting the context characteristics of the original sentence data to obtain initial sentence data;

the sequence marking module is used for carrying out sequence marking on the initial sentence data to obtain target question data and candidate answer data; the target question data comprises at least one target question, the candidate answer data comprises at least one candidate answer, and each candidate answer is used for solving one of the target questions;

and the calculation module is used for performing matching probability calculation on the question-answer semantic features through the question-answer matching model to obtain a question-answer matching value, and screening the initial question-answer pair according to the question-answer matching value to obtain a target question-answer pair, wherein the target question-answer pair comprises a target question and a target answer corresponding to the target question.

9. An electronic device, characterized in that the electronic device comprises a memory, a processor, a program stored on the memory and executable on the processor, and a data bus for enabling connection communication between the processor and the memory, the program, when executed by the processor, implementing the steps of the question-answer matching method according to any one of claims 1 to 7.

10. A storage medium that is a computer-readable storage medium for computer-readable storage, characterized in that the storage medium stores one or more programs that are executable by one or more processors to implement the steps of the question-answer matching method according to any one of claims 1 to 7.