CN113159187A

CN113159187A - Classification model training method and device, and target text determining method and device

Info

Publication number: CN113159187A
Application number: CN202110442474.1A
Authority: CN
Inventors: 戴淑敏; 李长亮; 李小龙
Original assignee: Beijing Kingsoft Software Co Ltd
Current assignee: Beijing Kingsoft Software Co Ltd; Beijing Kingsoft Digital Entertainment Co Ltd
Priority date: 2021-04-23
Filing date: 2021-04-23
Publication date: 2021-07-23

Abstract

The method comprises the steps of obtaining a target question, inputting the target question into a search database, and obtaining at least one initial text corresponding to the target question; and inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question. The method comprises the steps of firstly inputting a target problem into a search database, obtaining a plurality of initial texts corresponding to the target problem through the search database, realizing the rough text recall of the first stage, and then further screening the initial texts recalled at the first stage through a pre-trained classification model, so as to screen out the more accurate target texts most relevant to the target problem from the plurality of initial texts.

Description

Classification model training method and device, and target text determining method and device

Technical Field

The present application relates to the field of artificial intelligence, and in particular, to a classification model training method and apparatus, a target text determination method and apparatus, a computing device, and a computer-readable storage medium.

Background

In the technical field of information retrieval, a common text recall method mainly comprises a text matching recall, a label recall and a semantic recall, wherein the text matching recall is that a most relevant text (Doc) is matched according to a keyword in a question sentence (Query) of a user in a corpus based on a Term Frequency-Inverse text Frequency index (TF-IDF) statistical analysis method of the keyword; the label recalls are used for matching the most relevant recalling texts according to labels of the texts in the corpus; the semantic recall is to calculate the text most relevant to the question sentence through semantic similarity, while the common semantic matching recall is mainly based on semantic matching represented, and is performed by respectively representing the question sentence and the text of a user into semantic vectors and then performing semantic similarity calculation on the semantic vectors of the question sentence and the text.

However, the semantic vector learned by the semantic matching recall method has limitations, no interaction exists between the question sentence and the recall text, and no context information is considered, so that the matching precision is not high. Therefore, how to improve the matching precision between the question sentence and the recall text becomes a problem to be solved urgently.

Disclosure of Invention

In view of the above, embodiments of the present application provide a classification model training method and apparatus, a target text determination method and apparatus, a computing device, and a computer-readable storage medium, so as to solve technical defects in the prior art.

According to a first aspect of embodiments of the present application, there is provided classification model training, including:

acquiring a training data set, wherein the training data set comprises a sample question and a sample answer corresponding to the sample question;

constructing a training sample corresponding to the sample question based on the sample question and a sample text of the sample question obtained by searching a database;

and training a classification model based on the training samples and the sample labels corresponding to the training samples to obtain the classification model.

According to a second aspect of the embodiments of the present application, there is provided a target text determination method, including:

acquiring a target problem, inputting the target problem into a search database, and acquiring at least one initial text corresponding to the target problem;

inputting the target question and the at least one initial text into a classification model, and obtaining the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method;

and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability.

According to a third aspect of the embodiments of the present application, there is provided a classification model training apparatus, including:

a training data acquisition module configured to acquire a training data set, wherein the training data set includes a sample question and a sample answer corresponding to the sample question;

a training sample construction module configured to construct a training sample corresponding to the sample question based on the sample question and a sample text of the sample question obtained by searching a database;

and the model training module is configured to train a classification model based on the training samples and the sample labels corresponding to the training samples to obtain the classification model.

According to a fourth aspect of embodiments of the present application, there is provided a target text determination apparatus, including:

the question acquisition module is configured to acquire a target question and input the target question into a search database to acquire at least one initial text corresponding to the target question;

a probability obtaining module configured to input the target question and the at least one initial text into a classification model, and obtain a probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method;

a text determination module configured to determine a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability.

According to a fifth aspect of embodiments of the present application, there is provided a computing device comprising a memory, a processor and computer instructions stored on the memory and executable on the processor, the processor executing the steps of the classification model training method or the steps of the target text determination method when the computer instructions are executed.

According to a sixth aspect of embodiments of the present application, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the classification model training method or the steps of the target text determination method.

The target text determination method comprises the steps of obtaining a target problem, inputting the target problem into a search database, and obtaining at least one initial text corresponding to the target problem; inputting the target question and the at least one initial text into a classification model, and obtaining the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method; and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability. The method comprises the steps of firstly inputting a target problem into a search database, obtaining a plurality of initial texts corresponding to the target problem through the search database, realizing the rough text recall of the first stage, and then further screening the initial texts recalled at the first stage through a pre-trained classification model, so as to screen out the more accurate target texts most relevant to the target problem from the plurality of initial texts.

In addition, when the initial texts recalled at the first stage are screened through the classification model, the target problem and each initial text are spliced and input into the classification model, the classification model calculates the similarity between the word vector at each position in the spliced text and other word vectors in the text, which is equivalent to the interactive calculation of every two word vectors in the spliced text, and by referring to the features of the word vectors at all positions around each word vector, the context information of the spliced text is combined, the matching precision of the target problem and the initial text is improved, and the target text corresponding to the target problem can be obtained more accurately.

Drawings

FIG. 1 is a block diagram of a computing device provided by an embodiment of the present application;

FIG. 2 is a flowchart of a training method of a semantic matching model according to an embodiment of the present disclosure;

FIG. 3 is another flowchart of a training method of a semantic matching model according to an embodiment of the present disclosure;

FIG. 4 is another flowchart of a training method of a semantic matching model according to an embodiment of the present disclosure;

fig. 5 is another flowchart of a training method of a semantic matching model according to an embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description.

First, the noun terms to which one or more embodiments of the present invention relate are explained.

Embedding: : namely, embedded expression, word embedding is an essential link for processing texts by a computer, namely, input natural language symbols are mapped to vectors with fixed length through a numerical matrix, so that a complex text problem is converted into a mathematical problem.

Transformer model: a neural network model for solving a sequence problem based on an attention model is mainly divided into an encoder (encoder) and a decoder (decoder), wherein the encoder and the decoder are similar in basic structure and both consist of a multi-head self-attention layer and a full-connection layer. The Transformer model can capture text information at a longer distance than the traditional recurrent neural network model for solving the sequence problem.

BERT model: BERT, i.e., Bidirectional Encoder reproduction from transformations, refers to the Encoder portion of a Bidirectional Transformer model, which is a self-encoding language model that captures expressions of terms and sentence levels respectively by using a mask language model and a next sentence prediction two pre-training tasks.

ALBERT model: a lightweight BERT model aims to solve the problem that the parameter quantity of the current pre-training model is too large, and an ALBERT model is mainly improved by three points compared with the BERT model: (1) factorizing the embedded matrix; (2) cross-layer parameter sharing, i.e. multiple layers use the same parameters; (3) the next Sentence prediction (NS P) is replaced by a Sentence-order prediction (SOP), specifically its positive training sample is the same as the NSP, but the negative training sample is constructed by selecting two consecutive sentences in a document and exchanging their order.

Recalling: in the fields of search, recommendation and the like, relevant or interesting content of the user is returned according to the search problem, the user behavior and the like of the user. Common recall methods include text matching recall, tag recall, semantic recall, and the like

Semantic matching recall: the method is characterized in that word embedding coding is carried out on question sentences and text corpora of users, and semantic similarity of semantic vectors of the question sentences and semantic vectors of the text corpora is calculated through a vector similarity calculation method, so that semantic matching recall is achieved.

Elastic search: a non-relational database. Is a near real-time search platform from indexing the document to the time the document can be searched with only a slight delay. The non-relational database is expandable and highly available, and aims to quickly query data required by a user.

In the present application, a classification model training method and apparatus, a target text determination method and apparatus, a computing device, and a computer-readable storage medium are provided, which are described in detail in the following embodiments one by one.

FIG. 1 shows a block diagram of a computing device 100, according to an embodiment of the present description. The components of the computing device 100 include, but are not limited to, memory 110 and processor 120. The processor 120 is coupled to the memory 110 via a bus 130 and a database 150 is used to store data.

Computing device 100 also includes access device 140, access device 140 enabling computing device 100 to communicate via one or more networks 160. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 140 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 100 and other components not shown in FIG. 1 may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 1 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.

Computing device 100 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 100 may also be a mobile or stationary server.

Wherein the processor 120 may perform the steps of the method shown in fig. 2. Fig. 2 is a flowchart illustrating a classification model training method provided according to an embodiment of the present application, which specifically includes the following steps.

Step 202: obtaining a training data set, wherein the training data set comprises a sample question and a sample answer corresponding to the sample question.

In practical application, in order to ensure the training effect of the classification model, the training data set includes a plurality of sample questions and sample answers corresponding to the sample questions, where each sample question and corresponding sample answer form a query (sample question) -answer pair, and then the training data set includes a plurality of query-answer pairs.

The sample questions can be any type of sample questions with any length, and the sample answer corresponding to each sample question can be understood as the standard answer corresponding to each sample question; for example, the sample questions are: why is the economic increase and decrease in our country in the period from 2012 to 2016? "; the sample answers corresponding to the sample questions are: "because the demand structure and supply conditions supporting the past high-speed growth of the Chinese economy have changed, the slowing of the economic growth of China during the year 2012 to the year 2016 is actually the result of a change in the development phase, rather than a so-called periodic short-term change".

Step 204: and constructing a training sample corresponding to the sample question based on the sample question and the sample text of the sample question obtained by searching a database.

The search database may be an Elasticsearch database, and may also be other text databases with a search function, which is not limited in this application.

Specifically, after sample questions in a training data set and sample answers corresponding to the sample questions are obtained, inputting the sample questions into an Elasticissearch database, and retrieving a preset number of sample texts corresponding to the sample questions through the Elasticissearch database; the preset number may be set according to the actual training requirement of the classification model, for example, set to 100, 200, and so on.

In practical application, when the elastic search database performs sample text search based on the sample problem, the sample text search may be implemented based on keywords in the sample problem, that is, the elastic search database extracts keywords in the sample problem, and 100 initial sample texts corresponding to the keywords are searched through the keywords, for example, when several keywords of the extracted sample problem occur in one initial sample text at the same time, it may be determined that the probability of retrieving the initial sample text is high, and the ranking in the 100 initial sample texts is advanced, that is, the 100 initial sample texts retrieved through the elastic search database based on the sample problem are all associated with the sample problem, so as to implement that a more accurate positive training sample may be subsequently constructed based on the retrieved initial sample text and the sample problem.

In this embodiment of the present application, the classification model is supervised training, the training samples include positive training samples and negative training samples, and then the constructing of the training sample corresponding to the sample question based on the sample question and the sample text of the sample question obtained by searching the database includes:

inputting the sample question into a search database to obtain an initial sample text corresponding to the sample question;

matching the sample answer corresponding to the sample question with the initial sample text, and taking the initial sample text with the matching similarity greater than or equal to a preset similarity threshold as a first sample text;

constructing a positive training sample corresponding to the sample question based on the sample question and the first sample text;

and constructing a negative training sample corresponding to the sample question based on the sample question and other sample texts which are obtained from the search database and are different from the first sample text.

When the method is specifically implemented, each sample question is firstly input into a search database, and initial sample texts with preset quantity corresponding to each sample question are retrieved through the search database. And then matching the sample answer corresponding to each sample question with the initial sample text corresponding to the sample question, and taking the initial sample text with the matching similarity greater than or equal to a preset similarity threshold as the first sample text. And finally, constructing a positive training sample corresponding to each sample question based on each sample question and the first sample text corresponding to the sample question. Meanwhile, a negative training sample corresponding to each sample question is constructed based on each sample question and other sample questions which are obtained from a search database and are different from the first sample question corresponding to the sample question.

The preset similarity threshold may be set according to actual needs, for example, set to 80% or 90%.

Specifically, when the sample answer corresponding to the sample question is matched with the initial sample text and the similarity between the sample answer corresponding to the sample question and the initial sample text is determined, the similarity between the sample answer and each initial sample text may be calculated by calculating an edit distance between two character strings.

For example: the character string 1 is: "I is now in Beijing" and string 2 is: "I is in Beijing", the character string 1 has one more character, and the editing distance between the two character strings is 1.

The calculation of the edit distance between two character strings is to compare how many steps are taken to convert one character string into another character string after the two character strings are subjected to a series of operations such as addition, deletion or replacement, and the like, so that the step taken is the edit distance between the two character strings, and the smaller the distance, the higher the similarity.

The sample answer and the initial sample text are respectively treated as a character string, the initial sample text can be converted into the sample answer by the number of steps, the number of the steps is the edit distance between the sample answer and the initial sample text, and the similarity between the sample answer and the initial sample text is calculated by the edit distance.

In practical application, the sample answer corresponding to the sample question may be matched with each initial sample text by using a fuzzy matching tool (string), so as to calculate the similarity between the sample answer and each initial sample text, or the sample answer and each initial sample text may be calculated by using another similarity calculation tool, which is not limited herein.

For example, if the preset number is 100, and the preset similarity threshold is 80%, to describe the construction of the training sample corresponding to the sample problem a in detail.

First, the sample question a is input into the Elasticsearch database, and 100 initial sample texts corresponding to each sample question are retrieved through the Elasticsearch database.

Then, the standard answer corresponding to the sample question a is matched with each initial sample text in the 100 initial sample texts, the similarity between the standard answer corresponding to the sample question a and each initial sample text is calculated, and the initial text sample with the similarity being more than or equal to 80% is used as the first sample text.

And finally, constructing a positive training sample corresponding to the sample question a based on the sample question a and each first sample text.

Meanwhile, a negative training sample corresponding to the sample question a is constructed based on the sample question a and other sample texts (namely, texts which are not the first sample texts and are obtained from the Elasticissearch database) which are obtained from the Elasticissearch database and are different from the first sample texts.

In the embodiment of the application, training of the classification model depends on training samples, in practical application, the more training samples, the better training effect of the classification model is, but the training samples depend on manual labeling at present, the cost of the manual labeling is very high, and the labor cost can be greatly increased by labeling a large number of training samples; in the method, a plurality of initial sample problems associated with the sample problems are obtained from a search database through the sample problems by a small number of manually labeled query-answer pairs, construction of a positive training sample is realized based on the sample problems and each associated initial sample text, construction of a negative training sample is realized based on the sample problems and other sample texts different from the associated initial sample text, and construction of a training sample of a classification model with a good effect can be quickly realized by the method, so that the training effect of a subsequent classification model is improved.

Specifically, the constructing a negative training sample corresponding to the sample question based on the sample question and other sample texts, which are obtained from the search database and are different from the first sample text, includes:

matching the sample answer corresponding to the sample question with the initial sample text, and taking the initial sample text with the matching similarity smaller than a preset similarity threshold as a second sample text;

obtaining a third sample text different from the initial sample text from the search database based on the sample question;

determining a fourth sample text from the initial sample texts corresponding to other sample questions different from the sample question;

and constructing a negative training sample corresponding to the sample question based on the sample question and the second sample text, the third sample text and/or the fourth sample text.

In specific implementation, when a negative training sample of a classification model is constructed, various construction modes exist, and a negative training sample can be formed by combining an initial sample text with similarity smaller than a preset similarity threshold value with a sample problem; or other sample texts different from the sample text corresponding to the current sample problem can be retrieved from the search database and combined with the sample problem to form a negative training sample; in addition, a negative training sample can be formed based on combining a sample text corresponding to any one sample question different from the current sample question with the sample question. Or combining the above modes two by two or three to form a negative training sample.

Along with the above example, the standard answer corresponding to the sample question a is matched with each of the 100 initial sample texts, the similarity between the standard answer corresponding to the sample question a and each of the initial sample texts is calculated, and the initial text sample with the similarity smaller than 80% is used as the second sample text.

Meanwhile, acquiring texts with different preset quantities from the Elasticissearch database corresponding to the sample question a and different from 100 initial sample texts corresponding to the sample question a as third sample texts based on the sample question a; the preset number here can also be set according to actual needs, and is not limited here.

Selecting a preset number of initial text samples from 100 initial sample texts corresponding to a sample question b different from the sample question a as a fourth sample text; the 100 initial sample texts corresponding to the sample problem b may also be initial sample texts obtained from an Elasticsearch database, and the preset number here may also be set according to actual needs, which is not limited herein.

After the second sample text, the third sample text and the fourth sample text are obtained, the sample question a can be respectively combined with each second sample text to construct a negative training sample corresponding to the sample question a; the sample question a can be combined with each three-sample text respectively to construct a negative training sample corresponding to the sample question a; or combining the sample question a with each fourth sample text respectively to construct a negative training sample corresponding to the sample question a; or combining the sample question a with each second sample text, third sample text and fourth sample text respectively to construct a negative training sample corresponding to the sample question a. Specifically, the combination manner of the negative training samples may be set according to practical applications, and is not limited herein.

In practical application, when constructing a negative training sample, three situations can be considered, wherein one situation is a sample text which is associated with a sample problem and has the similarity smaller than a preset similarity threshold, and the construction of the negative training sample is realized by combining the sample problem; one is not related to the sample problem, but the sample text obtained from the search database is combined with the sample problem to realize the construction of a negative training sample; and the sample text which is not necessarily related to the sample question but contains the sample answers of other sample questions is combined with the sample question to realize the construction of the negative training sample. By considering the three conditions, the classification model can be combined with various conditions of negative training samples during training, richer and differentiated learning is realized, and the training effect of the classification model is greatly improved.

In addition, in order to enable the classification model to be trained, the context information can be better considered based on the interactive semantic matching between the sample question and the corresponding sample text; because the same sample text is represented differently under different sample problems, a semantic focus can be grasped through training samples formed between the sample problems and sample texts corresponding to other sample problems during the training of the classification model based on the interaction between the sample problems and the sample texts (namely the interaction between every two word vectors in the text after the sample problems and the corresponding sample texts are spliced), so that the accurate recall of the subsequent classification model to the text is realized, and during the training of the classification model, the sample problems and the corresponding sample texts can be spliced to be used as training samples to realize the training of the classification model, and the specific implementation mode is as follows:

constructing a positive training sample corresponding to the sample question based on the sample question and the first sample text, including:

stitching the sample question with the first sample;

and taking the result obtained after the sample question and the first sample text are spliced as a positive training sample corresponding to the sample question, and adding a corresponding first label to the positive training sample.

And constructing a negative training sample corresponding to the sample question based on the sample question and the second sample text, the third sample text and/or the fourth sample text, including:

stitching the sample question with the second sample text, the third sample text and/or the fourth sample text;

and taking the sample question and a result spliced with the second sample text, the third sample text and/or the fourth sample text as a negative training sample corresponding to the sample question, and adding a corresponding second label to the negative training sample.

Wherein the first label may be a label indicating that the training sample is a positive training sample, e.g. 1; the second label may be a label, e.g., 0, indicating that the training pattern is a negative training pattern.

In practical application, when a positive training sample is constructed, each sample problem is spliced with each corresponding first sample, then the splicing result is used as the positive training sample, and the first label of each positive training sample is as follows: 1; and splicing each sample question with a second sample text, a third sample text and/or a fourth sample text corresponding to the sample question, taking the splicing result as a negative training sample, and regarding each positive training sample as a second label: 0. when the classification model is trained subsequently, accurate training of the classification model can be achieved based on the positive training sample, the negative training sample, the first label and the second label.

During specific implementation, when the classification model is trained, if the number of training samples is large, the training process of the classification model is burdened, and if the number of training samples is small, the training precision of the classification model is affected. In addition, in order to further improve the accuracy of the constructed training sample and improve the training precision of the classification model, after the sample text corresponding to the sample question is retrieved from the search database, the sample text retrieved for the first time may be further screened according to the semantics of the sample question, and the specific implementation manner is as follows:

the inputting the sample question into a search database to obtain an initial sample text corresponding to the sample question includes:

inputting the sample question into a search database to obtain at least one sample text to be screened corresponding to the sample question;

and performing semantic analysis on the sample question, and screening an initial sample text corresponding to the sample question from the at least one sample text to be screened based on a semantic analysis result.

In practical applications, when a sample question is input into a search database to search for a sample text, the search for the sample text corresponding to the sample question is realized only by means of keyword matching, and there is a case where although there are many matching keywords of the sample question with keywords in a certain sample text, the sample text is not semantically a text containing a sample answer to the sample question.

In order to avoid the situation, the accuracy of subsequently constructing the training sample is improved, when the initial sample text is obtained, firstly, each sample question is input into a search database, a plurality of sample texts to be screened corresponding to each sample question are obtained, then, semantic analysis is carried out on each sample question, and the initial sample question suitable for the sample question is screened out from the corresponding plurality of sample texts to be screened based on the semantic analysis result of each sample question.

Step 206: training a classification model based on the training sample and a sample label corresponding to the training sample to obtain the classification model, wherein the classification model outputs the probability that a sample text of the sample question obtained by searching a database contains a sample answer corresponding to the sample question.

In specific implementation, the classification model comprises an input layer, an encoding layer and a classification layer;

correspondingly, the training a classification model based on the training sample and the sample label corresponding to the training sample to obtain the classification model includes:

inputting the training sample into the classification model through the input layer, and obtaining a coding vector of the training sample through the coding layer;

inputting the coding vector of the training sample into the binary layer to obtain the initial probability of the training sample;

calculating a loss value based on the initial probability of the training sample and the sample label;

and adjusting parameters of the classification model according to the loss value, and continuing training the classification model until a training stopping condition is reached.

The classification model comprises but is not limited to an ALBERT model, other two classification models which can realize sample problem and sample text interaction in the model training process and consider context information based on interactive semantic matching can be used, and for facilitating understanding, the classification model is used as the ALBERT model for explanation.

Specifically, the constructed positive training sample and the negative training sample are input into the classification model through an input layer of the classification model, and a coding vector (i.e., a hidden layer vector) of the training sample is obtained through a coding layer (e.g., Embedding); and then inputting the coding vector of the training sample into a downstream two-classification task layer, and converting the multidimensional coding vector into a two-dimensional vector through a preset linear expression in the two-classification task layer, wherein each dimensional element of the two-dimensional vector represents the initial probability corresponding to the training sample. And finally, calculating a loss value based on the initial probability of the training sample and the sample label, adjusting the network parameters of the classification model according to the loss value, and continuing training the classification model until the classification model reaches a training stopping condition.

In the embodiment of the specification, firstly, the training samples of the classification model can be quickly and accurately constructed based on sample problems and a search database, so that the training time is saved for the subsequent classification model training; and then training the classification model based on a training sample constructed by splicing the sample question and the corresponding sample text, so that the sample question and the sample text are interacted when the classification model is trained, the classification model can well consider the question and context information of the corresponding text in the subsequent application, a semantic focus is accurately grasped, and the accurate prediction of the probability that the text contains the answer to the question is realized.

Wherein the processor 120 may perform the steps of the method shown in fig. 3. Fig. 3 is a schematic flowchart illustrating a target text determination method according to an embodiment of the present application, which specifically includes the following steps.

Step 302: and acquiring a target question, inputting the target question into a search database, and acquiring at least one initial text corresponding to the target question.

Wherein the target problem includes, but is not limited to, any length, any type of problem. Still taking the above as an example, the target problem may be: why is the economic increase and decrease in our country in the period from 2012 to 2016? ".

For searching the database, reference may be made to the above embodiments, which are not described herein again.

Specifically, a target question is obtained and input into a search database, and a plurality of initial texts corresponding to the target question are obtained through the search database.

In specific implementation, the inputting the target question into a search database to obtain at least one initial text corresponding to the target question includes:

inputting the target question into a search database to obtain at least one text to be screened corresponding to the target question;

and performing semantic analysis on the target question, and screening at least one initial text corresponding to the target question from the at least one text to be screened based on a semantic analysis result.

In practical application, in order to reduce workload of a classification model and improve accuracy of an initial text corresponding to a target question obtained from a search database, after the target question is obtained, the target question is input into the search database, and a plurality of texts to be screened corresponding to the target question are obtained. And then carrying out semantic analysis on the target problem, and screening out an initial text which is matched with the target problem from the plurality of texts to be screened based on the semantic analysis result of the target problem.

Step 304: and inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question.

The classification model is obtained by the classification model training method.

Specifically, after a target question and a plurality of initial texts corresponding to the target question are obtained, the target text and each corresponding initial text are respectively input into a classification model, and the probability that each initial text contains a target answer corresponding to the target text is obtained.

In specific implementation, the inputting the target question and the at least one initial text into a classification model to obtain a probability that the at least one initial text contains a target answer corresponding to the target question includes:

and splicing the target question and each initial text in the at least one initial text, inputting each spliced result into a classification model, and obtaining the probability that each initial text contains the target answer corresponding to the target question.

The target questions are: why is the economic increase and decrease in our country from 2012 to 2016? (ii) a The initial text corresponding to the target question is: because the demand structure and supply conditions supporting the past high-speed growth of Chinese economy have changed, the slow-down of Chinese economy speed increase during 2012 to 2016 is actually the result of a change in the development stage, rather than a so-called periodic short-term change, for example, the result after splicing the target problem and the initial text corresponding to the target problem is: why is the economic increase and decrease in our country from 2012 to 2016? Since demand structures and supply conditions supporting the past high-speed growth of the Chinese economy have changed, the slowing of the economic growth of China during the year 2012 to the year 2016 is actually the result of a change in the development phase, rather than a so-called periodic short-term change.

Then "is the economic acceleration of our country slowed down during the year 2012 to the year 2016? Because the demand structure and supply conditions supporting the past high-speed growth of Chinese economy have changed, the slow-down of the economic speed increase of China during 2012 to 2016 is actually the result of the change of the development stage, rather than the so-called periodic short-term change, which is input into the classification model trained in advance, and the probability that the target text contains the target answer corresponding to the target question is obtained, for example, the probability is 0.3.

In practical application, after the target question and each corresponding initial text are input into the classification model, the classification model outputs the probability that each initial text contains the target answer corresponding to the target question and the probability that each initial text does not contain the target answer corresponding to the target question; in the application, the target texts are screened only based on the training samples, so that only the probability that each initial text contains the target answer corresponding to the target question is obtained through the classification model is introduced.

In the embodiment of the specification, firstly, a database is retrieved to realize rough recall of an initial text corresponding to a target question, then the rough recalled initial text and the target question are combined and input into a classification model, and a text containing a target answer of the target question is further screened from the initial text corresponding to the target question by calculating semantic similarity between the initial text and the target question in the classification model so as to determine accuracy of a subsequently acquired target text.

Step 306: and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability.

In specific implementation, the determining, from the at least one initial text based on the probability, a target text including a target answer corresponding to the target question includes:

and performing descending order arrangement on all the initial texts in the at least one initial text based on the probability, and acquiring a preset number of initial texts after descending order arrangement from high to low to serve as target texts containing target answers corresponding to the target questions.

Specifically, after the probability of each initial text corresponding to the target problem is obtained, the initial texts are ranked from high to low based on the probability, and then a preset number of initial texts are selected as target texts containing target answers corresponding to the target problem based on preset requirements. For example, the top 10 or top 15 initial texts after descending sorting are selected as target texts containing target answers corresponding to the target question.

The target text determination method provided by the embodiment of the application comprises the steps of obtaining a target problem, inputting the target problem into a search database, and obtaining at least one initial text corresponding to the target problem; inputting the target question and the at least one initial text into a classification model, and obtaining the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method; and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability. The method comprises the steps of firstly inputting a target problem into a search database, obtaining a plurality of initial texts corresponding to the target problem through the search database, realizing the rough text recall of the first stage, and then further screening the initial texts recalled at the first stage through a pre-trained classification model, so as to screen out the more accurate target texts most relevant to the target problem from the plurality of initial texts.

In addition, when the initial texts recalled at the first stage are screened through the classification model, the target problem and each initial text are combined and input into the classification model, the classification model calculates the characteristics of each word of the text spliced by the target problem and each initial text and the mutual influence between every two words, the interaction between the target problem and each initial text is realized in the classification model, context information can be well considered, the matching precision of the target problem and the initial text is improved, and the target text corresponding to the target problem can be more accurately obtained.

Corresponding to the above method embodiment, the present specification further provides an embodiment of a classification model training apparatus, and fig. 4 shows a schematic structural diagram of a classification model training apparatus according to an embodiment of the present specification. As shown in fig. 4, the apparatus includes:

a training data obtaining module 402 configured to obtain a training data set, where the training data set includes a sample question and a sample answer corresponding to the sample question;

a training sample construction module 404 configured to construct a training sample corresponding to the sample question based on the sample question and a sample text of the sample question obtained by searching a database;

a model training module 406 configured to train a classification model based on the training samples and the sample labels corresponding to the training samples, so as to obtain the classification model.

Optionally, the training sample construction module 404 is further configured to:

stitching the sample question with the first sample;

Optionally, the classification model comprises an input layer, an encoding layer and a classification layer;

accordingly, the model training module 406 is further configured to:

The classification model training device provided by the embodiment of the specification can realize the rapid and accurate construction of the training sample of the classification model based on the sample problem and the search database, and save the training time for the subsequent classification model training; training a classification model based on a training sample constructed by splicing a sample question and a corresponding sample text, so that the sample question and the sample text are interacted when the classification model is trained, the classification model can well consider the question and context information of the corresponding text in subsequent application, a semantic focus is accurately grasped, and the accurate prediction of the probability that the text contains the answer to the question is realized.

Corresponding to the above method embodiment, the present specification further provides an embodiment of a target text determination apparatus, and fig. 5 shows a schematic structural diagram of a target text determination apparatus according to an embodiment of the present specification.

As shown in fig. 5, the apparatus includes:

a question obtaining module 502 configured to obtain a target question, and input the target question into a search database to obtain at least one initial text corresponding to the target question;

a probability obtaining module 504, configured to input the target question and the at least one initial text into a classification model, and obtain a probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method as claimed above;

a text determination module 506 configured to determine a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability.

Optionally, the probability obtaining module 504 is further configured to:

Optionally, the text determination module 506 is further configured to:

Optionally, the question obtaining module 502 is further configured to:

The target text determining device comprises a first step of obtaining a target question, inputting the target question into a search database, and obtaining at least one initial text corresponding to the target question; inputting the target question and the at least one initial text into a classification model, and obtaining the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method; and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability. The method comprises the steps of firstly inputting a target problem into a search database, obtaining a plurality of initial texts corresponding to the target problem through the search database, realizing the rough text recall of the first stage, and then further screening the initial texts recalled at the first stage through a pre-trained classification model, so as to screen out the more accurate target texts most relevant to the target problem from the plurality of initial texts.

It should be noted that the components in the device claims should be understood as functional blocks which are necessary to implement the steps of the program flow or the steps of the method, and each functional block is not actually defined by functional division or separation. The device claims defined by such a set of functional modules are to be understood as a functional module framework for implementing the solution mainly by means of a computer program as described in the specification, and not as a physical device for implementing the solution mainly by means of hardware.

An embodiment of the present application further provides a computing device, which includes a memory, a processor, and computer instructions stored on the memory and executable on the processor, wherein the processor executes the computer instructions and executes the steps of the classification model training method or the steps of the target text determination method.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solutions of the classification model training method and the target text determination method belong to the same concept, and details of the technical solutions of the computing device, which are not described in detail, can be referred to the descriptions of the technical solutions of the classification model training method and the target text determination method.

An embodiment of the present application further provides a computer readable storage medium storing computer instructions, which when executed by a processor, implement the steps of the classification model training method or the steps of the target text determination method.

The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the computer-readable storage medium and the technical solution of the classification model training method and the target text determination method belong to the same concept, and details that are not described in detail in the technical solution of the computer-readable storage medium can be referred to the description of the technical solution of the classification model training method and the target text determination method.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.

Claims

1. A classification model training method is characterized by comprising the following steps:

2. The method for training the classification model according to claim 1, wherein the constructing the training sample corresponding to the sample question based on the sample question and the sample text of the sample question obtained by searching a database comprises:

3. The method for training classification models according to claim 2, wherein the constructing negative training samples corresponding to the sample question based on the sample question and other sample texts, which are different from the first sample text, and are obtained from the search database comprises:

4. The method for training classification models according to claim 2, wherein the inputting the sample question into a search database to obtain an initial sample text corresponding to the sample question comprises:

5. The method for training the classification model according to claim 2, wherein the constructing the positive training sample corresponding to the sample question based on the sample question and the first sample text comprises:

stitching the sample question with the first sample;

6. The method for training the classification model according to claim 3, wherein the constructing negative training samples corresponding to the sample question based on the sample question and the second sample text, the third sample text and/or the fourth sample text comprises:

7. The method for training the classification model according to any one of claims 1 to 6, wherein the classification model comprises an input layer, an encoding layer and a classification layer;

and adjusting parameters of the classification model according to the loss value, and continuing training the classification model until a training stopping condition is reached, wherein the classification model outputs the probability that the sample text of the sample question obtained by searching a database contains the sample answer corresponding to the sample question.

8. A method for determining a target text, comprising:

inputting the target question and the at least one initial text into a classification model, and obtaining the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method of any one of claims 1 to 7;

9. The method for determining target text according to claim 8, wherein the inputting the target question and the at least one initial text into a classification model to obtain a probability that the at least one initial text contains a target answer corresponding to the target question comprises:

10. The method for determining target text according to claim 8, wherein the determining the target text containing the target answer corresponding to the target question from the at least one initial text based on the probability comprises:

11. The method for determining the target text according to any one of claims 8 to 10, wherein the inputting the target question into a search database to obtain at least one initial text corresponding to the target question comprises:

12. A classification model training apparatus, comprising:

13. A target text determination apparatus, comprising:

a probability obtaining module configured to input the target question and the at least one initial text into a classification model, and obtain a probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method according to any one of claims 1 to 7;

14. A computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, wherein the processor when executing the computer instructions performs the steps of the classification model training method of any one of claims 1-7 or the steps of the target text determination method of any one of claims 8-11.

15. A computer readable storage medium storing computer instructions, which when executed by a processor, perform the steps of the classification model training method according to any one of claims 1 to 7 or the steps of the target text determination method according to any one of claims 8 to 11.