WO2021056710A1

WO2021056710A1 - Multi-round question-and-answer identification method, device, computer apparatus, and storage medium

Info

Publication number: WO2021056710A1
Application number: PCT/CN2019/116924
Authority: WO
Inventors: 邓悦; 金戈; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-09-24
Filing date: 2019-11-10
Publication date: 2021-04-01
Also published as: CN110825857A; CN110825857B

Abstract

A multi-round question-and-answer identification method, a device, a computer apparatus, and a storage medium. The multi-round question-and-answer identification method comprises: importing an acquired historical user question, an acquired historical user answer, and an acquired current user question into a pre-trained target multi-round question-and-answer model; performing vector feature conversion processing by using an encoding unit in the target multi-round question-and-answer model, and obtaining a first vector feature, a second vector feature, and a third vector feature; importing the first vector feature, the second vector feature, and the third vector feature into a long short-term memory unit, performing semantic feature extraction, and obtaining a target semantic feature; and importing the target semantic feature into a fully-connected unit, performing similarity calculation, and outputting an identification result having a maximum similarity level. The method enhances accuracy and efficiency of acquiring information according to a target multi-round question-and-answer model for users.

Description

Multi-round question and answer recognition method, device, computer equipment and storage medium

This application is based on the Chinese invention patent application filed on September 24, 2019 with the application number 201910906819.7, titled "Multi-round question and answer identification method, device, computer equipment and storage medium", and claims its priority.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a multi-round question and answer recognition method, device, computer equipment and storage medium.

Background technique

The traditional multi-round question answering model mainly spliced the dialogue information of the previous rounds directly and regarded it as a sentence as input. The inventor realized that because the relationship between the sentence and the sentence was not considered, it could only learn the word level. Semantic information and unable to learn semantic information at the grammatical level or sentence level, resulting in incomplete semantic information that the model can express, making the recognition accuracy of the multi-round question answering model not high, which in turn affects the accuracy of the information obtained by the user according to the multi-round question answering model Sex and efficiency.

Summary of the invention

The embodiments of the application provide a method, device, computer equipment, and storage medium for multi-round question and answer recognition, so as to solve the problem that the accuracy of traditional multi-round question answering model recognition is not high, which affects the accuracy and efficiency of information obtained by users according to the multi-round question answering model. problem.

A method for multiple rounds of question and answer recognition, including:

Get user history questions, user history answers, and user current questions from the user database;

The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit

The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;

Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;

The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.

A multi-round question and answer recognition device, including:

The first obtaining module is used to obtain user historical questions, user historical answers, and user current questions from the user database;

The import module is used to import the user history question, the user history answer, and the user current question into a pre-trained target multi-round question answering model, wherein the target multi-round question answering model includes a coding unit, a long Short-term memory unit and fully connected unit;

The conversion module is configured to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the encoding unit to obtain the first vector feature corresponding to the user history question, and the user The second vector feature corresponding to the historical answer, and the third vector feature corresponding to the user's current question;

An extraction module, configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain a target semantic feature;

The output module is used to import the target semantic feature into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.

A computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the above-mentioned multiple rounds of question and answer recognition when the processor executes the computer-readable instructions Method steps.

A non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions implement any of the foregoing when executed by a processor Steps of multiple rounds of question-and-answer recognition method.

The details of one or more embodiments of the present application are set forth in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings, and claims.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

FIG. 1 is a flowchart of a method for identifying multiple rounds of question and answer provided by an embodiment of the present application;

2 is a flowchart of training a target multi-round question answering model in the multi-round question answering recognition method provided by an embodiment of the present application;

FIG. 3 is a flowchart of step S2 in the multi-round question and answer recognition method provided by an embodiment of the present application;

4 is a flowchart of step S21 in the method for identifying multiple rounds of question and answer provided by an embodiment of the present application;

FIG. 5 is a flowchart of step S3 in the multi-round question and answer recognition method provided by the embodiment of the present application;

FIG. 6 is a flowchart of step S6 in the method for identifying multiple rounds of question and answer provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of a multi-round question and answer recognition device provided by an embodiment of the present application;

Fig. 8 is a basic structural block diagram of a computer device provided by an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The multi-round question and answer recognition method provided in this application is applied to the server, and the server can be implemented by an independent server or a server cluster composed of multiple servers. In an embodiment, as shown in FIG. 1, a method for multi-round question and answer recognition is provided, which includes the following steps:

S101: Obtain user historical questions, user historical answers, and user current questions from a user database.

In the embodiments of the present application, the user history questions, user history answers, and user current questions are directly obtained from the user database, where the user database refers to a database specifically used to store user history questions, user history answers, and user current questions.

S102: Import user history questions, user history answers, and user current questions into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a fully connected unit.

In the embodiments of the present application, the pre-trained target multi-round question answering model refers to training the neural network model according to the training data set by the user, and then, in the case of multiple rounds of question and answer, the current user after the multiple rounds of question and answer Question, a neural network model that can quickly identify the user's current answer corresponding to the user's current question.

Specifically, the user history questions, user history answers, and user current questions obtained in step S101 are directly imported into the pre-trained target multi-round question answering model.

S103: Perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the coding unit to obtain the first vector feature corresponding to the user history question, the second vector feature corresponding to the user history answer, and the first vector feature corresponding to the user’s current question. Three vector features.

In the embodiment of this application, there is a vector conversion port in the coding unit for vector feature conversion processing of user history questions, user history answers, and user current questions, by directly importing user history questions, user history answers, and user current questions Perform vector feature conversion processing on the vector conversion port in the coding unit to obtain the first vector feature corresponding to the user's historical question, the second vector feature corresponding to the user's historical answer, and the third vector feature corresponding to the user's current question.

S104: Import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature.

Specifically, there is a semantic feature port used to extract semantic features of the first vector feature, the second vector feature, and the third vector feature in the long and short-term memory unit. By directly combining the first vector feature, the second vector feature, and the third vector feature, The vectors are imported into the semantic feature port in the long-term and short-term memory unit for semantic feature extraction, and the target semantic feature is obtained.

S105: Import the target semantic feature into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.

Specifically, the fully connected unit contains a preset classifier, and the target semantic feature is imported into the fully connected unit. When the fully connected unit receives the target semantic feature, the preset classifier is used to perform similarity of the target semantic feature Calculate and output the recognition result with the greatest similarity, that is, the recognition result is the answer corresponding to the user's current question. Among them, the classifier is specially used for similarity calculation.

In this embodiment, by importing the acquired user history questions, user history answers, and user current questions into the pre-trained target multi-round question answering model, the coding unit in the target multi-round question answering model is used for vector feature conversion processing, Obtain the first vector feature, the second vector feature, and the third vector feature, and perform the semantic feature extraction on the first vector feature, the second vector feature and the third vector feature according to the long and short-term memory unit to obtain the target semantic feature. Calculate the similarity of the target semantic features, and output the recognition result with the greatest similarity. By using the pre-trained target multi-round question answering model, it can quickly and accurately determine the recognition result corresponding to the user’s current question based on the user’s historical questions, user historical answers and the user’s current question, and the pre-trained target multi-round question answering model The use of long and short-term memory units for semantic feature extraction can strengthen the information interaction between user historical questions, user historical answers and user current questions, so that the accuracy of the target multi-round question answering model identification is higher, thereby improving the user's multi-round question answering according to the target The accuracy and efficiency of the information obtained by the model.

In one embodiment, as shown in FIG. 2, before step S101, the multi-round question and answer recognition method further includes the following steps:

S1: Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples.

In the embodiment of the application, by detecting the label information in the preset sample library, when the label information is detected as label 1, label two, and label three, the historical question corresponding to label one is obtained, and label two corresponds to Obtain the historical answer of label three, obtain the current question corresponding to label three, and determine the historical question, historical answer, and current question as a positive sample; when the label information is detected as label four, perform the current answer corresponding to label four Obtain and determine the current answer as a negative sample.

Among them, the preset sample library refers to a database dedicated to storing different tag information and data information corresponding to the tag information. The tag information includes tag 1, tag 2, tag 3, and tag 4. The data information includes historical questions, historical answers, For the current question and the current answer, the data information corresponding to the label one is a historical question, the data information corresponding to the label two is the historical answer, the data information corresponding to the label three is the current question, and the data information corresponding to the label four is the current answer.

It should be noted that there is a mapping relationship between historical questions and historical answers, that is, each historical question has its corresponding historical answer, and there are at least 5 historical questions and historical answers.

S2: Import the positive sample and the negative sample into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample and the negative vector feature corresponding to the negative sample. Among them, the initial multi-round question answering model Including coding layer, long short-term memory network and convolutional network.

In the embodiment of this application, the initial multi-round question answering model includes an encoding layer, a long short-term memory network, and a convolutional network. There is a conversion database in the encoding layer for performing vector feature conversion processing on positive samples and negative samples. For positive samples and negative samples, the positive and negative samples are imported into the conversion database in the coding layer for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples and the negative vector features corresponding to the negative samples after the vector feature conversion processing .

S3: Perform semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature.

In the embodiment of this application, there is a semantic feature library for semantic feature extraction of positive vector features and negative vector features in the long and short-term memory network. According to the positive vector features and negative vector features obtained in step S2, the positive vector features and The negative vector features are respectively imported into the semantic feature database for semantic feature extraction, and the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature are obtained after the semantic feature extraction.

Among them, the Long-Short-Term Memory (LSTM) is a time-loop neural network, which is specially designed to solve the long-term dependency problem of the general cyclic neural network. All cyclic neural networks have one A chain form of repeated neural network modules.

S4: Query the standard question matching the first semantic feature from the preset standard library, and obtain the standard answer vector corresponding to the standard question.

Specifically, according to the first semantic feature obtained in step S3, the legal semantic feature that is the same as the first semantic feature is queried from the preset standard library, and when the legal semantic feature that is the same as the first semantic feature is queried, the The legal question corresponding to the legal semantic feature is taken as the standard question, and the standard answer vector corresponding to the target legal question that is the same as the standard question is extracted from the preset vector library.

Among them, the preset standard library refers to a database specifically used to store different legal semantic features and legal questions corresponding to the legal semantic features, and the preset standard library is preset to have the same legal semantic feature as the first semantic feature.

The preset vector library refers to a database specially used to store the target legal questions that are the same as the legal questions in the preset standard library and the standard answer vectors corresponding to the target legal questions.

S5: Import the second semantic feature into the convolutional network for convolution processing to obtain the target vector.

In the embodiment of this application, the convolutional network includes a pre-set convolution kernel, and the second semantic feature obtained in step S3 is subjected to convolution processing by using the pre-set convolution kernel in the convolution network to obtain the convolution processing After the target vector. Among them, the preset convolution kernel refers to a kernel function that is set according to the actual needs of the user to convert the second semantic feature into a target vector.

S6: Perform loss calculation according to the standard answer vector and the target vector to obtain the loss value.

Specifically, the standard answer vector and the target vector are imported into the preset loss calculation port for loss calculation processing, and the loss value after the loss calculation processing is output. Among them, the preset loss calculation port refers to a processing port specially used for loss calculation.

S7: The loss value is compared with the preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question and answer model is iteratively updated, until the loss value is less than or equal to the preset threshold, the updated initial multi-round question and answer model Multi-round question answering model as the target

Compare the loss value obtained in step S6 with the preset threshold. If the loss value is greater than the preset threshold, use the preset loss function to adjust the initial parameters of each network layer in the initial multi-round question and answer model. Iterative update, if the loss value is less than or equal to the preset threshold, the iteration is stopped, and the initial multi-round question answering model corresponding to the loss value is determined as the target multi-round question answering model.

It should be noted that the initial parameter is only a parameter preset to facilitate the calculation of the initial multi-round question and answer model, so that there must be an error between the standard answer vector obtained from the positive and negative samples and the target vector. This error information needs to be returned layer by layer. Pass to each layer of network structure in the initial multi-round question and answer model, and let each layer of network structure adjust the preset initial parameters to obtain a target multi-round question answering model with better recognition effect.

In this embodiment, by taking historical questions, historical answers, and current questions as positive samples, and obtaining the current answers as negative samples, and using the coding layer in the initial multi-round question answering model to perform vector feature conversion processing on the positive samples and negative samples, we obtain Positive vector features and negative vector features, using long and short-term memory network to extract semantic features of positive vector features and negative vector features to obtain the first semantic feature and the second semantic feature, and obtain the standard question corresponding to the first semantic feature The standard answer vector of, the second semantic feature is imported into the convolutional network for convolution processing to obtain the target vector, the loss value is calculated based on the standard answer vector and the target vector, and the loss value is compared with the preset threshold. If the loss If the value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold, and the target multi-round question answering model is obtained. By using the long and short-term memory network to extract the first semantic feature and the second semantic feature, the information interaction between the context information in the positive and negative samples can be strengthened, and the accuracy and training efficiency of the model training can be effectively improved. It is based on the loss value and the prediction. The method of setting thresholds for comparison improves the accuracy of model training, further improves the training efficiency and recognition accuracy of the target multi-round question answering model, and ensures the accuracy and efficiency of the user's acquisition of information based on the target multi-round question answering model.

In one embodiment, as shown in FIG. 3, in step S2, the positive samples and negative samples are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples, and the negative samples The negative vector feature corresponding to the sample includes the following steps:

S21: Perform word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample.

In the embodiments of this application, word segmentation refers to the process of recombining continuous word sequences into word sequences according to certain specifications. For example, the continuous word sequence "ABCD" is processed through word segmentation to obtain "AB" and "CD". .

Specifically, according to the positive sample and the negative sample obtained in step S1, the positive sample and the negative sample are segmented using the mechanical word segmentation method to obtain the first word segmentation result of the positive sample after the word segmentation process, and the negative sample after the word segmentation process The second participle result obtained.

Among them, mechanical word segmentation methods mainly include four methods: forward maximum matching, forward minimum matching, reverse maximum matching, and reverse minimum matching. Preferably, this proposal adopts the forward maximum matching algorithm.

It should be noted that since the positive sample contains historical questions, historical answers, and current questions, when word segmentation is performed on the positive sample, the purpose is to segment each historical question, each historical answer, and current question in the positive sample. , The obtained first word segmentation result contains multiple, that is, it contains the word segmentation result corresponding to each historical question, the word segmentation result corresponding to each historical answer and the word segmentation result corresponding to the current question.

S22: Use the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain a positive vector feature and a negative vector feature.

In the embodiment of the present application, according to step S2, since there is a conversion database for performing vector feature conversion processing on positive samples and negative samples in the coding layer, the conversion database contains the results for the first word segmentation and the second word segmentation result. Perform vector feature conversion processing preset processing library.

Specifically, by directly importing the first word segmentation result and the second word segmentation result into a preset processing library for vector feature conversion processing, the first word segmentation result corresponding to the positive vector feature and the second word segmentation result corresponding to the negative vector feature are obtained.

Among them, the preset processing library specifically uses the word2vec model to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result.

In this embodiment, the positive samples and negative samples can be quickly and accurately converted into the first word segmentation result and the second word segmentation result through word segmentation processing, and then the first word segmentation result and the second word segmentation result are converted into positive vector features and negative vectors Features, so as to achieve accurate acquisition of positive vector features and negative vector features, and improve the accuracy of subsequent use of positive vector features and negative vector features for semantic feature extraction.

In one embodiment, each historical question, each historical answer, and current question in the positive sample is used as a corpus, and the current answer in the negative sample is used as a corpus. As shown in FIG. 4, in step S21, The positive sample and the negative sample are subjected to word segmentation processing to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample includes the following steps:

S211: Set the string index value and the maximum length value of word segmentation according to preset requirements.

In the embodiment of the present application, the string index value refers to the position specifically used to locate the character to start scanning. If the character string index value is 0, it means that the first character is the position to start scanning the character. The maximum length value is the maximum range specifically used to scan characters. If the maximum length value is 2, it means scanning at most 2 characters, and if the maximum length value is 3, it means scanning at most 3 characters.

Specifically, the string index value and the maximum length value of the word segmentation are set according to the preset requirements. The preset requirements may specifically be to set the string index value to 0 and the maximum length value to 2, and the specific settings are The requirements can be set according to the actual needs of users, and there is no restriction here.

S212: For each corpus in the positive sample and the negative sample, extract the target character from the corpus according to the string index value and the maximum length value.

Specifically, for each corpus in the positive sample and the negative sample, according to the string index value and the maximum length value obtained in step S211, the corpus is scanned in a left-to-right scanning mode. When the character with the maximum length value is scanned, The character from the starting scanning position to the maximum length value is identified as the target character, and the target character is extracted.

For example, the corpus is "Nanjing Yangtze River Bridge", the maximum length value is 3, and the initial value of the string index is 0. Scan the corpus from left to right, that is, the character scanned to the maximum length value is "Nanjing City" , The character "Nanjing City" with the maximum length value is identified as the target character, and the target character is extracted.

S213: Match the target character with the legal character in the preset dictionary library.

Specifically, the target character obtained in step S212 is matched with the legal character in the preset dictionary library. Among them, the preset dictionary database refers to a database specially used for storing legal characters set by the user.

S214: If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated string index value and maximum length value, from the corpus Extract the target characters in the corpus for matching until the word segmentation operation on the corpus is completed.

Specifically, the target character obtained in step S212 is matched with the legal character in the preset dictionary library. When the target character is matched with the legal character in the preset dictionary library, it indicates that the matching is successful, and the target character is determined For the target word segmentation, the string index value is updated to the string index value in the current step S212 plus the maximum length value in the current step S212, and the target is extracted from the corpus based on the updated string index value and maximum length value The characters are matched until the word segmentation operation on the corpus is completed.

For example, as described in the example in step S212, if the target character "Nanjing City" matches the character in the preset dictionary library, the target character "Nanjing City" is confirmed as the target segmentation, and the string index value is Update to the current string index value 0 + the current maximum length value 3, that is, the string index value will be updated to 3, and based on the updated string index value 3 and the maximum length value 3, the target characters are extracted from the corpus for matching, That is, for the corpus "Nanjing Yangtze River Bridge", scan from the "long" character. Until the word segmentation operation on the corpus is completed.

S215: If the matching fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching until the word segmentation operation on the corpus is completed.

Specifically, the target character obtained in step S212 is matched with the legal character in the preset dictionary library. When the target character is not matched with the legal character in the preset dictionary library, it means that the matching fails, and the maximum length value is changed. Update to the maximum length value in the current step S212 minus 1, and extract the target characters from the corpus based on the updated maximum length value and the string index value for matching until the word segmentation operation on the corpus is completed.

It should be noted that when all target characters with a maximum length value greater than 1 do not match the characters in the preset dictionary library, a single character is confirmed as the target word segmentation.

For example: as described in the example in step S212, if the target character "Nanjing City" does not match the character in the preset dictionary library, the maximum length value is updated to the current maximum length value 3 minus 1, which is the maximum length The value is updated to 2, and based on the updated maximum length value of 2 and the string index value of 0, the target characters are extracted from the corpus for matching until the word segmentation operation on the corpus is completed.

S216: If each corpus in the positive sample completes the word segmentation operation, then the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the second word segmentation result corresponding to the negative sample is obtained.

Specifically, when each corpus in the positive sample completes the word segmentation operation, the word segmentation result corresponding to each corpus is used as the first word segmentation result corresponding to the positive sample. When the corpus in the negative sample completes the word segmentation operation, the corpus corresponding to the The word segmentation result is used as the second word segmentation result corresponding to the negative sample.

In this embodiment, by setting the string index value and the maximum length value of the word segmentation, each corpus in the positive sample and the negative sample is word segmented, and according to the string index value and the maximum length value and the legal characters, the first is obtained. The word segmentation result and the second word segmentation result. In this way, accurate word segmentation of each corpus in the positive sample and the negative sample is realized, and the accuracy of the subsequent vector feature conversion processing of the first and second word segmentation results after the word segmentation processing is improved.

In one embodiment, the long-short-term memory network includes n+1 first long-short-term memory network layers and 2 second long-short-term memory network layers, and there are n positive vector features, where n is a positive integer greater than 1, As shown in Figure 5, in step S3, the positive vector feature and the negative vector feature are extracted through the long and short-term memory network, and the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature are obtained. The following steps:

S31: Import n positive vector features and negative vector features into n+1 first long and short-term memory network layers for semantic recognition, and obtain n first recognition results corresponding to n positive vector features and corresponding negative vector features The second recognition result.

In the embodiments of this application, the first LSTM network layer refers to a network structure specifically used for semantic recognition of positive vector features and negative vector features, by importing n positive vector features and negative vector features into n+1 The first long-short-term memory network layer performs semantic recognition, that is, each positive vector feature and negative vector feature are respectively imported into a first long-short-term memory network layer for semantic recognition, and n+1 output of the first long-short-term memory network layer is obtained n first recognition results and second recognition results. Among them, the first recognition result corresponds to the positive vector feature, and the second recognition result corresponds to the second recognition result.

It should be noted that by importing each positive vector feature and negative vector feature into a first long and short-term memory network layer for semantic recognition, semantic recognition can be performed on n positive vector features and negative vector features at the same time, which improves Recognition efficiency of semantic recognition.

S32: Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively to extract semantic features to obtain the first semantic feature and the second semantic feature.

In the embodiment of the present application, the second LSTM network layer refers to a network structure specifically used for semantic feature extraction of the first recognition result and the second recognition result, and the second LSTM network layer is a two-way LSTM. Two-way LSTM consists of two LSTMs with different directions. One LSTM reads data from front to back according to the order of words in the sentence, and the other LSTM reads data from back to front according to the reverse direction of the sentence word order, so that the first LSTM gets the upper Text information, another LSTM obtains the following information, and the joint statement of the two LSTMs is the context information of the entire sentence.

After two-way LSTM encoding, the hidden layer of the two-way LSTM neuron only outputs the vector that marks the corresponding position of the entity instead of outputting all the encoding vectors of the entire sentence. The advantage of this is that the interference of redundant information on the relationship classification can be removed, and only the interference of the relationship classification can be removed. The most critical information: After two-way LSTM extraction, the semantic features corresponding to the sentence are output.

Specifically, the n first recognition results are all input into a second long short-term memory network layer for semantic feature extraction, and the first semantic features obtained after the semantic feature extraction is performed on the n first recognition results are output; the second The recognition result is input to another second long and short-term memory network layer for semantic feature extraction, and the second semantic feature obtained after semantic feature extraction is output for the second recognition result.

It should be noted that since the first recognition result is obtained based on historical questions, historical answers, and current questions, when semantic feature extraction is performed, combining historical questions, historical answers and current questions can make semantic feature extraction more accurate.

In this embodiment, the positive vector feature and the negative vector feature are semantically recognized through the first long and short-term memory network layer to obtain the first recognition result and the second recognition result, and the first recognition result is obtained by using the second long and short-term memory network layer. Perform semantic feature extraction with the second recognition result to obtain the first semantic feature and the second semantic feature. Thereby, accurate extraction of the first semantic feature and the second semantic feature is realized, the accuracy of calculation using the first semantic feature and the second semantic feature is improved, and the accuracy of model training is further improved.

In one embodiment, as shown in FIG. 6, in step S6, the loss calculation is performed according to the standard answer vector and the target vector, and obtaining the loss value includes the following steps:

S61: Obtain the cosine calculation result by performing cosine similarity calculation on the standard answer vector and the target vector.

Specifically, according to the standard answer vector and the target vector, the cosine calculation result is calculated according to formula (1):

Among them, X is the result of cosine calculation, A is the standard answer vector, and B is the target vector.

S62: Perform a loss calculation according to the cosine calculation result and the cross-entropy loss function to obtain a loss value.

In the embodiment of the present application, the result of the cosine calculation indicates the probability that the current question and the current answer are predicted by the initial multi-round question answering model. When the probability predicted by the initial multi-round question answering model reaches the preset target value, it represents the current question and the current answer. Matching, when the probability predicted by the initial multi-round question answering model does not reach the preset target value, it means that the current question and the current answer do not match. Among them, the preset target value may specifically be 0.8, or it may be set according to the actual needs of the user, and there is no limitation here.

Specifically, according to the result of the cosine calculation, the cross-entropy loss function is used to calculate the loss value as in formula (2):

Among them, H(p,q) is the loss value, x is 0 or 1, p(x) is the actual state corresponding to x, if x is 0, it means that the current question does not match the current answer, and p(x) is 0, If x is 1, it means that the current question matches the current answer, p(x) is 1, and q(x) is the result of cosine calculation.

In this embodiment, formula (1) can quickly and accurately calculate the cosine calculation result between the standard answer vector and the target vector, and formula (2) can quickly and accurately calculate the corresponding loss value based on the cosine calculation result, and further Ensure that the subsequent use of the loss value to determine the accuracy of the target multi-round question and answer model.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

In one embodiment, a multi-round question answering recognition device is provided, and the multi-round question answering recognition device corresponds to the multi-round question answering recognition method in the above-mentioned embodiment one-to-one. As shown in Figure 7, the multi-round question answering recognition device includes

The first acquisition module 71, the import module 72, the conversion module 73, the extraction module 74 and the output module 75. The detailed description of each functional module is as follows:

The first obtaining module 71 is used to obtain user historical questions, user historical answers, and user current questions from the user database;

The import module 72 is used to import user history questions, user history answers, and user current questions into the pre-trained target multi-round question answering model, where the target multi-round question answering model includes coding units, long and short-term memory units, and fully connected units ；

The conversion module 73 is used to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the coding unit, to obtain the first vector feature corresponding to the user history question, the second vector feature corresponding to the user history answer, and the user current The third vector feature corresponding to the problem;

The extraction module 74 is configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;

The output module 75 is used to import the target semantic features into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.

Further, the multi-round question answering recognition device further includes:

The second acquisition module is used to acquire historical questions, historical answers, and current questions as positive samples from the preset sample library, and acquire current answers as negative samples;

The vector feature conversion module is used to import the positive samples and negative samples into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples and the negative vector features corresponding to the negative samples. Among them, The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;

The semantic feature extraction module is used to perform semantic feature extraction on positive vector features and negative vector features through a long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;

The query module is used to query the standard question matching the first semantic feature from the preset standard library, and obtain the standard answer vector corresponding to the standard question;

The convolution module is used to import the second semantic feature into the convolutional network for convolution processing to obtain the target vector;

The loss calculation module is used to calculate the loss according to the standard answer vector and the target vector to obtain the loss value;

The iterative update module is used to compare the loss value with a preset threshold. If the loss value is greater than the preset threshold, iteratively update the initial multi-round question and answer model until the loss value is less than or equal to the preset threshold. The multi-round question answering model is the target multi-round question answering model.

Further, the vector feature conversion module includes:

The word segmentation sub-module is used to perform word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample;

The initial conversion sub-module is used to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result using the coding layer to obtain positive vector features and negative vector features.

Further, the word segmentation sub-module includes:

The setting unit is used to set the string index value and the maximum length value of the word segmentation according to the preset requirements;

The character extraction unit is used to extract target characters from the corpus according to the string index value and the maximum length value for each corpus in the positive sample and the negative sample;

The matching unit is used to match the target character with the legal character in the preset dictionary library;

The matching success unit is used to determine the target character as the target word segmentation if the match is successful, and update the string index value to the current string index value plus the current maximum length value, based on the updated string index value and maximum length Value, extract the target characters from the corpus for matching until the word segmentation operation on the corpus is completed;

The matching failure unit is used to decrement the maximum length value if the matching fails, and extract the target characters from the corpus based on the updated maximum length value and the string index value for matching until the word segmentation operation on the corpus is completed;

The word segmentation completion unit is used to obtain the first word segmentation result corresponding to the positive sample if each corpus in the positive sample completes the word segmentation operation, and obtain the second word segmentation result corresponding to the negative sample if the corpus in the negative sample completes the word segmentation operation .

Further, the semantic feature extraction module includes:

Semantic recognition sub-module, used to import n positive vector features and negative vector features to n+1 first long and short-term memory network layers for semantic recognition, and obtain n first recognition results corresponding to n positive vector features and The second recognition result corresponding to the negative vector feature;

The feature extraction sub-module is used to import the n first recognition results and the second recognition results into two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature.

Further, the loss calculation module includes:

The cosine calculation sub-module is used to calculate the cosine similarity between the standard answer vector and the target vector to obtain the cosine calculation result;

The loss value acquisition sub-module is used to calculate the loss according to the cosine calculation result and the cross entropy loss function to obtain the loss value.

Some embodiments of the application disclose computer equipment. For details, please refer to FIG. 8, which is a block diagram of the basic structure of the computer device 90 in an embodiment of the present application.

As shown in FIG. 8, the computer device 90 includes a memory 91, a processor 92, and a network interface 93 that are communicatively connected to each other through a system bus. It should be pointed out that FIG. 8 only shows a computer device 90 with components 91-93, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.

The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.

The memory 91 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static memory Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 91 may be an internal storage unit of the computer device 90, such as a hard disk or memory of the computer device 90. In other embodiments, the memory 91 may also be an external storage device of the computer device 90, for example, a plug-in hard disk equipped on the computer device 90, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Of course, the memory 91 may also include both an internal storage unit of the computer device 90 and an external storage device thereof. In this embodiment, the memory 91 is generally used to store an operating system and various application software installed in the computer device 90, such as computer-readable instructions of the multi-round question and answer recognition method. In addition, the memory 91 can also be used to temporarily store various types of data that have been output or will be output.

The processor 92 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 92 is generally used to control the overall operation of the computer device 90. In this embodiment, the processor 92 is configured to run computer-readable instructions or processed data stored in the memory 91, such as computer-readable instructions for running the multi-round question and answer recognition method.

The network interface 93 may include a wireless network interface or a wired network interface, and the network interface 93 is generally used to establish a communication connection between the computer device 90 and other electronic devices.

This application also provides another implementation manner, that is, to provide a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores the user's current problem information entry process, the user The current question information entry process can be executed by at least one processor, so that the at least one processor executes the steps of any one of the above-mentioned multi-round question and answer identification methods.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a computer device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the various embodiments of the present application.

Finally, it should be noted that it is obvious that the embodiments described above are only a part of the embodiments of the application, rather than all of the embodiments. The drawings show the preferred embodiments of the application, but do not limit the patents of the application. range. This application can be implemented in many different forms. On the contrary, the purpose of providing these examples is to make the understanding of the disclosure of this application more thorough and comprehensive. Although this application has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still possible for those skilled in the art to modify the technical solutions described in each of the foregoing specific embodiments, or equivalently replace some of the technical features. . All equivalent structures made using the contents of the description and drawings of this application, directly or indirectly used in other related technical fields, are similarly within the scope of patent protection of this application.

Claims

A multi-round question answering recognition method, characterized in that the multi-round question answering recognition method includes:

Get user history questions, user history answers, and user current questions from the user database;

The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit

The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;

Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;

The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
5. The multi-round question and answer recognition method according to claim 1, characterized in that, before the step of obtaining user historical questions, user historical answers, and user current questions from a user database, the multi-round question answer recognition method further comprises:

Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples;

The positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative vector feature corresponding to the negative sample, where , The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;

Performing semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;

Querying a standard question matching the first semantic feature from a preset standard library, and obtaining a standard answer vector corresponding to the standard question;

Importing the second semantic feature into the convolutional network for convolution processing to obtain a target vector;

Perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;

The loss value is compared with a preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold. The initial multi-round question answering model is used as the target multi-round question answering model.
The multi-round question answering recognition method according to claim 2, wherein the positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the The positive vector feature corresponding to the positive sample, and the step of the negative vector feature corresponding to the negative sample includes:

Performing word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;

Using the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
The method of multi-round question answering recognition according to claim 3, characterized in that each of the historical question, each of the historical answer and the current question in the positive sample is used as a corpus, and the negative The current answer in the sample is used as a corpus, the positive sample and the negative sample are segmented to obtain the first segmentation result corresponding to the positive sample, and the second segmentation result corresponding to the negative sample is The steps include:

Set the string index value and the maximum length of the word segmentation according to the preset requirements;

For each of the corpus of the positive sample and the negative sample, extract a target character from the corpus according to the character string index value and the maximum length value;

Matching the target character with a legal character in a preset dictionary library;

If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated character String index value and the maximum length value, extracting target characters from the corpus for matching until the word segmentation operation on the corpus is completed;

If the match fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching, until the word segmentation of the corpus is completed So far

If each of the corpus in the positive sample completes the word segmentation operation, the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the negative The second word segmentation result corresponding to the sample.
The multi-round question answering recognition method according to claim 2, wherein the long short-term memory network comprises n+1 first long-short-term memory network layers and 2 second long-short-term memory network layers, and the positive vector There are n features, where n is a positive integer greater than 1, and the positive vector feature and the negative vector feature are extracted through the long- and short-term memory network to obtain the first vector feature corresponding to the positive vector feature. The step of a semantic feature corresponding to the second semantic feature of the negative vector feature includes:

The n positive vector features and the negative vector features are respectively imported into the n+1 first long-short-term memory network layers for semantic recognition, and n first recognition results corresponding to the n positive vector features are obtained A second recognition result corresponding to the negative vector feature;

Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature .
The method for multi-round question answering recognition according to claim 2, wherein the step of calculating the loss according to the standard answer vector and the target vector to obtain the loss value comprises:

Obtaining a cosine calculation result by performing a cosine similarity calculation on the standard answer vector and the target vector;

Perform a loss calculation according to the cosine calculation result and the cross entropy loss function to obtain the loss value.
A multi-round question answering recognition device, characterized in that the multi-round question answering recognition device includes:

The first obtaining module is used to obtain user historical questions, user historical answers, and user current questions from the user database;

The import module is used to import the user history question, the user history answer, and the user current question into a pre-trained target multi-round question answering model, wherein the target multi-round question answering model includes a coding unit, a long Short-term memory unit and fully connected unit;

The conversion module is configured to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the encoding unit to obtain the first vector feature corresponding to the user history question, and the user The second vector feature corresponding to the historical answer, and the third vector feature corresponding to the user's current question;

An extraction module, configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain a target semantic feature;

The output module is used to import the target semantic feature into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.
8. The multi-round question answering recognition device according to claim 7, wherein the multi-round question answering recognition device further comprises:

The second acquisition module is used to acquire historical questions, historical answers, and current questions as positive samples from the preset sample library, and acquire current answers as negative samples;

The vector feature conversion module is used to import the positive sample and the negative sample into the coding layer of the initial multi-round question answering model to perform vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative sample Corresponding negative vector features, wherein the initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;

The semantic feature extraction module is configured to perform semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the negative vector feature corresponding Second semantic feature of

A query module, configured to query a standard question matching the first semantic feature from a preset standard library, and obtain a standard answer vector corresponding to the standard question;

A convolution module, configured to import the second semantic feature into the convolutional network for convolution processing to obtain a target vector;

A loss calculation module, configured to perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;

An iterative update module, configured to compare the loss value with a preset threshold, and if the loss value is greater than the preset threshold, iteratively update the initial multi-round question and answer model until the loss value is less than or equal to the preset threshold Up to the threshold, the updated initial multi-round question answering model is used as the target multi-round question answering model.
The multi-round question answering recognition device according to claim 8, wherein the vector feature conversion module comprises:

The word segmentation sub-module is used to perform word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;

The initial conversion sub-module is configured to use the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
The multi-round question answering recognition device according to claim 9, wherein the word segmentation sub-module comprises:

The setting unit is used to set the string index value and the maximum length value of the word segmentation according to the preset requirements;

A character extraction unit, configured to extract a target character from the corpus according to the string index value and the maximum length value for each of the corpus of the positive sample and the negative sample;

A matching unit for matching the target character with a legal character in a preset dictionary library;

The matching success unit is configured to, if the matching is successful, determine the target character as the target word segmentation, and update the string index value to the current string index value plus the current maximum length value, based on After the updated string index value and the maximum length value, extract target characters from the corpus for matching until the word segmentation operation on the corpus is completed;

The matching failure unit is configured to decrement the maximum length value if the matching fails, and extract target characters from the corpus based on the updated maximum length value and the string index value for matching until the completion End the word segmentation operation on the corpus;

The word segmentation completion unit is configured to obtain the first word segmentation result corresponding to the positive sample if each of the corpus in the positive sample completes the word segmentation operation, and if the corpus in the negative sample completes the word segmentation Operation, the second word segmentation result corresponding to the negative sample is obtained.
A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, wherein the processor executes the computer-readable instructions as follows step:

Get user history questions, user history answers, and user current questions from the user database;

The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit

The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;

Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;

The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
The computer device according to claim 11, wherein, before the step of obtaining user history questions, user history answers, and user current questions from a user database, when the processor executes the computer-readable instructions, it further comprises To achieve the following steps:

Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples;

The positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative vector feature corresponding to the negative sample, where , The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;

Performing semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;

Querying a standard question matching the first semantic feature from a preset standard library, and obtaining a standard answer vector corresponding to the standard question;

Importing the second semantic feature into the convolutional network for convolution processing to obtain a target vector;

Perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;

The loss value is compared with a preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold. The initial multi-round question answering model is used as the target multi-round question answering model.
The computer device according to claim 12, wherein the positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the corresponding positive sample The steps of the negative vector feature corresponding to the negative sample include:

Performing word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;

Using the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
The computer device according to claim 13, wherein each of the historical question, each of the historical answer and the current question in the positive sample is used as a corpus, and the negative sample The current answer is used as a corpus, and the word segmentation processing is performed on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the step of the second word segmentation result corresponding to the negative sample includes:

Set the string index value and the maximum length of the word segmentation according to the preset requirements;

For each of the corpus of the positive sample and the negative sample, extract a target character from the corpus according to the character string index value and the maximum length value;

Matching the target character with a legal character in a preset dictionary library;

If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated character String index value and the maximum length value, extracting target characters from the corpus for matching until the word segmentation operation on the corpus is completed;

If the matching fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching, until the word segmentation of the corpus is completed So far

If each of the corpus in the positive sample completes the word segmentation operation, the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the negative The second word segmentation result corresponding to the sample.
The computer device according to claim 12, wherein the long short-term memory network comprises n+1 first long-short-term memory network layers and 2 second long-short-term memory network layers, and the positive vector feature is n Wherein, n is a positive integer greater than 1, the positive vector feature and the negative vector feature are extracted through the long and short-term memory network to obtain the first semantic feature corresponding to the positive vector feature The step of the second semantic feature corresponding to the negative vector feature includes:

The n positive vector features and the negative vector features are respectively imported into the n+1 first long-short-term memory network layers for semantic recognition, and n first recognition results corresponding to the n positive vector features are obtained A second recognition result corresponding to the negative vector feature;

Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature .
A non-volatile computer-readable storage medium storing computer-readable instructions, wherein when the computer-readable instructions are executed by a processor, Make the processor execute the following steps:

Get user history questions, user history answers, and user current questions from the user database;

The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit

The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;

Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;

The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
The non-volatile computer-readable storage medium of claim 16, wherein before the step of obtaining user history questions, user history answers, and user current questions from a user database, the computer-readable instructions When executed by a processor, the processor is caused to further execute the following steps:

Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples;

The positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative vector feature corresponding to the negative sample, where , The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;

Performing semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;

Querying a standard question matching the first semantic feature from a preset standard library, and obtaining a standard answer vector corresponding to the standard question;

Importing the second semantic feature into the convolutional network for convolution processing to obtain a target vector;

Perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;

The loss value is compared with a preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold. The initial multi-round question answering model is used as the target multi-round question answering model.
The non-volatile computer-readable storage medium of claim 17, wherein the positive sample and the negative sample are respectively imported into the coding layer in the initial multi-round question answering model to perform vector feature conversion Processing to obtain the positive vector feature corresponding to the positive sample, and the step of the negative vector feature corresponding to the negative sample includes:

Performing word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;

Using the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
The non-volatile computer-readable storage medium of claim 18, wherein each of the historical questions, each of the historical answers, and the current question in the positive sample is used as a corpus. , Using the current answer in the negative sample as a corpus, and performing word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the negative sample corresponding to The steps of the second segmentation result include:

Set the string index value and the maximum length of the word segmentation according to the preset requirements;

For each of the corpus of the positive sample and the negative sample, extract a target character from the corpus according to the character string index value and the maximum length value;

Matching the target character with a legal character in a preset dictionary library;

If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated character String index value and the maximum length value, extracting target characters from the corpus for matching until the word segmentation operation on the corpus is completed;

If the match fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching, until the word segmentation of the corpus is completed So far

If each of the corpus in the positive sample completes the word segmentation operation, the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the negative The second word segmentation result corresponding to the sample.
The non-volatile computer-readable storage medium of claim 17, wherein the long short-term memory network comprises n+1 first long-short-term memory network layers and two second long- and short-term memory network layers There are n positive vector features, where n is a positive integer greater than 1, and performing semantic feature extraction on the positive vector feature and the negative vector feature through the long short-term memory network to obtain the positive vector feature The steps of the first semantic feature corresponding to the vector feature and the second semantic feature corresponding to the negative vector feature include:

The n positive vector features and the negative vector features are respectively imported into the n+1 first long-short-term memory network layers for semantic recognition, and n first recognition results corresponding to the n positive vector features are obtained A second recognition result corresponding to the negative vector feature;

Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature .