WO2021056710A1 - Multi-round question-and-answer identification method, device, computer apparatus, and storage medium - Google Patents

Multi-round question-and-answer identification method, device, computer apparatus, and storage medium Download PDF

Info

Publication number
WO2021056710A1
WO2021056710A1 PCT/CN2019/116924 CN2019116924W WO2021056710A1 WO 2021056710 A1 WO2021056710 A1 WO 2021056710A1 CN 2019116924 W CN2019116924 W CN 2019116924W WO 2021056710 A1 WO2021056710 A1 WO 2021056710A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
feature
positive
negative
word segmentation
Prior art date
Application number
PCT/CN2019/116924
Other languages
French (fr)
Chinese (zh)
Inventor
邓悦
金戈
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021056710A1 publication Critical patent/WO2021056710A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a multi-round question and answer recognition method, device, computer equipment and storage medium.
  • the traditional multi-round question answering model mainly spliced the dialogue information of the previous rounds directly and regarded it as a sentence as input.
  • the embodiments of the application provide a method, device, computer equipment, and storage medium for multi-round question and answer recognition, so as to solve the problem that the accuracy of traditional multi-round question answering model recognition is not high, which affects the accuracy and efficiency of information obtained by users according to the multi-round question answering model. problem.
  • a method for multiple rounds of question and answer recognition including:
  • the user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit
  • the encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;
  • the target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
  • a multi-round question and answer recognition device including:
  • the first obtaining module is used to obtain user historical questions, user historical answers, and user current questions from the user database;
  • the import module is used to import the user history question, the user history answer, and the user current question into a pre-trained target multi-round question answering model, wherein the target multi-round question answering model includes a coding unit, a long Short-term memory unit and fully connected unit;
  • the conversion module is configured to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the encoding unit to obtain the first vector feature corresponding to the user history question, and the user The second vector feature corresponding to the historical answer, and the third vector feature corresponding to the user's current question;
  • An extraction module configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain a target semantic feature
  • the output module is used to import the target semantic feature into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the above-mentioned multiple rounds of question and answer recognition when the processor executes the computer-readable instructions Method steps.
  • a non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions implement any of the foregoing when executed by a processor Steps of multiple rounds of question-and-answer recognition method.
  • FIG. 1 is a flowchart of a method for identifying multiple rounds of question and answer provided by an embodiment of the present application
  • FIG. 2 is a flowchart of training a target multi-round question answering model in the multi-round question answering recognition method provided by an embodiment of the present application;
  • FIG. 3 is a flowchart of step S2 in the multi-round question and answer recognition method provided by an embodiment of the present application;
  • step S21 is a flowchart of step S21 in the method for identifying multiple rounds of question and answer provided by an embodiment of the present application
  • FIG. 5 is a flowchart of step S3 in the multi-round question and answer recognition method provided by the embodiment of the present application.
  • FIG. 6 is a flowchart of step S6 in the method for identifying multiple rounds of question and answer provided by an embodiment of the present application
  • FIG. 7 is a schematic diagram of a multi-round question and answer recognition device provided by an embodiment of the present application.
  • Fig. 8 is a basic structural block diagram of a computer device provided by an embodiment of the present application.
  • the multi-round question and answer recognition method provided in this application is applied to the server, and the server can be implemented by an independent server or a server cluster composed of multiple servers.
  • a method for multi-round question and answer recognition is provided, which includes the following steps:
  • S101 Obtain user historical questions, user historical answers, and user current questions from a user database.
  • the user history questions, user history answers, and user current questions are directly obtained from the user database, where the user database refers to a database specifically used to store user history questions, user history answers, and user current questions.
  • S102 Import user history questions, user history answers, and user current questions into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a fully connected unit.
  • the pre-trained target multi-round question answering model refers to training the neural network model according to the training data set by the user, and then, in the case of multiple rounds of question and answer, the current user after the multiple rounds of question and answer Question, a neural network model that can quickly identify the user's current answer corresponding to the user's current question.
  • step S101 the user history questions, user history answers, and user current questions obtained in step S101 are directly imported into the pre-trained target multi-round question answering model.
  • S103 Perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the coding unit to obtain the first vector feature corresponding to the user history question, the second vector feature corresponding to the user history answer, and the first vector feature corresponding to the user’s current question.
  • Three vector features are possible.
  • a vector conversion port in the coding unit for vector feature conversion processing of user history questions, user history answers, and user current questions, by directly importing user history questions, user history answers, and user current questions Perform vector feature conversion processing on the vector conversion port in the coding unit to obtain the first vector feature corresponding to the user's historical question, the second vector feature corresponding to the user's historical answer, and the third vector feature corresponding to the user's current question.
  • S104 Import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature.
  • semantic feature port used to extract semantic features of the first vector feature, the second vector feature, and the third vector feature in the long and short-term memory unit.
  • the vectors are imported into the semantic feature port in the long-term and short-term memory unit for semantic feature extraction, and the target semantic feature is obtained.
  • the fully connected unit contains a preset classifier, and the target semantic feature is imported into the fully connected unit.
  • the preset classifier is used to perform similarity of the target semantic feature Calculate and output the recognition result with the greatest similarity, that is, the recognition result is the answer corresponding to the user's current question.
  • the classifier is specially used for similarity calculation.
  • the coding unit in the target multi-round question answering model is used for vector feature conversion processing, Obtain the first vector feature, the second vector feature, and the third vector feature, and perform the semantic feature extraction on the first vector feature, the second vector feature and the third vector feature according to the long and short-term memory unit to obtain the target semantic feature. Calculate the similarity of the target semantic features, and output the recognition result with the greatest similarity.
  • the pre-trained target multi-round question answering model By using the pre-trained target multi-round question answering model, it can quickly and accurately determine the recognition result corresponding to the user’s current question based on the user’s historical questions, user historical answers and the user’s current question, and the pre-trained target multi-round question answering model
  • the use of long and short-term memory units for semantic feature extraction can strengthen the information interaction between user historical questions, user historical answers and user current questions, so that the accuracy of the target multi-round question answering model identification is higher, thereby improving the user's multi-round question answering according to the target The accuracy and efficiency of the information obtained by the model.
  • the multi-round question and answer recognition method further includes the following steps:
  • the label information in the preset sample library by detecting the label information in the preset sample library, when the label information is detected as label 1, label two, and label three, the historical question corresponding to label one is obtained, and label two corresponds to Obtain the historical answer of label three, obtain the current question corresponding to label three, and determine the historical question, historical answer, and current question as a positive sample; when the label information is detected as label four, perform the current answer corresponding to label four Obtain and determine the current answer as a negative sample.
  • the preset sample library refers to a database dedicated to storing different tag information and data information corresponding to the tag information.
  • the tag information includes tag 1, tag 2, tag 3, and tag 4.
  • the data information includes historical questions, historical answers, For the current question and the current answer, the data information corresponding to the label one is a historical question, the data information corresponding to the label two is the historical answer, the data information corresponding to the label three is the current question, and the data information corresponding to the label four is the current answer.
  • each historical question has its corresponding historical answer, and there are at least 5 historical questions and historical answers.
  • the initial multi-round question answering model Including coding layer, long short-term memory network and convolutional network.
  • the initial multi-round question answering model includes an encoding layer, a long short-term memory network, and a convolutional network.
  • the positive and negative samples are imported into the conversion database in the coding layer for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples and the negative vector features corresponding to the negative samples after the vector feature conversion processing .
  • S3 Perform semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature.
  • the positive vector features and negative vector features are respectively imported into the semantic feature database for semantic feature extraction, and the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature are obtained after the semantic feature extraction.
  • the Long-Short-Term Memory is a time-loop neural network, which is specially designed to solve the long-term dependency problem of the general cyclic neural network. All cyclic neural networks have one A chain form of repeated neural network modules.
  • S4 Query the standard question matching the first semantic feature from the preset standard library, and obtain the standard answer vector corresponding to the standard question.
  • the legal semantic feature that is the same as the first semantic feature is queried from the preset standard library, and when the legal semantic feature that is the same as the first semantic feature is queried, the The legal question corresponding to the legal semantic feature is taken as the standard question, and the standard answer vector corresponding to the target legal question that is the same as the standard question is extracted from the preset vector library.
  • the preset standard library refers to a database specifically used to store different legal semantic features and legal questions corresponding to the legal semantic features, and the preset standard library is preset to have the same legal semantic feature as the first semantic feature.
  • the preset vector library refers to a database specially used to store the target legal questions that are the same as the legal questions in the preset standard library and the standard answer vectors corresponding to the target legal questions.
  • the convolutional network includes a pre-set convolution kernel
  • the second semantic feature obtained in step S3 is subjected to convolution processing by using the pre-set convolution kernel in the convolution network to obtain the convolution processing After the target vector.
  • the preset convolution kernel refers to a kernel function that is set according to the actual needs of the user to convert the second semantic feature into a target vector.
  • S6 Perform loss calculation according to the standard answer vector and the target vector to obtain the loss value.
  • the standard answer vector and the target vector are imported into the preset loss calculation port for loss calculation processing, and the loss value after the loss calculation processing is output.
  • the preset loss calculation port refers to a processing port specially used for loss calculation.
  • the loss value is compared with the preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question and answer model is iteratively updated, until the loss value is less than or equal to the preset threshold, the updated initial multi-round question and answer model Multi-round question answering model as the target
  • step S6 Compare the loss value obtained in step S6 with the preset threshold. If the loss value is greater than the preset threshold, use the preset loss function to adjust the initial parameters of each network layer in the initial multi-round question and answer model. Iterative update, if the loss value is less than or equal to the preset threshold, the iteration is stopped, and the initial multi-round question answering model corresponding to the loss value is determined as the target multi-round question answering model.
  • the initial parameter is only a parameter preset to facilitate the calculation of the initial multi-round question and answer model, so that there must be an error between the standard answer vector obtained from the positive and negative samples and the target vector. This error information needs to be returned layer by layer. Pass to each layer of network structure in the initial multi-round question and answer model, and let each layer of network structure adjust the preset initial parameters to obtain a target multi-round question answering model with better recognition effect.
  • the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold, and the target multi-round question answering model is obtained.
  • the long and short-term memory network to extract the first semantic feature and the second semantic feature, the information interaction between the context information in the positive and negative samples can be strengthened, and the accuracy and training efficiency of the model training can be effectively improved. It is based on the loss value and the prediction.
  • the method of setting thresholds for comparison improves the accuracy of model training, further improves the training efficiency and recognition accuracy of the target multi-round question answering model, and ensures the accuracy and efficiency of the user's acquisition of information based on the target multi-round question answering model.
  • step S2 the positive samples and negative samples are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples, and the negative samples
  • the negative vector feature corresponding to the sample includes the following steps:
  • S21 Perform word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample.
  • word segmentation refers to the process of recombining continuous word sequences into word sequences according to certain specifications.
  • the continuous word sequence "ABCD” is processed through word segmentation to obtain “AB” and "CD”. .
  • the positive sample and the negative sample obtained in step S1 are segmented using the mechanical word segmentation method to obtain the first word segmentation result of the positive sample after the word segmentation process, and the negative sample after the word segmentation process The second participle result obtained.
  • mechanical word segmentation methods mainly include four methods: forward maximum matching, forward minimum matching, reverse maximum matching, and reverse minimum matching.
  • this proposal adopts the forward maximum matching algorithm.
  • the positive sample contains historical questions, historical answers, and current questions
  • word segmentation when word segmentation is performed on the positive sample, the purpose is to segment each historical question, each historical answer, and current question in the positive sample.
  • the obtained first word segmentation result contains multiple, that is, it contains the word segmentation result corresponding to each historical question, the word segmentation result corresponding to each historical answer and the word segmentation result corresponding to the current question.
  • S22 Use the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain a positive vector feature and a negative vector feature.
  • step S2 since there is a conversion database for performing vector feature conversion processing on positive samples and negative samples in the coding layer, the conversion database contains the results for the first word segmentation and the second word segmentation result. Perform vector feature conversion processing preset processing library.
  • the first word segmentation result corresponding to the positive vector feature and the second word segmentation result corresponding to the negative vector feature are obtained.
  • the preset processing library specifically uses the word2vec model to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result.
  • the positive samples and negative samples can be quickly and accurately converted into the first word segmentation result and the second word segmentation result through word segmentation processing, and then the first word segmentation result and the second word segmentation result are converted into positive vector features and negative vectors Features, so as to achieve accurate acquisition of positive vector features and negative vector features, and improve the accuracy of subsequent use of positive vector features and negative vector features for semantic feature extraction.
  • each historical question, each historical answer, and current question in the positive sample is used as a corpus
  • the current answer in the negative sample is used as a corpus.
  • step S21 The positive sample and the negative sample are subjected to word segmentation processing to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample includes the following steps:
  • S211 Set the string index value and the maximum length value of word segmentation according to preset requirements.
  • the string index value refers to the position specifically used to locate the character to start scanning. If the character string index value is 0, it means that the first character is the position to start scanning the character.
  • the maximum length value is the maximum range specifically used to scan characters. If the maximum length value is 2, it means scanning at most 2 characters, and if the maximum length value is 3, it means scanning at most 3 characters.
  • the string index value and the maximum length value of the word segmentation are set according to the preset requirements.
  • the preset requirements may specifically be to set the string index value to 0 and the maximum length value to 2, and the specific settings are
  • the requirements can be set according to the actual needs of users, and there is no restriction here.
  • the corpus is scanned in a left-to-right scanning mode.
  • the character from the starting scanning position to the maximum length value is identified as the target character, and the target character is extracted.
  • the corpus is "Nanjing Yangtze River Bridge"
  • the maximum length value is 3
  • the initial value of the string index is 0.
  • the character "Nanjing City” with the maximum length value is identified as the target character, and the target character is extracted.
  • the target character obtained in step S212 is matched with the legal character in the preset dictionary library.
  • the preset dictionary database refers to a database specially used for storing legal characters set by the user.
  • the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated string index value and maximum length value, from the corpus Extract the target characters in the corpus for matching until the word segmentation operation on the corpus is completed.
  • the target character obtained in step S212 is matched with the legal character in the preset dictionary library.
  • the target character is matched with the legal character in the preset dictionary library, it indicates that the matching is successful, and the target character is determined
  • the string index value is updated to the string index value in the current step S212 plus the maximum length value in the current step S212, and the target is extracted from the corpus based on the updated string index value and maximum length value
  • the characters are matched until the word segmentation operation on the corpus is completed.
  • the target character "Nanjing City” matches the character in the preset dictionary library, the target character “Nanjing City” is confirmed as the target segmentation, and the string index value is Update to the current string index value 0 + the current maximum length value 3, that is, the string index value will be updated to 3, and based on the updated string index value 3 and the maximum length value 3, the target characters are extracted from the corpus for matching, That is, for the corpus "Nanjing Yangtze River Bridge", scan from the "long” character. Until the word segmentation operation on the corpus is completed.
  • the target character obtained in step S212 is matched with the legal character in the preset dictionary library.
  • the target character is not matched with the legal character in the preset dictionary library, it means that the matching fails, and the maximum length value is changed.
  • the maximum length value is updated to the current maximum length value 3 minus 1, which is the maximum length The value is updated to 2, and based on the updated maximum length value of 2 and the string index value of 0, the target characters are extracted from the corpus for matching until the word segmentation operation on the corpus is completed.
  • the word segmentation result corresponding to each corpus is used as the first word segmentation result corresponding to the positive sample.
  • the corpus corresponding to the The word segmentation result is used as the second word segmentation result corresponding to the negative sample.
  • each corpus in the positive sample and the negative sample is word segmented, and according to the string index value and the maximum length value and the legal characters, the first is obtained.
  • the word segmentation result and the second word segmentation result are realized, and the accuracy of the subsequent vector feature conversion processing of the first and second word segmentation results after the word segmentation processing is improved.
  • the long-short-term memory network includes n+1 first long-short-term memory network layers and 2 second long-short-term memory network layers, and there are n positive vector features, where n is a positive integer greater than 1,
  • n is a positive integer greater than 1
  • the positive vector feature and the negative vector feature are extracted through the long and short-term memory network, and the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature are obtained.
  • the first LSTM network layer refers to a network structure specifically used for semantic recognition of positive vector features and negative vector features, by importing n positive vector features and negative vector features into n+1
  • the first long-short-term memory network layer performs semantic recognition, that is, each positive vector feature and negative vector feature are respectively imported into a first long-short-term memory network layer for semantic recognition, and n+1 output of the first long-short-term memory network layer is obtained n first recognition results and second recognition results.
  • the first recognition result corresponds to the positive vector feature
  • the second recognition result corresponds to the second recognition result.
  • semantic recognition can be performed on n positive vector features and negative vector features at the same time, which improves Recognition efficiency of semantic recognition.
  • the second LSTM network layer refers to a network structure specifically used for semantic feature extraction of the first recognition result and the second recognition result
  • the second LSTM network layer is a two-way LSTM.
  • Two-way LSTM consists of two LSTMs with different directions. One LSTM reads data from front to back according to the order of words in the sentence, and the other LSTM reads data from back to front according to the reverse direction of the sentence word order, so that the first LSTM gets the upper Text information, another LSTM obtains the following information, and the joint statement of the two LSTMs is the context information of the entire sentence.
  • the hidden layer of the two-way LSTM neuron After two-way LSTM encoding, the hidden layer of the two-way LSTM neuron only outputs the vector that marks the corresponding position of the entity instead of outputting all the encoding vectors of the entire sentence.
  • the advantage of this is that the interference of redundant information on the relationship classification can be removed, and only the interference of the relationship classification can be removed.
  • the most critical information After two-way LSTM extraction, the semantic features corresponding to the sentence are output.
  • the n first recognition results are all input into a second long short-term memory network layer for semantic feature extraction, and the first semantic features obtained after the semantic feature extraction is performed on the n first recognition results are output; the second The recognition result is input to another second long and short-term memory network layer for semantic feature extraction, and the second semantic feature obtained after semantic feature extraction is output for the second recognition result.
  • the positive vector feature and the negative vector feature are semantically recognized through the first long and short-term memory network layer to obtain the first recognition result and the second recognition result, and the first recognition result is obtained by using the second long and short-term memory network layer.
  • Perform semantic feature extraction with the second recognition result to obtain the first semantic feature and the second semantic feature.
  • step S6 the loss calculation is performed according to the standard answer vector and the target vector, and obtaining the loss value includes the following steps:
  • the cosine calculation result is calculated according to formula (1):
  • X is the result of cosine calculation
  • A is the standard answer vector
  • B is the target vector.
  • S62 Perform a loss calculation according to the cosine calculation result and the cross-entropy loss function to obtain a loss value.
  • the result of the cosine calculation indicates the probability that the current question and the current answer are predicted by the initial multi-round question answering model.
  • the probability predicted by the initial multi-round question answering model reaches the preset target value, it represents the current question and the current answer.
  • Matching when the probability predicted by the initial multi-round question answering model does not reach the preset target value, it means that the current question and the current answer do not match.
  • the preset target value may specifically be 0.8, or it may be set according to the actual needs of the user, and there is no limitation here.
  • the cross-entropy loss function is used to calculate the loss value as in formula (2):
  • H(p,q) is the loss value
  • x is 0 or 1
  • p(x) is the actual state corresponding to x, if x is 0, it means that the current question does not match the current answer, and p(x) is 0, If x is 1, it means that the current question matches the current answer, p(x) is 1, and q(x) is the result of cosine calculation.
  • formula (1) can quickly and accurately calculate the cosine calculation result between the standard answer vector and the target vector
  • formula (2) can quickly and accurately calculate the corresponding loss value based on the cosine calculation result, and further Ensure that the subsequent use of the loss value to determine the accuracy of the target multi-round question and answer model.
  • a multi-round question answering recognition device is provided, and the multi-round question answering recognition device corresponds to the multi-round question answering recognition method in the above-mentioned embodiment one-to-one. As shown in Figure 7, the multi-round question answering recognition device includes
  • the first acquisition module 71 The first acquisition module 71, the import module 72, the conversion module 73, the extraction module 74 and the output module 75.
  • the detailed description of each functional module is as follows:
  • the first obtaining module 71 is used to obtain user historical questions, user historical answers, and user current questions from the user database;
  • the import module 72 is used to import user history questions, user history answers, and user current questions into the pre-trained target multi-round question answering model, where the target multi-round question answering model includes coding units, long and short-term memory units, and fully connected units ;
  • the conversion module 73 is used to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the coding unit, to obtain the first vector feature corresponding to the user history question, the second vector feature corresponding to the user history answer, and the user current The third vector feature corresponding to the problem;
  • the extraction module 74 is configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;
  • the output module 75 is used to import the target semantic features into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.
  • the multi-round question answering recognition device further includes:
  • the second acquisition module is used to acquire historical questions, historical answers, and current questions as positive samples from the preset sample library, and acquire current answers as negative samples;
  • the vector feature conversion module is used to import the positive samples and negative samples into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples and the negative vector features corresponding to the negative samples.
  • the initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;
  • the semantic feature extraction module is used to perform semantic feature extraction on positive vector features and negative vector features through a long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;
  • the query module is used to query the standard question matching the first semantic feature from the preset standard library, and obtain the standard answer vector corresponding to the standard question;
  • the convolution module is used to import the second semantic feature into the convolutional network for convolution processing to obtain the target vector;
  • the loss calculation module is used to calculate the loss according to the standard answer vector and the target vector to obtain the loss value
  • the iterative update module is used to compare the loss value with a preset threshold. If the loss value is greater than the preset threshold, iteratively update the initial multi-round question and answer model until the loss value is less than or equal to the preset threshold.
  • the multi-round question answering model is the target multi-round question answering model.
  • the vector feature conversion module includes:
  • the word segmentation sub-module is used to perform word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample;
  • the initial conversion sub-module is used to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result using the coding layer to obtain positive vector features and negative vector features.
  • word segmentation sub-module includes:
  • the setting unit is used to set the string index value and the maximum length value of the word segmentation according to the preset requirements
  • the character extraction unit is used to extract target characters from the corpus according to the string index value and the maximum length value for each corpus in the positive sample and the negative sample;
  • the matching unit is used to match the target character with the legal character in the preset dictionary library
  • the matching success unit is used to determine the target character as the target word segmentation if the match is successful, and update the string index value to the current string index value plus the current maximum length value, based on the updated string index value and maximum length Value, extract the target characters from the corpus for matching until the word segmentation operation on the corpus is completed;
  • the matching failure unit is used to decrement the maximum length value if the matching fails, and extract the target characters from the corpus based on the updated maximum length value and the string index value for matching until the word segmentation operation on the corpus is completed;
  • the word segmentation completion unit is used to obtain the first word segmentation result corresponding to the positive sample if each corpus in the positive sample completes the word segmentation operation, and obtain the second word segmentation result corresponding to the negative sample if the corpus in the negative sample completes the word segmentation operation .
  • semantic feature extraction module includes:
  • Semantic recognition sub-module used to import n positive vector features and negative vector features to n+1 first long and short-term memory network layers for semantic recognition, and obtain n first recognition results corresponding to n positive vector features and The second recognition result corresponding to the negative vector feature;
  • the feature extraction sub-module is used to import the n first recognition results and the second recognition results into two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature.
  • the loss calculation module includes:
  • the cosine calculation sub-module is used to calculate the cosine similarity between the standard answer vector and the target vector to obtain the cosine calculation result;
  • the loss value acquisition sub-module is used to calculate the loss according to the cosine calculation result and the cross entropy loss function to obtain the loss value.
  • FIG. 8 is a block diagram of the basic structure of the computer device 90 in an embodiment of the present application.
  • the computer device 90 includes a memory 91, a processor 92, and a network interface 93 that are communicatively connected to each other through a system bus. It should be pointed out that FIG. 8 only shows a computer device 90 with components 91-93, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC Application Specific Integrated Circuit
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Processor
  • the computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
  • the memory 91 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static memory Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 91 may be an internal storage unit of the computer device 90, such as a hard disk or memory of the computer device 90.
  • the memory 91 may also be an external storage device of the computer device 90, for example, a plug-in hard disk equipped on the computer device 90, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 91 may also include both an internal storage unit of the computer device 90 and an external storage device thereof.
  • the memory 91 is generally used to store an operating system and various application software installed in the computer device 90, such as computer-readable instructions of the multi-round question and answer recognition method.
  • the memory 91 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 92 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 92 is generally used to control the overall operation of the computer device 90.
  • the processor 92 is configured to run computer-readable instructions or processed data stored in the memory 91, such as computer-readable instructions for running the multi-round question and answer recognition method.
  • the network interface 93 may include a wireless network interface or a wired network interface, and the network interface 93 is generally used to establish a communication connection between the computer device 90 and other electronic devices.
  • This application also provides another implementation manner, that is, to provide a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores the user's current problem information entry process, the user
  • the current question information entry process can be executed by at least one processor, so that the at least one processor executes the steps of any one of the above-mentioned multi-round question and answer identification methods.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a computer device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the various embodiments of the present application.
  • a computer device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Abstract

A multi-round question-and-answer identification method, a device, a computer apparatus, and a storage medium. The multi-round question-and-answer identification method comprises: importing an acquired historical user question, an acquired historical user answer, and an acquired current user question into a pre-trained target multi-round question-and-answer model; performing vector feature conversion processing by using an encoding unit in the target multi-round question-and-answer model, and obtaining a first vector feature, a second vector feature, and a third vector feature; importing the first vector feature, the second vector feature, and the third vector feature into a long short-term memory unit, performing semantic feature extraction, and obtaining a target semantic feature; and importing the target semantic feature into a fully-connected unit, performing similarity calculation, and outputting an identification result having a maximum similarity level. The method enhances accuracy and efficiency of acquiring information according to a target multi-round question-and-answer model for users.

Description

多轮问答识别方法、装置、计算机设备及存储介质Multi-round question and answer recognition method, device, computer equipment and storage medium
本申请以2019年9月24日提交的申请号为201910906819.7,名称为“多轮问答识别方法、装置、计算机设备及存储介质”的中国发明专利申请为基础,并要求其优先权。This application is based on the Chinese invention patent application filed on September 24, 2019 with the application number 201910906819.7, titled "Multi-round question and answer identification method, device, computer equipment and storage medium", and claims its priority.
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及一种多轮问答识别方法、装置、计算机设备及存储介质。This application relates to the field of artificial intelligence technology, and in particular to a multi-round question and answer recognition method, device, computer equipment and storage medium.
背景技术Background technique
传统的多轮问答模型主要是将前几轮的对话信息直接拼接,并视为一句话作为输入,发明人意识到,由于没有考虑句子与句子之间的关系,因此只能学习到词层面的语义信息而无法学习到语法层面或句子层面的语义信息,导致模型能表达的语义信息不完整,使到多轮问答模型识别的准确性不高,进而影响用户根据多轮问答模型获取信息的准确性及效率。The traditional multi-round question answering model mainly spliced the dialogue information of the previous rounds directly and regarded it as a sentence as input. The inventor realized that because the relationship between the sentence and the sentence was not considered, it could only learn the word level. Semantic information and unable to learn semantic information at the grammatical level or sentence level, resulting in incomplete semantic information that the model can express, making the recognition accuracy of the multi-round question answering model not high, which in turn affects the accuracy of the information obtained by the user according to the multi-round question answering model Sex and efficiency.
发明内容Summary of the invention
本申请实施例提供一种多轮问答识别方法、装置、计算机设备及存储介质,以解决传统多轮问答模型识别的准确性不高,影响用户根据多轮问答模型获取信息的准确性及效率的问题。The embodiments of the application provide a method, device, computer equipment, and storage medium for multi-round question and answer recognition, so as to solve the problem that the accuracy of traditional multi-round question answering model recognition is not high, which affects the accuracy and efficiency of information obtained by users according to the multi-round question answering model. problem.
一种多轮问答识别方法,包括:A method for multiple rounds of question and answer recognition, including:
从用户数据库中获取用户历史问题、用户历史答案和用户当前问题;Get user history questions, user history answers, and user current questions from the user database;
将所述用户历史问题、所述用户历史答案和所述用户当前问题导入到预先训练好的目标多轮问答模型中,其中,所述目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元;The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit
通过所述编码单元对所述用户历史问题、所述用户历史答案和所述用户当前问题进行向量特征转换处理,得到所述用户历史问题对应的第一向量特征,所述用户历史答案对应的第二向量特征,所述用户当前问题对应的第三向量特征;The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;
将所述第一向量特征、所述第二向量特征和所述第三向量特征导入到所述长短期记忆单元中进行语义特征提取,得到目标语义特征;Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;
将所述目标语义特征导入到所述全连接单元中进行相似度计算,输出相似度最大的识别结果。The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
一种多轮问答识别装置,包括:A multi-round question and answer recognition device, including:
第一获取模块,用于从用户数据库中获取用户历史问题、用户历史答案和用户当前问题;The first obtaining module is used to obtain user historical questions, user historical answers, and user current questions from the user database;
导入模块,用于将所述用户历史问题、所述用户历史答案和所述用户当前问题导入到预先训练好的目标多轮问答模型中,其中,所述目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元;The import module is used to import the user history question, the user history answer, and the user current question into a pre-trained target multi-round question answering model, wherein the target multi-round question answering model includes a coding unit, a long Short-term memory unit and fully connected unit;
转换模块,用于通过所述编码单元对所述用户历史问题、所述用户历史答案和所述用户当前问题进行向量特征转换处理,得到所述用户历史问题对应的第一向量特征,所述用户历史答案对应的第二向量特征,所述用户当前问题对应的第三向量特征;The conversion module is configured to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the encoding unit to obtain the first vector feature corresponding to the user history question, and the user The second vector feature corresponding to the historical answer, and the third vector feature corresponding to the user's current question;
提取模块,用于将所述第一向量特征、所述第二向量特征和所述第三向量特征导入到所述长短期记忆单元中进行语义特征提取,得到目标语义特征;An extraction module, configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain a target semantic feature;
输出模块,用于将所述目标语义特征导入到所述全连接单元中进行相似度计算,输出相似度最大的识别结果。The output module is used to import the target semantic feature into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上 运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述多轮问答识别方法的步骤。A computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the above-mentioned multiple rounds of question and answer recognition when the processor executes the computer-readable instructions Method steps.
一种非易失性的计算机可读存储介质,所述非易失性的计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被一种处理器执行时实现上述任一种多轮问答识别方法的步骤。A non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions implement any of the foregoing when executed by a processor Steps of multiple rounds of question-and-answer recognition method.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。The details of one or more embodiments of the present application are set forth in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings, and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.
图1是本申请实施例提供的多轮问答识别方法的流程图;FIG. 1 is a flowchart of a method for identifying multiple rounds of question and answer provided by an embodiment of the present application;
图2是本申请实施例提供的多轮问答识别方法中对目标多轮问答模型进行训练的流程图;2 is a flowchart of training a target multi-round question answering model in the multi-round question answering recognition method provided by an embodiment of the present application;
图3是本申请实施例提供的多轮问答识别方法中步骤S2的流程图;FIG. 3 is a flowchart of step S2 in the multi-round question and answer recognition method provided by an embodiment of the present application;
图4是本申请实施例提供的多轮问答识别方法中步骤S21的流程图;4 is a flowchart of step S21 in the method for identifying multiple rounds of question and answer provided by an embodiment of the present application;
图5是本申请实施例提供的多轮问答识别方法中步骤S3的流程图;FIG. 5 is a flowchart of step S3 in the multi-round question and answer recognition method provided by the embodiment of the present application;
图6是本申请实施例提供的多轮问答识别方法中步骤S6的流程图;FIG. 6 is a flowchart of step S6 in the method for identifying multiple rounds of question and answer provided by an embodiment of the present application;
图7是本申请实施例提供的多轮问答识别装置的示意图;FIG. 7 is a schematic diagram of a multi-round question and answer recognition device provided by an embodiment of the present application;
图8是本申请实施例提供的计算机设备的基本机构框图。Fig. 8 is a basic structural block diagram of a computer device provided by an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请提供的多轮问答识别方法应用于服务端,服务端具体可以用独立的服务器或者多个服务器组成的服务器集群实现。在一实施例中,如图1所示,提供一种多轮问答识别方法,包括如下步骤:The multi-round question and answer recognition method provided in this application is applied to the server, and the server can be implemented by an independent server or a server cluster composed of multiple servers. In an embodiment, as shown in FIG. 1, a method for multi-round question and answer recognition is provided, which includes the following steps:
S101:从用户数据库中获取用户历史问题、用户历史答案和用户当前问题。S101: Obtain user historical questions, user historical answers, and user current questions from a user database.
在本申请实施例中,通过直接从用户数据库中获取用户历史问题、用户历史答案和用户当前问题,其中,用户数据库是指专门用于存储用户历史问题、用户历史答案和用户当前问题的数据库。In the embodiments of the present application, the user history questions, user history answers, and user current questions are directly obtained from the user database, where the user database refers to a database specifically used to store user history questions, user history answers, and user current questions.
S102:将用户历史问题、用户历史答案和用户当前问题导入到预先训练好的目标多轮问答模型中,其中,目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元。S102: Import user history questions, user history answers, and user current questions into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a fully connected unit.
在本申请实施例中,预先训练好的目标多轮问答模型是指根据用户设定的训练数据对神经网络模型进行训练后,得到在多轮问答的情况下,针对多轮问答后的用户当前问题,能够快速识别用户当前问题对应的用户当前答案的神经网络模型。In the embodiments of the present application, the pre-trained target multi-round question answering model refers to training the neural network model according to the training data set by the user, and then, in the case of multiple rounds of question and answer, the current user after the multiple rounds of question and answer Question, a neural network model that can quickly identify the user's current answer corresponding to the user's current question.
具体地,将步骤S101获取到的用户历史问题、用户历史答案和用户当前问题直接导入到预先训练好的目标多轮问答模型中。Specifically, the user history questions, user history answers, and user current questions obtained in step S101 are directly imported into the pre-trained target multi-round question answering model.
S103:通过编码单元对用户历史问题、用户历史答案和用户当前问题进行向量特征转换处理,得到用户历史问题对应的第一向量特征,用户历史答案对应的第二向量特征,用户当前问题对应的第三向量特征。S103: Perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the coding unit to obtain the first vector feature corresponding to the user history question, the second vector feature corresponding to the user history answer, and the first vector feature corresponding to the user’s current question. Three vector features.
在本申请实施例中,编码单元中存在用于对用户历史问题、用户历史答案和用户当前问题进向量特征转换处理的向量转换端口,通过直接将用户历史问题、用户历史答案和用户当前问题导入到编码单元中的向量转换端口进行向量特征转换处理,得到用户历史问题对应的第一向量特征,用户历史答案对应的第二向量特征,用户当前问题对应的第三向量特征。In the embodiment of this application, there is a vector conversion port in the coding unit for vector feature conversion processing of user history questions, user history answers, and user current questions, by directly importing user history questions, user history answers, and user current questions Perform vector feature conversion processing on the vector conversion port in the coding unit to obtain the first vector feature corresponding to the user's historical question, the second vector feature corresponding to the user's historical answer, and the third vector feature corresponding to the user's current question.
S104:将第一向量特征、第二向量特征和第三向量特征导入到长短期记忆单元中进行语义特征提取,得到目标语义特征。S104: Import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature.
具体地,长短期记忆单元中存在用于对第一向量特征、第二向量特征和第三向量特征进行语义特征提取的语义特征端口,通过直接将第一向量特征、第二向量特征和第三向量一起导入到长短期记忆单元中的语义特征端口进行语义特征提取,得到目标语义特征。Specifically, there is a semantic feature port used to extract semantic features of the first vector feature, the second vector feature, and the third vector feature in the long and short-term memory unit. By directly combining the first vector feature, the second vector feature, and the third vector feature, The vectors are imported into the semantic feature port in the long-term and short-term memory unit for semantic feature extraction, and the target semantic feature is obtained.
S105:将目标语义特征导入到全连接单元中进行相似度计算,输出相似度最大的识别结果。S105: Import the target semantic feature into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.
具体地,全连接单元中包含预先设置好的分类器,将目标语义特征导入到全连接单元,当全连接单元接收到目标语义特征时,利用预先设置好的分类器对目标语义特征进行相似度计算,将相似度最大的识别结果进行输出,即该识别结果为用户当前问题对应的答案。其中,分类器专门用于进行相似度计算。Specifically, the fully connected unit contains a preset classifier, and the target semantic feature is imported into the fully connected unit. When the fully connected unit receives the target semantic feature, the preset classifier is used to perform similarity of the target semantic feature Calculate and output the recognition result with the greatest similarity, that is, the recognition result is the answer corresponding to the user's current question. Among them, the classifier is specially used for similarity calculation.
本实施例中,通过将获取到的用户历史问题、用户历史答案和用户当前问题导入到预先训练好的目标多轮问答模型中,利用目标多轮问答模型中的编码单元进行向量特征转换处理,得到第一向量特征、第二向量特征和第三向量特征,根据长短期记忆单元对第一向量特征、第二向量特征和第三向量特征进行语义特征提取,得到目标语义特征,通过全连接单元对目标语义特征进行相似度计算,并输出相似度最大的识别结果。通过利用预先训练好的目标多轮问答模型,能够根据用户历史问题、用户历史答案和用户当前问题,快速准确地判断出用户当前问题对应的识别结果,且预先训练好的目标多轮问答模型中利用长短期记忆单元进行语义特征提取,能够加强用户历史问题、用户历史答案和用户当前问题之间的信息交互,使目标多轮问答模型识别的准确性更高,进而提高用户根据目标多轮问答模型获取信息的准确性及效率。In this embodiment, by importing the acquired user history questions, user history answers, and user current questions into the pre-trained target multi-round question answering model, the coding unit in the target multi-round question answering model is used for vector feature conversion processing, Obtain the first vector feature, the second vector feature, and the third vector feature, and perform the semantic feature extraction on the first vector feature, the second vector feature and the third vector feature according to the long and short-term memory unit to obtain the target semantic feature. Calculate the similarity of the target semantic features, and output the recognition result with the greatest similarity. By using the pre-trained target multi-round question answering model, it can quickly and accurately determine the recognition result corresponding to the user’s current question based on the user’s historical questions, user historical answers and the user’s current question, and the pre-trained target multi-round question answering model The use of long and short-term memory units for semantic feature extraction can strengthen the information interaction between user historical questions, user historical answers and user current questions, so that the accuracy of the target multi-round question answering model identification is higher, thereby improving the user's multi-round question answering according to the target The accuracy and efficiency of the information obtained by the model.
在一实施例中,如图2所示,步骤S101之前,该多轮问答识别方法还包括如下步骤:In one embodiment, as shown in FIG. 2, before step S101, the multi-round question and answer recognition method further includes the following steps:
S1:从预设样本库中获取历史问题、历史答案和当前问题作为正样本,获取当前答案作为负样本。S1: Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples.
在本申请实施例中,通过对预设样本库中的标签信息进行检测,当检测到标签信息分别为标签一、标签二和标签三时,对标签一对应的历史问题进行获取,标签二对应的历史答案进行获取,标签三对应的当前问题进行获取,并将历史问题、历史答案和当前问题均确定为正样本;当检测到标签信息为标签四时,对该标签四对应的当前答案进行获取,并将当前答案确定为负样本。In the embodiment of the application, by detecting the label information in the preset sample library, when the label information is detected as label 1, label two, and label three, the historical question corresponding to label one is obtained, and label two corresponds to Obtain the historical answer of label three, obtain the current question corresponding to label three, and determine the historical question, historical answer, and current question as a positive sample; when the label information is detected as label four, perform the current answer corresponding to label four Obtain and determine the current answer as a negative sample.
其中,预设样本库是指专门用于存储不同的标签信息及标签信息对应的数据信息的数据库,标签信息包含标签一、标签二、标签三和标签四,数据信息包含历史问题、历史答案、当前问题和当前答案,标签一对应的数据信息为历史问题,标签二对应的数据信息为历史答案,标签三对应的数据信息为当前问题,标签四对应数据信息为当前答案。Among them, the preset sample library refers to a database dedicated to storing different tag information and data information corresponding to the tag information. The tag information includes tag 1, tag 2, tag 3, and tag 4. The data information includes historical questions, historical answers, For the current question and the current answer, the data information corresponding to the label one is a historical question, the data information corresponding to the label two is the historical answer, the data information corresponding to the label three is the current question, and the data information corresponding to the label four is the current answer.
需要说明的是,历史问题与历史答案之间存在映射关系,即每个历史问题都有其对应的历史答案,且存在至少5个历史问题和历史答案。It should be noted that there is a mapping relationship between historical questions and historical answers, that is, each historical question has its corresponding historical answer, and there are at least 5 historical questions and historical answers.
S2:将正样本和负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到正样本对应的正向量特征,负样本对应的负向量特征,其中,初始多轮问答模型包含编码层、长短期记忆网络和卷积网络。S2: Import the positive sample and the negative sample into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample and the negative vector feature corresponding to the negative sample. Among them, the initial multi-round question answering model Including coding layer, long short-term memory network and convolutional network.
在本申请实施例,初始多轮问答模型包含编码层、长短期记忆网络和卷积网络,编码层中存在用于对正样本和负样本进行向量特征转换处理的转换数据库,根据步骤S1得到 的正样本和负样本,通过将正样本和负样本分别导入到编码层中的转换数据库进行向量特征转换处理,得到向量特征转换处理后正样本对应的正向量特征,和负样本对应的负向量特征。In the embodiment of this application, the initial multi-round question answering model includes an encoding layer, a long short-term memory network, and a convolutional network. There is a conversion database in the encoding layer for performing vector feature conversion processing on positive samples and negative samples. For positive samples and negative samples, the positive and negative samples are imported into the conversion database in the coding layer for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples and the negative vector features corresponding to the negative samples after the vector feature conversion processing .
S3:通过长短期记忆网络对正向量特征和负向量特征进行语义特征提取,获取正向量特征对应的第一语义特征和负向量特征对应的第二语义特征。S3: Perform semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature.
在本申请实施例中,长短期记忆网络中存在用于对正向量特征和负向量特征进行语义特征提取的语义特征库,根据步骤S2得到的正向量特征和负向量特征,将正向量特征和负向量特征分别导入到语义特征库中进行语义特征提取,得到语义特征提取后正向量特征对应的第一语义特征,和负向量特征对应的第二语义特征。In the embodiment of this application, there is a semantic feature library for semantic feature extraction of positive vector features and negative vector features in the long and short-term memory network. According to the positive vector features and negative vector features obtained in step S2, the positive vector features and The negative vector features are respectively imported into the semantic feature database for semantic feature extraction, and the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature are obtained after the semantic feature extraction.
其中,长短期记忆网络(Long Short-Term Memory,LSTM)是一种时间循环神经网络,是为了解决一般的循环神经网络存在的长期依赖问题而专门设计出来的,所有的循环神经网络都具有一种重复神经网络模块的链式形式。Among them, the Long-Short-Term Memory (LSTM) is a time-loop neural network, which is specially designed to solve the long-term dependency problem of the general cyclic neural network. All cyclic neural networks have one A chain form of repeated neural network modules.
S4:从预设标准库中查询与第一语义特征相匹配的标准问题,并获取标准问题对应的标准答案向量。S4: Query the standard question matching the first semantic feature from the preset standard library, and obtain the standard answer vector corresponding to the standard question.
具体地,根据步骤S3得到的第一语义特征,从预设标准库中的查询与该第一语义特征相同的合法语义特征,当查询到与第一语义特征相同的合法语义特征时,获取该合法语义特征对应的合法问题作为标准问题,并从预设向量库中提取与该标准问题相同的目标合法问题对应的标准答案向量。Specifically, according to the first semantic feature obtained in step S3, the legal semantic feature that is the same as the first semantic feature is queried from the preset standard library, and when the legal semantic feature that is the same as the first semantic feature is queried, the The legal question corresponding to the legal semantic feature is taken as the standard question, and the standard answer vector corresponding to the target legal question that is the same as the standard question is extracted from the preset vector library.
其中,预设标准库是指专门用于存储不同的合法语义特征以及合法语义特征对应的合法问题的数据库,且预设标准库中预先设置存在与第一语义特征相同的合法语义特征。Among them, the preset standard library refers to a database specifically used to store different legal semantic features and legal questions corresponding to the legal semantic features, and the preset standard library is preset to have the same legal semantic feature as the first semantic feature.
预设向量库是指专门用于存储与预设标准库中的合法问题相同的目标合法问题及目标合法问题对应的标准答案向量的数据库。The preset vector library refers to a database specially used to store the target legal questions that are the same as the legal questions in the preset standard library and the standard answer vectors corresponding to the target legal questions.
S5:将第二语义特征导入到卷积网络中进行卷积处理,得到目标向量。S5: Import the second semantic feature into the convolutional network for convolution processing to obtain the target vector.
在本申请实施例中,卷积网络中包含预先设置好的卷积核,通过利用卷积网络中预先设置好的卷积核对步骤S3得到的第二语义特征进行卷积处理,得到卷积处理后的目标向量。其中,预先设置好的卷积核是指根据用户实际需求进行设置用于将第二语义特征转换成目标向量的核函数。In the embodiment of this application, the convolutional network includes a pre-set convolution kernel, and the second semantic feature obtained in step S3 is subjected to convolution processing by using the pre-set convolution kernel in the convolution network to obtain the convolution processing After the target vector. Among them, the preset convolution kernel refers to a kernel function that is set according to the actual needs of the user to convert the second semantic feature into a target vector.
S6:根据标准答案向量和目标向量进行损失计算,得到损失数值。S6: Perform loss calculation according to the standard answer vector and the target vector to obtain the loss value.
具体地,将标准答案向量和目标向量导入到预设损失计算端口进行损失计算处理,输出损失计算处理后的损失数值。其中,预设损失计算端口是指专门用于进行损失计算的处理端口。Specifically, the standard answer vector and the target vector are imported into the preset loss calculation port for loss calculation processing, and the loss value after the loss calculation processing is output. Among them, the preset loss calculation port refers to a processing port specially used for loss calculation.
S7:将损失数值与预设阈值进行比较,若损失数值大于预设阈值,则对初始多轮问答模型进行迭代更新,直到损失数值小于等于预设阈值为止,将更新后的初始多轮问答模型作为目标多轮问答模型。S7: The loss value is compared with the preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question and answer model is iteratively updated, until the loss value is less than or equal to the preset threshold, the updated initial multi-round question and answer model Multi-round question answering model as the target
将步骤S6得到的损失数值与预设阈值进行比较,若损失数值大于预设阈值,则利用预先设置好的损失函数,通过对初始多轮问答模型中各个网络层的初始参数进行调整的方式进行迭代更新,若损失数值小于等于预设阈值,则停止迭代,并将该损失数值对应的初始多轮问答模型确定为目标多轮问答模型。Compare the loss value obtained in step S6 with the preset threshold. If the loss value is greater than the preset threshold, use the preset loss function to adjust the initial parameters of each network layer in the initial multi-round question and answer model. Iterative update, if the loss value is less than or equal to the preset threshold, the iteration is stopped, and the initial multi-round question answering model corresponding to the loss value is determined as the target multi-round question answering model.
需要说明的是,初始参数只是为了方便初始多轮问答模型的运算预设的一个参数,使得根据正负样本获得的标准答案向量和目标向量之间必然存在误差,需要将这个误差信息逐层回传给初始多轮问答模型中的各层网络结构,让每一层网络结构对预设的初始参数进行调整,才能获得识别效果更好的目标多轮问答模型。It should be noted that the initial parameter is only a parameter preset to facilitate the calculation of the initial multi-round question and answer model, so that there must be an error between the standard answer vector obtained from the positive and negative samples and the target vector. This error information needs to be returned layer by layer. Pass to each layer of network structure in the initial multi-round question and answer model, and let each layer of network structure adjust the preset initial parameters to obtain a target multi-round question answering model with better recognition effect.
本实施例中,通过将获取历史问题、历史答案和当前问题作为正样本,获取当前答案作为负样本,利用初始多轮问答模型中的编码层对正样本和负样本进行向量特征转换处理,得到正向量特征和负向量特征,利用长短期记忆网络对正向量特征和负向量特征进行语义 特征提取,得到第一语义特征和第二语义特征,并获取与第一语义特征相匹配的标准问题对应的标准答案向量,将第二语义特征导入到卷积网络中进行卷积处理得到目标向量,基于标准答案向量与目标向量进行损失计算得到损失数值,将损失数值与预设阈值进行比较,若损失数值大于预设阈值,则对初始多轮问答模型进行迭代更新,直到损失数值小于等于预设阈值为止,并获取目标多轮问答模型。通过利用长短期记忆网络提取第一语义特征和第二语义特征的方式,能够加强正负样本中上下文信息之间的信息交互,有效提高模型训练的准确性及训练效率,并基于损失数值与预设阈值进行比较的方式,提高模型训练的准确度,进一步提高目标多轮问答模型的训练效率以及识别准确率,保证用户根据目标多轮问答模型获取信息的准确性及效率。In this embodiment, by taking historical questions, historical answers, and current questions as positive samples, and obtaining the current answers as negative samples, and using the coding layer in the initial multi-round question answering model to perform vector feature conversion processing on the positive samples and negative samples, we obtain Positive vector features and negative vector features, using long and short-term memory network to extract semantic features of positive vector features and negative vector features to obtain the first semantic feature and the second semantic feature, and obtain the standard question corresponding to the first semantic feature The standard answer vector of, the second semantic feature is imported into the convolutional network for convolution processing to obtain the target vector, the loss value is calculated based on the standard answer vector and the target vector, and the loss value is compared with the preset threshold. If the loss If the value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold, and the target multi-round question answering model is obtained. By using the long and short-term memory network to extract the first semantic feature and the second semantic feature, the information interaction between the context information in the positive and negative samples can be strengthened, and the accuracy and training efficiency of the model training can be effectively improved. It is based on the loss value and the prediction. The method of setting thresholds for comparison improves the accuracy of model training, further improves the training efficiency and recognition accuracy of the target multi-round question answering model, and ensures the accuracy and efficiency of the user's acquisition of information based on the target multi-round question answering model.
在一实施例中,如图3所示,步骤S2中,即将正样本和负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到正样本对应的正向量特征,负样本对应的负向量特征包括如下步骤:In one embodiment, as shown in FIG. 3, in step S2, the positive samples and negative samples are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples, and the negative samples The negative vector feature corresponding to the sample includes the following steps:
S21:对正样本和负样本进行分词处理,得到正样本对应的第一分词结果,负样本对应的第二分词结果.。S21: Perform word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample.
在本申请是实施例中,分词处理是指将连续的字序列按照一定的规范重新组合成词序列的过程,例如,将连续的字序列“ABCD”通过分词处理得到“AB”和“CD”。In the embodiments of this application, word segmentation refers to the process of recombining continuous word sequences into word sequences according to certain specifications. For example, the continuous word sequence "ABCD" is processed through word segmentation to obtain "AB" and "CD". .
具体地,根据步骤S1得到的正样本和负样本,利用机械分词方法对正样本和负样本均进行分词处理,获取正样本经过分词处理后得到的第一分词结果,以及负样本经过分词处理后得到的第二分词结果。Specifically, according to the positive sample and the negative sample obtained in step S1, the positive sample and the negative sample are segmented using the mechanical word segmentation method to obtain the first word segmentation result of the positive sample after the word segmentation process, and the negative sample after the word segmentation process The second participle result obtained.
其中,机械分词方法主要有正向最大匹配、正向最小匹配、逆向最大匹配、逆向最小匹配四种方法。优选地,本提案采用正向最大匹配算法。Among them, mechanical word segmentation methods mainly include four methods: forward maximum matching, forward minimum matching, reverse maximum matching, and reverse minimum matching. Preferably, this proposal adopts the forward maximum matching algorithm.
需要说明的是,由于正样本中包含历史问题、历史答案和当前问题,当对正样本进行分词处理时,针对的是对正样本中的每个历史问题、每个历史答案和当前问题进行分词,得到的第一分词结果包含多个,即包含每个历史问题对应的分词结果,每个历史答案对应的分词结果和当前问题对应的分词结果。It should be noted that since the positive sample contains historical questions, historical answers, and current questions, when word segmentation is performed on the positive sample, the purpose is to segment each historical question, each historical answer, and current question in the positive sample. , The obtained first word segmentation result contains multiple, that is, it contains the word segmentation result corresponding to each historical question, the word segmentation result corresponding to each historical answer and the word segmentation result corresponding to the current question.
S22:利用编码层对第一分词结果和第二分词结果进行向量特征转换处理,得到正向量特征和负向量特征。S22: Use the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain a positive vector feature and a negative vector feature.
在本申请实施例中,根据步骤S2,由于编码层中存在用于对正样本和负样本进行向量特征转换处理的转换数据库,该转换数据库中包含用于对第一分词结果和第二分词结果进行向量特征转换处理预设处理库。In the embodiment of the present application, according to step S2, since there is a conversion database for performing vector feature conversion processing on positive samples and negative samples in the coding layer, the conversion database contains the results for the first word segmentation and the second word segmentation result. Perform vector feature conversion processing preset processing library.
具体地,通过直接将第一分词结果和第二分词结果分别导入到预设处理库中进行向量特征转换处理,得到正向量特征对应的第一分词结果和负向量特征对应的第二分词结果。Specifically, by directly importing the first word segmentation result and the second word segmentation result into a preset processing library for vector feature conversion processing, the first word segmentation result corresponding to the positive vector feature and the second word segmentation result corresponding to the negative vector feature are obtained.
其中,预设处理库中具体是运用word2vec模型对第一分词结果和第二分词结果进行向量特征转换处理。Among them, the preset processing library specifically uses the word2vec model to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result.
本实施例中,通过分词处理的方式能够快速准确的将正样本和负样本转换成第一分词结果和第二分词结果,再将第一分词结果和第二分词结果转换成正向量特征和负向量特征,从而实现对正向量特征和负向量特征的准确获取,提高后续利用正向量特征和负向量特征进行语义特征提取的准确性。In this embodiment, the positive samples and negative samples can be quickly and accurately converted into the first word segmentation result and the second word segmentation result through word segmentation processing, and then the first word segmentation result and the second word segmentation result are converted into positive vector features and negative vectors Features, so as to achieve accurate acquisition of positive vector features and negative vector features, and improve the accuracy of subsequent use of positive vector features and negative vector features for semantic feature extraction.
在一实施例中,将正样本中每个历史问题、每个历史答案和当前问题分别作为一个语料,将负样本中的当前答案作为一个语料,如图4所示,步骤S21中,即对正样本和负样本进行分词处理,得到正样本对应的第一分词结果,负样本对应的第二分词结果包括如下步骤:In one embodiment, each historical question, each historical answer, and current question in the positive sample is used as a corpus, and the current answer in the negative sample is used as a corpus. As shown in FIG. 4, in step S21, The positive sample and the negative sample are subjected to word segmentation processing to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample includes the following steps:
S211:根据预设要求设置字符串索引值和分词的最大长度值。S211: Set the string index value and the maximum length value of word segmentation according to preset requirements.
在本申请实施例中,字符串索引值是指专门用于定位开始扫描字符的位置,若该字符串索引值为0,则表示第一个字符为开始扫描字符的位置。最大长度值是专门用于扫描字 符的最大范围,若最大长度值为2,则表示扫描最多2个字符,若最大长度值为3,则表示扫描最多3个字符。In the embodiment of the present application, the string index value refers to the position specifically used to locate the character to start scanning. If the character string index value is 0, it means that the first character is the position to start scanning the character. The maximum length value is the maximum range specifically used to scan characters. If the maximum length value is 2, it means scanning at most 2 characters, and if the maximum length value is 3, it means scanning at most 3 characters.
具体地,根据预设要求对字符串索引值和分词的最大长度值进行设置,其中,预设要求具体可以是将字符串索引值设置为0,将最大长度值设置为2,其具体的设置要求可以根据用户的实际需求进行设置,此处不做限制。Specifically, the string index value and the maximum length value of the word segmentation are set according to the preset requirements. The preset requirements may specifically be to set the string index value to 0 and the maximum length value to 2, and the specific settings are The requirements can be set according to the actual needs of users, and there is no restriction here.
S212:针对正样本和负样本中的每个语料,根据字符串索引值和最大长度值,从语料中提取目标字符。S212: For each corpus in the positive sample and the negative sample, extract the target character from the corpus according to the string index value and the maximum length value.
具体地,针对正样本和负样本中的每个语料,根据步骤S211得到的字符串索引值和最大长度值,按照从左到右的扫描方式扫描语料,当扫描到最大长度值的字符时,将从开始扫描位置的字符到该最大长度值的字符标识为目标字符,并对该目标字符进行提取。Specifically, for each corpus in the positive sample and the negative sample, according to the string index value and the maximum length value obtained in step S211, the corpus is scanned in a left-to-right scanning mode. When the character with the maximum length value is scanned, The character from the starting scanning position to the maximum length value is identified as the target character, and the target character is extracted.
例如,语料为“南京市长江大桥”,最大长度值为3,字符串索引的初始值为0,按照从左到右的方式扫描该语料,即扫描到最大长度值的字符为“南京市”,将该最大长度值的字符“南京市”标识为目标字符,并对该目标字符进行提取。For example, the corpus is "Nanjing Yangtze River Bridge", the maximum length value is 3, and the initial value of the string index is 0. Scan the corpus from left to right, that is, the character scanned to the maximum length value is "Nanjing City" , The character "Nanjing City" with the maximum length value is identified as the target character, and the target character is extracted.
S213:将目标字符与预设字典库中的合法字符进行匹配。S213: Match the target character with the legal character in the preset dictionary library.
具体地,将步骤S212中得到的目标字符与预设字典库中的合法字符进行匹配。其中,预设字典库是指专门用于存储用户设定的合法字符的数据库。Specifically, the target character obtained in step S212 is matched with the legal character in the preset dictionary library. Among them, the preset dictionary database refers to a database specially used for storing legal characters set by the user.
S214:若匹配成功,则将目标字符确定为目标分词,并将字符串索引值更新为当前字符串索引值加上当前最大长度值,基于更新后的字符串索引值和最大长度值,从语料中提取目标字符进行匹配,直到完成对语料的分词操作为止。S214: If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated string index value and maximum length value, from the corpus Extract the target characters in the corpus for matching until the word segmentation operation on the corpus is completed.
具体地,将步骤S212中得到的目标字符与预设字典库中的合法字符进行匹配,当匹配到目标字符与预设字典库中的合法字符相同时,表示匹配成功,并将该目标字符确定为目标分词,同时将字符串索引值更新为当前步骤S212中的字符串索引值加上当前步骤S212中的最大长度值,基于更新后的字符串索引值和最大长度值,从语料中提取目标字符进行匹配,直到完成对语料的分词操作为止。Specifically, the target character obtained in step S212 is matched with the legal character in the preset dictionary library. When the target character is matched with the legal character in the preset dictionary library, it indicates that the matching is successful, and the target character is determined For the target word segmentation, the string index value is updated to the string index value in the current step S212 plus the maximum length value in the current step S212, and the target is extracted from the corpus based on the updated string index value and maximum length value The characters are matched until the word segmentation operation on the corpus is completed.
例如,如步骤S212中的例子所述,若目标字符“南京市”匹配到与预设字典库中的字符相同时,则将目标字符“南京市”确认为目标分词,并将字符串索引值更新为当前字符串索引值0+当前最大长度值3,即字符串索引值将更新为3,并基于更新后的字符串索引值3和最大长度值3,从语料中提取目标字符进行匹配,即针对语料“南京市长江大桥”,从“长”字符开始扫描。直到完成对语料的分词操作为止。For example, as described in the example in step S212, if the target character "Nanjing City" matches the character in the preset dictionary library, the target character "Nanjing City" is confirmed as the target segmentation, and the string index value is Update to the current string index value 0 + the current maximum length value 3, that is, the string index value will be updated to 3, and based on the updated string index value 3 and the maximum length value 3, the target characters are extracted from the corpus for matching, That is, for the corpus "Nanjing Yangtze River Bridge", scan from the "long" character. Until the word segmentation operation on the corpus is completed.
S215:若匹配失败,则将最大长度值进行递减,并基于更新后的最大长度值和字符串索引值从语料中提取目标字符进行匹配,直到完成对语料的分词操作为止。S215: If the matching fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching until the word segmentation operation on the corpus is completed.
具体地,将步骤S212中得到的目标字符与预设字典库中的合法字符进行匹配,当未匹配到目标字符与预设字典库中的合法字符相同时,表示匹配失败,则将最大长度值更新为当前步骤S212中的最大长度值减1,并基于更新后的最大长度值和字符串索引值从语料中提取目标字符进行匹配,直到完成对语料的分词操作为止。Specifically, the target character obtained in step S212 is matched with the legal character in the preset dictionary library. When the target character is not matched with the legal character in the preset dictionary library, it means that the matching fails, and the maximum length value is changed. Update to the maximum length value in the current step S212 minus 1, and extract the target characters from the corpus based on the updated maximum length value and the string index value for matching until the word segmentation operation on the corpus is completed.
需要说明的是,当所有最大长度值大于1的目标字符都未匹配到与预设字典库中的字符相同时,则将单个字符确认为目标分词。It should be noted that when all target characters with a maximum length value greater than 1 do not match the characters in the preset dictionary library, a single character is confirmed as the target word segmentation.
例如:如步骤S212中的例子所述,若目标字符“南京市”未匹配到与预设字典库中的字符相同时,则将最大长度值更新为当前最大长度值3减1,即最大长度值更新为2,并基于更新后的最大长度值2和字符串索引值0从语料中提取目标字符进行匹配,直到完成对语料的分词操作为止。For example: as described in the example in step S212, if the target character "Nanjing City" does not match the character in the preset dictionary library, the maximum length value is updated to the current maximum length value 3 minus 1, which is the maximum length The value is updated to 2, and based on the updated maximum length value of 2 and the string index value of 0, the target characters are extracted from the corpus for matching until the word segmentation operation on the corpus is completed.
S216:若正样本中的每个语料完成分词操作,则得到正样本对应的第一分词结果,若负样本中的语料完成分词操作,则得到负样本对应的第二分词结果。S216: If each corpus in the positive sample completes the word segmentation operation, then the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the second word segmentation result corresponding to the negative sample is obtained.
具体地,当正样本中的每个语料完成分词操作时,将每个语料对应的分词结果作为正样本对应的第一分词结果,当负样本中的语料完成分词操作时,将该语料对应的分词结果作为负样本对应的第二分词结果。Specifically, when each corpus in the positive sample completes the word segmentation operation, the word segmentation result corresponding to each corpus is used as the first word segmentation result corresponding to the positive sample. When the corpus in the negative sample completes the word segmentation operation, the corpus corresponding to the The word segmentation result is used as the second word segmentation result corresponding to the negative sample.
本实施例中,通过设置字符串索引值和分词的最大长度值对正样本和负样本中的每个语料进行分词处理,并根据字符串索引值和最大长度值与合法字符进行匹配得到第一分词结果和第二分词结果。从而实现对正样本和负样本中的每个语料的准确分词,提高后续利用分词处理后的第一分词结果和第二分词结果进行向量特征转换处理的准确性。In this embodiment, by setting the string index value and the maximum length value of the word segmentation, each corpus in the positive sample and the negative sample is word segmented, and according to the string index value and the maximum length value and the legal characters, the first is obtained. The word segmentation result and the second word segmentation result. In this way, accurate word segmentation of each corpus in the positive sample and the negative sample is realized, and the accuracy of the subsequent vector feature conversion processing of the first and second word segmentation results after the word segmentation processing is improved.
在一实施例中,长短期记忆网络包含n+1个第一长短期记忆网络层和2个第二长短期记忆网络层,正向量特征为n个,其中,n为大于1的正整数,如图5所示,步骤S3中,即通过长短期记忆网络对正向量特征和负向量特征进行语义特征提取,获取正向量特征对应的第一语义特征和负向量特征对应的第二语义特征包括如下步骤:In one embodiment, the long-short-term memory network includes n+1 first long-short-term memory network layers and 2 second long-short-term memory network layers, and there are n positive vector features, where n is a positive integer greater than 1, As shown in Figure 5, in step S3, the positive vector feature and the negative vector feature are extracted through the long and short-term memory network, and the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature are obtained. The following steps:
S31:将n个正向量特征与负向量特征分别导入到n+1个第一长短期记忆网络层进行语义识别,得到n个正向量特征对应的n个第一识别结果和负向量特征对应的第二识别结果。S31: Import n positive vector features and negative vector features into n+1 first long and short-term memory network layers for semantic recognition, and obtain n first recognition results corresponding to n positive vector features and corresponding negative vector features The second recognition result.
在本申请实施例中,第一LSTM网络层是指专门用于针对正向量特征与负向量特征进行语义识别的网络结构,通过将n个正向量特征与负向量特征分别导入到n+1个第一长短期记忆网络层进行语义识别,即将每个正向量特征和负向量特征各自导入一个第一长短期记忆网络层中进行语义识别,得到n+1个第一长短期记忆网络层输出的n个第一识别结果和第二识别结果。其中,第一识别结果与正向量特征对应,第二识别结果与第二识别结果对应。In the embodiments of this application, the first LSTM network layer refers to a network structure specifically used for semantic recognition of positive vector features and negative vector features, by importing n positive vector features and negative vector features into n+1 The first long-short-term memory network layer performs semantic recognition, that is, each positive vector feature and negative vector feature are respectively imported into a first long-short-term memory network layer for semantic recognition, and n+1 output of the first long-short-term memory network layer is obtained n first recognition results and second recognition results. Among them, the first recognition result corresponds to the positive vector feature, and the second recognition result corresponds to the second recognition result.
需要说明的是,通过将每个正向量特征和负向量特征各自导入一个第一长短期记忆网络层中进行语义识别的方式,能够同时对n个正向量特征与负向量特征进行语义识别,提高语义识别的识别效率。It should be noted that by importing each positive vector feature and negative vector feature into a first long and short-term memory network layer for semantic recognition, semantic recognition can be performed on n positive vector features and negative vector features at the same time, which improves Recognition efficiency of semantic recognition.
S32:将n个第一识别结果和第二识别结果分别导入到2个第二长短期记忆网络层中进行语义特征提取,得到第一语义特征和第二语义特征。S32: Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively to extract semantic features to obtain the first semantic feature and the second semantic feature.
在本申请实施例中,第二LSTM网络层是指专门用于针对第一识别结果和第二识别结果进行语义特征提取的网络结构,且该第二LSTM网络层为双向LSTM。双向LSTM由两个方向不同的LSTM组成,一个LSTM按照句子中词的顺序从前往后读取数据,另一个LSTM从后往前按照句子词序的反方向读取数据,这样第一个LSTM获得上文信息,另一个LSTM获得下文信息,两个LSTM的联合说出就是整个句子的上下文信息。In the embodiment of the present application, the second LSTM network layer refers to a network structure specifically used for semantic feature extraction of the first recognition result and the second recognition result, and the second LSTM network layer is a two-way LSTM. Two-way LSTM consists of two LSTMs with different directions. One LSTM reads data from front to back according to the order of words in the sentence, and the other LSTM reads data from back to front according to the reverse direction of the sentence word order, so that the first LSTM gets the upper Text information, another LSTM obtains the following information, and the joint statement of the two LSTMs is the context information of the entire sentence.
经过双向LSTM编码后在双向LSTM神经元的隐藏层只输出标记实体对应位置的向量而不是把整个句子的编码向量全部输出,这样做的优点是可以去掉冗余信息对关系分类的干扰,只保留最关键的信息;经过双向LSTM的提取,输出句子对应的语义特征。After two-way LSTM encoding, the hidden layer of the two-way LSTM neuron only outputs the vector that marks the corresponding position of the entity instead of outputting all the encoding vectors of the entire sentence. The advantage of this is that the interference of redundant information on the relationship classification can be removed, and only the interference of the relationship classification can be removed. The most critical information: After two-way LSTM extraction, the semantic features corresponding to the sentence are output.
具体地,将n个第一识别结果都输入到一个第二长短期记忆网络层中进行语义特征提取,输出针对n个第一识别结果进行语义特征提取后得到的第一语义特征;将第二识别结果输入到另一个第二长短期记忆网络层中进行语义特征提取,输出针对第二识别结果进行语义特征提取后得到的第二语义特征。Specifically, the n first recognition results are all input into a second long short-term memory network layer for semantic feature extraction, and the first semantic features obtained after the semantic feature extraction is performed on the n first recognition results are output; the second The recognition result is input to another second long and short-term memory network layer for semantic feature extraction, and the second semantic feature obtained after semantic feature extraction is output for the second recognition result.
需要说明的是,由于第一识别结果是基于历史问题、历史答案和当前问题进行得到,故在进行语义特征提取时,结合历史问题、历史答案和当前问题,能够使语义特征提取更为精准。It should be noted that since the first recognition result is obtained based on historical questions, historical answers, and current questions, when semantic feature extraction is performed, combining historical questions, historical answers and current questions can make semantic feature extraction more accurate.
本实施例中,通过第一长短期记忆网络层对正向量特征与负向量特征进行语义识别,得到第一识别结果和第二识别结果,在利用第二长短期记忆网络层对第一识别结果和第二识别结果进行语义特征提取,得到第一语义特征和第二语义特征。从而实现对第一语义特征和第二语义特征的准确提取,提高利用第一语义特征和第二语义特征进行计算的准确性,进一步提高模型训练的准确精度。In this embodiment, the positive vector feature and the negative vector feature are semantically recognized through the first long and short-term memory network layer to obtain the first recognition result and the second recognition result, and the first recognition result is obtained by using the second long and short-term memory network layer. Perform semantic feature extraction with the second recognition result to obtain the first semantic feature and the second semantic feature. Thereby, accurate extraction of the first semantic feature and the second semantic feature is realized, the accuracy of calculation using the first semantic feature and the second semantic feature is improved, and the accuracy of model training is further improved.
在一实施例中,如图6所示,步骤S6中,即根据标准答案向量和目标向量进行损失计算,得到损失数值包括如下步骤:In one embodiment, as shown in FIG. 6, in step S6, the loss calculation is performed according to the standard answer vector and the target vector, and obtaining the loss value includes the following steps:
S61:通过对标准答案向量和目标向量进行余弦相似度计算,得到余弦计算结果。S61: Obtain the cosine calculation result by performing cosine similarity calculation on the standard answer vector and the target vector.
具体地,根据标准答案向量和目标向量,按照公式(1)计算余弦计算结果:Specifically, according to the standard answer vector and the target vector, the cosine calculation result is calculated according to formula (1):
Figure PCTCN2019116924-appb-000001
Figure PCTCN2019116924-appb-000001
其中,X为余弦计算结果,A为标准答案向量,B为目标向量。Among them, X is the result of cosine calculation, A is the standard answer vector, and B is the target vector.
S62:根据余弦计算结果与交叉熵损失函数进行损失计算,得到损失数值。S62: Perform a loss calculation according to the cosine calculation result and the cross-entropy loss function to obtain a loss value.
在本申请实施例中,余弦计算结果表示初始多轮问答模型预测出当前问题和当前答案匹配的概率,当初始多轮问答模型预测出的概率达到预设目标值时,表示当前问题和当前答案匹配,当初始多轮问答模型预测出的概率未达到预设目标值时,表示当前问题和当前答案不匹配。其中,预设目标值具体可以是0.8,也可以根据用户实际需求进行设置,此处不做限制。In the embodiment of the present application, the result of the cosine calculation indicates the probability that the current question and the current answer are predicted by the initial multi-round question answering model. When the probability predicted by the initial multi-round question answering model reaches the preset target value, it represents the current question and the current answer. Matching, when the probability predicted by the initial multi-round question answering model does not reach the preset target value, it means that the current question and the current answer do not match. Among them, the preset target value may specifically be 0.8, or it may be set according to the actual needs of the user, and there is no limitation here.
具体地,根据余弦计算结果,利用交叉熵损失函数如公式(2)计算损失数值:Specifically, according to the result of the cosine calculation, the cross-entropy loss function is used to calculate the loss value as in formula (2):
Figure PCTCN2019116924-appb-000002
Figure PCTCN2019116924-appb-000002
其中,H(p,q)为损失数值,x为0或1,p(x)为x对应的实际状态,若x为0,表示当前问题和当前答案不匹配,p(x)为0,若x为1,表示当前问题和当前答案匹配,p(x)为1,q(x)为余弦计算结果。Among them, H(p,q) is the loss value, x is 0 or 1, p(x) is the actual state corresponding to x, if x is 0, it means that the current question does not match the current answer, and p(x) is 0, If x is 1, it means that the current question matches the current answer, p(x) is 1, and q(x) is the result of cosine calculation.
本实施例中,通过公式(1)能够快速准确地计算出标准答案向量与目标向量之间的余弦计算结果,通过公式(2)能够快速准确地根据余弦计算结果计算出对应的损失数值,进一步保证后续利用损失数值确定目标多轮问答模型的准确性。In this embodiment, formula (1) can quickly and accurately calculate the cosine calculation result between the standard answer vector and the target vector, and formula (2) can quickly and accurately calculate the corresponding loss value based on the cosine calculation result, and further Ensure that the subsequent use of the loss value to determine the accuracy of the target multi-round question and answer model.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
在一实施例中,提供一种多轮问答识别装置,该多轮问答识别装置与上述实施例中多轮问答识别方法一一对应。如图7所示,该多轮问答识别装置包括In one embodiment, a multi-round question answering recognition device is provided, and the multi-round question answering recognition device corresponds to the multi-round question answering recognition method in the above-mentioned embodiment one-to-one. As shown in Figure 7, the multi-round question answering recognition device includes
第一获取模块71,导入模块72,转换模块73,提取模块74和输出模块75。各功能模块详细说明如下:The first acquisition module 71, the import module 72, the conversion module 73, the extraction module 74 and the output module 75. The detailed description of each functional module is as follows:
第一获取模块71,用于从用户数据库中获取用户历史问题、用户历史答案和用户当前问题;The first obtaining module 71 is used to obtain user historical questions, user historical answers, and user current questions from the user database;
导入模块72,用于将用户历史问题、用户历史答案和用户当前问题导入到预先训练好的目标多轮问答模型中,其中,目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元;The import module 72 is used to import user history questions, user history answers, and user current questions into the pre-trained target multi-round question answering model, where the target multi-round question answering model includes coding units, long and short-term memory units, and fully connected units ;
转换模块73,用于通过编码单元对用户历史问题、用户历史答案和用户当前问题进行向量特征转换处理,得到用户历史问题对应的第一向量特征,用户历史答案对应的第二向量特征,用户当前问题对应的第三向量特征;The conversion module 73 is used to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the coding unit, to obtain the first vector feature corresponding to the user history question, the second vector feature corresponding to the user history answer, and the user current The third vector feature corresponding to the problem;
提取模块74,用于将第一向量特征、第二向量特征和第三向量特征导入到长短期记忆单元中进行语义特征提取,得到目标语义特征;The extraction module 74 is configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;
输出模块75,用于将目标语义特征导入到全连接单元中进行相似度计算,输出相似度最大的识别结果。The output module 75 is used to import the target semantic features into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.
进一步地,该多轮问答识别装置还包括:Further, the multi-round question answering recognition device further includes:
第二获取模块,用于从预设样本库中获取历史问题、历史答案和当前问题作为正样本,获取当前答案作为负样本;The second acquisition module is used to acquire historical questions, historical answers, and current questions as positive samples from the preset sample library, and acquire current answers as negative samples;
向量特征转换模块,用于将正样本和负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到正样本对应的正向量特征,负样本对应的负向量特征,其中,初始多轮问答模型包含编码层、长短期记忆网络和卷积网络;The vector feature conversion module is used to import the positive samples and negative samples into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector features corresponding to the positive samples and the negative vector features corresponding to the negative samples. Among them, The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;
语义特征提取模块,用于通过长短期记忆网络对正向量特征和负向量特征进行语义特征提取,获取正向量特征对应的第一语义特征和负向量特征对应的第二语义特征;The semantic feature extraction module is used to perform semantic feature extraction on positive vector features and negative vector features through a long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;
查询模块,用于从预设标准库中查询与第一语义特征相匹配的标准问题,并获取标准问题对应的标准答案向量;The query module is used to query the standard question matching the first semantic feature from the preset standard library, and obtain the standard answer vector corresponding to the standard question;
卷积模块,用于将第二语义特征导入到卷积网络中进行卷积处理,得到目标向量;The convolution module is used to import the second semantic feature into the convolutional network for convolution processing to obtain the target vector;
损失计算模块,用于根据标准答案向量和目标向量进行损失计算,得到损失数值;The loss calculation module is used to calculate the loss according to the standard answer vector and the target vector to obtain the loss value;
迭代更新模块,用于将损失数值与预设阈值进行比较,若损失数值大于预设阈值,则对初始多轮问答模型进行迭代更新,直到损失数值小于等于预设阈值为止,将更新后的初始多轮问答模型作为目标多轮问答模型。The iterative update module is used to compare the loss value with a preset threshold. If the loss value is greater than the preset threshold, iteratively update the initial multi-round question and answer model until the loss value is less than or equal to the preset threshold. The multi-round question answering model is the target multi-round question answering model.
进一步地,向量特征转换模块包括:Further, the vector feature conversion module includes:
分词子模块,用于对正样本和负样本进行分词处理,得到正样本对应的第一分词结果,负样本对应的第二分词结果;The word segmentation sub-module is used to perform word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the second word segmentation result corresponding to the negative sample;
初始转换子模块,用于利用编码层对第一分词结果和第二分词结果进行向量特征转换处理,得到正向量特征和负向量特征。The initial conversion sub-module is used to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result using the coding layer to obtain positive vector features and negative vector features.
进一步地,分词子模块包括:Further, the word segmentation sub-module includes:
设置单元,用于根据预设要求设置字符串索引值和分词的最大长度值;The setting unit is used to set the string index value and the maximum length value of the word segmentation according to the preset requirements;
字符提取单元,用于针对正样本和负样本中的每个语料,根据字符串索引值和最大长度值,从语料中提取目标字符;The character extraction unit is used to extract target characters from the corpus according to the string index value and the maximum length value for each corpus in the positive sample and the negative sample;
匹配单元,用于将目标字符与预设字典库中的合法字符进行匹配;The matching unit is used to match the target character with the legal character in the preset dictionary library;
匹配成功单元,用于若匹配成功,则将目标字符确定为目标分词,并将字符串索引值更新为当前字符串索引值加上当前最大长度值,基于更新后的字符串索引值和最大长度值,从语料中提取目标字符进行匹配,直到完成对语料的分词操作为止;The matching success unit is used to determine the target character as the target word segmentation if the match is successful, and update the string index value to the current string index value plus the current maximum length value, based on the updated string index value and maximum length Value, extract the target characters from the corpus for matching until the word segmentation operation on the corpus is completed;
匹配失败单元,用于若匹配失败,则将最大长度值进行递减,并基于更新后的最大长度值和字符串索引值从语料中提取目标字符进行匹配,直到完成对语料的分词操作为止;The matching failure unit is used to decrement the maximum length value if the matching fails, and extract the target characters from the corpus based on the updated maximum length value and the string index value for matching until the word segmentation operation on the corpus is completed;
分词操作完成单元,用于若正样本中的每个语料完成分词操作,则得到正样本对应的第一分词结果,若负样本中的语料完成分词操作,则得到负样本对应的第二分词结果。The word segmentation completion unit is used to obtain the first word segmentation result corresponding to the positive sample if each corpus in the positive sample completes the word segmentation operation, and obtain the second word segmentation result corresponding to the negative sample if the corpus in the negative sample completes the word segmentation operation .
进一步地,语义特征提取模块包括:Further, the semantic feature extraction module includes:
语义识别子模块,用于将n个正向量特征与负向量特征分别导入到n+1个第一长短期记忆网络层进行语义识别,得到n个正向量特征对应的n个第一识别结果和负向量特征对应的第二识别结果;Semantic recognition sub-module, used to import n positive vector features and negative vector features to n+1 first long and short-term memory network layers for semantic recognition, and obtain n first recognition results corresponding to n positive vector features and The second recognition result corresponding to the negative vector feature;
特征提取子模块,用于将n个第一识别结果和第二识别结果分别导入到2个第二长短期记忆网络层中进行语义特征提取,得到第一语义特征和第二语义特征。The feature extraction sub-module is used to import the n first recognition results and the second recognition results into two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature.
进一步地,损失计算模块包括:Further, the loss calculation module includes:
余弦计算子模块,用于通过对标准答案向量和目标向量进行余弦相似度计算,得到余弦计算结果;The cosine calculation sub-module is used to calculate the cosine similarity between the standard answer vector and the target vector to obtain the cosine calculation result;
损失数值获取子模块,用于根据余弦计算结果与交叉熵损失函数进行损失计算,得到损失数值。The loss value acquisition sub-module is used to calculate the loss according to the cosine calculation result and the cross entropy loss function to obtain the loss value.
本申请的一些实施例公开了计算机设备。具体请参阅图8,为本申请的一实施例中计算机设备90基本结构框图。Some embodiments of the application disclose computer equipment. For details, please refer to FIG. 8, which is a block diagram of the basic structure of the computer device 90 in an embodiment of the present application.
如图8中所示意的,所述计算机设备90包括通过系统总线相互通信连接存储器91、处理器92、网络接口93。需要指出的是,图8中仅示出了具有组件91-93的计算机设备90,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。As shown in FIG. 8, the computer device 90 includes a memory 91, a processor 92, and a network interface 93 that are communicatively connected to each other through a system bus. It should be pointed out that FIG. 8 only shows a computer device 90 with components 91-93, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
所述存储器91至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,所述存储器91可以是所述计算机设备90的内部存储单元,例如该计算机设备90的硬盘或内存。在另一些实施例中,所述存储器91也可以是所述计算机设备90的外部存储设备,例如该计算机设备90上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器91还可以既包括所述计算机设备90的内部存储单元也包括其外部存储设备。本实施例中,所述存储器91通常用于存储安装于所述计算机设备90的操作系统和各类应用软件,例如所述多轮问答识别方法的计算机可读指令等。此外,所述存储器91还可以用于暂时地存储已经输出或者将要输出的各类数据。The memory 91 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static memory Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 91 may be an internal storage unit of the computer device 90, such as a hard disk or memory of the computer device 90. In other embodiments, the memory 91 may also be an external storage device of the computer device 90, for example, a plug-in hard disk equipped on the computer device 90, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Of course, the memory 91 may also include both an internal storage unit of the computer device 90 and an external storage device thereof. In this embodiment, the memory 91 is generally used to store an operating system and various application software installed in the computer device 90, such as computer-readable instructions of the multi-round question and answer recognition method. In addition, the memory 91 can also be used to temporarily store various types of data that have been output or will be output.
所述处理器92在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器92通常用于控制所述计算机设备90的总体操作。本实施例中,所述处理器92用于运行所述存储器91中存储的计算机可读指令或者处理数据,例如运行所述多轮问答识别方法的计算机可读指令。The processor 92 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 92 is generally used to control the overall operation of the computer device 90. In this embodiment, the processor 92 is configured to run computer-readable instructions or processed data stored in the memory 91, such as computer-readable instructions for running the multi-round question and answer recognition method.
所述网络接口93可包括无线网络接口或有线网络接口,该网络接口93通常用于在所述计算机设备90与其他电子设备之间建立通信连接。The network interface 93 may include a wireless network interface or a wired network interface, and the network interface 93 is generally used to establish a communication connection between the computer device 90 and other electronic devices.
本申请还提供了另一种实施方式,即提供一种非易失性的计算机可读存储介质,所述非易失性的计算机可读存储介质存储有用户当前问题信息录入流程,所述用户当前问题信息录入流程可被至少一个处理器执行,以使所述至少一个处理器执行上述任意一种多轮问答识别方法的步骤。This application also provides another implementation manner, that is, to provide a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores the user's current problem information entry process, the user The current question information entry process can be executed by at least one processor, so that the at least one processor executes the steps of any one of the above-mentioned multi-round question and answer identification methods.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台计算机设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a computer device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the various embodiments of the present application.
最后应说明的是,显然以上所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例,附图中给出了本申请的较佳实施例,但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现,相反地,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明,对于本领域的技术人员来而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中 部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本申请专利保护范围之内。Finally, it should be noted that it is obvious that the embodiments described above are only a part of the embodiments of the application, rather than all of the embodiments. The drawings show the preferred embodiments of the application, but do not limit the patents of the application. range. This application can be implemented in many different forms. On the contrary, the purpose of providing these examples is to make the understanding of the disclosure of this application more thorough and comprehensive. Although this application has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still possible for those skilled in the art to modify the technical solutions described in each of the foregoing specific embodiments, or equivalently replace some of the technical features. . All equivalent structures made using the contents of the description and drawings of this application, directly or indirectly used in other related technical fields, are similarly within the scope of patent protection of this application.

Claims (20)

  1. 一种多轮问答识别方法,其特征在于,所述多轮问答识别方法包括:A multi-round question answering recognition method, characterized in that the multi-round question answering recognition method includes:
    从用户数据库中获取用户历史问题、用户历史答案和用户当前问题;Get user history questions, user history answers, and user current questions from the user database;
    将所述用户历史问题、所述用户历史答案和所述用户当前问题导入到预先训练好的目标多轮问答模型中,其中,所述目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元;The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit
    通过所述编码单元对所述用户历史问题、所述用户历史答案和所述用户当前问题进行向量特征转换处理,得到所述用户历史问题对应的第一向量特征,所述用户历史答案对应的第二向量特征,所述用户当前问题对应的第三向量特征;The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;
    将所述第一向量特征、所述第二向量特征和所述第三向量特征导入到所述长短期记忆单元中进行语义特征提取,得到目标语义特征;Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;
    将所述目标语义特征导入到所述全连接单元中进行相似度计算,输出相似度最大的识别结果。The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
  2. 如权利要求1所述的多轮问答识别方法,其特征在于,所述从用户数据库中获取用户历史问题、用户历史答案和用户当前问题的步骤之前,所述多轮问答识别方法还包括:5. The multi-round question and answer recognition method according to claim 1, characterized in that, before the step of obtaining user historical questions, user historical answers, and user current questions from a user database, the multi-round question answer recognition method further comprises:
    从预设样本库中获取历史问题、历史答案和当前问题作为正样本,获取当前答案作为负样本;Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples;
    将所述正样本和所述负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到所述正样本对应的正向量特征,所述负样本对应的负向量特征,其中,所述初始多轮问答模型包含编码层、长短期记忆网络和卷积网络;The positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative vector feature corresponding to the negative sample, where , The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;
    通过所述长短期记忆网络对所述正向量特征和所述负向量特征进行语义特征提取,获取所述正向量特征对应的第一语义特征和所述负向量特征对应的第二语义特征;Performing semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;
    从预设标准库中查询与所述第一语义特征相匹配的标准问题,并获取所述标准问题对应的标准答案向量;Querying a standard question matching the first semantic feature from a preset standard library, and obtaining a standard answer vector corresponding to the standard question;
    将所述第二语义特征导入到所述卷积网络中进行卷积处理,得到目标向量;Importing the second semantic feature into the convolutional network for convolution processing to obtain a target vector;
    根据所述标准答案向量和所述目标向量进行损失计算,得到损失数值;Perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;
    将所述损失数值与预设阈值进行比较,若所述损失数值大于预设阈值,则对所述初始多轮问答模型进行迭代更新,直到所述损失数值小于等于预设阈值为止,将更新后的所述初始多轮问答模型作为所述目标多轮问答模型。The loss value is compared with a preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold. The initial multi-round question answering model is used as the target multi-round question answering model.
  3. 如权利要求2所述的多轮问答识别方法,其特征在于,所述将所述正样本和所述负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到所述正样本对应的正向量特征,所述负样本对应的负向量特征的步骤包括:The multi-round question answering recognition method according to claim 2, wherein the positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the The positive vector feature corresponding to the positive sample, and the step of the negative vector feature corresponding to the negative sample includes:
    对所述正样本和所述负样本进行分词处理,得到所述正样本对应的第一分词结果,所述负样本对应的第二分词结果;Performing word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;
    利用所述编码层对所述第一分词结果和所述第二分词结果进行向量特征转换处理,得到所述正向量特征和所述负向量特征。Using the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
  4. 如权利要求3所述的多轮问答识别方法,其特征在于,将所述正样本中每个所述历史问题、每个所述历史答案和所述当前问题分别作为一个语料,将所述负样本中的所述当前答案作为一个语料,所述对所述正样本和所述负样本进行分词处理,得到所述正样本对应的第一分词结果,所述负样本对应的第二分词结果的步骤包括:The method of multi-round question answering recognition according to claim 3, characterized in that each of the historical question, each of the historical answer and the current question in the positive sample is used as a corpus, and the negative The current answer in the sample is used as a corpus, the positive sample and the negative sample are segmented to obtain the first segmentation result corresponding to the positive sample, and the second segmentation result corresponding to the negative sample is The steps include:
    根据预设要求设置字符串索引值和分词的最大长度值;Set the string index value and the maximum length of the word segmentation according to the preset requirements;
    针对所述正样本和所述负样本中的每个所述语料,根据所述字符串索引值和所述最大长度值,从所述语料中提取目标字符;For each of the corpus of the positive sample and the negative sample, extract a target character from the corpus according to the character string index value and the maximum length value;
    将所述目标字符与预设字典库中的合法字符进行匹配;Matching the target character with a legal character in a preset dictionary library;
    若匹配成功,则将所述目标字符确定为所述目标分词,并将所述字符串索引值更新为当前所述字符串索引值加上当前所述最大长度值,基于更新后的所述字符串索引值和所述最大长度值,从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated character String index value and the maximum length value, extracting target characters from the corpus for matching until the word segmentation operation on the corpus is completed;
    若匹配失败,则将所述最大长度值进行递减,并基于更新后的所述最大长度值和所述字符串索引值从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;If the match fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching, until the word segmentation of the corpus is completed So far
    若所述正样本中的每个所述语料完成分词操作,则得到所述正样本对应的所述第一分词结果,若所述负样本中的所述语料完成分词操作,则得到所述负样本对应的所述第二分词结果。If each of the corpus in the positive sample completes the word segmentation operation, the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the negative The second word segmentation result corresponding to the sample.
  5. 如权利要求2所述的多轮问答识别方法,其特征在于,所述长短期记忆网络包含n+1个第一长短期记忆网络层和2个第二长短期记忆网络层,所述正向量特征为n个,其中,n为大于1的正整数,所述通过所述长短期记忆网络对所述正向量特征和所述负向量特征进行语义特征提取,获取所述正向量特征对应的第一语义特征和所述负向量特征对应的第二语义特征的步骤包括:The multi-round question answering recognition method according to claim 2, wherein the long short-term memory network comprises n+1 first long-short-term memory network layers and 2 second long-short-term memory network layers, and the positive vector There are n features, where n is a positive integer greater than 1, and the positive vector feature and the negative vector feature are extracted through the long- and short-term memory network to obtain the first vector feature corresponding to the positive vector feature. The step of a semantic feature corresponding to the second semantic feature of the negative vector feature includes:
    将n个所述正向量特征与所述负向量特征分别导入到n+1个所述第一长短期记忆网络层进行语义识别,得到n个所述正向量特征对应的n个第一识别结果和所述负向量特征对应的第二识别结果;The n positive vector features and the negative vector features are respectively imported into the n+1 first long-short-term memory network layers for semantic recognition, and n first recognition results corresponding to the n positive vector features are obtained A second recognition result corresponding to the negative vector feature;
    将n个所述第一识别结果和所述第二识别结果分别导入到2个所述第二长短期记忆网络层中进行语义特征提取,得到所述第一语义特征和所述第二语义特征。Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature .
  6. 如权利要求2所述的多轮问答识别方法,其特征在于,所述根据所述标准答案向量和所述目标向量进行损失计算,得到损失数值的步骤包括:The method for multi-round question answering recognition according to claim 2, wherein the step of calculating the loss according to the standard answer vector and the target vector to obtain the loss value comprises:
    通过对所述标准答案向量和所述目标向量进行余弦相似度计算,得到余弦计算结果;Obtaining a cosine calculation result by performing a cosine similarity calculation on the standard answer vector and the target vector;
    根据所述余弦计算结果与交叉熵损失函数进行损失计算,得到所述损失数值。Perform a loss calculation according to the cosine calculation result and the cross entropy loss function to obtain the loss value.
  7. 一种多轮问答识别装置,其特征在于,所述多轮问答识别装置包括:A multi-round question answering recognition device, characterized in that the multi-round question answering recognition device includes:
    第一获取模块,用于从用户数据库中获取用户历史问题、用户历史答案和用户当前问题;The first obtaining module is used to obtain user historical questions, user historical answers, and user current questions from the user database;
    导入模块,用于将所述用户历史问题、所述用户历史答案和所述用户当前问题导入到预先训练好的目标多轮问答模型中,其中,所述目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元;The import module is used to import the user history question, the user history answer, and the user current question into a pre-trained target multi-round question answering model, wherein the target multi-round question answering model includes a coding unit, a long Short-term memory unit and fully connected unit;
    转换模块,用于通过所述编码单元对所述用户历史问题、所述用户历史答案和所述用户当前问题进行向量特征转换处理,得到所述用户历史问题对应的第一向量特征,所述用户历史答案对应的第二向量特征,所述用户当前问题对应的第三向量特征;The conversion module is configured to perform vector feature conversion processing on the user history question, the user history answer, and the user current question through the encoding unit to obtain the first vector feature corresponding to the user history question, and the user The second vector feature corresponding to the historical answer, and the third vector feature corresponding to the user's current question;
    提取模块,用于将所述第一向量特征、所述第二向量特征和所述第三向量特征导入到所述长短期记忆单元中进行语义特征提取,得到目标语义特征;An extraction module, configured to import the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain a target semantic feature;
    输出模块,用于将所述目标语义特征导入到所述全连接单元中进行相似度计算,输出相似度最大的识别结果。The output module is used to import the target semantic feature into the fully connected unit for similarity calculation, and output the recognition result with the largest similarity.
  8. 如权利要求7所述的多轮问答识别装置,其特征在于,所述多轮问答识别装置还包括:8. The multi-round question answering recognition device according to claim 7, wherein the multi-round question answering recognition device further comprises:
    第二获取模块,用于从预设样本库中获取历史问题、历史答案和当前问题作为正样本,获取当前答案作为负样本;The second acquisition module is used to acquire historical questions, historical answers, and current questions as positive samples from the preset sample library, and acquire current answers as negative samples;
    向量特征转换模块,用于将所述正样本和所述负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到所述正样本对应的正向量特征,所述负样本对应的负向量特征,其中,所述初始多轮问答模型包含编码层、长短期记忆网络和卷积网络;The vector feature conversion module is used to import the positive sample and the negative sample into the coding layer of the initial multi-round question answering model to perform vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative sample Corresponding negative vector features, wherein the initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;
    语义特征提取模块,用于通过所述长短期记忆网络对所述正向量特征和所述负向量特征进行语义特征提取,获取所述正向量特征对应的第一语义特征和所述负向量特征对应的第二语义特征;The semantic feature extraction module is configured to perform semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the negative vector feature corresponding Second semantic feature of
    查询模块,用于从预设标准库中查询与所述第一语义特征相匹配的标准问题,并获取所述标准问题对应的标准答案向量;A query module, configured to query a standard question matching the first semantic feature from a preset standard library, and obtain a standard answer vector corresponding to the standard question;
    卷积模块,用于将所述第二语义特征导入到所述卷积网络中进行卷积处理,得到目标向量;A convolution module, configured to import the second semantic feature into the convolutional network for convolution processing to obtain a target vector;
    损失计算模块,用于根据所述标准答案向量和所述目标向量进行损失计算,得到损失数值;A loss calculation module, configured to perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;
    迭代更新模块,用于将所述损失数值与预设阈值进行比较,若所述损失数值大于预设阈值,则对所述初始多轮问答模型进行迭代更新,直到所述损失数值小于等于预设阈值为止,将更新后的所述初始多轮问答模型作为所述目标多轮问答模型。An iterative update module, configured to compare the loss value with a preset threshold, and if the loss value is greater than the preset threshold, iteratively update the initial multi-round question and answer model until the loss value is less than or equal to the preset threshold Up to the threshold, the updated initial multi-round question answering model is used as the target multi-round question answering model.
  9. 如权利要求8所述的多轮问答识别装置,其特征在于,所述向量特征转换模块包括:The multi-round question answering recognition device according to claim 8, wherein the vector feature conversion module comprises:
    分词子模块,用于对所述正样本和所述负样本进行分词处理,得到所述正样本对应的第一分词结果,所述负样本对应的第二分词结果;The word segmentation sub-module is used to perform word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;
    初始转换子模块,用于利用所述编码层对所述第一分词结果和所述第二分词结果进行向量特征转换处理,得到所述正向量特征和所述负向量特征。The initial conversion sub-module is configured to use the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
  10. 如权利要求9所述的多轮问答识别装置,其特征在于,所述分词子模块包括:The multi-round question answering recognition device according to claim 9, wherein the word segmentation sub-module comprises:
    设置单元,用于根据预设要求设置字符串索引值和分词的最大长度值;The setting unit is used to set the string index value and the maximum length value of the word segmentation according to the preset requirements;
    字符提取单元,用于针对所述正样本和所述负样本中的每个所述语料,根据所述字符串索引值和所述最大长度值,从所述语料中提取目标字符;A character extraction unit, configured to extract a target character from the corpus according to the string index value and the maximum length value for each of the corpus of the positive sample and the negative sample;
    匹配单元,用于将所述目标字符与预设字典库中的合法字符进行匹配;A matching unit for matching the target character with a legal character in a preset dictionary library;
    匹配成功单元,用于若匹配成功,则将所述目标字符确定为所述目标分词,并将所述字符串索引值更新为当前所述字符串索引值加上当前所述最大长度值,基于更新后的所述字符串索引值和所述最大长度值,从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;The matching success unit is configured to, if the matching is successful, determine the target character as the target word segmentation, and update the string index value to the current string index value plus the current maximum length value, based on After the updated string index value and the maximum length value, extract target characters from the corpus for matching until the word segmentation operation on the corpus is completed;
    匹配失败单元,用于若匹配失败,则将所述最大长度值进行递减,并基于更新后的所述最大长度值和所述字符串索引值从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;The matching failure unit is configured to decrement the maximum length value if the matching fails, and extract target characters from the corpus based on the updated maximum length value and the string index value for matching until the completion End the word segmentation operation on the corpus;
    分词操作完成单元,用于若所述正样本中的每个所述语料完成分词操作,则得到所述正样本对应的所述第一分词结果,若所述负样本中的所述语料完成分词操作,则得到所述负样本对应的所述第二分词结果。The word segmentation completion unit is configured to obtain the first word segmentation result corresponding to the positive sample if each of the corpus in the positive sample completes the word segmentation operation, and if the corpus in the negative sample completes the word segmentation Operation, the second word segmentation result corresponding to the negative sample is obtained.
  11. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, wherein the processor executes the computer-readable instructions as follows step:
    从用户数据库中获取用户历史问题、用户历史答案和用户当前问题;Get user history questions, user history answers, and user current questions from the user database;
    将所述用户历史问题、所述用户历史答案和所述用户当前问题导入到预先训练好的目标多轮问答模型中,其中,所述目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元;The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit
    通过所述编码单元对所述用户历史问题、所述用户历史答案和所述用户当前问题进行向量特征转换处理,得到所述用户历史问题对应的第一向量特征,所述用户历史答案对应的第二向量特征,所述用户当前问题对应的第三向量特征;The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;
    将所述第一向量特征、所述第二向量特征和所述第三向量特征导入到所述长短期记忆单元中进行语义特征提取,得到目标语义特征;Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;
    将所述目标语义特征导入到所述全连接单元中进行相似度计算,输出相似度最大的识别结果。The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
  12. 如权利要求11所述的计算机设备,其特征在于,所述从用户数据库中获取用户历史问题、用户历史答案和用户当前问题的步骤之前,所述处理器执行所述计算机可读指令 时还包括实现如下步骤:The computer device according to claim 11, wherein, before the step of obtaining user history questions, user history answers, and user current questions from a user database, when the processor executes the computer-readable instructions, it further comprises To achieve the following steps:
    从预设样本库中获取历史问题、历史答案和当前问题作为正样本,获取当前答案作为负样本;Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples;
    将所述正样本和所述负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到所述正样本对应的正向量特征,所述负样本对应的负向量特征,其中,所述初始多轮问答模型包含编码层、长短期记忆网络和卷积网络;The positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative vector feature corresponding to the negative sample, where , The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;
    通过所述长短期记忆网络对所述正向量特征和所述负向量特征进行语义特征提取,获取所述正向量特征对应的第一语义特征和所述负向量特征对应的第二语义特征;Performing semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;
    从预设标准库中查询与所述第一语义特征相匹配的标准问题,并获取所述标准问题对应的标准答案向量;Querying a standard question matching the first semantic feature from a preset standard library, and obtaining a standard answer vector corresponding to the standard question;
    将所述第二语义特征导入到所述卷积网络中进行卷积处理,得到目标向量;Importing the second semantic feature into the convolutional network for convolution processing to obtain a target vector;
    根据所述标准答案向量和所述目标向量进行损失计算,得到损失数值;Perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;
    将所述损失数值与预设阈值进行比较,若所述损失数值大于预设阈值,则对所述初始多轮问答模型进行迭代更新,直到所述损失数值小于等于预设阈值为止,将更新后的所述初始多轮问答模型作为所述目标多轮问答模型。The loss value is compared with a preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold. The initial multi-round question answering model is used as the target multi-round question answering model.
  13. 如权利要求12所述的计算机设备,其特征在于,所述将所述正样本和所述负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到所述正样本对应的正向量特征,所述负样本对应的负向量特征的步骤包括:The computer device according to claim 12, wherein the positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the corresponding positive sample The steps of the negative vector feature corresponding to the negative sample include:
    对所述正样本和所述负样本进行分词处理,得到所述正样本对应的第一分词结果,所述负样本对应的第二分词结果;Performing word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;
    利用所述编码层对所述第一分词结果和所述第二分词结果进行向量特征转换处理,得到所述正向量特征和所述负向量特征。Using the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
  14. 如权利要求13所述的计算机设备,其特征在于,将所述正样本中每个所述历史问题、每个所述历史答案和所述当前问题分别作为一个语料,将所述负样本中的所述当前答案作为一个语料,所述对所述正样本和所述负样本进行分词处理,得到所述正样本对应的第一分词结果,所述负样本对应的第二分词结果的步骤包括:The computer device according to claim 13, wherein each of the historical question, each of the historical answer and the current question in the positive sample is used as a corpus, and the negative sample The current answer is used as a corpus, and the word segmentation processing is performed on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the step of the second word segmentation result corresponding to the negative sample includes:
    根据预设要求设置字符串索引值和分词的最大长度值;Set the string index value and the maximum length of the word segmentation according to the preset requirements;
    针对所述正样本和所述负样本中的每个所述语料,根据所述字符串索引值和所述最大长度值,从所述语料中提取目标字符;For each of the corpus of the positive sample and the negative sample, extract a target character from the corpus according to the character string index value and the maximum length value;
    将所述目标字符与预设字典库中的合法字符进行匹配;Matching the target character with a legal character in a preset dictionary library;
    若匹配成功,则将所述目标字符确定为所述目标分词,并将所述字符串索引值更新为当前所述字符串索引值加上当前所述最大长度值,基于更新后的所述字符串索引值和所述最大长度值,从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated character String index value and the maximum length value, extracting target characters from the corpus for matching until the word segmentation operation on the corpus is completed;
    若匹配失败,则将所述最大长度值进行递减,并基于更新后的所述最大长度值和所述字符串索引值从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;If the matching fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching, until the word segmentation of the corpus is completed So far
    若所述正样本中的每个所述语料完成分词操作,则得到所述正样本对应的所述第一分词结果,若所述负样本中的所述语料完成分词操作,则得到所述负样本对应的所述第二分词结果。If each of the corpus in the positive sample completes the word segmentation operation, the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the negative The second word segmentation result corresponding to the sample.
  15. 如权利要求12所述的计算机设备,其特征在于,所述长短期记忆网络包含n+1个第一长短期记忆网络层和2个第二长短期记忆网络层,所述正向量特征为n个,其中,n为大于1的正整数,所述通过所述长短期记忆网络对所述正向量特征和所述负向量特征进行语义特征提取,获取所述正向量特征对应的第一语义特征和所述负向量特征对应的第二语义特征的步骤包括:The computer device according to claim 12, wherein the long short-term memory network comprises n+1 first long-short-term memory network layers and 2 second long-short-term memory network layers, and the positive vector feature is n Wherein, n is a positive integer greater than 1, the positive vector feature and the negative vector feature are extracted through the long and short-term memory network to obtain the first semantic feature corresponding to the positive vector feature The step of the second semantic feature corresponding to the negative vector feature includes:
    将n个所述正向量特征与所述负向量特征分别导入到n+1个所述第一长短期记忆网络层进行语义识别,得到n个所述正向量特征对应的n个第一识别结果和所述负向量特征对 应的第二识别结果;The n positive vector features and the negative vector features are respectively imported into the n+1 first long-short-term memory network layers for semantic recognition, and n first recognition results corresponding to the n positive vector features are obtained A second recognition result corresponding to the negative vector feature;
    将n个所述第一识别结果和所述第二识别结果分别导入到2个所述第二长短期记忆网络层中进行语义特征提取,得到所述第一语义特征和所述第二语义特征。Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature .
  16. 一种非易失性的计算机可读存储介质,所述非易失性的计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被一种处理器执行时,使得所述一种处理器执行如下步骤:A non-volatile computer-readable storage medium storing computer-readable instructions, wherein when the computer-readable instructions are executed by a processor, Make the processor execute the following steps:
    从用户数据库中获取用户历史问题、用户历史答案和用户当前问题;Get user history questions, user history answers, and user current questions from the user database;
    将所述用户历史问题、所述用户历史答案和所述用户当前问题导入到预先训练好的目标多轮问答模型中,其中,所述目标多轮问答模型包含编码单元、长短期记忆单元和全连接单元;The user history question, the user history answer, and the user current question are imported into a pre-trained target multi-round question answering model, where the target multi-round question answering model includes a coding unit, a long short-term memory unit, and a whole Connection unit
    通过所述编码单元对所述用户历史问题、所述用户历史答案和所述用户当前问题进行向量特征转换处理,得到所述用户历史问题对应的第一向量特征,所述用户历史答案对应的第二向量特征,所述用户当前问题对应的第三向量特征;The encoding unit performs vector feature conversion processing on the user history question, the user history answer, and the user current question to obtain the first vector feature corresponding to the user history question, and the first vector feature corresponding to the user history answer A two-vector feature, the third vector feature corresponding to the user's current question;
    将所述第一向量特征、所述第二向量特征和所述第三向量特征导入到所述长短期记忆单元中进行语义特征提取,得到目标语义特征;Importing the first vector feature, the second vector feature, and the third vector feature into the long and short-term memory unit for semantic feature extraction to obtain the target semantic feature;
    将所述目标语义特征导入到所述全连接单元中进行相似度计算,输出相似度最大的识别结果。The target semantic feature is imported into the fully connected unit for similarity calculation, and the recognition result with the largest similarity is output.
  17. 如权利要求16所述的非易失性的计算机可读存储介质,其特征在于,所述从用户数据库中获取用户历史问题、用户历史答案和用户当前问题的步骤之前,所述计算机可读指令被一种处理器执行时,使得所述一种处理器还执行如下步骤:The non-volatile computer-readable storage medium of claim 16, wherein before the step of obtaining user history questions, user history answers, and user current questions from a user database, the computer-readable instructions When executed by a processor, the processor is caused to further execute the following steps:
    从预设样本库中获取历史问题、历史答案和当前问题作为正样本,获取当前答案作为负样本;Obtain historical questions, historical answers, and current questions from the preset sample library as positive samples, and obtain current answers as negative samples;
    将所述正样本和所述负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到所述正样本对应的正向量特征,所述负样本对应的负向量特征,其中,所述初始多轮问答模型包含编码层、长短期记忆网络和卷积网络;The positive sample and the negative sample are respectively imported into the coding layer of the initial multi-round question answering model for vector feature conversion processing to obtain the positive vector feature corresponding to the positive sample, and the negative vector feature corresponding to the negative sample, where , The initial multi-round question answering model includes an encoding layer, a long and short-term memory network, and a convolutional network;
    通过所述长短期记忆网络对所述正向量特征和所述负向量特征进行语义特征提取,获取所述正向量特征对应的第一语义特征和所述负向量特征对应的第二语义特征;Performing semantic feature extraction on the positive vector feature and the negative vector feature through the long and short-term memory network, and obtain the first semantic feature corresponding to the positive vector feature and the second semantic feature corresponding to the negative vector feature;
    从预设标准库中查询与所述第一语义特征相匹配的标准问题,并获取所述标准问题对应的标准答案向量;Querying a standard question matching the first semantic feature from a preset standard library, and obtaining a standard answer vector corresponding to the standard question;
    将所述第二语义特征导入到所述卷积网络中进行卷积处理,得到目标向量;Importing the second semantic feature into the convolutional network for convolution processing to obtain a target vector;
    根据所述标准答案向量和所述目标向量进行损失计算,得到损失数值;Perform loss calculation according to the standard answer vector and the target vector to obtain a loss value;
    将所述损失数值与预设阈值进行比较,若所述损失数值大于预设阈值,则对所述初始多轮问答模型进行迭代更新,直到所述损失数值小于等于预设阈值为止,将更新后的所述初始多轮问答模型作为所述目标多轮问答模型。The loss value is compared with a preset threshold. If the loss value is greater than the preset threshold, the initial multi-round question answering model is iteratively updated until the loss value is less than or equal to the preset threshold. The initial multi-round question answering model is used as the target multi-round question answering model.
  18. 如权利要求17所述的非易失性的计算机可读存储介质,其特征在于,所述将所述正样本和所述负样本分别导入到初始多轮问答模型中的编码层进行向量特征转换处理,得到所述正样本对应的正向量特征,所述负样本对应的负向量特征的步骤包括:The non-volatile computer-readable storage medium of claim 17, wherein the positive sample and the negative sample are respectively imported into the coding layer in the initial multi-round question answering model to perform vector feature conversion Processing to obtain the positive vector feature corresponding to the positive sample, and the step of the negative vector feature corresponding to the negative sample includes:
    对所述正样本和所述负样本进行分词处理,得到所述正样本对应的第一分词结果,所述负样本对应的第二分词结果;Performing word segmentation processing on the positive sample and the negative sample to obtain a first word segmentation result corresponding to the positive sample, and a second word segmentation result corresponding to the negative sample;
    利用所述编码层对所述第一分词结果和所述第二分词结果进行向量特征转换处理,得到所述正向量特征和所述负向量特征。Using the coding layer to perform vector feature conversion processing on the first word segmentation result and the second word segmentation result to obtain the positive vector feature and the negative vector feature.
  19. 如权利要求18所述的非易失性的计算机可读存储介质,其特征在于,将所述正样本中每个所述历史问题、每个所述历史答案和所述当前问题分别作为一个语料,将所述负样本中的所述当前答案作为一个语料,所述对所述正样本和所述负样本进行分词处理,得到所述正样本对应的第一分词结果,所述负样本对应的第二分词结果的步骤包括:The non-volatile computer-readable storage medium of claim 18, wherein each of the historical questions, each of the historical answers, and the current question in the positive sample is used as a corpus. , Using the current answer in the negative sample as a corpus, and performing word segmentation processing on the positive sample and the negative sample to obtain the first word segmentation result corresponding to the positive sample, and the negative sample corresponding to The steps of the second segmentation result include:
    根据预设要求设置字符串索引值和分词的最大长度值;Set the string index value and the maximum length of the word segmentation according to the preset requirements;
    针对所述正样本和所述负样本中的每个所述语料,根据所述字符串索引值和所述最大长度值,从所述语料中提取目标字符;For each of the corpus of the positive sample and the negative sample, extract a target character from the corpus according to the character string index value and the maximum length value;
    将所述目标字符与预设字典库中的合法字符进行匹配;Matching the target character with a legal character in a preset dictionary library;
    若匹配成功,则将所述目标字符确定为所述目标分词,并将所述字符串索引值更新为当前所述字符串索引值加上当前所述最大长度值,基于更新后的所述字符串索引值和所述最大长度值,从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;If the match is successful, the target character is determined as the target word segmentation, and the string index value is updated to the current string index value plus the current maximum length value, based on the updated character String index value and the maximum length value, extracting target characters from the corpus for matching until the word segmentation operation on the corpus is completed;
    若匹配失败,则将所述最大长度值进行递减,并基于更新后的所述最大长度值和所述字符串索引值从所述语料中提取目标字符进行匹配,直到完成对所述语料的分词操作为止;If the match fails, the maximum length value is decremented, and the target characters are extracted from the corpus based on the updated maximum length value and the string index value for matching, until the word segmentation of the corpus is completed So far
    若所述正样本中的每个所述语料完成分词操作,则得到所述正样本对应的所述第一分词结果,若所述负样本中的所述语料完成分词操作,则得到所述负样本对应的所述第二分词结果。If each of the corpus in the positive sample completes the word segmentation operation, the first word segmentation result corresponding to the positive sample is obtained, and if the corpus in the negative sample completes the word segmentation operation, then the negative The second word segmentation result corresponding to the sample.
  20. 如权利要求17所述的非易失性的计算机可读存储介质,其特征在于,所述长短期记忆网络包含n+1个第一长短期记忆网络层和2个第二长短期记忆网络层,所述正向量特征为n个,其中,n为大于1的正整数,所述通过所述长短期记忆网络对所述正向量特征和所述负向量特征进行语义特征提取,获取所述正向量特征对应的第一语义特征和所述负向量特征对应的第二语义特征的步骤包括:The non-volatile computer-readable storage medium of claim 17, wherein the long short-term memory network comprises n+1 first long-short-term memory network layers and two second long- and short-term memory network layers There are n positive vector features, where n is a positive integer greater than 1, and performing semantic feature extraction on the positive vector feature and the negative vector feature through the long short-term memory network to obtain the positive vector feature The steps of the first semantic feature corresponding to the vector feature and the second semantic feature corresponding to the negative vector feature include:
    将n个所述正向量特征与所述负向量特征分别导入到n+1个所述第一长短期记忆网络层进行语义识别,得到n个所述正向量特征对应的n个第一识别结果和所述负向量特征对应的第二识别结果;The n positive vector features and the negative vector features are respectively imported into the n+1 first long-short-term memory network layers for semantic recognition, and n first recognition results corresponding to the n positive vector features are obtained A second recognition result corresponding to the negative vector feature;
    将n个所述第一识别结果和所述第二识别结果分别导入到2个所述第二长短期记忆网络层中进行语义特征提取,得到所述第一语义特征和所述第二语义特征。Import the n first recognition results and the second recognition results into the two second long and short-term memory network layers respectively for semantic feature extraction to obtain the first semantic feature and the second semantic feature .
PCT/CN2019/116924 2019-09-24 2019-11-10 Multi-round question-and-answer identification method, device, computer apparatus, and storage medium WO2021056710A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910906819.7A CN110825857B (en) 2019-09-24 2019-09-24 Multi-round question and answer identification method and device, computer equipment and storage medium
CN201910906819.7 2019-09-24

Publications (1)

Publication Number Publication Date
WO2021056710A1 true WO2021056710A1 (en) 2021-04-01

Family

ID=69548255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116924 WO2021056710A1 (en) 2019-09-24 2019-11-10 Multi-round question-and-answer identification method, device, computer apparatus, and storage medium

Country Status (2)

Country Link
CN (1) CN110825857B (en)
WO (1) WO2021056710A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590783A (en) * 2021-07-28 2021-11-02 复旦大学 Traditional Chinese medicine health-preserving intelligent question-answering system based on NLP natural language processing
CN117332823A (en) * 2023-11-28 2024-01-02 浪潮电子信息产业股份有限公司 Automatic target content generation method and device, electronic equipment and readable storage medium
CN117688163A (en) * 2024-01-29 2024-03-12 杭州有赞科技有限公司 Online intelligent question-answering method and device based on instruction fine tuning and retrieval enhancement generation
CN117688163B (en) * 2024-01-29 2024-04-23 杭州有赞科技有限公司 Online intelligent question-answering method and device based on instruction fine tuning and retrieval enhancement generation

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259668B (en) * 2020-05-07 2020-08-18 腾讯科技(深圳)有限公司 Reading task processing method, model training device and computer equipment
CN112183105A (en) * 2020-08-28 2021-01-05 华为技术有限公司 Man-machine interaction method and device
CN113204633B (en) * 2021-06-01 2022-12-30 吉林大学 Semantic matching distillation method and device
CN113934824B (en) * 2021-12-15 2022-05-06 之江实验室 Similar medical record matching system and method based on multi-round intelligent question answering
CN114757208B (en) * 2022-06-10 2022-10-21 荣耀终端有限公司 Question and answer matching method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766320A (en) * 2016-08-23 2018-03-06 中兴通讯股份有限公司 A kind of Chinese pronoun resolution method for establishing model and device
CN108595629A (en) * 2018-04-24 2018-09-28 北京慧闻科技发展有限公司 Data processing method and the application of system are selected for answer
US20180300312A1 (en) * 2017-04-13 2018-10-18 Baidu Usa Llc Global normalized reader systems and methods
CN108733703A (en) * 2017-04-20 2018-11-02 北京京东尚科信息技术有限公司 The answer prediction technique and device of question answering system, electronic equipment, storage medium
CN109376222A (en) * 2018-09-27 2019-02-22 国信优易数据有限公司 Question and answer matching degree calculation method, question and answer automatic matching method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572734B (en) * 2013-10-23 2019-04-30 腾讯科技(深圳)有限公司 Method for recommending problem, apparatus and system
CN108345585A (en) * 2018-01-11 2018-07-31 浙江大学 A kind of automatic question-answering method based on deep learning
US11250038B2 (en) * 2018-01-21 2022-02-15 Microsoft Technology Licensing, Llc. Question and answer pair generation using machine learning
CN109344236B (en) * 2018-09-07 2020-09-04 暨南大学 Problem similarity calculation method based on multiple characteristics
CN109783617B (en) * 2018-12-11 2024-01-26 平安科技(深圳)有限公司 Model training method, device, equipment and storage medium for replying to questions
CN110222163B (en) * 2019-06-10 2022-10-04 福州大学 Intelligent question-answering method and system integrating CNN and bidirectional LSTM

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766320A (en) * 2016-08-23 2018-03-06 中兴通讯股份有限公司 A kind of Chinese pronoun resolution method for establishing model and device
US20180300312A1 (en) * 2017-04-13 2018-10-18 Baidu Usa Llc Global normalized reader systems and methods
CN108733703A (en) * 2017-04-20 2018-11-02 北京京东尚科信息技术有限公司 The answer prediction technique and device of question answering system, electronic equipment, storage medium
CN108595629A (en) * 2018-04-24 2018-09-28 北京慧闻科技发展有限公司 Data processing method and the application of system are selected for answer
CN109376222A (en) * 2018-09-27 2019-02-22 国信优易数据有限公司 Question and answer matching degree calculation method, question and answer automatic matching method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590783A (en) * 2021-07-28 2021-11-02 复旦大学 Traditional Chinese medicine health-preserving intelligent question-answering system based on NLP natural language processing
CN113590783B (en) * 2021-07-28 2023-10-03 复旦大学 NLP natural language processing-based traditional Chinese medicine health preserving intelligent question-answering system
CN117332823A (en) * 2023-11-28 2024-01-02 浪潮电子信息产业股份有限公司 Automatic target content generation method and device, electronic equipment and readable storage medium
CN117332823B (en) * 2023-11-28 2024-03-05 浪潮电子信息产业股份有限公司 Automatic target content generation method and device, electronic equipment and readable storage medium
CN117688163A (en) * 2024-01-29 2024-03-12 杭州有赞科技有限公司 Online intelligent question-answering method and device based on instruction fine tuning and retrieval enhancement generation
CN117688163B (en) * 2024-01-29 2024-04-23 杭州有赞科技有限公司 Online intelligent question-answering method and device based on instruction fine tuning and retrieval enhancement generation

Also Published As

Publication number Publication date
CN110825857A (en) 2020-02-21
CN110825857B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
WO2021056710A1 (en) Multi-round question-and-answer identification method, device, computer apparatus, and storage medium
WO2021135910A1 (en) Machine reading comprehension-based information extraction method and related device
WO2018153265A1 (en) Keyword extraction method, computer device, and storage medium
WO2017092380A1 (en) Method for human-computer dialogue, neural network system and user equipment
WO2021051517A1 (en) Information retrieval method based on convolutional neural network, and device related thereto
WO2021051513A1 (en) Chinese-english translation method based on neural network, and related devices thereof
CN113204611A (en) Method for establishing reading understanding model, reading understanding method and corresponding device
US20230297617A1 (en) Video retrieval method and apparatus, device, and storage medium
CN113177412A (en) Named entity identification method and system based on bert, electronic equipment and storage medium
CN116127953B (en) Chinese spelling error correction method, device and medium based on contrast learning
US20230114673A1 (en) Method for recognizing token, electronic device and storage medium
CN113593661A (en) Clinical term standardization method, device, electronic equipment and storage medium
US11615247B1 (en) Labeling method and apparatus for named entity recognition of legal instrument
CN111506726A (en) Short text clustering method and device based on part-of-speech coding and computer equipment
US11281714B2 (en) Image retrieval
CN116775918B (en) Cross-modal retrieval method, system, equipment and medium based on complementary entropy contrast learning
CN112613293A (en) Abstract generation method and device, electronic equipment and storage medium
US20230215203A1 (en) Character recognition model training method and apparatus, character recognition method and apparatus, device and storage medium
CN115858776A (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN115169342A (en) Text similarity calculation method and device, electronic equipment and storage medium
WO2021082570A1 (en) Artificial intelligence-based semantic identification method, device, and semantic identification apparatus
CN115204142A (en) Open relationship extraction method, device and storage medium
CN115358227A (en) Open domain relation joint extraction method and system based on phrase enhancement
CN114220505A (en) Information extraction method of medical record data, terminal equipment and readable storage medium
WO2021042517A1 (en) Artificial intelligence-based article gist extraction method and device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19947088

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19947088

Country of ref document: EP

Kind code of ref document: A1