CN113127624B - Question-answer model training method and device - Google Patents

Question-answer model training method and device Download PDF

Info

Publication number
CN113127624B
CN113127624B CN202110665052.0A CN202110665052A CN113127624B CN 113127624 B CN113127624 B CN 113127624B CN 202110665052 A CN202110665052 A CN 202110665052A CN 113127624 B CN113127624 B CN 113127624B
Authority
CN
China
Prior art keywords
sample
question
text
training
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110665052.0A
Other languages
Chinese (zh)
Other versions
CN113127624A (en
Inventor
冯晓阳
李长亮
姬子明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Digital Entertainment Co Ltd
Original Assignee
Beijing Kingsoft Digital Entertainment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Digital Entertainment Co Ltd filed Critical Beijing Kingsoft Digital Entertainment Co Ltd
Priority to CN202110665052.0A priority Critical patent/CN113127624B/en
Priority to CN202111256825.6A priority patent/CN113987147A/en
Priority to CN202111258418.9A priority patent/CN113901191A/en
Publication of CN113127624A publication Critical patent/CN113127624A/en
Application granted granted Critical
Publication of CN113127624B publication Critical patent/CN113127624B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application provides a method and a device for training a question-answer model, wherein the method for training the question-answer model comprises the following steps: constructing an initial text meaning group corresponding to a sample corpus, and generating a scene oriented word list space corresponding to the sample corpus based on the initial text meaning group; acquiring a training sample, and determining a sample phrase corresponding to the training sample; inquiring the scene oriented word list space based on the sample word group, and determining a target text meaning group corresponding to the training sample according to an inquiry result; and training an initial question-answering model by using the target text meaning group and the training sample until a target question-answering model meeting the training stopping condition is obtained.

Description

Question-answer model training method and device
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a method and a device for training a question-answering model.
Background
With the development of the artificial intelligence industry, the proportion of the question-answering model in practical application is gradually increased, and the requirements of users on the reply accuracy and the reply efficiency of the question-answering model are higher and higher. In the prior art, the question-answering model generally understands the questions issued by the user and generates a targeted answer according to the questions of the user for replying. However, the accuracy of the current question-answering model for the reply content of the user is not high, and the reply speed is still to be improved. Therefore, how to solve the above problems is to improve the accuracy and speed of response for the question-answering model, and is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the embodiment of the present application provides a method for training a question-answering model to solve the technical defects in the prior art. The embodiment of the application also provides a question-answer model training device, a text processing method, a text processing device, computing equipment and a computer readable storage medium.
According to a first aspect of the embodiments of the present application, there is provided a method for training a question-answering model, including:
constructing an initial text meaning group corresponding to a sample corpus, and generating a scene oriented word list space corresponding to the sample corpus based on the initial text meaning group;
acquiring a training sample, and determining a sample phrase corresponding to the training sample;
inquiring the scene oriented word list space based on the sample word group, and determining a target text meaning group corresponding to the training sample according to an inquiry result;
and training an initial question-answering model by using the target text meaning group and the training sample until a target question-answering model meeting the training stopping condition is obtained.
Optionally, the generating a scene oriented vocabulary space corresponding to the sample corpus based on the initial text meaning group includes:
adding context labels to the sample corpus, and extracting initial phrases in the initial text meaning groups;
and establishing a corresponding relation between the context label and the initial text meaning group, and establishing the scene oriented word list space corresponding to the sample corpus according to the corresponding relation and the initial phrase.
Optionally, the adding a context tag to the sample corpus includes:
extracting a plurality of initial features of the sample corpus, and preprocessing the initial features to obtain a plurality of target features;
calculating the context similarity of each target feature and the sample corpus, selecting at least one target feature as the context label according to the calculation result of the context similarity, and adding the target feature to the sample corpus.
Optionally, the querying the scene oriented word list space based on the sample phrase, and determining a target text meaning group corresponding to the training sample according to a query result includes:
mapping the sample phrase to the scene directional word list space, and calculating the phrase similarity of the sample phrase and the context label;
and determining a target context label according to the phrase similarity calculation result, and taking an initial text meaning group corresponding to the target context label as the target text meaning group.
Optionally, the determining a sample phrase corresponding to the training sample includes:
analyzing the training sample to obtain a sample problem text in the training sample;
and extracting a first word unit and a second word unit in the sample question text, and constructing the sample phrase based on the first word unit and the second word unit.
Optionally, the training an initial question-answering model by using the target text meaning group and the training sample until a target question-answering model meeting a training stop condition is obtained includes:
inputting the target text meaning groups and the sample question texts in the training samples into the initial question-answering model for processing to obtain predicted answer texts;
and optimizing the initial question-answer model based on the predicted answer text and the sample answer text in the training sample until the target question-answer model meeting the training stopping condition is obtained.
Optionally, the inputting the target text meaning group and the sample question text in the training sample into the initial question-answering model for processing to obtain a predicted answer text includes:
generating word unit vectors and scene label vectors based on the sample question texts, and generating sense group vectors based on the target text sense groups;
integrating the word unit vector and the scene label vector to obtain a sample problem vector corresponding to the sample problem text;
and inputting the sample question vector and the sense group vector into the initial question-answering model for processing to obtain the predicted answer text.
Optionally, the inputting the sample question vector and the sense group vector into the initial question-answering model for processing to obtain the predicted answer text includes:
inputting the sample question vector and the intention group vector into the initial question-answering model, and processing the sample question vector and the intention group vector through a fusion module in the initial question-answering model to obtain a fusion vector;
inputting the fusion vector into an identification module in the initial question-answering model for processing to obtain the distribution of the associated entity core word and the context scene;
and processing the associated entity core word and the context scene distribution through an output layer in the initial question-answering model to obtain the predicted answer text.
According to a second aspect of the embodiments of the present application, there is provided a device for training a question-answering model, including:
the construction module is configured to construct an initial text meaning group corresponding to a sample corpus, and generate a scene oriented word list space corresponding to the sample corpus based on the initial text meaning group;
the acquisition module is configured to acquire a training sample and determine a sample phrase corresponding to the training sample;
the determining module is configured to query the scene oriented word list space based on the sample word group and determine a target text meaning group corresponding to the training sample according to a query result;
and the training module is configured to train the initial question-answering model by using the target text meaning group and the training samples until a target question-answering model meeting the training stopping condition is obtained.
According to a third aspect of the embodiments of the present application, there is provided a text processing method, including:
acquiring a question text uploaded by a user;
inputting the question text into a target question-answer model in the question-answer model training method for processing to obtain an answer text;
and updating a reply interface based on the answer text, and displaying the updated reply interface to the user.
According to a fourth aspect of embodiments of the present application, there is provided a text processing apparatus including:
the text acquisition module is configured to acquire a question text uploaded by a user;
the text processing module is configured to input the question text into a target question-answer model in the question-answer model training method for processing to obtain an answer text;
and the interface display module is configured to update a reply interface based on the answer text and display the updated reply interface to the user.
According to a fifth aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is used for storing computer-executable instructions, and the processor realizes the steps of the question-answering model training method or the text processing method when executing the computer-executable instructions.
According to a sixth aspect of embodiments of the present application, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the question-answering model training method or the text processing method.
According to the method for training the question-answering model, after an initial text meaning group corresponding to the sample corpus is constructed, a scene oriented word list space corresponding to the sample corpus is generated based on the initial text meaning group, and sufficient corpus is prepared for model training at the moment; then obtaining a training sample, determining a sample phrase corresponding to the training sample, inquiring a scene directional word list space by using the sample phrase, determining a target text meaning group corresponding to the training sample according to an inquiry result, and finally training an initial question-answering model according to the target text meaning group and the training sample until a training stopping condition is met, so that a target question-answering model can be obtained; the method and the device have the advantages that the problem and the corpus are captured from the semantic level, the prediction accuracy of the trained question-answering model is effectively guaranteed, the scene oriented word list space is constructed by utilizing abundant sample corpora in the preparation stage, the processing capacity of the question-answering model is effectively improved, and the question-answering processing task is accurately and efficiently completed.
Drawings
Fig. 1 is a flowchart of a method for training a question-answering model according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a question-answer model in a training method of the question-answer model according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a training apparatus for a question-answering model according to an embodiment of the present application;
FIG. 4 is a flowchart of a text processing method according to an embodiment of the present application;
FIG. 5 is a flowchart of a process applied to an ancient poetry question-answer scenario according to an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present application;
fig. 7 is a block diagram of a computing device according to an embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
The terminology used in the one or more embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the present application. As used in one or more embodiments of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present application refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments of the present application to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first aspect may be termed a second aspect, and, similarly, a second aspect may be termed a first aspect, without departing from the scope of one or more embodiments of the present application.
First, the noun terms to which one or more embodiments of the present invention relate are explained.
ERNIE: learning semantic knowledge of the real world by modeling word, entity and entity relationships in the mass data; compared with the semantic representation of the co-occurrence of the BERT learning local language, ERNIE directly models semantic knowledge, and the semantic representation capability of the model is enhanced.
LSTM: the Long Short-Term Memory network (RNN) is a time-cycle neural network, which is specially designed to solve the Long-Term dependence problem of the general RNN, and all RNNs have a chain form of repeated neural network modules. In the standard RNN, this repeated structure block has only a very simple structure, e.g. one tanh layer.
RNN: a Recurrent Neural Network (Recurrent Neural Network) is a Recurrent Neural Network in which sequence data is input, recursion is performed in the direction of evolution of the sequence, and all nodes (Recurrent units) are connected in a chain.
BilSTM: (Bi-directional Long Short-Term Memory) is composed of a forward LSTM and a backward LSTM. The method is applied to modeling context information in natural language processing tasks.
Phrase similarity: the similarity between two phrases can be calculated by a point multiplication between the two phrase word vectors.
And (3) corpus: the basic elements used in translation or language research scenarios are the basic units that make up a corpus.
LDA: (Latent Dirichlet Allocation) is a document topic generation model, also called a three-layer Bayesian probability model, and comprises three layers of structures of words, topics and documents. The generative model is a process in which each word of an article is considered to be obtained by "selecting a topic with a certain probability and selecting a word from the topic with a certain probability". Document-to-topic follows a polynomial distribution, and topic-to-word follows a polynomial distribution.
Semantic dependency analysis: (SDP) analyzes Semantic associations between language units of a sentence and presents the Semantic associations in a Dependency structure. Using semantic dependencies to characterize sentence semantics, the vocabulary itself need not be abstracted, but rather is described by the semantic framework it is subjected to, with the number of arguments always being much smaller relative to the number of vocabulary. The semantic dependency analysis aims at directly acquiring deep semantic information by spanning the constraint of the syntactic structure of the sentence surface layer.
Group meaning: the method refers to each component divided according to meaning and structure in a sentence, and each component is called a meaning group; the words in the same meaning group have close relation and cannot be randomly split, otherwise misunderstanding can be caused.
Context: refers to the environment in which language is used; wherein the internal context refers to a relationship between a certain speech piece and a certain context, and the external context refers to a social environment of a language existing outside the speech piece.
In the application, a method for training a question-answering model is provided. The present application also relates to a question-and-answer model training device, a text processing method, a text processing device, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.
In practical application, the question answering system is applied in each field, and the difficulty of the question answering system in different fields is different due to different field characteristics. In order to make an accurate answer to a question related to classical poetry, for example, in a classical poetry question-answer system, in a model preparation stage, classical poetry entities are usually extracted for training a question-answer model in the classical poetry question-answer system to make an accurate answer. In the prior art, classical poetry entity extraction is generally divided into two types, namely rule extraction and model extraction, and the rule extraction method extracts entities according to grammar rules and a poetry knowledge base by judging important words and sentences in original poetry texts. The model extraction method is to apply an algorithm model of natural language processing and generate a more condensed and concise poetry entity set through technologies such as self-attention, pre-training, sense group adaptation and the like. Compared with a rule extraction method, the model extraction method is closer to the process of entity discovery and extraction of a user, and along with the rise and research of a deep neural network, an entity extraction model algorithm based on a neural network Transformer is rapidly developed, good results are obtained, and strong model generalization capability is expressed.
However, since classical poetry is complex, in the stage of extracting classical poetry entities, the processing granularity of poetry text entities by the existing extraction algorithm is relatively fuzzy, so that the output poetry corpus entity words lack context identification, and meanwhile, the output entity words have more repetition, the phenomenon of data redundancy is serious, and the quality of entity extraction is influenced to a great extent, so that an effective scheme is urgently needed to solve the problems.
In view of the above, the present application provides a method for training a question-answering model, which includes creating an initial text meaning group corresponding to a sample corpus, and then generating a scene oriented vocabulary space corresponding to the sample corpus based on the initial text meaning group, so as to prepare sufficient corpus for model training; then obtaining a training sample, determining a sample phrase corresponding to the training sample, inquiring a scene directional word list space by using the sample phrase, determining a target text meaning group corresponding to the training sample according to an inquiry result, and finally training an initial question-answering model according to the target text meaning group and the training sample until a training stopping condition is met, so that a target question-answering model can be obtained; the method and the device have the advantages that the problem and the corpus are captured from the semantic level, the prediction accuracy of the trained question-answering model is effectively guaranteed, the scene oriented word list space is constructed by utilizing abundant sample corpora in the preparation stage, the processing capacity of the question-answering model is effectively improved, and the question-answering processing task is accurately and efficiently completed.
Fig. 1 shows a flowchart of a method for training a question-answering model according to an embodiment of the present application, which specifically includes the following steps:
step S102, an initial text meaning group corresponding to a sample corpus is constructed, and a scene oriented word list space corresponding to the sample corpus is generated based on the initial text meaning group.
Specifically, the sample corpus is a text corpus providing a full amount of sample information when the question-answering model is trained, and the sample corpuses corresponding to different fields are different. For example, in the field of classical poetry question and answer, the sample corpus can be a text corpus consisting of poetry texts, titles, authors, author information, poetry text explanation, poetry analysis and the like; or in the field of sports knowledge question answering, the sample corpus can be a text corpus consisting of a sports event interpretation text, sports stars, sports star information, sports event places and the like; or in the character relation question and answer field, the sample corpus can be a text corpus consisting of characters, character information, family information, job information, character biography, character traces and the like. In practical application, sample corpora used in different question and answer fields may be obtained and constructed according to actual requirements, and this embodiment is not limited herein.
Furthermore, in order to train a question-answering model meeting the use requirement, a sample needs to be continuously provided for iterative training for multiple rounds, and meanwhile, the model needs to be optimized by combining a loss function, so that a large amount of sample corpora need to be prepared in a data preparation stage, and the problem of over-fitting or incomplete training of the model is avoided by improving the richness of the sample.
Furthermore, the initial text meaning group is specifically a text meaning group constructed based on each sample corpus, namely, a plurality of components are divided from sentences in each sample corpus according to meanings and structures, and the plurality of components can form the initial text meaning group corresponding to the corresponding sample corpus and be used for subsequently constructing a scene oriented word list space and assisting a question-and-answer model to learn a fine-grained semantic association relation between a text and the meaning group, so that the prediction capability of the model is ensured. It should be noted that, because the data volume of the sample corpus is large, the sample corpus may be stored in a corpus corresponding to a field, so as to facilitate management and use of the sample corpus.
Correspondingly, the scene oriented word list space specifically refers to an expression relationship of an entity word association relationship constructed based on initial text meaning groups corresponding to each sample corpus, and the scene oriented word list space constructed by combining the initial text meaning groups corresponding to all the sample corpora can be formed by integrating the initial text meaning groups corresponding to all the sample corpora, and the scene oriented word list space contains scene information and context information of each sample corpus, and is used for subsequently positioning a target text meaning group corresponding to a text to make an accurate response to a problem.
Further, in the process of creating the scene oriented word list space based on the initial text meaning group, since the components contained in the initial text meaning group corresponding to the sample corpus are relatively complex, if the scene oriented word list space is created by directly using each initial text meaning group in full, a large amount of redundant data will be generated, and excessive storage resources are occupied, so in order to reduce the interference generated by the redundant data, the initial phrases will be preferentially extracted to complete the creation of the scene oriented word list space, in this embodiment, the specific implementation manner is as in steps S1022 to S1024:
step S1022, add a context label to the sample corpus, and extract an initial phrase in the initial text meaning group.
Specifically, the context label is a label added to the sample corpus according to the language environment of the sample corpus, and it needs to be explained that because the language environments corresponding to different sample corpora are relatively complex, a plurality of different context labels can be added to the sample corpus, such as ancient poetry 'quiet night thought', which expresses the thinking and country of an author and the meaning of beautiful scenery, so that when the context label is added to the ancient poetry 'quiet night thought', a country context label and a scenery context label can be added; or ancient poem benignal wanlun, which expresses the author's departure and friendship to wanlun, so when context labels are added for sample corpus ancient poem benignal wanlun, departure context labels and friendship context labels may be added.
Based on the above, the initial phrase specifically refers to a phrase formed by extracting a nominal primitive word and a verb phrase stem from the initial text meaning group, and is used for representing the core thought and the key content of a sample corpus corresponding to the initial text meaning group, and simultaneously, a foundation can be laid for subsequently generating a scene oriented word list space, so that an accurate positioning answer can be started from semantics in a question answering stage, and the answering correctness is ensured.
In practical application, in the process of adding the context label to the sample corpus, the LDA feature engineering model can be adopted to process each sample corpus in consideration of the fact that the context label corresponding to the sample corpus may not be unique, so that the context label can be accurately determined for each sample corpus, and the efficiency of adding the context label to the sample corpus is effectively improved.
Meanwhile, when the initial phrase in the initial text meaning group is extracted, a semantic dependency analysis tool can be adopted, namely, semantic association among sentence language units in the sample corpus is analyzed, the semantic association is presented in a dependency structure, and the sentence semantics are described by using language dependency, so that the word unit is described through a semantic framework borne by the word unit under the condition that the word unit in the sample corpus is not required to be abstracted, the constraint of the syntactic structure of the surface layer of the sentence is avoided, and deep semantic information is directly acquired. For example, "first eaten apple" analyzes and processes the statement through a semantic dependency analysis tool, and determines that a word unit "first" has an att relationship with a word unit "eating", the word unit "eating" has an mTime relationship with the word unit "eating", the word unit "eating" has a Pat relationship with the word unit "apple", wherein the att relationship represents an employment relationship, the mTime relationship represents a time stamp, and the Pat relationship represents a victim relationship. Similarly, after the initial phrases of each initial text meaning group are extracted, the internal relations among the phrases can be stored for subsequently creating a scene oriented word list space.
Further, in the process of adding the context label to the sample corpus, in order to ensure that the added context label is more attached to the sample corpus, the context label may be screened in a manner of calculating context similarity, and in this embodiment, the specific implementation manner is as follows:
extracting a plurality of initial features of the sample corpus, and preprocessing the initial features to obtain a plurality of target features;
calculating the context similarity of each target feature and the sample corpus, selecting at least one target feature as the context label according to the calculation result of the context similarity, and adding the target feature to the sample corpus.
Specifically, the initial feature specifically refers to a meaning expressed by the sample corpus in different dimensions, and the target feature specifically refers to a feature expression obtained by preprocessing each initial feature. The preprocessing specifically refers to cleaning the plurality of initial features, that is, deleting the repeated or redundant features, so as to determine a plurality of target features from the plurality of initial features. Correspondingly, the context similarity specifically refers to a degree of similarity between each target feature and the corresponding sample corpus in the language environment dimension.
Based on this, after the sample corpus is obtained, in order to construct a richer scene oriented vocabulary space, a plurality of initial features of the sample corpus may be extracted at this time, and then, in order to avoid excessive calculation pressure due to data redundancy, the plurality of initial features may be preprocessed to obtain a plurality of target features, where the number of the plurality of target features is less than or equal to the number of the plurality of initial features. And secondly, calculating the context similarity of each target feature in the plurality of target features and the sample corpus of the extracted initial features, selecting at least one target feature as a context label according to the context similarity calculation result, and adding the target feature to the sample corpus.
In practical application, in the process of selecting at least one target feature as a context label according to a context similarity calculation result, considering that meanings of sample corpora expressed in different dimensions are different, the context similarity can be compared with a preset context similarity threshold, and the target feature which is greater than or equal to the context similarity threshold is selected as the context label of the sample corpora; in addition, after the context similarity is calculated, the target feature with the largest context similarity can be selected as the context label, so that each sample corpus is ensured to have a unique context label. In a specific implementation, the manner of determining the context label may be selected according to an actual application scenario, and the embodiment is not limited herein.
In conclusion, by screening the context labels from the characteristics of the sample corpus, the fitness of the context labels and the sample corpus can be ensured, and the richness of the subsequently constructed directed scene vocabulary space can be ensured, so that the prediction capability of the question-answering model is improved.
And step S1024, establishing a corresponding relation between the context label and the initial text meaning group, and establishing the scene oriented word list space corresponding to the sample corpus according to the corresponding relation and the initial phrase.
Specifically, on the basis of completing adding context labels to sample corpora and extracting initial phrases in an initial text meaning group, further, in order to ensure the richness of the constructed scene oriented vocabulary space, a corresponding relationship between the context labels and the initial text meaning group may be established, that is, the context labels are labels added to the sample corpora, and the initial text meaning group is constructed based on the sample corpora, so that the corresponding relationship between the context labels and the initial text meaning group may be determined by determining each sample corpora, and then the scene oriented vocabulary space corresponding to the sample corpora may be constructed according to the corresponding relationship and the initial phrases extracted from the initial text meaning group, so as to be used for subsequent training of a question-and-answer model.
In this embodiment, the training question-answer model is described in the field of classical poetry as an example, and the training methods of question-answer models in other fields can refer to the corresponding description contents of this embodiment, which are not described herein in detail.
For example, a large number of corpora corresponding to classical poems are stored in a classical poem corpus, and each corpus corresponding to classical poems comprises poem texts, titles, authors, author information, poem text explanation and poem analysis; the method is characterized in that a classical poetry corpus comprises ten thousand classical poetry and corresponding contents thereof for explanation, after the classical poetry corpora are determined, a coarse-grained poetry initial text meaning group can be established for each classical poetry through a task heuristic topic classification algorithm, and simultaneously, context labels are added for each classical poetry corpora through an LDA characteristic engineering model.
In the process of adding the context labels, a plurality of corresponding initial features { hometown of thought; separating; a side plug; friendship; scenery; sadness … … Zhuang }, preprocessing a plurality of initial features corresponding to each classical poetry corpus to obtain target features corresponding to each classical poetry corpus, calculating context similarity between the classical poetry corpus and the target features corresponding to the classical poetry corpus, selecting the target features larger than a preset context similarity threshold as context labels of the classical poetry corpus, and adding the target features to the classical poetry corpus corresponding to the classical poetry corpus; namely, context labels corresponding to classical poetry corpora from Shizu to Shizu comprise { edge jams; scenery; sadness emotion … … context labels corresponding to classical poetry corpus of quiet night thought include { emotion; thinking; scene }.
Furthermore, after context labels corresponding to classical poetry linguistic data are obtained, the corresponding relation between initial text meaning groups corresponding to the classical poetry contexts and the context labels can be established, meanwhile, the initial phrases corresponding to the initial text meaning groups are formed by extracting nominal original words and verb phrase stems of the initial text meaning groups by utilizing a semantic dependency analysis tool, and scene directional entity word list spaces corresponding to the classical poetry linguistic data are established by combining the corresponding relation and the initial phrases for the follow-up auxiliary completion of the training of a classical poetry question-answer model and the training of a classical poetry question-answer system.
In summary, in the process of establishing the scene oriented vocabulary space, by combining the context labels and the initial phrases corresponding to the sample corpora, it can be effectively ensured that the space contains semantic information of each sample corpus, and the method is convenient for the follow-up question-answering model training.
And step S104, acquiring a training sample, and determining a sample phrase corresponding to the training sample.
Specifically, on the basis of completing the scene oriented word list space based on the sample corpus construction, further, in order to train a question-answer model meeting requirements for the field corresponding to the sample corpus, after the training sample in the field is obtained, the sample word group corresponding to the training sample can be determined, a foundation is laid for subsequently determining a target text meaning group, and the relation between the meaning group and different types of questions is established for the model, so that the model can be set from semantics and answer the questions.
Based on this, the training sample specifically refers to a sample used for subsequently training the question-answering model, wherein the sample comprises a sample question and a sample answer, and correspondingly, the sample phrase specifically refers to an entity phrase constructed based on the training sample, and is used for mapping to the scene oriented word list space, so that a text meaning group corresponding to the sample is found from the word list space, and is used for subsequently training the question-answering model.
Further, in the process of determining the sample phrase corresponding to the training sample, since the scene oriented word list space is constructed based on the initial text meaning group, in order to meet the requirement of subsequently mapping the sample phrase to the scene oriented word list space to determine the target text meaning group, the sample phrase is determined in a manner of selecting the same architecture at this time, so as to ensure that the sample phrase can be successfully mapped to the scene oriented word list space, in this embodiment, the specific implementation manner is as follows:
analyzing the training sample to obtain a sample problem text in the training sample;
and extracting a first word unit and a second word unit in the sample question text, and constructing the sample phrase based on the first word unit and the second word unit.
Specifically, the sample question text specifically refers to a problem related to a field corresponding to the sample corpus, which is prepared in advance, and in order to ensure that a subsequent successful question-answer model is trained, the obtained training sample needs to have a certain association relationship with the sample corpus, the association relationship specifically refers to that the sample question text included in the training sample is provided based on the sample corpus, and an answer corresponding to the sample question text can also be determined from the sample corpus. Correspondingly, the first word unit specifically refers to a nominal original word extracted from the sample question text, and the second word unit specifically refers to a verb phrase stem extracted from the sample question text.
Based on this, after the training sample is obtained, the training sample can be analyzed to obtain a sample question text in the training sample, then in order to be able to determine a target text meaning group associated with the sample question text from a scene oriented word list space in the follow-up process, the training of a complete question and answer model is assisted, a noumenon word and a verb phrase stem can be extracted from the sample question text, then the noumenon word and the verb phrase stem are integrated to construct a sample phrase corresponding to the training sample, the structure of the sample phrase is ensured to be the same as that of the scene oriented word list space, so that the meaning group determination of the target text can be rapidly completed, and the training efficiency of the question and answer model is accelerated.
Following the above example, after completing the scene oriented entity word list space based on ten thousand classical poems and their corresponding contents, in order to train a classical poem question-answer model that answers the problems of the classical poems, a training sample including a sample question text and a sample answer text may be obtained, and the sample question text included in the training sample is related to the classical poems, and the sample question text may include what the central thought of { (quiet night thought) }, { please provide an ancient poem with a depicted landscape } or { (who the author of Huanghe building) } and so on, and in order to determine the classical poem linguistic data associated with each sample question text, at this time, a famous original verb and a phrase stem in each sample question text may be extracted respectively to generate a to-be-oriented associated entity word group corresponding to each sample question text, facilitating subsequent mapping to the scene oriented entity word list space, and the method is used for determining the target text meaning group corresponding to each sample question text.
And S106, inquiring the scene oriented word list space based on the sample word group, and determining a target text meaning group corresponding to the training sample according to an inquiry result.
Specifically, on the basis of determining the sample phrase corresponding to the training sample, in order to ensure that a subsequently trained question-and-answer model can learn the semantic association relationship between the question text and the text meaning group, a pre-constructed scene oriented word list space can be queried based on the sample phrase corresponding to the training sample, so that the target text meaning group corresponding to the training sample can be accurately determined according to a query result, and the target text meaning group is used for subsequently training the question-and-answer model.
Based on the above, the target text meaning group specifically refers to an initial text meaning group which is screened from the scene oriented word list space and has a high degree of association with the training sample, and the answer corresponding to the sample question text in the training sample can be positioned from the sample corpus corresponding to the initial text meaning group.
Further, in the process of determining the target text meaning group, since the initial text meaning group corresponding to the sample corpus included in the scene oriented word list space is relatively complex, in order to accurately determine the target text meaning group corresponding to the training sample, the target text meaning group may be determined from the context label, and in this embodiment, the specific implementation manner is as follows:
mapping the sample phrase to the scene directional word list space, and calculating the phrase similarity of the sample phrase and the context label;
and determining a target context label according to the phrase similarity calculation result, and taking an initial text meaning group corresponding to the target context label as the target text meaning group.
Specifically, the phrase similarity specifically refers to calculating the similarity between the sample phrase and the context label included in the scene oriented word list space, and the target context label specifically refers to the context label with the highest phrase similarity with the sample phrase.
Based on the above, after the sample word group corresponding to the training sample is determined, the sample word group may be mapped to the scene oriented word list space, then the word group similarity between the sample word group and the context label included in the scene oriented word list space is calculated, then the context label with the highest word group similarity is selected as the target context label, and finally the initial text meaning group corresponding to the target context label is determined from the scene oriented word list space as the target text meaning group for the subsequent training of the question-answer model. In a specific implementation, in the process of determining the target context label according to the phrase similarity, one or more target context labels may be determined, and correspondingly, one or more target text meaning groups determined from the scene oriented word list space may also be determined.
Along the above example, after determining the associated entity word groups to be oriented corresponding to each sample problem text, mapping each associated entity word group to be oriented to the scene oriented entity word table space, then calculating the word group similarity of the associated entity word groups to be oriented to the context labels contained in the scene oriented entity word table space, determining the target context label associated with each sample problem text according to the calculation result, and determining what the central idea of the sample problem text { (meditation at night) is associated with the target context label as the idea; sample question text { please provide an ancient poem depicting a side-by-side scenery } associated target context label is "side-by-side"; … … sample question text { (who the author of Huanghe building) } associated target context label is "scenery"; at this time, target text meaning groups corresponding to the sample question texts can be determined by combining sample word groups and context labels corresponding to the sample question texts, wherein the target text meaning group corresponding to the central thought of the sample question texts { (what the central thought of the quiet night thought) } is an initial text meaning group corresponding to the quiet night thought of the classical poetry, { please provide a classical poetry describing a side scene } is an initial text meaning group corresponding to the classical poetry of the past to the past, and … … { (who the author of the Huanghe) } is the target text meaning group corresponding to the author of the yellow crane of the classical poetry of the past to be used for subsequent training of a classical poetry question and answer model.
In conclusion, the target text meaning group corresponding to the sample question text is determined from the scene directional word list space in a word group similarity calculation mode, the accuracy of determining the target text meaning group can be effectively improved, and meanwhile, the follow-up question-answering model can accurately learn the semantic association relation, so that the question-answering model meeting the requirements is trained.
And S108, training an initial question-answering model by using the target text meaning group and the training sample until a target question-answering model meeting the training stopping condition is obtained.
Specifically, on the basis of determining the target text meaning groups corresponding to the training samples, the initial question-answering model can be trained by combining the target text meaning groups and the training samples, so that the initial question-answering model can learn the fine-grained semantic association relationship between the sample question texts in the training samples and the target text meaning groups, and then the target question-answering model meeting the training stop conditions can be obtained through continuous iteration and optimization. The training stopping condition may be a training iteration number or a loss value comparison, and in practical application, the training stopping condition may be set according to a requirement, which is not limited herein.
In practical application, because the question-answering systems in different fields have different architectures, in order to train a question-answering model with better prediction accuracy for the question-answering system in the field, a plurality of module component question-answering systems with different functions can be integrated. For example, in the field of classical poetry question-answering, a classical poetry question-answering system related to the classical poetry question-answering system can introduce a context recognition attention module, by setting a context discrimination label, adopting a Chinese word segmentation module based on a deep semantic unit, simultaneously establishing a word-level hidden layer state layer distribution representation of a label sentence and a poetry text sentence by using BiLSTM, calculating an attention vector matrix fusing the label and the text semantic information, rapidly realizing entity discovery and classification extraction, and simultaneously configuring a memory unit, thereby improving the accuracy of entity extraction, avoiding repeatedly extracting entity words related to the poetry question-answering context, and effectively ensuring that accurate answers can be made to problems related to the classical poetry.
In specific implementation, a BilSTM (binary Scale language) rule is utilized to analyze tasks from three dimensions of question types, high-frequency entities and associated entities, three corresponding learnable weight matrixes are established, a semantic mapping matrix between poetry corpora and different types of question sentences is established through multi-round iterative learning, loss functions are designed according to learning effects, different weights are given to the three matrixes, and finally the three matrixes are spliced into a weight matrix for a real-time question-answering system.
Further, in the process of training the initial question-answer model by using the target text meaning group and the training samples, since the initial question-answer model is a supervised model, the model training can be completed only by continuous optimization and parameter adjustment, in this embodiment, the training process refers to steps S1082 to S1084:
step S1082, inputting the target text meaning group and the sample question text in the training sample into the initial question-answer model for processing, and obtaining a predicted answer text.
Specifically, the predicted answer text refers to a text corresponding to an answer queried from a target text meaning group after an initial question-answer model performs prediction processing on a sample question text in a training sample.
Further, in the training process of the initial question-answering model, the semantic association relationship between the target text meaning group and the sample question text is actually learned, and the semantic association relationship is continuously optimized in a parameter adjusting mode, so that the target question-answering model meeting the training stop condition is obtained, and in the embodiment, the specific implementation mode is as follows:
generating word unit vectors and scene label vectors based on the sample question texts, and generating sense group vectors based on the target text sense groups;
integrating the word unit vector and the scene label vector to obtain a sample problem vector corresponding to the sample problem text;
and inputting the sample question vector and the sense group vector into the initial question-answering model for processing to obtain the predicted answer text.
Specifically, the word unit vector specifically refers to a lexical syntax unit word vector constructed based on the sample question text, and correspondingly, the scene tag vector specifically refers to a tag vector constructed based on the sample question text and capable of expressing a scene of the sample question text. The meaning group vector specifically refers to a vector expression constructed based on the target text meaning group, and is used as an input convenience model of the model for processing.
Based on the method, after a target text meaning group and a sample question text in a training sample are determined, word unit vectors and scene label vectors can be generated based on the sample question text, meaning group vectors are generated based on the target text meaning group, then the word unit vectors and the scene label vectors are integrated to obtain sample question vectors corresponding to the sample question text, finally, contents (the sample question vectors and the meaning group vectors) which are converted into vector expressions are simultaneously input into an initial question-answer model to be processed, a predicted answer text corresponding to the sample question text can be predicted through the model, and the model can be conveniently optimized by combining the sample answer text in the training sample.
Further, the process of the initial question-answering model to predict answers from vector expressions is as follows:
inputting the sample question vector and the intention group vector into the initial question-answering model, and processing the sample question vector and the intention group vector through a fusion module in the initial question-answering model to obtain a fusion vector;
inputting the fusion vector into an identification module in the initial question-answering model for processing to obtain the distribution of the associated entity core word and the context scene;
and processing the associated entity core word and the context scene distribution through an output layer in the initial question-answering model to obtain the predicted answer text.
Specifically, the fusion module is a module for performing information fusion between the sample problem text and the target text meaning group in the model, and correspondingly, the fusion vector is a vector expression obtained after fusion processing of the sample problem vector and the meaning group vector; correspondingly, the identification module is specifically a module for identifying the distribution of the central words and the context scenes of the associated entities in the fused vector inside the model, and correspondingly, the distribution of the central words and the context scenes of the associated entities is specifically information used for positioning and predicting answer texts from semantic dimensions and scene dimensions.
Based on the method, after the sample question vector and the intention vector are obtained, the sample question vector and the intention vector can be input into an initial question-answering model, the sample question vector and the intention vector are processed through a fusion module in the initial question-answering model to obtain a fusion vector, and then the fusion vector is input into an identification module in the initial question-answering model to be processed to obtain the distribution of the central words and the context scene of the associated entity; and finally, processing the associated entity core word and the context scene distribution through an output layer in an initial question-answering model to obtain the predicted answer text.
Step S1084, the initial question-answer model is optimized based on the predicted answer text and the sample answer text in the training sample until the target question-answer model meeting the training stopping condition is obtained.
Specifically, after the predicted answer text is obtained, a loss function can be determined by combining the sample answer text in the training sample with the predicted answer text, then the initial question-answer model is subjected to parameter tuning/optimization based on the loss function, and the model training process is continuously repeated, so that the target question-answer model meeting the training stop condition can be obtained.
Referring to fig. 2, when the sample question text indicates what the author feels { [ quiet night thought ] }, the lexical and syntactic unit word vectors and the scene tag vectors corresponding to the sample question text may be extracted and fused to obtain a sample question vector, then the sample question vector is processed by a semantic dependency analysis tool in the classical poetry question-answering system, and then the text self-attention calculation unit performs self-attention calculation on the processing result to obtain a to-be-directed associated entity phrase; in the process, the question-answering model trains a task-oriented poetry corpus pointer weight matrix through a BilSTM weight matrix training module, so that when the question-answering processing is carried out, a text meaning group and the matrix can be combined to position fine-grained poetry entity phrases, namely, a predicted answer text can be mapped through the fine-grained poetry entity phrases, the model is optimized according to the predicted answer text until a classical poetry question-answering system meeting the training stopping condition is obtained, and accurate response can be made when the classical poetry question-answering processing is carried out.
After an initial text meaning group corresponding to a sample corpus is constructed, a scene oriented word list space corresponding to the sample corpus is generated based on the initial text meaning group, and sufficient corpus is prepared for model training at the moment; then obtaining a training sample, determining a sample phrase corresponding to the training sample, inquiring a scene directional word list space by using the sample phrase, determining a target text meaning group corresponding to the training sample according to an inquiry result, and finally training an initial question-answering model according to the target text meaning group and the training sample until a training stopping condition is met, so that a target question-answering model can be obtained; the method and the device have the advantages that the problem and the corpus are captured from the semantic level, the prediction accuracy of the trained question-answering model is effectively guaranteed, the scene oriented word list space is constructed by utilizing abundant sample corpora in the preparation stage, the processing capacity of the question-answering model is effectively improved, and the question-answering processing task is accurately and efficiently completed.
Corresponding to the above method embodiment, the present application further provides an embodiment of a training device for a question-and-answer model, and fig. 3 shows a schematic structural diagram of the training device for a question-and-answer model provided in an embodiment of the present application. As shown in fig. 3, the apparatus includes:
a building module 302 configured to build an initial text meaning group corresponding to a sample corpus, and generate a scene-oriented word list space corresponding to the sample corpus based on the initial text meaning group;
an obtaining module 304, configured to obtain a training sample, and determine a sample phrase corresponding to the training sample;
a determining module 306 configured to query the scene oriented word list space based on the sample word group, and determine a target text meaning group corresponding to the training sample according to a query result;
a training module 308 configured to train the initial question-answering model using the target text meaning groups and the training samples until a target question-answering model satisfying a training stop condition is obtained.
In an optional embodiment, the building module 302 is further configured to:
adding context labels to the sample corpus, and extracting initial phrases in the initial text meaning groups; and establishing a corresponding relation between the context label and the initial text meaning group, and establishing the scene oriented word list space corresponding to the sample corpus according to the corresponding relation and the initial phrase.
In an optional embodiment, the building module 302 is further configured to:
extracting a plurality of initial features of the sample corpus, and preprocessing the initial features to obtain a plurality of target features; calculating the context similarity of each target feature and the sample corpus, selecting at least one target feature as the context label according to the calculation result of the context similarity, and adding the target feature to the sample corpus.
In an optional embodiment, the determining module 306 is further configured to:
mapping the sample phrase to the scene directional word list space, and calculating the phrase similarity of the sample phrase and the context label; and determining a target context label according to the phrase similarity calculation result, and taking an initial text meaning group corresponding to the target context label as the target text meaning group.
In an optional embodiment, the obtaining module 304 is further configured to:
analyzing the training sample to obtain a sample problem text in the training sample; and extracting a first word unit and a second word unit in the sample question text, and constructing the sample phrase based on the first word unit and the second word unit.
In an optional embodiment, the training module 308 is further configured to:
inputting the target text meaning groups and the sample question texts in the training samples into the initial question-answering model for processing to obtain predicted answer texts; and optimizing the initial question-answer model based on the predicted answer text and the sample answer text in the training sample until the target question-answer model meeting the training stopping condition is obtained.
In an optional embodiment, the training module 308 is further configured to:
generating word unit vectors and scene label vectors based on the sample question texts, and generating sense group vectors based on the target text sense groups; integrating the word unit vector and the scene label vector to obtain a sample problem vector corresponding to the sample problem text; and inputting the sample question vector and the sense group vector into the initial question-answering model for processing to obtain the predicted answer text.
In an optional embodiment, the training module 308 is further configured to:
inputting the sample question vector and the intention group vector into the initial question-answering model, and processing the sample question vector and the intention group vector through a fusion module in the initial question-answering model to obtain a fusion vector; inputting the fusion vector into an identification module in the initial question-answering model for processing to obtain the distribution of the associated entity core word and the context scene; and processing the associated entity core word and the context scene distribution through an output layer in the initial question-answering model to obtain the predicted answer text.
According to the device for training the question-answering model, after the initial text meaning group corresponding to the sample corpus is constructed, the scene oriented word list space corresponding to the sample corpus is generated based on the initial text meaning group, and sufficient corpus is prepared for model training at the moment; then obtaining a training sample, determining a sample phrase corresponding to the training sample, inquiring a scene directional word list space by using the sample phrase, determining a target text meaning group corresponding to the training sample according to an inquiry result, and finally training an initial question-answering model according to the target text meaning group and the training sample until a training stopping condition is met, so that a target question-answering model can be obtained; the method and the device have the advantages that the problem and the corpus are captured from the semantic level, the prediction accuracy of the trained question-answering model is effectively guaranteed, the scene oriented word list space is constructed by utilizing abundant sample corpora in the preparation stage, the processing capacity of the question-answering model is effectively improved, and the question-answering processing task is accurately and efficiently completed.
The above is an illustrative scheme of a training apparatus for a question-answering model according to this embodiment. It should be noted that the technical solution of the training device of the question-answering model and the technical solution of the training method of the question-answering model belong to the same concept, and details that are not described in detail in the technical solution of the training device of the question-answering model can be referred to the description of the technical solution of the training method of the question-answering model.
Further, the components in the device embodiment should be understood as functional blocks that must be created to implement the steps of the program flow or the steps of the method, and each functional block is not actually divided or separately defined. The device claims defined by such a set of functional modules are to be understood as a functional module framework for implementing the solution mainly by means of a computer program as described in the specification, and not as a physical device for implementing the solution mainly by means of hardware.
The present embodiment further provides a text processing method, and fig. 4 shows a flowchart of a text processing method according to an embodiment of the present application, which specifically includes the following steps:
step S402, obtaining the question text uploaded by the user.
Step S404, inputting the question text into a target question-answer model in the question-answer model training method for processing to obtain an answer text.
Step S406, updating a reply interface based on the answer text, and displaying the updated reply interface to the user.
For example, when the user inputs a question text { please provide a side poem }, the question text may be input to a classical poem question-answer model for processing, an answer text "make to go up" is obtained according to a prediction result, at this time, in order to show the classical poem "make to go up" to the user, a reply interface displayed to the user may be updated based on the poem text "make to go up", and a reply interface including the poem text "make to go up" is displayed to the user according to an update result.
In conclusion, the target question-answering model obtained by the training method is used for processing the question text, so that the reply accuracy can be effectively improved, the response speed is higher, and the use experience of the user is improved.
The method is further described below with reference to fig. 5, which illustrates an application of the method provided in the present application to answer an ancient poem question-answer. Fig. 5 shows a processing flow chart applied to an ancient poetry question-answer scene, which is provided by an embodiment of the present application, and specifically includes the following steps:
and step S502, constructing an initial text meaning group corresponding to the linguistic data of the classical poetry.
In the process of constructing the initial text meaning groups corresponding to the classical poetry linguistic data, a plurality of components are divided according to meanings and structures of sentences of each classical poetry linguistic data, and the initial text meaning groups corresponding to the corresponding sample linguistic data can be formed by the plurality of components.
Step S504, generating a scene directional entity word list space based on the initial text meaning group.
After obtaining the initial text meaning groups corresponding to the classical poetry linguistic data, adding context labels to the classical poetry linguistic data, and extracting initial phrases in the initial text meaning groups, wherein the added context labels comprise { hometown; separating; a side plug; friendship; scenery; sadness … … Zhuang }, the initial phrase of the initial text meaning group is composed of noumenal original words and verb phrase stems.
Further, after determining the context label and the initial phrase, a corresponding relation between the context label and the initial problem meaning group can be established, and then a scene directional entity word list space corresponding to the classical poetry corpus is established by utilizing the corresponding relation and the initial phrase.
Step S506, a training sample is obtained, and the associated entity phrase to be oriented corresponding to the sample question text in the training sample is determined.
After the training sample is obtained, a sample question text can be extracted from the training sample, nominal original words and verb phrase stems in the sample question text are extracted, and then the nominal original words and the verb phrase stems are integrated to generate a to-be-oriented associated entity phrase for subsequent training of a classical poetry question-answer model.
Step S508, mapping the associated entity phrase to be oriented to the scene oriented entity word list space, and calculating the phrase similarity between the associated entity phrase to be oriented and the context label contained in the space.
Step S510, determining a target context label according to the phrase similarity calculation result, and using an initial text meaning group corresponding to the target context label as a target text meaning group corresponding to the sample question text.
After mapping the associated entity word group to be oriented to the scene oriented entity word list space, calculating the word group similarity of the associated entity word group to be oriented and context labels contained in the scene oriented entity word list space, then selecting the context label with the highest word group similarity as a target context label, then determining an initial text meaning group corresponding to the context label from the scene oriented entity word list space, and selecting the initial text meaning group as a target text meaning group for subsequent model training.
And S512, training the classical poetry question-answer model by using the target text meaning group, the sample question text and the sample answer text contained in the training sample until the target classical poetry question-answer model meeting the training stop condition is obtained.
At the moment, the classical poetry question-answer model can be trained by combining the target text meaning group and the sample question text and the sample answer text contained in the training sample, so that the classical poetry question-answer model can learn the fine-grained semantic association relation between different question types and the target text meaning group, and then the target classical poetry question-answer model meeting the training stop condition can be obtained through continuous iteration and optimization.
Step S514, receiving the text of the question to be answered input by the user.
After the classical poetry question-answer model is trained, the classical poetry model can be reused, and the text of the question to be answered which is input by the user is received at the moment { please provide an ancient poem which depicts the situation of thinking home }.
Step S516, inputting the question text to be answered into the target classical poetry question-answer model to obtain a target answer text.
Step S518, updating the reply interface based on the target answer text, and displaying the updated reply interface to the user.
The method comprises the steps of processing a question text { please provide an ancient poem describing a hometown thinking situation } to be answered through a target classical poetry question-answer model to obtain an answer text of 'quiet night thinking', updating a reply interface based on the ancient poem 'quiet night thinking' and a corresponding text thereof, and displaying the reply interface with the text of 'quiet night thinking' to a user.
In conclusion, the question-answering model is trained in the above manner, so that the prediction accuracy of the trained question-answering model is effectively guaranteed, the scene oriented word list space is constructed by utilizing abundant sample corpora in the preparation stage, the processing capability of the question-answering model is effectively improved, and the question-answering processing task is accurately and efficiently completed. Corresponding to the above method embodiment, the present application further provides a text processing apparatus embodiment, and fig. 6 shows a schematic structural diagram of a text processing apparatus provided in an embodiment of the present application. As shown in fig. 6, the apparatus includes:
an obtaining text module 602 configured to obtain a question text uploaded by a user;
a text processing module 604, configured to input the question text into a target question-answer model in the training method of the question-answer model for processing, so as to obtain an answer text;
an interface display module 606 configured to update a reply interface based on the answer text and display the updated reply interface to the user.
In conclusion, the target question-answering model obtained by the training method is used for processing the question text, so that the reply accuracy can be effectively improved, the response speed is higher, and the use experience of the user is improved.
The above is a schematic scheme of a text processing apparatus of the present embodiment. It should be noted that the technical solution of the text processing apparatus and the technical solution of the text processing method belong to the same concept, and details that are not described in detail in the technical solution of the text processing apparatus can be referred to the description of the technical solution of the text processing method. Further, the components in the device embodiment should be understood as functional blocks that must be created to implement the steps of the program flow or the steps of the method, and each functional block is not actually divided or separately defined. The device claims defined by such a set of functional modules are to be understood as a functional module framework for implementing the solution mainly by means of a computer program as described in the specification, and not as a physical device for implementing the solution mainly by means of hardware.
Fig. 7 illustrates a block diagram of a computing device 700 provided according to an embodiment of the present application. The components of the computing device 700 include, but are not limited to, memory 710 and a processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.
Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 740 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the application, the above-described components of the computing device 700 and other components not shown in fig. 7 may also be connected to each other, for example, by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 7 is for purposes of example only and is not limiting as to the scope of the present application. Those skilled in the art may add or replace other components as desired.
Computing device 700 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.
Wherein, the processor 720 is used for executing the computer-executable instructions of the training method or the text processing method of the question-answering model.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the above-mentioned question-and-answer model training method or text processing method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the above-mentioned question-and-answer model training method or text processing method.
An embodiment of the present application further provides a computer-readable storage medium storing computer instructions, which when executed by a processor, are used for a method for training a question-and-answer model or a method for processing a text.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium and the technical solution of the above-mentioned question-and-answer model training method or text processing method belong to the same concept, and details of the technical solution of the storage medium, which are not described in detail, can be referred to the description of the technical solution of the above-mentioned question-and-answer model training method or text processing method.
The foregoing description of specific embodiments of the present application has been presented. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and its practical applications, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.

Claims (12)

1. A method for training a question-answering model is characterized by comprising the following steps:
constructing an initial text meaning group corresponding to a sample corpus, adding context labels to the sample corpus, extracting initial phrases in the initial text meaning group, establishing a corresponding relation between the context labels and the initial text meaning group, and constructing a scene oriented word list space corresponding to the sample corpus according to the corresponding relation and the initial phrases;
acquiring a training sample, and determining a sample phrase corresponding to the training sample;
inquiring the scene oriented word list space based on the sample word group, and determining a target text meaning group corresponding to the training sample according to an inquiry result;
and training an initial question-answering model by using the target text meaning group and the training sample until a target question-answering model meeting the training stopping condition is obtained.
2. The method for training a question-answering model according to claim 1, wherein the adding context labels to the sample corpus comprises:
extracting a plurality of initial features of the sample corpus, and preprocessing the initial features to obtain a plurality of target features;
calculating the context similarity of each target feature and the sample corpus, selecting at least one target feature as the context label according to the calculation result of the context similarity, and adding the target feature to the sample corpus.
3. The method for training a question-answer model according to claim 2, wherein the step of querying the scene oriented word list space based on the sample word group and determining a target text meaning group corresponding to the training sample according to a query result comprises:
mapping the sample phrase to the scene directional word list space, and calculating the phrase similarity of the sample phrase and the context label;
and determining a target context label according to the phrase similarity calculation result, and taking an initial text meaning group corresponding to the target context label as the target text meaning group.
4. The method for training the question-answering model according to claim 1, wherein the determining the sample phrase corresponding to the training sample comprises:
analyzing the training sample to obtain a sample problem text in the training sample;
and extracting a first word unit and a second word unit in the sample question text, and constructing the sample phrase based on the first word unit and the second word unit.
5. The method for training the question-answer model according to claim 4, wherein the training an initial question-answer model by using the target text meaning groups and the training samples until a target question-answer model satisfying a training stop condition is obtained comprises:
inputting the target text meaning groups and the sample question texts in the training samples into the initial question-answering model for processing to obtain predicted answer texts;
and optimizing the initial question-answer model based on the predicted answer text and the sample answer text in the training sample until the target question-answer model meeting the training stopping condition is obtained.
6. The method for training the question-answer model according to claim 5, wherein the step of inputting the target text meaning group and the sample question texts in the training samples into the initial question-answer model for processing to obtain predicted answer texts comprises:
generating word unit vectors and scene label vectors based on the sample question texts, and generating sense group vectors based on the target text sense groups;
integrating the word unit vector and the scene label vector to obtain a sample problem vector corresponding to the sample problem text;
and inputting the sample question vector and the sense group vector into the initial question-answering model for processing to obtain the predicted answer text.
7. The method for training question-answer models according to claim 6, wherein the inputting the sample question vectors and the sense group vectors into the initial question-answer model for processing to obtain the predicted answer text comprises:
inputting the sample question vector and the intention group vector into the initial question-answering model, and processing the sample question vector and the intention group vector through a fusion module in the initial question-answering model to obtain a fusion vector;
inputting the fusion vector into an identification module in the initial question-answering model for processing to obtain the distribution of the associated entity core word and the context scene;
and processing the associated entity core word and the context scene distribution through an output layer in the initial question-answering model to obtain the predicted answer text.
8. A device for training a question-answering model, comprising:
the construction module is configured to construct an initial text meaning group corresponding to a sample corpus, add context labels to the sample corpus, extract initial phrases in the initial text meaning group, establish a corresponding relation between the context labels and the initial text meaning group, and construct a scene oriented word list space corresponding to the sample corpus according to the corresponding relation and the initial phrases;
the acquisition module is configured to acquire a training sample and determine a sample phrase corresponding to the training sample;
the determining module is configured to query the scene oriented word list space based on the sample word group and determine a target text meaning group corresponding to the training sample according to a query result;
and the training module is configured to train the initial question-answering model by using the target text meaning group and the training samples until a target question-answering model meeting the training stopping condition is obtained.
9. A method of text processing, comprising:
acquiring a question text uploaded by a user;
inputting the question text into a target question-answer model in the method according to any one of claims 1 to 7 for processing to obtain an answer text;
and updating a reply interface based on the answer text, and displaying the updated reply interface to the user.
10. A text processing apparatus, comprising:
the text acquisition module is configured to acquire a question text uploaded by a user;
a text processing module configured to input the question text into the target question-answer model in the method according to any one of claims 1 to 7 for processing to obtain an answer text;
and the interface display module is configured to update a reply interface based on the answer text and display the updated reply interface to the user.
11. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions to implement the steps of the method of any one of claims 1 to 7 or 9.
12. A computer-readable storage medium storing computer instructions, which when executed by a processor, perform the steps of the method of any one of claims 1 to 7 or 9.
CN202110665052.0A 2021-06-16 2021-06-16 Question-answer model training method and device Active CN113127624B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202110665052.0A CN113127624B (en) 2021-06-16 2021-06-16 Question-answer model training method and device
CN202111256825.6A CN113987147A (en) 2021-06-16 2021-06-16 Sample processing method and device
CN202111258418.9A CN113901191A (en) 2021-06-16 2021-06-16 Question-answer model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110665052.0A CN113127624B (en) 2021-06-16 2021-06-16 Question-answer model training method and device

Related Child Applications (2)

Application Number Title Priority Date Filing Date
CN202111256825.6A Division CN113987147A (en) 2021-06-16 2021-06-16 Sample processing method and device
CN202111258418.9A Division CN113901191A (en) 2021-06-16 2021-06-16 Question-answer model training method and device

Publications (2)

Publication Number Publication Date
CN113127624A CN113127624A (en) 2021-07-16
CN113127624B true CN113127624B (en) 2021-11-16

Family

ID=76783260

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202111256825.6A Pending CN113987147A (en) 2021-06-16 2021-06-16 Sample processing method and device
CN202110665052.0A Active CN113127624B (en) 2021-06-16 2021-06-16 Question-answer model training method and device
CN202111258418.9A Pending CN113901191A (en) 2021-06-16 2021-06-16 Question-answer model training method and device

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202111256825.6A Pending CN113987147A (en) 2021-06-16 2021-06-16 Sample processing method and device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202111258418.9A Pending CN113901191A (en) 2021-06-16 2021-06-16 Question-answer model training method and device

Country Status (1)

Country Link
CN (3) CN113987147A (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962315B (en) * 2021-10-28 2023-12-22 北京百度网讯科技有限公司 Model pre-training method, device, equipment, storage medium and program product
CN114611529B (en) * 2022-03-15 2024-02-02 平安科技(深圳)有限公司 Intention recognition method and device, electronic equipment and storage medium
CN116204726B (en) * 2023-04-28 2023-07-25 杭州海康威视数字技术股份有限公司 Data processing method, device and equipment based on multi-mode model
CN116450796B (en) * 2023-05-17 2023-10-17 中国兵器工业计算机应用技术研究所 Intelligent question-answering model construction method and device
CN117271751B (en) * 2023-11-16 2024-02-13 北京百悟科技有限公司 Interaction method, device, equipment and storage medium
CN117574286A (en) * 2024-01-11 2024-02-20 阿里健康科技(杭州)有限公司 Method, device, equipment and storage medium for determining tag value

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677779A (en) * 2015-12-30 2016-06-15 山东大学 Feedback-type question type classifier system based on scoring mechanism and working method thereof
CN108959412A (en) * 2018-06-07 2018-12-07 出门问问信息科技有限公司 Generation method, device, equipment and the storage medium of labeled data
CN109446399A (en) * 2018-10-16 2019-03-08 北京信息科技大学 A kind of video display entity search method
CN111712836A (en) * 2018-02-09 2020-09-25 易享信息技术有限公司 Multitask learning as question and answer

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102200212B1 (en) * 2018-12-07 2021-01-08 서울대학교 산학협력단 Apparatus and Method for Generating Sampling Model for Uncertainty Prediction, Apparatus for Predicting Uncertainty

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677779A (en) * 2015-12-30 2016-06-15 山东大学 Feedback-type question type classifier system based on scoring mechanism and working method thereof
CN111712836A (en) * 2018-02-09 2020-09-25 易享信息技术有限公司 Multitask learning as question and answer
CN108959412A (en) * 2018-06-07 2018-12-07 出门问问信息科技有限公司 Generation method, device, equipment and the storage medium of labeled data
CN109446399A (en) * 2018-10-16 2019-03-08 北京信息科技大学 A kind of video display entity search method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding;Jacob Devlin et al.;《Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics》;20190524;全文 *
语义检索模型中的词元扩展算法研究;赵文娟 等;《情报科学》;20190531;第37卷(第5期);全文 *

Also Published As

Publication number Publication date
CN113987147A (en) 2022-01-28
CN113901191A (en) 2022-01-07
CN113127624A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
US20230100376A1 (en) Text sentence processing method and apparatus, computer device, and storage medium
CN113127624B (en) Question-answer model training method and device
JP6845486B2 (en) Mathematical problem concept type prediction service provision method using neural network-based machine translation and mass corpus
CN110301117B (en) Method and apparatus for providing response in session
CN110795913B (en) Text encoding method, device, storage medium and terminal
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
WO2023137911A1 (en) Intention classification method and apparatus based on small-sample corpus, and computer device
CN114519356B (en) Target word detection method and device, electronic equipment and storage medium
CN111930914A (en) Question generation method and device, electronic equipment and computer-readable storage medium
CN113268609A (en) Dialog content recommendation method, device, equipment and medium based on knowledge graph
WO2023231576A1 (en) Generation method and apparatus for mixed language speech recognition model
CN114358201A (en) Text-based emotion classification method and device, computer equipment and storage medium
CN107562729B (en) Party building text representation method based on neural network and theme enhancement
CN108363685B (en) Self-media data text representation method based on recursive variation self-coding model
CN114528398A (en) Emotion prediction method and system based on interactive double-graph convolutional network
CN116578688A (en) Text processing method, device, equipment and storage medium based on multiple rounds of questions and answers
CN114462425B (en) Social media text processing method, device and equipment and storage medium
CN116258137A (en) Text error correction method, device, equipment and storage medium
CN114492661A (en) Text data classification method and device, computer equipment and storage medium
CN109002498B (en) Man-machine conversation method, device, equipment and storage medium
CN116432705A (en) Text generation model construction method, text generation device, equipment and medium
Sawant et al. Analytical and Sentiment based text generative chatbot
CN115757723A (en) Text processing method and device
Lamons et al. Python Deep Learning Projects: 9 projects demystifying neural network and deep learning models for building intelligent systems
CN115221315A (en) Text processing method and device, and sentence vector model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant