CN116306974A - Model training method and device of question-answering system, electronic equipment and storage medium - Google Patents

Model training method and device of question-answering system, electronic equipment and storage medium Download PDF

Info

Publication number
CN116306974A
CN116306974A CN202310247092.2A CN202310247092A CN116306974A CN 116306974 A CN116306974 A CN 116306974A CN 202310247092 A CN202310247092 A CN 202310247092A CN 116306974 A CN116306974 A CN 116306974A
Authority
CN
China
Prior art keywords
question
vector
intention
entity
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310247092.2A
Other languages
Chinese (zh)
Inventor
夏志超
马超
肖冰
夏粉
蒋宁
吴海英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Priority to CN202310247092.2A priority Critical patent/CN116306974A/en
Publication of CN116306974A publication Critical patent/CN116306974A/en
Priority to PCT/CN2024/070737 priority patent/WO2024187925A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure provides a model training method, a device, electronic equipment and a storage medium of a question-answering system, wherein the model training method of the question-answering system comprises the following steps: acquiring a problem text sample; inputting a problem text sample into an initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer; the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector; the conversion layer is used for generating a preset number of initial intention vectors under the condition of receiving the first sentence pattern vector, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; the text segment is used for inquiring answers corresponding to the question text sample in the question-answering system, so that the intention recognition capability of the question-answering system is improved.

Description

Model training method and device of question-answering system, electronic equipment and storage medium
Technical Field
The application relates to the technical field of knowledge graphs, in particular to a model training method and device of a question-answering system, electronic equipment and a storage medium.
Background
Along with the development of internet technology, the application of the question-answering system is more and more popular, and the question-answering system can automatically answer the questions raised by the user, so that the answer efficiency can be improved, and the human resources can be saved. However, in practical application, the questions presented by the user may be spoken, so that the sentence pattern structure is rich and various, and the intention recognition difficulty of the question-answering system is increased.
Disclosure of Invention
The embodiment of the application provides a model training method and device of a question-answering system, electronic equipment and a storage medium, so as to improve the intention recognition capability of the question-answering system.
In a first aspect, an embodiment of the present application provides a model training method of a question-answering system, where the question-answering system includes an initial problem analysis model; the method comprises the following steps:
acquiring a problem text sample;
inputting the problem text sample into the initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer;
the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector;
the conversion layer is used for generating a preset number of initial intention vectors under the condition that the first sentence pattern vector is received, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; and the text segment is used for inquiring answers corresponding to the question text samples in the question and answer system.
In a second aspect, an embodiment of the present application provides a response method, including:
acquiring a target problem to be responded;
inputting the target problem into a problem analysis model for analysis treatment to obtain a corresponding target segment; the problem analysis model is obtained by training the model training method of the question answering system according to the first aspect;
and determining the answer of the target question according to the target segment.
In a third aspect, an embodiment of the present application provides a model training apparatus of a question-answering system, where the question-answering system includes an initial problem analysis model; the device comprises:
the first acquisition unit is used for acquiring a question text sample;
the training unit is used for inputting the problem text sample into the initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer;
the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector;
the conversion layer is used for generating a preset number of initial intention vectors under the condition that the first sentence pattern vector is received, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; and the text segment is used for inquiring answers corresponding to the question text samples in the question and answer system.
In a fourth aspect, an embodiment of the present application provides a response device, including:
the second acquisition unit is used for acquiring a target problem to be responded;
the analysis unit is used for inputting the target problem into a problem analysis model to carry out analysis processing to obtain a corresponding target segment; the problem analysis model is obtained by training the model training method of the question answering system according to the first aspect;
and the determining unit is used for determining the answer of the target question according to the target segment.
In a fifth aspect, embodiments of the present application provide an electronic device, including: a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to perform the model training method of the question-answering system as described in the first aspect, or the answer method as described in the second aspect.
In a sixth aspect, embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement a model training method of a question-answering system according to the first aspect, or a response method according to the second aspect.
It can be seen that in the embodiment of the present application, first, a question text sample is obtained; then, inputting a problem text sample into an initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer; the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector; the conversion layer is used for generating a preset number of initial intention vectors under the condition of receiving the first sentence pattern vector, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; the text segment is used for inquiring answers corresponding to the text sample of the question in the question-answering system. In this way, the initial problem analysis model is iteratively trained through the acquired problem text samples, in the training process, the problem text samples can be encoded through the first encoding layer to obtain corresponding first sentence pattern vectors, a preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector is filled based on the first sentence pattern vectors, so that the problem text samples are analyzed at one time to obtain text fragments corresponding to at least one target intention vector, the problem intention reflected by each text fragment can be determined by the corresponding initial intention vector, the problem text samples can be converted into different intention spaces through the problem analysis model, entities and constraints are marked under the intention vectors corresponding to each intention space, the improvement of the slot recognition efficiency is facilitated, the number of models required by the slot recognition processing flow is reduced, the time delay is reduced, the slot recognition is simultaneously executed in each intention space, the concurrency can be increased, and the intention recognition capability and the recognition efficiency of the question-answering system are improved.
Drawings
For a clearer description of embodiments of the present application or of the solutions of the prior art, the drawings that are required to be used in the description of the embodiments or of the prior art will be briefly described, it being obvious that the drawings in the description below are only some of the embodiments described in the present specification, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art;
FIG. 1 is a process flow diagram of a model training method of a question-answering system according to an embodiment of the present application;
fig. 2 is a data flow chart of a problem analysis model in a model training method of a question answering system according to an embodiment of the present application;
FIG. 3 is a data flow chart of an entity link model in a model training method of a question-answering system according to an embodiment of the present application;
fig. 4 is a process flow diagram of a response method provided in an embodiment of the present application;
fig. 5 is a session management flow chart in a response method provided in an embodiment of the present application;
fig. 6 is a schematic diagram of a model training device of a question-answering system according to an embodiment of the present application;
fig. 7 is a schematic diagram of a response device according to an embodiment of the present application;
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions in the embodiments of the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
In the practical application of the question-answering system, on one hand, the questions presented by the user can be spoken or one-sided, so that the sentence pattern of the questions is rich and various, and the intention recognition difficulty of the question-answering system is increased; on the other hand, the information that the user may provide when asking a question may be one-sided, and if the user merely answers based on a question that the information is insufficient, it is likely that a reply that is satisfactory to the user cannot be found. In order to solve the above problems, an embodiment of the present application provides a sample generation method of a question-answering system.
Fig. 1 is a process flow diagram of a model training method of a question-answering system according to an embodiment of the present application. The model training method of the question-answering system of fig. 1 can be performed by an electronic device, which may be a terminal device such as a mobile phone, a notebook computer, an intelligent interaction device, etc.; alternatively, the electronic device may be a server, such as a stand-alone physical server, a server cluster, or a cloud server capable of cloud computing. Referring to fig. 1, the model training method of the question-answering system provided in this embodiment specifically includes steps S102 to S104.
The question answering system may include an initial question resolution model, and may also include an initial question classification model and an initial entity link model.
The initial problem classification model may be an untrained OOD (out of the design scope, out of design scope) model.
In implementation, a problem domain may be preconfigured, and a problem located inside the problem domain is an intra-domain problem, and a problem located outside the problem domain is an extra-domain problem. The question answering system has the ability to answer questions within the domain and does not have the ability to answer questions outside the domain. The OOD model may be used to classify the received problem, where the classification result includes at least: intra-domain issues and extra-domain issues.
The initial problem resolution model may be an untrained relationship extraction model. The relation extraction model can be used for carrying out slot identification processing to obtain slot identification results, wherein the slot identification results include but are not limited to: entities, attributes, relationships, constraints, etc.
The entity may be a person name, place name, institution name, preset proper noun, etc. Concepts (concepts), also called classes, are abstract descriptions of a class of collections of objects with the same characteristics. Concepts may be used to reflect categories. One concept may correspond to multiple entities.
For example, the concept "plant" may correspond to the following entities: "willow", "cactus", "sakura", etc.
The attributes may be used to reflect the characteristics of the entity. A concept may correspond to multiple attributes.
For example, the concept "plant" may correspond to the following attributes: "name", "category", "shape", "growing environment", "distribution range", and "breeding method", etc.
For each attribute, an attribute name, an attribute type, and an attribute description may be preconfigured.
Wherein the attribute types include, but are not limited to: text, numbers, pictures, rich text, json.
Relationships (relationships) are used to describe relationships between concepts, which are in turn divided into categorical relationships and non-categorical relationships. In practical applications, the corresponding relationship, such as "causal relationship", can be customized according to the specific field and the specific application.
The constraint may be a constraint. For example, "please ask what are services with annual interest rate less than x? In the question, the slot recognition result of the annual interest rate being less than x% is a constraint.
The initial entity linking model may be an untrained Bert classification model. The Bert classification model may be used to perform a classification prediction process, linking the entity fragment of the input model to one entity node in a preconfigured knowledge-graph.
The knowledge-graph includes a plurality of entity nodes, each entity node corresponding to an entity. Knowledge graph can be obtained by knowledge extraction to form triples through ontology and corpus, and knowledge of business data is obtained through construction.
Step S102, acquiring a question text sample.
The question text sample may be a pre-generated question resolution sample. Each question text sample may be a word, a word combination, an incomplete sentence or a complete sentence, etc.
The problem resolution sample may be generated as follows:
acquiring a preset synonym set, a similarity question set and a comparison word set; the synonym set comprises a plurality of standard words and at least one synonym corresponding to each standard word; the similarity question set comprises a plurality of first attributes and at least one similarity question corresponding to each first attribute; the comparison word set comprises comparison word information; and generating a problem analysis sample according to at least one of the synonym set, the similarity question set and the comparison word set.
In particular implementations, the set of synonyms may be obtained from a pre-configured synonym template.
In the synonym template, the following fields may be displayed: standard words, types, synonyms, update times, operations, etc. Specifically:
(a1) Standard word: automatically generating by a system, wherein the value is derived from defined entities and attributes in the corresponding map;
(a2) Type (2): the system automatically generates values derived from the types corresponding to the entities and attributes defined in the corresponding map.
(a3) Synonyms of: manually entered by "edit" or batch entered by "upload synonyms".
(a4) Update time: synonym update modification time.
The synonym set includes a plurality of criteria and at least one synonym corresponding to each criteria, e.g., the synonym set includes:
a standard word x1, a synonym x2, a synonym x3 and a synonym x4 corresponding to the standard word x 1;
a standard word y1, and a synonym y2 corresponding to the standard word y 1;
a standard word z1, a synonym z2 corresponding to the standard word z1, a synonym z3, and the like.
In practice, the set of questions may be obtained from a pre-configured question-of-similarity template.
In the similarity question template, the entity type column corresponds to the concept and the attribute column corresponds to the attribute for which the similarity question is constructed, a mask [ e ] may be configured to leave a place to fill in the synonyms when maintaining the similarity question template.
For example, "how do this parking space credit transact? "how should this [ e ] be handled in a similarity question template? ".
The similarity question template can be used for generating a question analysis data set which comprises a plurality of question analysis samples, and can be used for forming complete sentences together with synonyms and comparison words to simulate user problems in a real scene.
In the similarity query template, the following fields may be displayed: concept, attribute name, attribute type, similarity, update time, operation, etc. Specifically:
(b1) Concept: the well defined concept in the corresponding map can be searched and can not be newly added and edited.
(b2) Attribute type: the defined attributes in the corresponding map cannot be edited.
Attribute types include, but are not limited to: text, numbers, pictures, rich text, maps, etc.
(b3) Similarity query: the similarity of a specific attribute can be maintained singly or imported in batches, and the similarity can be one or a plurality of similarity.
(b4) Update time: similarity asks for the latest modification time.
(b5) The operation is as follows: editing and deleting.
The similarity question set includes a plurality of first attributes and at least one similarity question corresponding to each first attribute, for example, the similarity question set includes:
a first attribute, namely a similar question mark 1 and a similar question mark 2 corresponding to the first attribute;
a first attribute b, a similar question b1 corresponding to the first attribute b, and so on.
In particular implementations, the set of comparison terms may be obtained from a pre-configured comparison term template.
The comparison word template can be used for generating a question analysis data set, and maintainable candidate words are mainly attribute words with the attribute type of 'numbers' of concepts, and the purpose of the comparison word template is to ensure that training data of relevant comparison types are generated in the analysis data set.
The comparison word information may include comparison words and units.
The comparison terms include, but are not limited to: minimum, smallest, fewest, shortest, cheaply, low, small, few, short, cheaply, highest, largest, most, longest, most expensive, high, large, many, long, expensive, etc.
Units include, but are not limited to: meta, thousand, ten thousand, hundred million, day, month, year,%, percent, and the like.
Problem resolution samples include, but are not limited to: single slot samples, single attribute samples, single entity dual attribute samples, dual entity single attribute samples, dual entity dual attribute samples, composite attribute constraint samples, comparison type samples, and so forth. Illustratively, the generated problem resolution sample may be stored with reference to the following format:
Figure BDA0004129175510000071
Figure BDA0004129175510000081
the above-described storage format is merely exemplary, and does not constitute a particular limitation on the present embodiment.
The map attribute may be an attribute of the generated data in the map, for representing a deposit intention of the generated data in the map. The different types of problem resolution samples are generated in different ways, and several of them are specifically described below:
(c1) Single slot samples.
The single slot samples may be text samples corresponding to only one of a plurality of preset slot types, such as entities, attributes, relationships, constraints, and the like.
Each preset slot type corresponds to one of the foregoing slot identification results.
For example, the question "what the price of the product a is" is subjected to the slot identification processing by the relation extraction model, and the obtained slot identification result includes: the slot identification result of the "A product" is "entity", the slot identification result of the "price" is "attribute", and so on.
The slot identification result "entity" corresponds to a preset slot type "entity". The single slot sample belonging to the preset slot type entity can be an A product, a B activity, a C commodity and a D commodity, and the like.
The slot identification result "attribute" corresponds to a preset slot type "attribute". The single slot samples belonging to the preset slot type "attribute" may be "price", "campaign expiration date", "interest rate", "discount", etc.
Generating a problem resolution sample from at least one of the set of synonyms, the set of similarity questions, and the set of comparison questions, comprising: filtering the synonym set according to a first preset filtering condition to obtain a candidate word list; the candidate word list comprises candidate standard words and candidate synonyms corresponding to the candidate standard words; and performing first sampling processing on the candidate word list to obtain a single-slot sample.
The generation of the single-slot sample is used for simulating a question when the intention of a user is unknown, and can be used in a guide question-and-answer and continuous question-and-answer scene.
Single slot samples include, but are not limited to: single entity data, double entity data, and single constraint data.
It should be noted that a "dual entity sample" is also a single slot sample, although it includes two entities, both of which correspond to a predetermined slot type, i.e., a predetermined slot type "entity".
The first preset filtering condition may include a filtering condition "entity" and a filtering condition "json key". And filtering the synonym set based on the filtering condition 'entity', obtaining a candidate word list, and performing first sampling processing on the candidate word list to generate single-entity data and/or double-entity data. The sampling mode of the first sampling process may be random sampling or other sampling modes.
The single entity data may be, for example,
Figure BDA0004129175510000091
the "bill business" located in the "text" line may be generated data, that is, text corresponding to the single entity data. "NULL" may be a graph attribute that indicates that the single entity data is intended to be empty. "0" may be a head entity start position, "4" may be a head entity end position, "business" may be a head entity category, "ticket business" located in a "value" line may be a head entity standard name.
And filtering the synonym set based on a filtering condition json to obtain a candidate word list, and performing first sampling processing on the candidate word list to generate single constraint data.
The single constraint data may be, for example,
Figure BDA0004129175510000101
the "X branch line" located in the "text" line may be generated data, that is, text corresponding to the single constraint data. The "transacting process" may be a map attribute for indicating that the deposit intention of the single constraint data is a transacting process. "0" may be a head entity start position, "4" may be a head entity end position, "constraint" may be a head entity category, and "X branch" located in "value" line may be a head entity standard name.
(c2) Single attribute samples.
The single attribute sample may be a text sample corresponding only to a preset slot type "attribute". The single attribute sample may be a special single slot sample, and is described separately because the single attribute sample is different from the above listed "single entity sample", "double entity sample" and "single constraint sample".
Generating a problem resolution sample from at least one of the set of synonyms, the set of similarity questions, and the set of comparison questions, comprising: performing second sampling processing on a plurality of first attributes in the similarity set to obtain target first attributes; performing third sampling processing on at least one similar question corresponding to the first attribute of the target to obtain an initial similar question; if the initial similar question carries a mask, deleting the mask in the initial similar question to obtain a single attribute sample; if the initial similar question does not carry a mask, the initial similar question is determined to be a single attribute sample.
The generation of the single attribute sample is to generate a question sentence for simulating the condition that the user inquires about the unknown main body, and the question sentence is used in a guide question-and-answer scene and a continuous question-and-answer scene.
The single attribute sample may be single attribute data.
Performing second sampling processing on a plurality of first attributes in the similarity set to obtain target first attributes; and performing third sampling processing on at least one similar question corresponding to the first attribute of the target to obtain an initial similar question. If the initial similar question carries a mask, namely the initial similar question is a similar question with [ e ], deleting [ e ] to obtain single attribute data; if the initial similar question does not carry a mask, i.e. the initial similar question is a similar question without [ e ], the similar question is determined to be single attribute data.
The second sampling process and the third sampling process may be performed in a random sampling manner or in other sampling manners.
The single attribute data may be, for example,
Figure BDA0004129175510000111
the "what" the transaction flow is in the "text" line may be the generated data, i.e. the text corresponding to the single attribute data. The "transacting process" may be a map attribute for indicating that the deposit intention of the single attribute data is a transacting process. The single attribute data has no head entity nor tail entity.
(c3) And (5) compounding the sample.
The composite sample may be a text sample corresponding to a plurality of preset slot types among the plurality of preset slot types of entities, attributes, relationships, constraints, and the like.
Generating a problem resolution sample from at least one of the set of synonyms, the set of similarity questions, and the set of comparison questions, comprising: filtering the synonym set according to a second preset filtering condition to obtain a candidate entity word list; the candidate entity word list comprises a plurality of candidate entity words; screening the similarity question set to obtain an intermediate similarity question set; the intermediate similarity question set comprises a plurality of first attributes, and at least one candidate similarity question with a mask corresponding to each first attribute; determining target candidate entity words and target candidate similar questions corresponding to the same attribute category according to the candidate word list and the intermediate similar question set; and replacing masks in the target candidate similar questions by using target candidate entity words to obtain a composite sample.
Composite samples include, but are not limited to: single entity single attribute samples, single entity dual attribute samples, dual entity single attribute samples, dual entity dual attribute samples, and so forth.
The generation of the single-entity single-attribute sample is used for simulating a question sentence when a user normally inquires the attribute of a certain object, and the generation of the single-entity double-attribute sample is used in a single-entity single-attribute scene and is used in a single-entity multi-attribute scene. The generation of the double-entity single-attribute sample is used in a multi-entity single-attribute scene in order to simulate a question sentence when a user simultaneously inquires the same attribute of a plurality of things. The generation of the dual-entity dual-attribute sample is used in a multi-entity multi-attribute scene in order to simulate a question when a user simultaneously inquires a plurality of attributes of a plurality of things.
The composite sample may be composite data. The composite data includes, but is not limited to: single entity single attribute data, single entity dual attribute data, dual entity single attribute data, and dual entity dual attribute data.
The second preset filter term may include a filter term "entity". Filtering the synonym set based on a filtering condition 'entity' to obtain a candidate entity vocabulary; the candidate entity word list comprises a plurality of candidate entity words; screening the similarity question set to obtain an intermediate similarity question set; the intermediate similarity question set comprises a plurality of first attributes, and at least one candidate similarity question with a mask corresponding to each first attribute; according to the candidate word list and the intermediate similarity question set, determining candidate entity words and candidate similarity question sentences corresponding to each attribute category; randomly sampling the candidate entity words and the candidate similar questions corresponding to each attribute category to obtain target candidate entity words and target candidate similar questions corresponding to the same attribute category; and replacing the mask [ e ] in the target candidate similar question by using the target candidate entity word to obtain the composite data.
Determining target candidate entity words and target candidate similar questions of the same attribute category; and replacing the mask [ e ] in the target candidate similar question by the target candidate entity word to obtain composite data, wherein one target candidate entity word and one target candidate similar question under the same attribute category are randomly selected, and the mask [ e ] in the target candidate similar question is replaced by the target candidate entity word to obtain single-entity single-attribute data.
Single entity single attribute data, e.g.,
Figure BDA0004129175510000121
Figure BDA0004129175510000131
where to support the transaction service A can be the generated data, namely the text corresponding to the single-entity single-attribute data. The "transacting channel" may be a map attribute for indicating that the deposit intention of the single-entity single-attribute data is a transacting channel. "9" may be a head entity start position, "12" may be a head entity end position, "traffic" may be a head entity category, "traffic a" may be a head entity standard name.
Determining target candidate entity words and target candidate similar questions of the same attribute category; and replacing the mask [ e ] in the similar question of the target candidate by the target candidate entity word to obtain composite data, or randomly selecting one target candidate entity word and two similar questions of the target candidate under the same attribute category, replacing the mask [ e ] in the similar question of the first target candidate by the target candidate entity word to obtain an attribute question 1, deleting [ e ] in the similar question of the second target candidate to obtain an attribute question 2, and splicing the attribute question 1 and the attribute question 2 to obtain single-entity double-attribute data.
Single entity dual attribute data, e.g.,
Figure BDA0004129175510000132
/>
Figure BDA0004129175510000141
the "please ask what data to submit for business X, and ask what requirements to ask for the application" may be the generated data, i.e. the text corresponding to the single-entity dual-attribute data. The "transacted material" and the "transacted condition" may be two different map attributes for indicating that the deposit intention of the single-entity dual-attribute data includes the transacted material and the transacted condition. "4" may be a head entity start position, "11" may be a head entity end position, "business" may be a head entity category, "exit letter" may be a first head entity standard name, "X" may be a second head entity standard name.
It should be noted that in the single-entity dual-attribute data, the first header entity and the second header entity are the same entity, so that the first header entity start position is the same as the second header entity start position, and the first header entity end position is the same as the second header entity end position.
Determining target candidate entity words and target candidate similar questions of the same attribute category; and replacing masks [ e ] in the target candidate similar question sentences through target candidate entity words to obtain composite data, or randomly selecting two target candidate entity words and one target candidate similar question sentence under the same attribute category, randomly selecting one of a plurality of preset splicing modes as a target splicing mode, splicing the two target candidate entity words through the target splicing mode to obtain target splicing times, and replacing the masks [ e ] in the target candidate similar question sentences through the target splicing words to obtain double-entity parametric data.
The splicing mode is preset, for example, one of 'and', 'and' is adopted to splice the two words together.
The dual entity single attribute data is, for example,
Figure BDA0004129175510000142
/>
Figure BDA0004129175510000151
the "how much money the credit line of the A service and the B service has" can be the generated data, namely the text corresponding to the double-entity single-attribute data. The "credit" may be a map attribute, which is used to indicate that the deposit intention of the dual entity single attribute data is the credit. "0" may be a first head entity start position, "6" may be a first head entity end position, "7" may be a second head entity start position, "12" may be a second head entity end position, "traffic" may be a head entity category, "a traffic" may be a first head entity standard name, and "B traffic" may be a second head entity standard name.
It should be noted that in the dual-entity single-attribute data, the first header entity and the second header entity are two different entities, but the map attributes of the two entities are the same. The dual entity dual attribute sample may be dual entity dual attribute data.
In the implementation, a plurality of pieces of data, for example, single-entity single-attribute data 1 and single-entity single-attribute data 2, can be randomly extracted from the generated plurality of single-entity single-attribute data, and the single-entity single-attribute data 1 and the single-entity single-attribute data 2 are spliced through the randomly selected connector to form the double-entity double-attribute data.
Dual entity dual attribute data, e.g.,
Figure BDA0004129175510000161
the "limit usage mode of the Y product" specifically includes what kind of card can be used by the Z service in general ", and the" limit usage mode of the Y product "may be generated data, that is, text corresponding to the dual-entity dual-attribute data. The "credit usage mode" and the "category" may be two different map attributes, which are used to indicate that the storage intention of the dual-entity dual-attribute data includes the credit usage mode and the category. "0" may be a first head entity start position, "3" may be a first head entity end position, "16" may be a second head entity start position, "24" may be a second head entity end position, "business" may be a head entity category, "Y product" may be a first head entity standard name, and "Z business" may be a second head entity standard name. In addition, the composite attribute constraint sample may be composite attribute constraint data.
The generation of the composite attribute constraint sample is used in a composite attribute constraint scene and a continuous query scene in order to simulate a question when a user inquires about a certain attribute of a certain object under a certain limiting condition.
And filtering the synonym set based on the filtering condition 'json key' to obtain json attribute synonyms, and filtering the synonym set based on the filtering condition 'entity' to obtain entity synonyms.
And screening the similarity question set to obtain a similarity question with a mask, namely a similarity question with [ e ].
According to the json attribute synonyms, the entity synonyms and the similar question sentences with [ e ], the entity synonyms, the json attribute synonyms and the similar question sentences corresponding to each attribute category can be determined.
And randomly sampling entity synonyms, json attribute synonyms and similar questions corresponding to each attribute category to obtain target entity synonyms, target json attribute synonyms and target similar questions. And replacing [ e ] in the target similar question through the target entity synonym to obtain an intermediate similar question, randomly determining a splicing mode, and splicing the target json attribute synonym and the intermediate similar question according to the determined splicing mode to obtain the composite attribute constraint data.
The composite attribute constraint data is, for example,
Figure BDA0004129175510000171
the "can say that the business is transacted with the second flow" can be the generated data, namely the text corresponding to the constraint data of the composite attribute. The "transacting process" may be a map attribute for indicating that the deposit intention of the composite attribute constraint data is a transacting process. "3" may be a head entity start position, "6" may be a head entity end position, "11" may be a tail entity start position, "15" may be a tail entity end position, "service" may be a head entity category, "constraint" may be a tail entity category, "service b" may be a head entity standard name, "cell phone bank" may be a tail entity standard name.
(c4) Comparing the type samples.
Generating a problem resolution sample from at least one of the set of synonyms, the set of similarity questions, and the set of comparison questions, comprising: filtering the synonym set according to a third preset filtering condition to obtain attribute synonyms and entity synonyms; generating a random number; and performing splicing processing according to the comparison word information, the attribute synonyms, the entity synonyms, the random numbers and the preset descriptors to obtain a comparison type sample.
The generation of the comparison type sample is used in a comparison sentence scene for simulating question sentences needing numerical comparison, such as maximum, minimum, greater than, less than, and the like, in a certain attribute under a certain object of a user inquiry.
The comparison type sample may be comparison type data.
And filtering the synonym set according to a third preset filtering condition to obtain the attribute synonym and the entity synonym.
A random number is generated.
And performing splicing processing according to the comparison word information, the attribute synonyms, the entity synonyms, the random numbers and the preset description words in the comparison word set to obtain comparison type data.
The comparison type data is used, for example,
Figure BDA0004129175510000181
wherein, the "how many years the M product can be longest" can be the generated data, namely the text corresponding to the comparison type data. The term may be a map attribute for indicating that the storage intention of the comparison type data is a term. "0" may be a head entity start position, "4" may be a head entity end position, "business" may be a head entity category, "asset business" may be a head entity standard name. The descriptors are preset, for example, what all have the yam, what are speaking, etc.
In particular implementations, the comparison word information, e.g., attributes, comparison words, units, etc., may be synchronized from the comparison word templates; taking attribute synonyms with the type of numbers in the synonym template, such as amount, term and the like, and acquiring entities under the same type of non-leaf nodes from the entity synonyms; when the maximum, minimum, greater than or less than type data are generated, firstly splicing attribute synonyms, comparison words, generated random numbers and units, and then randomly selecting a position from the spliced character strings, entity synonyms and preset description words to splice.
Assuming ABC for each of the above, then the splice may have: ABC, ACB, BAC, BCA, CBA, CAB, etc., composition properties.
In specific implementation, the numerical single constraint data can also be composed of comparison words, random numbers and units, for example, more than 4.35 percent and less than 5 years.
In the specific implementation, the entity and the synonym which are leaf nodes under the same type in the knowledge graph can be selected, one splicing mode is selected randomly to splice the two entity synonyms, and then the attribute word and the comparison word are spliced through the connecting word to form an attribute question. For example, who is of high interest rate for A and B; who of A and C is the period of time long?
Step S104, inputting a problem text sample into an initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer; the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector; the conversion layer is used for generating a preset number of initial intention vectors under the condition of receiving the first sentence pattern vector, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; the text segment is used for inquiring answers corresponding to the text sample of the question in the question-answering system.
It should be noted that, the structure of the initial problem analysis model is completely consistent with that of the problem analysis model obtained after iterative training, and the model parameters participating in training are different. The problem analysis model obtained through iterative training also has a first coding layer and a conversion layer.
The first encoding layer may be a Bert encoder or other components that may be used to convert text into sentence-based vectors. The conversion layer may be composed of a spatial conversion layer and a plurality of two-class full connection layers.
The output of the first encoding layer may be an input of the conversion layer.
The initial intention vector can be a vector with a fixed number of elements and unknown each element, each initial intention vector can be used for representing a corresponding intention space, after the initial intention vector is filled, part of elements are known, and the remaining unknown elements can be complemented by specified values, so that a target intention vector corresponding to the initial intention vector can be obtained.
And generating a preset number of initial intention vectors under the condition that the first sentence pattern vector is received, wherein the number of elements included in the preset number of initial intention vectors is the same.
The number of elements included in each initial intent vector may be a fixed value that is pre-configured. Specifically, under the condition that the first sentence pattern vector is received, generating a preset number of initial intention vectors; the number of elements included in the initial intent vector is a preset number.
The number of elements included in each initial intent vector may also be determined based on the number of characters corresponding to the first sentence pattern vector. Specifically, under the condition that a first sentence pattern vector is received, generating a preset number of initial intention vectors according to the first sentence pattern vector; the number of elements included in the initial intent vector is determined based on the number of characters corresponding to the first sentence pattern vector.
In a specific implementation, the first sentence pattern vector includes a semantic feature sub-vector and a plurality of character sub-vectors; filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, wherein the specific implementation manner comprises the following steps: performing intention classification processing on the semantic feature sub-vectors to obtain corresponding intention classification results; filling each initial intention vector according to the intention classification result to obtain a corresponding intermediate intention vector; and filling each intermediate intention vector according to each character sub-vector to obtain a corresponding target intention vector.
Illustratively, the semantic feature sub-vector may be represented by [ cls ]. The cls is not a vector for representing a certain character in the text, is a semantic feature vector for representing the whole text, and can be directly used for classification after being taken out.
The question text sample may include a plurality of characters, and after the question text sample is encoded by the first encoding layer to obtain a corresponding first sentence pattern vector, the first sentence pattern vector may include a character sub-vector corresponding to each character in the question text sample.
The intent classification result may be used to represent, for each initial intent vector, whether the first sentence pattern vector has an intent corresponding to the initial intent vector. For example, for the initial intent vector 1, if the intent classification result is the first classification result, it is explained that the first sentence pattern vector does not have the intent corresponding to the initial intent vector 1; if the intention classification result is the second classification result, the first sentence pattern vector is indicated to have the intention corresponding to the initial intention vector 1.
The intention classification processing is carried out on the semantic feature sub-vectors to obtain corresponding intention classification results, and when the method is concretely implemented, the intention classification processing can be carried out on the basis of [ cls ] through two classification full-connection layer linear to obtain the intention classification results. That is, linear has an input value of [ cls ], and the output value includes a plurality of intention classification results, each corresponding to one of the initial vectors of intention, each of which may be one of the first classification result and the second classification result.
And filling each initial intention vector according to the intention classification result to obtain a corresponding intermediate intention vector, wherein the numerical value used for representing the intention classification result of the initial intention vector can be automatically filled in each initial intention vector to obtain the intermediate intention vector. For example, for the initial intention vector 1, if the intention classification result is the first classification result, a numerical value "0" for representing the first classification result is filled in the designated position of the initial intention vector 1; if the intention classification result is the second classification result, a numerical value "1" for representing the second classification result is filled in the designated position of the initial intention vector 1.
Illustratively, the value "0" for representing the first classification result is filled in the specified position of the initial intention vector 1, and an unknown element of the specified position may be replaced with "0".
It should be noted that, if the intent classification result corresponding to the initial intent vector is the first classification result, it is explained that the first sentence pattern vector does not have a corresponding intent, and thus, the character recognition result obtained by performing entity recognition and constraint recognition in the subsequent step is irrelevant to the initial intent vector, so that after the initial intent vector is filled with "0" to obtain an intermediate intent vector, the intermediate intent vector can be determined as the corresponding target intent vector.
In a specific implementation manner, according to each character sub-vector, filling processing is performed on each intermediate intention vector, so as to obtain a corresponding target intention vector, which includes: according to each character sub-vector, entity recognition processing and constraint recognition processing are carried out to obtain a corresponding character recognition result; and filling each intermediate intention vector according to the character recognition result to obtain a corresponding target intention vector.
In the implementation, entity recognition processing and constraint recognition processing can be performed through the two-classification full-connection layer linear according to each character sub-vector to obtain a corresponding character recognition result, and the character recognition result can be used for representing the entity type and the constraint type corresponding to the character sub-vector. The entity types comprise non-entities and a plurality of preset entity types, and the constraint types comprise non-constraints and a plurality of preset constraint types.
By generating the character recognition result, whether each character sub-vector belongs to an entity or constraint can be determined, and entity extraction or constraint extraction is further realized.
In a specific implementation manner, according to a character recognition result, filling processing is performed on each intermediate intention vector to obtain a corresponding target intention vector, and the specific implementation manner includes: if the character recognition result is used for representing that the character sub-vector belongs to the entity class or the constraint class, the corresponding intermediate intention vector is determined and filled according to the character recognition result.
If the character recognition result is used for representing that the character sub-vector belongs to the entity category, entity extraction processing can be performed according to the character recognition result, intention prediction processing is performed on the extracted entity, an intermediate intention vector corresponding to the entity is determined, and the entity is automatically filled in the corresponding intermediate intention vector.
The fill position of the entity in the intermediate intent vector may be determined based on the corresponding character position of the entity in the question text sample.
If the character recognition result is used for representing that the character sub-vector belongs to the constraint category, constraint extraction processing can be performed according to the character recognition result, intention prediction processing is performed on the extracted constraint, an intermediate intention vector corresponding to the constraint is determined, and the constraint is automatically filled in the corresponding intermediate intention vector.
The fill position of the constraint in the intermediate intent vector may be determined based on the corresponding character position of the constraint in the question text sample.
In a specific implementation, the question text sample includes at least one of an entity element, an attribute element, a relationship element, a constraint element; the preset number of initial intention vectors comprises a first intention vector and a plurality of second intention vectors; the first intent vector corresponds to an unintentional diagram, and each of the second intent vectors corresponds to an attribute element or a relationship element.
The question text sample includes at least one of an entity element, an attribute element, a relationship element, and a constraint element. The slot identification result of the entity element may be "entity", the slot identification result of the attribute element may be "attribute", the slot identification result of the relationship element may be "relationship", the slot identification result of the constraint element may be "constraint", and so on.
The question text sample is illustrated by way of a few examples below.
The problem text sample may be converted to different intent spaces by the problem resolution model, each of which corresponds to an initial intent vector.
If the question text sample includes an attribute element, the attribute element may be used to determine a target initial intent vector corresponding to the attribute element from a preset number of initial intent vectors.
If the question text sample includes a relationship element, the relationship element may be used to determine a target initial intent vector corresponding to the relationship element from a preset number of initial intent vectors.
Filling each initial intention vector according to the first sentence pattern vector, namely labeling entities and constraints under the initial intention vector corresponding to each intention space. Labeling the entity and the constraint in the target initial intention vector corresponding to the attribute element can enable the generated target intention vector to reflect at least one of the corresponding relation between the attribute and the entity and the corresponding relation between the attribute and the constraint. Labeling the entity and the constraint in the target initial intention vector corresponding to the relation element can enable the generated target intention vector to reflect at least one of the corresponding relation between the relation and the entity and the corresponding relation between the relation and the constraint.
In the processing process of slot identification, a certain corresponding relationship may exist among a plurality of slot elements such as entity elements, attribute elements, relationship elements, constraint elements and the like in the problem text sample.
For example, the question text samples are: what is the price of the A product, what is the time to shelf the B product? Wherein "A product" and "B product" are entities, "price" and "time to shelf" are attributes, where "A product" corresponds to "price" and "B product" corresponds to "time to shelf". If the slot recognition processing is performed for each element, misunderstanding may be generated on the user intention corresponding to the question text sample, so that the prices of both the product a and the product B and the time to put on shelf of both the product a and the product B are fed back to the user.
In this embodiment, under the condition that each initial intent vector corresponds to one attribute element or one relationship element, by labeling entities and constraints under the initial intent vector corresponding to each intent space, slot recognition can be performed at the same time in different intent spaces, so that not only can each slot element in the question text sample be recognized, but also the corresponding relationship between the attribute and the entities, the corresponding relationship between the attribute and the constraints, the corresponding relationship between the relationship and the entities, and the corresponding relationship between the relationship and the constraints can be recognized and obtained. Therefore, model training for identifying the slot positions by adopting corresponding models for each slot position element is not needed, the number of models required by the slot position identification processing flow is reduced, the time delay in the slot position identification processing flow is reduced, and labeling of entities and constraints is simultaneously executed in a plurality of intention spaces, so that the slot position identification result of each slot position element can be obtained by analyzing a problem text sample at one time, and the slot position identification efficiency is improved.
The question text sample may be composed of entity elements and attribute elements, e.g., the question text sample is: how does the cost of product a calculate? Wherein the entity element is "product A", the attribute element is "how cost is calculated", and the attribute element corresponds to the attribute "price".
The question text sample may be composed of entity elements and constraint elements, e.g., the question text sample is: what are restaurant activities available to participate prior to the current day? Wherein, the entity element is "restaurant activity", and the constraint element is "before this sunday".
The question text sample may be made up of entity elements and relationship elements, e.g., what are the business channels of the C-campaign? Wherein, the entity element is "C activity", and the relation element is "transacting channel".
The question text sample may be made up of entity elements, e.g., D services. Wherein the entity element is "D service". In the case where the question text sample includes only solid elements, the question is typically incomplete and a back-question is required to guide the user of the question to supplement the question. In this case, the intention classification result of the first intention vector may be a second classification result, and the intention classification result of each second intention vector may be the first classification result, i.e., the first sentence pattern vector does not have any intention.
The structure of the problem analysis model and the data processing method inside the problem analysis model will be exemplarily described with reference to fig. 2. Fig. 2 is a data flow chart of a problem analysis model in a model training method of a question answering system according to an embodiment of the present application.
As shown in fig. 2, the question text sample 202 is "how good it can be for sucking fat? How much the face thinning cost is calculated). Sep is used to represent the separator between two question text samples. [ cls ] is the meaning of classification, and can be understood as a classification task for downstream. For text classification tasks, the BERT model inserts a [ cls ] symbol in front of the text and uses the output vector corresponding to the symbol as a semantic representation of the entire text for text classification.
The first encoding layer may be the Bert encoder 204 and the conversion layer may include a spatial conversion layer 208, a bi-class full-concatenation layer 210, and a bi-class full-concatenation layer 212.
The Bert encoder 204 is configured to convert the question text samples 202 into corresponding sentence vectors 206 and send the sentence vectors to the spatial conversion layer 208; the space conversion layer 208 is configured to generate a preset number of initial intent vectors upon receiving the sentence pattern vector 206, where the initial intent vector 1 corresponds to the intent space Null220, the initial intent vector 2 corresponds to the intent space "introduction" 218, the initial intent vector 3 corresponds to the intent space "price" 216, and the initial intent vector 4 corresponds to the intent space "recovery period" 214.
The two-classification full-connection layer 210 may be configured to perform an intention classification process according to the semantic feature sub-vector H [ cls ]2062, obtain an intention classification result corresponding to each intention space, and fill in: the intent space "recovery period" 214 corresponds to the second classification result, filling in "1" in the first place; the intended space "price" 216 corresponds to the second classification result, filling in "1" in the first place; the intention space "introduction" 218 corresponds to the first classification result, filling "0" in the first place; the intention space Null220 corresponds to the first classification result, filling in "0" in the first place.
The two-classification full-connection layer 212 may be configured to perform entity recognition processing and constraint recognition processing according to each character sub-vector 2064 to obtain a corresponding character recognition result, and perform entity extraction processing and constraint extraction processing based on the character recognition result to obtain an entity "fat-absorbing" and a "lean face" corresponding to the first sentence pattern vector. The entity 'fat suction' is subjected to intention prediction processing, an intention space corresponding to the entity 'fat suction' is determined to be an intention space 'recovery period' 214 and is filled, and a target intention vector corresponding to the intention space 'recovery period' is obtained. And carrying out intention prediction processing on the entity 'thin face', determining an intention space corresponding to the entity 'thin face' as an intention space 'price' 216, and filling to obtain a target intention vector corresponding to the intention space 'price'.
Corresponding text segments may be generated from the target intent vector corresponding to the intent space "recovery period" and the target intent vector corresponding to the intent space "price". For the intention space in which the other intention classification result is the first classification result, the corresponding target intention vector may be discarded.
In a specific implementation, the question-answering system further includes an initial entity link model; the model training method of the question-answering system further comprises the following steps: obtaining an entity link sample; inputting the entity link sample into an initial entity link model for iterative training to obtain an entity link model; the entity link model comprises a second coding layer and a prediction layer; the second coding layer is used for carrying out coding processing on the entity link samples to obtain corresponding second sentence pattern vectors; the prediction layer is used for performing prediction processing according to the second sentence pattern vector and determining a corresponding target entity.
The entity link sample is obtained specifically by the following way: obtaining a standard entity library; the standard entity library comprises a plurality of standard entities; and generating an entity link sample according to the problem analysis sample and the standard entity library.
The initial entity link model is completely consistent with the entity link model obtained after iterative training, and model parameters participating in training are different. The entity link model obtained through iterative training also has a second coding layer and a prediction layer.
The second encoding layer may be a Bert transformer, or may be other components that may be used to transform text into sentence-based vectors. The prediction layer may be a two-class full-connection layer linear, or may be another component that may be used to map semantic feature sub-vectors to corresponding entities.
In specific implementation, generating an entity link sample according to the problem analysis sample and the standard entity library, wherein the entity link sample comprises the following steps: determining a target analysis sample carrying a non-standard entity according to the problem analysis sample; calculating the similarity between the non-standard entity and each standard entity in the standard entity library and sequencing; determining a preset number of target standard entities corresponding to the non-standard entities according to the sorting result; and constructing positive and negative samples corresponding to the nonstandard entities according to the nonstandard entities and a preset number of target standard entities, and determining the positive and negative samples as entity link samples.
And determining a second preset number of target standard entities corresponding to the non-standard entities according to the sorting result, wherein the second preset number of target standard entities can be the preset number of standard entities with the maximum similarity and are determined to be the target standard entities.
For example, the non-standard entity "unit notification deposit", the top5 entity most similar to the standard entity library is recalled according to the similarity, which are "unit notification deposit", "unit regular deposit", "unit live deposit", "unit agreement deposit", "unit regular book pass", respectively.
Positive and negative samples corresponding to the non-standard entity are constructed according to the non-standard entity and a second preset number of target standard entities, positive samples can be constructed according to the non-standard entity and the target standard entity with the largest similarity, and negative samples can be constructed according to the non-standard entity and the target standard entity except the target standard entity with the largest similarity.
For example, the positive sample constructed is { "text": "" trade time of unit notification deposit "is the point [ sep ] unit notification deposit", "label":1}.
The negative samples constructed included:
transaction time of { "text": "unit notification saving" is the point [ sep ] unit periodic saving "," label ":0};
transaction time of { "text": "unit notification saving" is several points [ sep ] unit live saving "," label ":0};
Transaction time of { "text": "unit notification deposit" is the point [ sep ] unit agreement deposit "," label ":0};
transaction time of { "text": "unit notification saving" is the periodic one-pass of the unit of several points [ sep ], "label":0}.
The prediction layer is configured to perform prediction processing according to the second sentence pattern vector, and determine a corresponding target entity, and specifically, the prediction layer may be configured to perform prediction processing according to a semantic feature sub-vector in the second sentence pattern vector, and determine the corresponding target entity.
During the training phase, the bert classification model may be trained using the generated entity-linked data set. In the prediction stage, after the entity identification is completed through the problem analysis model, an entity fragment is obtained, and the entity with higher similarity is recalled in the map, and similar to the construction of data, the question sentence with the fragment and the recalled entity are respectively spliced. Predictions using entity linking models will link to unique entities in the atlas.
The structure of the physical link model and the flow of data processing within the physical link model may be illustrated below in connection with fig. 3. Fig. 3 is a data flow chart of an entity link model in a model training method of a question-answering system according to an embodiment of the present application.
As shown in fig. 3, the physical link sample 302 is "[ sep ] and uses $radio frequency thin face $money [ sep ] radio frequency fat-soluble thin face", and the Bert encoder 302 encodes the physical link sample 302 to obtain a corresponding second sentence pattern vector, where the second sentence pattern vector includes a semantic feature sub-vector H [ cls ]306. The semantic feature sub-vector H [ cls ]306 is predicted by the two-class full-join layer 308 to determine a corresponding entity prediction result 310.
In a specific implementation, the question answering system further includes an initial question classification model; the model training method of the question-answering system further comprises the following steps: acquiring a problem classification sample; and inputting the problem classification sample into an initial problem classification model for iterative training to obtain a problem classification model.
The problem classification sample may be obtained as follows:
acquiring a problem domain set; the problem domain set comprises an intra-domain problem text and an extra-domain problem text; and generating a problem classification sample according to the problem analysis sample and the problem domain set.
In particular implementations, the set of problem domains may be obtained from a pre-configured problem domain template.
The question domain template is used for generating an OOD model data set, maintaining questions, boring, and the like which are not answered by the question and answer system. The background can sample part of the question analysis data set to form two classification positive and negative samples, so that reasonable leakage of question and answer data is ensured. Since there is a limit in generating data, editing and importing of two problems, namely, in-domain and out-of-domain, is provided by the OOD template to ensure classification accuracy and data maintainability.
The set of problem domains includes an inside-domain problem text and an outside-domain problem text.
The intra-domain question may be a question located within the question domain and the text describing the intra-domain question may be intra-domain question text. The out-of-domain question may be a question that is outside the question domain and the text describing the out-of-domain question may be out-of-domain question text.
Intra-domain question text, such as "what day is the deadline of a activity? "; out-of-domain question text, e.g., "do i want to go out of the way today? "
The problem classification samples may be an OOD data set for training data required to construct an OOD model. The problem classification samples may include positive and negative samples, the positive samples being used to characterize the problem outside the domain, identified with label "1", and the negative samples being used to characterize the problem inside the domain, identified with label "0".
By way of example only, and in an illustrative,
the positive sample may be { "text": "hello", "label":1}
The negative sample may be { "text": "unit informs that the transaction time of deposit is several points", "label":0}, or { "text": how long "the parking space credit is, and }," label ":0}
In specific implementation, generating a problem classification sample according to the problem analysis sample and the problem domain set, including: performing fourth sampling processing on the problem analysis sample to obtain a first intra-domain sample; generating a corresponding second intra-domain sample according to the intra-domain problem text; generating a corresponding outside-domain sample according to the outside-domain problem text; and generating a problem classification sample according to the first intra-domain sample, the second intra-domain sample and the outer-domain sample.
The fourth sampling process may be performed by random sampling or by other sampling methods.
In the embodiment shown in fig. 1, first, a question text sample is obtained; then, inputting a problem text sample into an initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer; the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector; the conversion layer is used for generating a preset number of initial intention vectors under the condition of receiving the first sentence pattern vector, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; the text segment is used for inquiring answers corresponding to the text sample of the question in the question-answering system. In this way, the initial problem analysis model is iteratively trained through the acquired problem text samples, in the training process, the problem text samples can be encoded through the first encoding layer to obtain corresponding first sentence pattern vectors, a preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector is filled based on the first sentence pattern vectors, so that the problem text samples are analyzed at one time to obtain text fragments corresponding to at least one target intention vector, the problem intention reflected by each text fragment can be determined by the corresponding initial intention vector, the problem text samples can be converted into different intention spaces through the problem analysis model, entities and constraints are marked under the intention vectors corresponding to each intention space, the improvement of the slot recognition efficiency is facilitated, the number of models required by the slot recognition processing flow is reduced, the time delay is reduced, the slot recognition is simultaneously executed in each intention space, the concurrency can be increased, and the intention recognition capability and the recognition efficiency of the question-answering system are improved.
The embodiment of the application also provides an embodiment of a response method, which has the same technical conception as the embodiment of the method. Fig. 4 is a process flow diagram of a response method provided in an embodiment of the present application. The processing flow of the response method referring to fig. 4 specifically includes steps S402 to S406.
S402, acquiring a target problem to be responded.
S404, inputting the target problem into a problem analysis model for analysis processing to obtain a corresponding target segment; the problem analysis model is obtained by training a model training method of a question-answering system.
The model training method of the question-answering system in this step may be the model training method of the question-answering system provided in the foregoing respective method embodiments.
In a specific implementation manner, inputting a target problem into a problem analysis model to perform analysis processing to obtain a corresponding target segment, including: inputting the target problem into a problem classification model for classification treatment to obtain a classification result; under the condition that the classification result is used for representing that the target problem belongs to the first preset classification, inputting the target problem into a problem analysis model for analysis processing to obtain a corresponding target segment.
Step S406, determining the answer of the target question according to the target segment.
In a specific implementation, determining an answer to the target question according to the target segment includes: inputting the target segment into an entity link model for prediction processing to obtain a corresponding target entity; according to the target entity, slot filling processing is carried out to obtain a slot filling result of the target problem; and inquiring corresponding answers in a pre-configured knowledge graph according to the slot filling result to obtain the answers of the target questions.
According to the target entity, the slot filling process may be that, for the target segment obtained in step S404, at least one slot template that may correspond to the target segment is first determined, then, for each slot template, slot filling process is performed on the slot template through the target segment, so as to obtain a filled slot template, and the filled slot template is determined as a slot filling result corresponding to the target problem.
Each slot template may include one or more slots to be filled.
For example, slot die plate 1 is:
what is the (attribute slot) of the (entity slot?
The slot position template 2 is:
(physical slot) (relationship slot) (physical slot)?
The slot position template 3 is:
what are (constraint slots) (physical slots?
It should be emphasized that the target entity may be used not only for filling the "entity slot", but also for filling various slots such as "property slot", "relationship slot", and "constraint slot".
Inquiring corresponding answers in a pre-configured knowledge graph according to the slot identification filling result, wherein if the slot identification filling result cannot be used for inquiring in the knowledge graph to obtain a unique corresponding answer, determining slot missing information corresponding to the slot identification filling result; generating a question-back sentence according to the slot deletion information; receiving user input in response to the question back; performing slot filling processing according to user input; and inquiring corresponding answers in a pre-configured knowledge graph according to the slot identification filling result after the slot filling process to obtain the answers of the target questions.
For example, the slot filling results of the target problem include the filled slot template 1: what is the (attribute slot of product a)? The filled slot template 1 cannot be used for inquiring in the knowledge graph to obtain a unique corresponding answer, and the missing slot information corresponding to the filled slot template 1 can be determined to be attribute missing. And generating a corresponding back question sentence 'please ask what information of the A product you want to consult' according to the slot deletion information 'attribute deletion'. User input "price" is received in response to the question back. The slot filling process is performed according to the user input, that is, the filling process is performed on the attribute slots in (what is the (attribute slot of product a) ", so as to obtain a slot template 1 after secondary filling: what is the (price of product a)? Inquiring the answer corresponding to 'what the price of the product A is' in a pre-configured knowledge graph according to the slot filling result after the slot filling processing, and obtaining the answer of the target question.
In the following, it may be exemplarily described how to determine corresponding answers in a knowledge graph according to slot filling results with reference to fig. 5. Fig. 5 is a session management flow chart in a response method provided in an embodiment of the present application.
As shown in fig. 5, after a question input by a user enters a session stack, the question answering system may perform algorithm analysis on the question to obtain an analysis result. The parsing result includes at least one slot identification result.
And (3) detecting an analysis result, and if the analysis result is used for representing OOD or no result, generating a response text according to a pre-configured response phone corresponding to the unknown/unable to answer/flag bit, wherein the response text is used for informing a user that the question and answer system cannot answer the question.
If the analysis result is used for representing a single result, the session stack detection is performed: for the first session, entering a slot management flow; for multiple rounds of conversations, a session management flow is entered.
If the analysis result is used for representing multiple results, entering a slot management flow, and carrying out slot filling processing through the slot management flow.
In the slot management flow, the question-back template to be generated, or the answer template, can be determined by compounding whether the constraint slot, the entity slot, the intention slot, the common constraint slot exists, whether the node is a leaf node, and whether the intention slot is a common intention or a multi-value intention.
In the session management flow, how to carry out logical inheritance in the multi-round session can be determined by means of composite constraint detection, common constraint detection, entity detection, intention detection, node judgment, entity inheritance, constraint inheritance, intention inheritance and the like.
Since the technical conception is the same, the description in this embodiment is relatively simple, and the relevant parts only need to refer to the corresponding descriptions of the method embodiments provided above.
In the foregoing embodiments, a model training method of a question-answering system is provided, and correspondingly, based on the same technical concept, the embodiments of the application further provide a model training device of a question-answering system, which is described below with reference to the accompanying drawings.
Fig. 6 is a schematic diagram of a model training device of a question-answering system according to an embodiment of the present application.
The present embodiment provides a model training device 600 of a question-answering system, which includes an initial question analysis model; the device comprises:
a first obtaining unit 602, configured to obtain a question text sample;
a first training unit 604, configured to input the problem text sample into the initial problem analysis model for iterative training, so as to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer;
The first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector;
the conversion layer is used for generating a preset number of initial intention vectors under the condition that the first sentence pattern vector is received, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; and the text segment is used for inquiring answers corresponding to the question text samples in the question and answer system.
Optionally, the first sentence pattern vector includes a semantic feature sub-vector and a plurality of character sub-vectors; the filling processing is performed on each initial intention vector according to the first sentence pattern vector, and the specific implementation manner of obtaining the corresponding target intention vector is as follows:
performing intention classification processing on the semantic feature sub-vectors to obtain corresponding intention classification results;
filling each initial intention vector according to the intention classification result to obtain a corresponding intermediate intention vector;
and filling each intermediate intention vector according to each character sub-vector to obtain a corresponding target intention vector.
Optionally, the intent classification result includes a first classification result and a second classification result; the specific implementation method for obtaining the corresponding target intention vector by filling each intermediate intention vector according to each character sub-vector comprises the following steps:
performing entity recognition processing and constraint recognition processing according to each character sub-vector to obtain a corresponding character recognition result;
and filling each intermediate intention vector according to the character recognition result to obtain a corresponding target intention vector.
Optionally, the filling processing is performed on each intermediate intention vector according to the character recognition result, so as to obtain a specific implementation manner of the corresponding target intention vector, which comprises the following steps:
if the character recognition result is used for representing that the character sub-vector belongs to an entity class or a constraint class, determining a corresponding intermediate intention vector according to the character recognition result and filling.
Optionally, the question text sample includes at least one of an entity element, an attribute element, a relationship element, a constraint element; the preset number of initial intention vectors comprises a first intention vector and a plurality of second intention vectors; the first intent vector corresponds to an unintentional diagram, and each of the second intent vectors corresponds to one of the attribute elements or one of the relationship elements.
Optionally, the question-answering system further comprises an initial entity link model; the model training apparatus 600 of the question-answering system further includes:
the third acquisition unit is used for acquiring an entity link sample;
the second training unit is used for inputting the entity link sample into the initial entity link model for iterative training to obtain an entity link model; the entity link model comprises a second coding layer and a prediction layer;
the second coding layer is used for carrying out coding processing on the entity link samples to obtain corresponding second sentence vectors;
the prediction layer is used for performing prediction processing according to the second sentence pattern vector and determining a corresponding target entity.
Optionally, the question answering system further comprises an initial question classification model; the model training apparatus 600 of the question-answering system further includes:
a fourth obtaining unit for obtaining a problem classification sample;
and the third training unit is used for inputting the problem classification sample into the initial problem classification model for iterative training to obtain a problem classification model.
The model training device of the question-answering system provided by the embodiment of the application comprises a first acquisition unit and a training unit, wherein: the first acquisition unit is used for acquiring a question text sample; the training unit is used for inputting the problem text sample into the initial problem analysis model for iterative training to obtain the problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer; the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector; the conversion layer is used for generating a preset number of initial intention vectors under the condition of receiving the first sentence pattern vector, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; the text segment is used for inquiring answers corresponding to the text sample of the question in the question-answering system. In this way, the initial problem analysis model is iteratively trained through the acquired problem text samples, in the training process, the problem text samples can be encoded through the first encoding layer to obtain corresponding first sentence pattern vectors, a preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector is filled based on the first sentence pattern vectors, so that the problem text samples are analyzed at one time to obtain text fragments corresponding to at least one target intention vector, the problem intention reflected by each text fragment can be determined by the corresponding initial intention vector, the problem text samples can be converted into different intention spaces through the problem analysis model, entities and constraints are marked under the intention vectors corresponding to each intention space, the improvement of the slot recognition efficiency is facilitated, the number of models required by the slot recognition processing flow is reduced, the time delay is reduced, the slot recognition is simultaneously executed in each intention space, the concurrency can be increased, and the intention recognition capability and the recognition efficiency of the question-answering system are improved.
In the foregoing embodiments, a response method is provided, and correspondingly, based on the same technical concept, the embodiments of the application further provide a response device, which is described below with reference to the accompanying drawings.
Fig. 7 is a schematic diagram of a response device according to an embodiment of the present application.
The present embodiment provides a response device 700, including:
a second obtaining unit 702, configured to obtain a target problem to be responded;
the parsing unit 704 is configured to input the target problem into a problem parsing model to perform parsing processing, so as to obtain a corresponding target segment; the problem analysis model is obtained by training a model training method of a question answering system;
a determining unit 706, configured to determine an answer to the target question according to the target segment.
Optionally, the parsing unit 704 is specifically configured to:
inputting the target problem into a problem classification model for classification treatment to obtain a classification result;
and under the condition that the classification result is used for representing that the target problem belongs to a first preset classification, inputting the target problem into the problem analysis model for analysis processing to obtain a corresponding target segment.
Optionally, the determining unit 706 is specifically configured to:
Inputting the target segment into an entity link model for prediction processing to obtain a corresponding target entity;
according to the target entity, slot filling processing is carried out to obtain a slot filling result of the target problem;
and inquiring a corresponding answer in a pre-configured knowledge graph according to the slot filling result to obtain the answer of the target question.
The response device provided by the embodiment of the application comprises: the second acquisition unit is used for acquiring a target problem to be responded; the analysis unit is used for inputting the target problem into a problem analysis model to carry out analysis processing to obtain a corresponding target segment; the problem analysis model is obtained by training a model training method of a question answering system; and the determining unit is used for determining the answer of the target question according to the target segment. In this way, in the training process, the problem analysis model can encode the problem text sample to obtain the corresponding first sentence pattern vector through the first encoding layer, the preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector is filled based on the first sentence pattern vector, so that the problem text sample is analyzed at one time to obtain a text segment corresponding to at least one target intention vector, the problem intention reflected by each text segment can be determined by the corresponding initial intention vector, and the problem text sample can be converted to different intention spaces through the problem analysis model, and the entity and constraint are marked under the intention vectors corresponding to each intention space, thereby being beneficial to improving the slot recognition efficiency, reducing the number of models required by the slot recognition processing flow, further reducing the time delay, simultaneously executing slot recognition in each intention space, increasing the concurrent intention recognition capability and recognition efficiency of the question-answering system, further generating a target segment through the problem analysis model, determining the answer of the target problem through the target segment, and improving the intention recognition accuracy and the intention recognition efficiency of the question-answering system.
The embodiment of the present application further provides an electronic device, which is configured to execute the above-provided model training method of the question-answering system, or the above-provided answer method, according to the same technical concept, and fig. 8 is a schematic structural diagram of an electronic device provided in the embodiment of the present application.
As shown in fig. 8, the electronic device may have a relatively large difference due to different configurations or performances, and may include one or more processors 801 and a memory 802, where the memory 802 may store one or more storage applications or data. Wherein the memory 802 may be transient storage or persistent storage. The application programs stored in the memory 802 may include one or more modules (not shown), each of which may include a series of computer-executable instructions in the electronic device. Still further, the processor 801 may be configured to communicate with a memory 802 and execute a series of computer executable instructions in the memory 802 on an electronic device. The electronic device may also include one or more power supplies 803, one or more wired or wireless network interfaces 804, one or more input/output interfaces 805, one or more keyboards 806, and the like.
In one particular embodiment, an electronic device includes a memory, and one or more programs, where the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the electronic device, and execution of the one or more programs by one or more processors includes instructions for:
acquiring a problem text sample;
inputting the problem text sample into the initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer;
the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector;
the conversion layer is used for generating a preset number of initial intention vectors under the condition that the first sentence pattern vector is received, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; and the text segment is used for inquiring answers corresponding to the question text samples in the question and answer system.
In another particular embodiment, an electronic device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the electronic device, and configured to be executed by one or more processors, the one or more programs comprising computer-executable instructions for:
acquiring a target problem to be responded;
inputting the target problem into a problem analysis model for analysis treatment to obtain a corresponding target segment; the problem analysis model is obtained by training a model training method of a question answering system;
and determining the answer of the target question according to the target segment.
An embodiment of a computer-readable storage medium provided in the present specification is as follows:
the embodiment of the application also provides a computer readable storage medium based on the same technical conception, which corresponds to the model training method of the question-answering system.
The computer readable storage medium provided in this embodiment is configured to store computer executable instructions, where the computer executable instructions when executed by a processor implement the following procedures:
Acquiring a problem text sample;
inputting the problem text sample into the initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer;
the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector;
the conversion layer is used for generating a preset number of initial intention vectors under the condition that the first sentence pattern vector is received, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; and the text segment is used for inquiring answers corresponding to the question text samples in the question and answer system.
It should be noted that, in the present specification, the embodiments of the computer readable storage medium and the embodiments of the model training method of the question-answering system in the present specification are based on the same inventive concept, so that the specific implementation of the embodiments may refer to the implementation of the foregoing corresponding method, and the repetition is omitted.
Corresponding to the above-described response method, the embodiments of the present application further provide a computer-readable storage medium based on the same technical concept.
The computer readable storage medium provided in this embodiment is configured to store computer executable instructions, where the computer executable instructions when executed by a processor implement the following procedures:
acquiring a target problem to be responded;
inputting the target problem into a problem analysis model for analysis treatment to obtain a corresponding target segment; the problem analysis model is obtained by training a model training method of a question answering system;
and determining the answer of the target question according to the target segment.
It should be noted that, in the present specification, the embodiments related to the computer readable storage medium and the embodiments related to the response method in the present specification are based on the same inventive concept, so the specific implementation of this embodiment may refer to the implementation of the foregoing corresponding method, and the repetition is omitted.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-readable storage media (including, but not limited to, magnetic disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
Embodiments of the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing description is by way of example only and is not intended to limit the present disclosure. Various modifications and changes may occur to those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. that fall within the spirit and principles of the present document are intended to be included within the scope of the claims of the present document.

Claims (13)

1. A model training method of a question-answering system is characterized in that the question-answering system comprises an initial question analysis model; the method comprises the following steps:
acquiring a problem text sample;
inputting the problem text sample into the initial problem analysis model for iterative training to obtain a problem analysis model; the initial problem analysis model comprises a first coding layer and a conversion layer;
the first coding layer is used for carrying out coding processing according to the problem text sample to obtain a corresponding first sentence pattern vector;
the conversion layer is used for generating a preset number of initial intention vectors under the condition that the first sentence pattern vector is received, filling each initial intention vector according to the first sentence pattern vector to obtain a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; and the text segment is used for inquiring answers corresponding to the question text samples in the question and answer system.
2. The method of claim 1, wherein the first sentence vector comprises a semantic feature sub-vector and a plurality of character sub-vectors; the filling processing is performed on each initial intention vector according to the first sentence pattern vector, and the specific implementation manner of obtaining the corresponding target intention vector is as follows:
Performing intention classification processing on the semantic feature sub-vectors to obtain corresponding intention classification results;
filling each initial intention vector according to the intention classification result to obtain a corresponding intermediate intention vector;
and filling each intermediate intention vector according to each character sub-vector to obtain a corresponding target intention vector.
3. The method according to claim 2, wherein the padding each of the intermediate intention vectors according to each of the character sub-vectors to obtain a corresponding target intention vector comprises the following specific implementation manners:
performing entity recognition processing and constraint recognition processing according to each character sub-vector to obtain a corresponding character recognition result;
and filling each intermediate intention vector according to the character recognition result to obtain a corresponding target intention vector.
4. The method of claim 3, wherein the filling processing is performed on each of the intermediate intention vectors according to the character recognition result, so as to obtain a specific implementation manner of the corresponding target intention vector, which includes:
if the character recognition result is used for representing that the character sub-vector belongs to an entity class or a constraint class, determining a corresponding intermediate intention vector according to the character recognition result and filling.
5. The method of any of claims 1-4, wherein the question text sample comprises at least one of an entity element, an attribute element, a relationship element, a constraint element; the preset number of initial intention vectors comprises a first intention vector and a plurality of second intention vectors; the first intent vector corresponds to an unintentional diagram, and each of the second intent vectors corresponds to one of the attribute elements or one of the relationship elements.
6. The method of claim 1, wherein the question-answering system further comprises an initial entity link model; the method further comprises the steps of:
obtaining an entity link sample;
inputting the entity link sample into the initial entity link model for iterative training to obtain an entity link model; the entity link model comprises a second coding layer and a prediction layer;
the second coding layer is used for carrying out coding processing on the entity link samples to obtain corresponding second sentence vectors;
the prediction layer is used for performing prediction processing according to the second sentence pattern vector and determining a corresponding target entity.
7. The method of claim 1, wherein the question answering system further comprises an initial question classification model; the method further comprises the steps of:
Acquiring a problem classification sample;
and inputting the problem classification sample into the initial problem classification model for iterative training to obtain a problem classification model.
8. A method of responding, comprising:
acquiring a target problem to be responded;
inputting the target problem into a problem analysis model for analysis treatment to obtain a corresponding target segment; the problem analysis model is obtained by training a model training method of the question answering system according to any one of claims 1 to 7;
and determining the answer of the target question according to the target segment.
9. The method of claim 8, wherein the inputting the target problem into the problem analysis model for analysis to obtain the corresponding target segment comprises:
inputting the target problem into a problem classification model for classification treatment to obtain a classification result;
and under the condition that the classification result is used for representing that the target problem belongs to a first preset classification, inputting the target problem into the problem analysis model for analysis processing to obtain a corresponding target segment.
10. The method of claim 8, wherein determining an answer to the target question from the target segment comprises:
Inputting the target segment into an entity link model for prediction processing to obtain a corresponding target entity;
according to the target entity, slot filling processing is carried out to obtain a slot filling result of the target problem;
and inquiring a corresponding answer in a pre-configured knowledge graph according to the slot filling result to obtain the answer of the target question.
11. A response device, comprising:
the second acquisition unit is used for acquiring a target problem to be responded;
the analysis unit is used for inputting the target problem into a problem analysis model to carry out analysis processing to obtain a corresponding target segment; the problem analysis model is obtained by training a model training method of the question answering system according to any one of claims 1 to 7;
and the determining unit is used for determining the answer of the target question according to the target segment.
12. An electronic device, the device comprising:
a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to perform the model training method of the question-answering system of any one of claims 1-7, or the answer method of any one of claims 8-10.
13. A computer readable storage medium storing computer executable instructions which, when executed by a processor, implement a model training method of a question-answering system according to any one of claims 1 to 7, or a response method according to any one of claims 8 to 10.
CN202310247092.2A 2023-03-14 2023-03-14 Model training method and device of question-answering system, electronic equipment and storage medium Pending CN116306974A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202310247092.2A CN116306974A (en) 2023-03-14 2023-03-14 Model training method and device of question-answering system, electronic equipment and storage medium
PCT/CN2024/070737 WO2024187925A1 (en) 2023-03-14 2024-01-05 Question answering system model training method, and sample generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310247092.2A CN116306974A (en) 2023-03-14 2023-03-14 Model training method and device of question-answering system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116306974A true CN116306974A (en) 2023-06-23

Family

ID=86833809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310247092.2A Pending CN116306974A (en) 2023-03-14 2023-03-14 Model training method and device of question-answering system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116306974A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744630A (en) * 2024-02-19 2024-03-22 卓世智星(天津)科技有限公司 Model access method and device and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744630A (en) * 2024-02-19 2024-03-22 卓世智星(天津)科技有限公司 Model access method and device and electronic equipment
CN117744630B (en) * 2024-02-19 2024-05-28 卓世智星(天津)科技有限公司 Model access method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN107766371B (en) Text information classification method and device
CN113535963B (en) Long text event extraction method and device, computer equipment and storage medium
CN107291840B (en) User attribute prediction model construction method and device
CN110168535A (en) A kind of information processing method and terminal, computer storage medium
US11461317B2 (en) Method, apparatus, system, device, and storage medium for answering knowledge questions
CN111339277A (en) Question-answer interaction method and device based on machine learning
CN116306974A (en) Model training method and device of question-answering system, electronic equipment and storage medium
CN116150367A (en) Emotion analysis method and system based on aspects
CN116821372A (en) Knowledge graph-based data processing method and device, electronic equipment and medium
CN117035078A (en) Multi-mode knowledge graph unified representation learning framework
CN113627194B (en) Information extraction method and device, and communication message classification method and device
CN114840642A (en) Event extraction method, device, equipment and storage medium
CN116933800B (en) Template-based generation type intention recognition method and device
CN111143454B (en) Text output method and device and readable storage medium
CN117130938A (en) Method and device for generating test cases based on knowledge graph
CN115859121B (en) Text processing model training method and device
CN114548325B (en) Zero sample relation extraction method and system based on dual contrast learning
CN116108181A (en) Client information processing method and device and electronic equipment
CN113392190B (en) Text recognition method, related equipment and device
CN115438098A (en) Relationship mining method and device
CN114706943A (en) Intention recognition method, apparatus, device and medium
WO2024187925A1 (en) Question answering system model training method, and sample generation method
CN113535125A (en) Financial demand item generation method and device
CN118674059A (en) Sample generation method and device of question-answering system, electronic equipment and storage medium
CN111723188A (en) Sentence display method and electronic equipment based on artificial intelligence for question-answering system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination