CN116932708A - Open domain natural language reasoning question-answering system and method driven by large language model - Google Patents
Open domain natural language reasoning question-answering system and method driven by large language model Download PDFInfo
- Publication number
- CN116932708A CN116932708A CN202310414399.7A CN202310414399A CN116932708A CN 116932708 A CN116932708 A CN 116932708A CN 202310414399 A CN202310414399 A CN 202310414399A CN 116932708 A CN116932708 A CN 116932708A
- Authority
- CN
- China
- Prior art keywords
- question
- reasoning
- module
- language model
- answering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000002776 aggregation Effects 0.000 claims abstract description 34
- 238000004220 aggregation Methods 0.000 claims abstract description 34
- 238000004364 calculation method Methods 0.000 claims abstract description 20
- 230000007246 mechanism Effects 0.000 claims description 39
- 230000014509 gene expression Effects 0.000 claims description 30
- 230000006978 adaptation Effects 0.000 claims description 27
- 238000005516 engineering process Methods 0.000 claims description 23
- 238000012549 training Methods 0.000 claims description 21
- 230000003993 interaction Effects 0.000 claims description 16
- 238000004458 analytical method Methods 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 8
- 238000012821 model calculation Methods 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 30
- 230000006870 function Effects 0.000 description 18
- 239000013598 vector Substances 0.000 description 18
- 230000008569 process Effects 0.000 description 16
- 238000013461 design Methods 0.000 description 12
- 230000000694 effects Effects 0.000 description 12
- 238000003058 natural language processing Methods 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 7
- 239000000523 sample Substances 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 210000004556 brain Anatomy 0.000 description 5
- 230000001537 neural effect Effects 0.000 description 5
- 238000010845 search algorithm Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 210000005036 nerve Anatomy 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 210000001320 hippocampus Anatomy 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 239000013589 supplement Substances 0.000 description 3
- 241000251468 Actinopterygii Species 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000013434 data augmentation Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 208000025174 PANDAS Diseases 0.000 description 1
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 125000002015 acyclic group Chemical group 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 210000003710 cerebral cortex Anatomy 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/338—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a large language model driven open domain natural language reasoning question-answering system and method, a question rewrite module rewrites user questions into rewritten questions; the central computing and managing module manages computing and knowledge resources of the large language model, and outputs the computing and knowledge resources of the large language model required by the rewrite problem and the problem core engine module to one or more sub-question-answering modules in the question core engine module according to the type of the rewrite problem; the question-answering core engine module obtains one or more candidate answers of the rewritten questions and explanatory information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model; the aggregation reasoning module aggregates and reasoning according to one or more candidate answers of the rewritten questions and the interpretable explanatory information of the candidate answers to obtain a final answer of the rewritten questions and the interpretable explanatory information of the final answer, and supports the questions by adopting a large language model, so that the types of the questions are comprehensive, easy to expand, interpretable and strong in universality.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a large-language model driven open domain natural language reasoning question-answering system and method.
Background
With the development of the internet and the explosive growth of digital information, how to improve the efficiency and accuracy of intelligent question and answer becomes a research hotspot. Intelligent questions and answers (Question Answering, QA) are intended to automatically provide answers to natural language questions posed by a user, and are one of the most representative tasks in the field of natural language processing (Nature Language Processing, NLP). Through unified task learning or instruction learning and other technologies, various NLP tasks can be unified into question-answering tasks, and the question-answering tasks become an important trend of NLP development at present.
The existing open-domain natural language reasoning question-answering system generally comprises three stages: question analysis, knowledge retrieval and answer generation, but there are problems of low support coverage of inference types, poor scalability, weak interpretability, etc. At present, the open domain dialogue large language model ChatGPT still has the problems of weak timeliness and unreliability of information and the like because information such as a large-scale structured knowledge graph, external text resources (including local and internet text resources) and the like cannot be utilized for carrying out knowledge reasoning and question answering, and the like, in particular to professional tasks in the processing field.
Disclosure of Invention
The invention provides a large language model driven open domain natural language reasoning question-answering system and method, which are used for solving the defects of low reasoning type support coverage, poor expandability, weak interpretability and the like of the question-answering system in the prior art.
The invention provides a large language model driven open domain natural language reasoning question-answering system, which comprises a question rewriting module, a central computing and managing module, a question-answering core engine module and an aggregation reasoning module; the question-answering core engine module comprises a plurality of sub-question-answering modules, and the question reasoning logics of the plurality of sub-question-answering modules are different; the output end of the problem rewriting module is connected with the input end of the central computing and management module and is used for rewriting the user problem to obtain a rewriting problem; the output end of the central computing and managing module is connected with the input end of the question-answering core engine module and is used for managing computing and knowledge resources of the large language model and outputting the computing and knowledge resources of the large language model required by the rewritten questions and the question core engine module to one or more sub-question-answering modules in the question-answering core engine module according to the type of the rewritten questions; the output end of the question-answer core engine module is connected with the input end of the aggregation reasoning module and is used for obtaining one or more candidate answers of the rewritten questions and the interpretable explanatory information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model; the aggregation reasoning module is used for aggregating and reasoning to obtain a final answer of the rewritten question and the explanatory information of the final answer according to one or more candidate answers of the rewritten question and the explanatory information of the candidate answers.
The invention provides a large language model driven open domain natural language reasoning question-answering system, which further comprises: the input end of the answer generation module is connected with the output end of the aggregation reasoning module, and the answer generation module is used for carrying out machine answer conversion on the final answer of the rewritten question and the explanatory information of the final answer to obtain a answer of the user question and the explanatory information of the answer; and the dialogue interaction module is connected with the reply generation module and is used for performing dialogue interaction with the user question according to the reply answer of the user question and the explanatory description information of the reply answer.
According to the open domain natural language reasoning question-answering system driven by the large language model, the question rewrite module is specifically used for carrying out reference resolution or omitting resolution on the user questions according to historical question-answering data so as to obtain rewrite questions, triggering no rewrite prompt when the rewrite questions are identical to the user questions, and triggering the rewrite prompt when the rewrite questions are not identical to the user questions so as to be selected by a user.
According to the invention, the center computing and managing module comprises: the knowledge graph base management module is used for providing knowledge query and reasoning execution of the knowledge graph base; the text resource library management module is used for providing text query support of the local text resource library; the large language model calculation and management module is used for providing model calculation and task adaptation support of a large language model library and a small amount of fine tuning parameter library for large language model task adaptation; and the management center module is used for scheduling the knowledge graph library, the local text resource library, the large language model library and a small amount of fine tuning parameter library adapted to the large language model task according to the plurality of sub question-answer modules, selecting one or more sub question-answer modules from the plurality of sub question-answer modules according to the type of the rewritten problem, and reasoning the rewritten problem so as to obtain one or more candidate answers of the rewritten problem and explanatory information of the candidate answers.
According to the large language model driven open domain natural language reasoning question-answering system provided by the invention, the aggregation reasoning module is specifically used for aggregating and reasoning to obtain a final answer of the rewritten question and the explanatory information of the final answer from one or more candidate answers of the rewritten question and the explanatory information of the candidate answers based on a score fusion mechanism, an optimization mechanism fed back by a user and an iterative comprehensive reasoning mechanism.
According to the invention, the open domain natural language reasoning question-answering system driven by the large language model comprises a plurality of sub question-answering modules, wherein the sub question-answering modules comprise: the knowledge base question-answering module is used for reasoning and obtaining candidate answers of the rewritten questions and explanatory description information of the candidate answers according to the large-scale knowledge map; the reading and understanding question-answering module is used for obtaining candidate answers of the rewritten questions and explanatory description information of the candidate answers by reasoning according to local collection text and/or online internet retrieval text; and the large language model question-answering module is used for reasoning and obtaining the candidate answers of the rewritten questions and the explanatory description information of the candidate answers according to the linguistic knowledge, the common sense knowledge and the fact knowledge in the large language model.
According to the open domain natural language reasoning question-answering system driven by the large language model provided by the invention, the knowledge base question-answering module comprises: the semantic analysis module is used for identifying the entity in the rewrite problem based on entity link and entity replacement, replacing the entity, and mapping the semantic analysis of the rewrite problem into a logic expression executable by the logic expression execution module supporting neural-symbol hybrid reasoning; the logic expression execution module supporting the neural-symbol mixed reasoning is used for inquiring information from a structured knowledge base or an unstructured text base based on the logic expression so as to reasoning and obtain candidate answers of the rewritten questions and explanatory information of the candidate answers.
According to the open domain natural language reasoning question-answering system driven by the large language model provided by the invention, the reading understanding question-answering module comprises: the evidence retrieval module is used for retrieving the local text data and the Internet data to obtain evidence candidates; and the knowledge reasoning model is used for reasoning and obtaining the candidate answers of the rewritten questions and the explanatory information of the candidate answers according to the evidence candidates.
According to the invention, the large language model driven open domain natural language reasoning question-answering system is provided, and the large language model question-answering module comprises: the answer generation module based on knowledge detection is used for carrying out task adaptation on answer generation through a pluggable small-parameter efficient learning technology by taking a non-autoregressive generation type pre-training language model as a large language model; the explanation generating module is used for generating task adaptation based on the answers, performing example construction and large language model explanation generation by taking a generated large language model as a large language model, and reasoning to obtain candidate answers of the rewritten questions and explanatory information of the candidate answers; the big language model question and answer module is also used for judging whether the rewritten problem is a preset position problem or not according to a preset position refuting mechanism and generating big language model refuting explanation.
The invention also provides a large language model driven open domain natural language reasoning question-answering method, which is suitable for the large language model driven open domain natural language reasoning question-answering system, and comprises the following steps: the problem rewriting module rewrites the user problem to obtain a rewritten problem; the central computing and managing module outputs computing and knowledge resources of the large language model required by the rewrite problem and the problem core engine module to one or more sub-question-answering modules in the question-answering core engine module according to the type of the rewrite problem; the question-answering core engine module obtains one or more candidate answers to the rewritten questions and explanatory information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model; and the aggregation reasoning module aggregates and reasoning to obtain a final answer of the rewritten question and the explanatory information of the final answer according to one or more candidate answers of the rewritten question and the explanatory information of the candidate answers.
According to the open domain natural language reasoning question-answering system and method driven by the large language model, a question rewrite module rewrites a user question to obtain a rewrite question; the central computing and managing module manages computing and knowledge resources of the large language model, and outputs the computing and knowledge resources of the large language model required by the rewrite problem and the problem core engine module to one or more sub-question-answering modules in the question core engine module according to the type of the rewrite problem; the question-answering core engine module obtains one or more candidate answers of the rewritten questions and explanatory information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model; the aggregation reasoning module aggregates and reasoning according to one or more candidate answers of the rewritten questions and the interpretable explanatory information of the candidate answers to obtain a final answer of the rewritten questions and the interpretable explanatory information of the final answer. The whole system is supported by a large language model, so that the problem type is comprehensive, the expansion is easy, the interpretation is easy, and the universality is strong.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following brief description will be given of the drawings used in the embodiments or the description of the prior art, it being obvious that the drawings in the following description are some embodiments of the invention and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of the structure of a large language model driven open domain natural language reasoning question-answering system provided by the invention;
FIG. 2 is a schematic diagram of a technical framework of a large language model driven open domain natural language reasoning question-answering system provided by the invention;
FIG. 3 is an exemplary diagram of a problem-overwriting effect provided by the present invention in a system;
FIG. 4 is a schematic diagram of a problem-rewriting module according to the present invention;
FIG. 5 is a schematic diagram of a central computing and management module provided by the present invention;
FIG. 6 is a schematic diagram of an aggregate inference module provided by the present invention;
FIG. 7 is a schematic diagram of an iterative comprehensive reasoning flow of the aggregate reasoning module provided by the invention;
FIG. 8 is a schematic diagram of a knowledge base question-answering module based on semantic parsing and logic expression execution provided by the present invention;
FIG. 9 is a schematic diagram of a reading understanding question-answering module based on open domain knowledge reasoning provided by the invention;
FIG. 10 is a schematic diagram of an evidence retrieval module provided by the present invention;
FIG. 11 is a schematic diagram of a knowledge reasoning model provided by the present invention;
FIG. 12 is a schematic diagram of a large language model question-answering module based on knowledge detection and demonstration provided by the present invention;
FIG. 13 is a schematic diagram of an open domain natural language reasoning question-answering system interface driven by a large language model provided by the present invention;
FIG. 14 is a second schematic diagram of an open domain natural language reasoning question-answering system interface driven by a large language model provided by the present invention;
FIG. 15 is a third diagram of an interface of the large language model driven open domain natural language reasoning question-answering system provided by the present invention;
FIG. 16 is a flow chart of a large language model driven open domain natural language reasoning question-answering method provided by the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Intelligent question-answering aims at automatically providing answers to natural language questions presented by users, and is one of the most representative tasks in the field of Natural Language Processing (NLP). In recent years, due to the massive increase of internet data, the rapid improvement of hardware computing capability and the great progress of NLP and deep learning technologies, intelligent question-answering methods have been developed with rapid progress, and various successful applications are layered, such as conversational robots, intelligent voice assistants, search engines and the like. Although the question-answering technology has made great progress, because of the complexity of human language and the openness of question-answering scenarios, there is a high demand for the intelligent level of the system, which requires the system to be able to understand user questions accurately, with the ability to answer questions from various complex inferences (e.g., multi-hop inferences, numerical inferences, common sense inferences, etc.), from different knowledge source data (e.g., text data, knowledge maps, etc.). The construction of an open domain question-answering system supporting complex scenes is always the effort direction of the intelligent question-answering field. The vast need for intelligent questions and answers makes academia and industry always have no residual force to explore.
However, these prior art questions and answers also exhibit a number of serious problems or limitations:
(1) The inference type supports low coverage. The user's problems often relate to various types of reasoning, such as multi-hop reasoning, numerical reasoning, common sense reasoning, logical reasoning, etc., and the current system can only cover a part of them.
(2) The scalability is poor. The language intelligent model adopted by the current question-answering system is mainly a small model, is limited in a specific field, is difficult to quickly expand to other application fields, and even if a large language model technology is adopted, the problem that the field adaptation is difficult and the calculation cost is high due to the fact that a high-efficiency calculation management design is lacked.
(3) The interpretability is weak. The system has poor interpretation of the answers, usually only gives the answers, and lacks presentation of sources, execution procedures and interpretation of the answers. At present, the large language model question-answering technology such as ChatGPT has serious 'illusion' problem in reply generation, namely unsafe, unreliable, speculative and other contents are generated.
(4) The refuting ability is lacking. Currently existing reading understanding questions and large language model questions lack refuting ability for preset position (False Premise) questions, such as asking "person's feet have several eyes? "i.e., textQA and GPT-3 based question-answering systems are difficult to answer correctly, and will answer an off-spectrum reply of" 2 eyes ".
(5) The open-domain question-answer prediction performance is insufficient. The technology adopted by the intelligent voice assistant is still a traditional small model, and the intelligent voice assistant is general in level difference. In addition, the knowledge source for answering the questions is single, mainly by adopting knowledge graph or text retrieval, and the application of internet online text, large language model knowledge and the like is lacking.
(6) The context is not understood well. In a question-answering system, users are usually used to continuously ask questions, the questions are frequently referred to for referring, omitting and the like, and the existing question-answering system lacks historical dialogue context understanding capability for the scenes, so that the accuracy of question understanding is insufficient.
In order to break through the problems existing in the existing question-answering system, the patent takes a large language model technology as a support, and invents a complex question-answering system which supports an open domain, can be interpreted and is easy to expand.
The following describes a large language model driven open domain natural language reasoning question-answering system and method in connection with fig. 1-16.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an open domain natural language reasoning question-answering system driven by a large language model according to the present invention.
Referring to fig. 2, fig. 2 is a schematic diagram of a technical framework of a large language model driven open domain natural language reasoning question-answering system provided by the present invention.
The invention provides a large language model driven open domain natural language reasoning question-answering system, which comprises a question rewriting module 1, a central computing and managing module 2, a question-answering core engine module 3 and an aggregation reasoning module 4; the question-answering core engine module 3 comprises a plurality of sub-question-answering modules, and the question reasoning logics of the plurality of sub-question-answering modules are different;
the output end of the problem rewriting module 1 is connected with the input end of the central computing and management module 2 and is used for rewriting user problems to obtain rewritten problems;
the output end of the central computing and managing module 2 is connected with the input end of the question-answering core engine module 3 and is used for managing the computing and knowledge resources of the large language model, and outputting the computing and knowledge resources of the large language model required by the question-answering core engine module 3 to one or more sub-question-answering modules in the question-answering core engine module 3 according to the type of the rewritten question;
the output end of the question-answering core engine module 3 is connected with the input end of the aggregation reasoning module 4 and is used for obtaining one or more candidate answers of the rewritten questions and the explanatory information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model;
The aggregation inference module 4 is configured to aggregate inference to obtain a final answer of the rewritten question and the interpretable specification information of the final answer according to one or more candidate answers of the rewritten question and the interpretable specification information of the candidate answers.
Specifically, the invention provides a large language model driven open domain natural language reasoning question-answering system, which adopts a modularized design and mainly comprises a question rewriting module 1, a central computing and managing module 2, a question-answering core engine module 3 and an aggregation reasoning module 4.
Question rewrite Module 1 (Rewriter): and (3) rewriting the user questions according to the question-answer or dialogue history data, wherein the questions such as the indication, omission and the like possibly exist in the questions, and the functions of fusing dialogue interaction are realized by fusing history information. This approach helps to promote semantic understanding of the following question-answering engine.
Central computing and management module 2 (Central Computation and Management, CCM): and unified management is carried out on a large language model adopted by the system, and high-efficiency computing support of pluggable small-quantity parameter high-efficiency learning technology (Delta learning) is provided for related tasks. The video memory occupation of the large language model and the field task adaptation difficulty are reduced, and the expandability of the system is improved. Meanwhile, the module can provide functions of knowledge integration, knowledge inquiry, database management and the like for large-scale knowledge graphs (knowledge graphs in general fields such as Wikidata and domain knowledge graphs).
The question and answer core engine module 3 may include, for example, a knowledge base question and answer module KBQA, an open domain reading understanding question and answer module TextQA, and a large language model question and answer module BMQA.
Knowledge base question-answering module KBQA: providing the ability to obtain knowledge from a large-scale knowledge graph for factual knowledge questions and answers, questions such as "Yao Ming and james which are higher? ". The module mainly comprises two parts of mapping of a problem text to a logic expression and execution of the logic expression, and the module provides functions of symbol reasoning, nerve-symbol hybrid reasoning and the like for a system. The KBQA method based on the logic expression searches answers from the large-scale structured knowledge graph, has the advantages of accurate answers, interpretability and the like, and is a strict symbol reasoning method. In addition, the KBQA module designs a knowledge reasoning channel for fusing the related entity sub-graph retrieval (KBQuery), namely KBQuery generation+sub-graph retrieval+neuro-symbol reasoning, firstly, inputting the rewrite problem into a Program generation model to generate a sub-graph query statement KBQuery, then, obtaining a knowledge sub-graph from a knowledge graph by utilizing a sub-graph retrieval function in a knowledge graph base management module, and finally, converting the retrieved knowledge sub-graph into a sequence text by JSON or direct splicing and the like to be input into a large language model for knowledge reasoning, thereby realizing the fuzzy knowledge reasoning capability of knowledge base question-answering.
Open domain reading understanding question and answer module TextQA: the technical framework of search and reading understanding is adopted, and text search from local collection text and online internet search text is supported to obtain text fragments related to the questions for reading understanding and answering. The support of various open domain problems by the system is realized through open domain text. In text retrieval, a dense retrieval and reordered neural retrieval technology driven by a large language model is adopted, and in order to improve the dense vector retrieval efficiency, a hierarchical navigation small world vector search algorithm (Hierarchical Navigable Small Worlds, HNSW) is adopted. Meanwhile, a reading understanding model integrating numerical reasoning and logical reasoning is constructed. The module greatly improves the text retrieval accuracy and the answer capability of open-domain questions through a large language model technology, and supports the functions of numerical reasoning, logical reasoning and the like.
Large language model question-answering module BMQA: the question answers are intended to take a large language model as a knowledge source. The large language model is obtained through massive general text training, rich linguistic knowledge, common sense knowledge and even rich fact knowledge are already encoded, and a Prompt (Prompt) question-answer and Demonstration (explanation) answer interpretation technical framework is adopted. This module is similar to the human brain's ability to answer questions via the hippocampus to memorize information, and in practice presents a better common sense reasoning capability than other question-answering modules. Aiming at the lack of refuting capability of the existing system on the preset position, a preset position Data set is constructed, and the capability of answering the preset position questions by a large language model is realized by combining small parameter fine Tuning (Delta Tuning), small sample prompting, pareto (Pareto) multitask learning, data playback (Data Replay) learning and the like, so that a better refuting interpretation effect is achieved.
Aggregate reasoning Module 4 (Aggregater): aiming at the problem types of the superiority of different question-answering modules on the basis of the replies of the three question-answering modules, carrying out aggregation reasoning to obtain the optimal replies considered by the system. The functions of mutual supplement and verification of different question-answering modules are exerted. In the design of the model, an online optimization mechanism of user feedback is constructed, and the system reply effect is continuously optimized through the user feedback in the operation and the use of the system.
Deep learning large language models represented by Pre-training language models (Pre-trained Language Model, PLM) greatly refresh various NLP task performances, and become breakthrough technology in the NLP and artificial intelligence fields. The PLM adopts a pre-training-fine-tuning model, namely a pre-training model is obtained through self-supervision learning on a mass of collected internet text corpus, and then fine tuning is carried out on a small amount of labeling data of a downstream task to realize the adaptation of the downstream task. Compared with the traditional artificial intelligence model, the large language model has three advantages: firstly, the large language model data has low cost, and large-scale training is mainly carried out by adopting unmarked public data. Meanwhile, the downstream task learning can be completed by only slightly adjusting the language model with a small amount of data in the specific field, so that the deployment and application of the model in the specific field are accelerated; secondly, the large language model has strong universality, can be applied to various artificial intelligent tasks, does not need to independently develop corresponding models aiming at specific tasks, and improves the research and development efficiency; thirdly, the large language model has good comprehensive performance, and the performance of the generalized large language model in each specific task can exceed the performance of the traditional artificial intelligent model for task-specific training. The adoption of the large language model technology provides possibility for constructing a high-level and general intelligent question-answering system. In this patent, the large language model refers to a large-scale pre-training language model, unless otherwise specified.
Aiming at the problems of (2) poor expandability and (5) insufficient open domain question-answer prediction performance, the solution is as follows: large language model support, central computing and management. The method adopts the current pre-trained large language model as a basic model of the question-answering system, and enables the system to support the problems of different fields and different problem types in an open domain scene through the general intelligent potential and powerful language understanding capability of the large language model. Considering the problems of high calculation cost, difficult fine adjustment of field task data, low model inference speed and the like caused by large-scale parameters of a large language model, the central calculation and management module 2 is designed to share and uniformly calculate and schedule a plurality of large language models involved in the system, and each subtask application is adapted in a pluggable small-parameter efficient fine adjustment (Delta-Tuning) mode, namely, large language model parameters are fixed, only a small number of model parameters (such as prompt fine adjustment, adapter fine adjustment, bias fine adjustment and the like) are finely adjusted, so that the demands of the system on a display memory of a computing platform and a Graphic Processing Unit (GPU) display card are reduced. Question answering from data sources such as open field text, large language models, knowledge maps, etc. is accomplished by embedding TextQA, BMQA, KBQA, etc.
A question-answering system supported by a large language model is constructed, for example, in KBQA, a user question is converted into an executable Program (Program) language to be retrieved and inferred from a knowledge graph, so that accurate answers of various complex reasoning questions are realized, and a powerful generative PLM model (such as a T5 model) can be utilized to generate the Program (Program) language. In open text qa, relevant documents are accurately retrieved from collected or internet massive text data, then answers are generated from the relevant documents according to a reading understanding model, and a large language model driven neural retrieval model (dense retrieval+reordering) method is constructed, wherein the dense retrieval refers to the retrieval by encoding the documents into semantic representation vectors by using the large language model, and experiments show that the method can be significantly superior to the traditional statistical model (such as BM 25). The large language model may then be utilized to generate answers based on the questions on the relevant retrieved documents.
Supporting low coverage for the reasoning type of the problem (1), the solution is as follows: and fusing multiple types of reasoning and multiple information source mechanisms. By means of human brain semantic understanding and symbol reasoning mechanism, a system architecture with various reasoning mechanisms such as symbol reasoning, nerve reasoning, symbol-nerve hybrid reasoning and the like is constructed. Specifically, three representative functional structures of human brain language information processing, namely a language ring structure good at sign reasoning, a hippocampus structure through a memory read-write mechanism, a linkage structure of a hippocampus with reading perception and a cerebral cortex, are considered, and a knowledge base question-answering module KBQA supporting sign reasoning, a large language model question-answering module BMQA supporting nerve reasoning and a reading understanding question-answering module TextQA with numerical reasoning are embedded in the system. KBQA converts problems into executable programs (such as KoPL (Knowledge oriented Programing Language, a programming language designed for complex problem reasoning, SPARQL (SPARQL Protocol and RDF Query Language, a query language used on a knowledge graph resource description framework)) by adopting a Program method, and the like), converts reasoning processes into function operations, and has the functions of multi-hop reasoning, numerical reasoning, logical reasoning, fact reasoning and the like; a new reading and understanding architecture is designed in the textQA, and the functions of multi-hop reasoning, numerical value reasoning, logic reasoning and the like are integrated; in BMQA, a large language model is used as a knowledge source, and functions of refuting, common sense reasoning, fact reasoning and the like of a preset position problem are realized through mechanisms of activation of the preset position refuting capability, knowledge detection and the like.
For the problem (3) with weak interpretability, the solution is: and the modularized design shows functions such as reasoning path, answer analysis and the like. The system adopts a modularized design, and is divided into modules of question rewriting, information retrieval, reading and understanding inference devices, large language model question answering, large language model interpretation generation, aggregation inference and the like, and each module result can be output, so that the information of the middle steps of the question answering process of the system can be mastered, and the system control and the operation mechanism mastering are facilitated. In addition, in the KBQA process, the execution process and intermediate results of answers are visually given through a Program (Program); listing evidence sources of answers in TextQA; in BMQA, answers are interpreted by a large language model Demonstration (translation) method. The system provides interpretability from question analysis, answer execution, evidence and interpretation of answers, etc. through these designs.
Aiming at the problem (4) that the refuting capability is weak, the solution is as follows: the activation mechanism of the standpoint question-answer data is preset. By constructing a question-answer Data set of a preset position, the anti-refuting capability of a large language model to the preset position problem is activated by adopting a pareto multitask learning and less parameter efficient fine Tuning (Delta Tuning) +data playback (Data Replay) mechanism and a less sample prompting method, and the answering capability of the common sense problem of BMQA is improved.
Lack of contextual understanding for problem (6), solution: the problem of combining the context rewrites. And constructing a question rewrite model for referring to disambiguation by combining dialogue history data, and rewriting the questions such as referring to disambiguation, omitting complement and the like before answering the questions, so that the question understanding capability of a follow-up question and answer module is improved.
The invention relates to a large language model driven open domain natural language reasoning question-answering system, which is an open question-answering system framework capable of supporting complex reasoning. The system can support multiple complex reasoning such as multi-hop reasoning, logic reasoning, knowledge reasoning, numerical reasoning and the like by referring to human brain semantic understanding and symbol reasoning mechanisms, and has the characteristics of being interpretable, extensible and the like. The system mainly uses a language large language model, knowledge base question answering, information retrieval, nerve/symbol reasoning and the like as core technologies, uses a large-scale structured knowledge graph, unstructured text data, a large language model and the like as knowledge sources to answer questions, and supports the question answering capability of multiple types of complex questions such as common sense questions, actual questions, logical reasoning questions, numerical reasoning questions and the like by mutual verification and supplement among different types of knowledge sources. The system realizes the interpretability of the questions and answers through question parsing, answer evidence, answer description and the like. The system designs a central computing and management module 2 to construct an easily-extensible and high-performance question-answering system. The system technology can be migrated to various language processing intelligent systems to provide information services and decision support.
In summary, the large language model driven open domain natural language reasoning question-answering system adopts the large language model support, combines the pluggable small quantity of parameter high-efficiency learning technology to realize the high-efficiency calculation and expansion management of the large language model in the system, and can acquire knowledge from a structured knowledge graph, unstructured text and the large language model to answer questions, so that the system supports comprehensive types of questions, is easy to expand, can explain and has strong universality.
Based on the above embodiments:
as a preferred embodiment, further comprising: the input end of the answer generation module is connected with the output end of the aggregation reasoning module 4, and the answer generation module is used for carrying out machine answer conversion on the final answer of the rewritten question and the explanatory description information of the final answer to obtain the answer of the user question and the explanatory description information of the answer; and the dialogue interaction module is connected with the reply generation module and is used for performing dialogue interaction with the user questions according to the reply answers of the user questions and the explanatory description information of the reply answers.
In this embodiment, the large language model driven open domain natural language reasoning question-answering system further includes a reply generation module and a dialogue interaction module. And the reply generation module is used for providing more friendly machine replies in modes of style conversion, controllable generation, visual display and the like on the basis of the aggregation reasoning result. The dialogue interaction aims at integrating a Human-in-the-Loop technology in the system, builds a question-answering system with dialogue interaction, and supports the realization of more accurate or continuous question-answering capability through dialogue interaction.
The above modules can be broadly divided into three categories, question understanding, information retrieval, reasoning and answering. The three types of modules have various implementation modes, and three question-answering system technical routes are formed by combination: KBQA based on structured knowledge base and symbol reasoning, textQA based on open-field text knowledge and neural reasoning, BMQA based on large language model internal knowledge and knowledge probe. The three technical routes complete the answer to the question in a cooperative manner.
Referring to fig. 3, fig. 3 is a diagram illustrating an exemplary problem writing effect provided by the present invention in a system.
Referring to fig. 4, fig. 4 is a schematic diagram of a problem rewriting module provided by the present invention.
As a preferred embodiment, the question rewrite module 1 is specifically configured to perform reference resolution or omit resolution on a user question according to question-answer history data to obtain a rewrite question, and not trigger a rewrite prompt when the rewrite question is the same as the user question, and trigger a rewrite prompt when the rewrite question is not the same as the user question, so as to be selected by a user.
Specifically, the question rewriting module 1 is mainly responsible for rewriting an inputted question. After the user inputs the questions, the question and answer system firstly translates the Chinese questions into English, then uses the question rewrite module 1 to process the input questions, rewrites the questions into a form which is convenient for the question and answer module to process, and uses the questions as the inputs of the central computing management module and each question and answer module. The question-answering task refers to that a machine searches answers in a certain information source according to questions posed by a user. Because the user can not necessarily obtain all the desired information through the answer of one question, more questions are often asked according to the existing questions and answers, and more relevant information is known. Because the problems are presented based on the existing information, users can simply write the problems, and flexibly use the instructions and omit the problems, the problems are more concise and compact. The effects are shown in table 1:
Table 1 example of the effect of question overwriting in multiple rounds of questions and answers
In this example, the user first asks "who the elderly wrote with the sea? The system gives the answer "oersted-hamming. When the user wishes to ask next about this topic, he asks the question "what prize he has obtained? The "wherein the pronoun" he "needs to be understood in connection with the previous question and answer history. The question rewrite module 1 recognizes that "he" refers to the answer "oersted-hamming" of the previous question by referring to the resolution, and therefore rewrites the question to "oersted-hamming" which awards are obtained? "thereby enabling the system to accurately understand the intent of the user. The user asks again "who is the principal angle of the book? "the system rewrites the question to" who is the principal angle of the elderly with the sea? "and answer" called old fisher in san diego ", and then the user asks" what fish he has fished? Here, "in addition to" he "means that the information of the book name is omitted, so the question rewrite module 1 complements the question to" what fish is caught in the old and the sea in san diego? "so that the system can answer.
In the interface of the question-answering system, when a user inputs a question, the system rewrites the question in real time, and asks the user whether the rewrite is correct or not through a prompt under an input box. If the rewrite is correct, the user may click on the rewritten question, which will automatically fill in the input box.
After integrating the question rewrite module 1, the user can continuously ask the system about the same topic and use the instruction and the omission in the questions, so that the question asking process is simpler and more efficient, and the interaction between the user and the system is more coherent and interactive.
Problem rewrite flow: the input of the question rewriting module 1 comprises two parts, namely a history of question and answer of the user and the system before the question and answer, and a question input by the user at present; the output is a rewritten question, and the question should not contain any ambiguous information such as reference, omission, etc., so that the question-answering module can accurately understand the intention of the user and give an answer. The core of the module is a large-scale language model (such as a T5 series model, a GPT (generating Pre-trained Transformer) series model, etc.) based on a self-attention transformation network, which is capable of encoding an input language text into a vector representation and then outputting the language text meeting the task requirements using a decoder. For the task of question rewrite, the question rewrite module 1 first spells the question-answer history with the current question, as an input to the language model, and the training model outputs the rewritten question.
Prompt judgment mechanism: the module will determine if the rewritten question is consistent with the original question. If the problem text is unchanged, the problem is described as not needing to be rewritten, and the function of prompting to rewrite in the interface is not triggered; if the text of the question changes, the explanatory question contains ambiguity information which needs to be understood by combining with the question and answer history, and the function of prompting and rewriting in the system interface is triggered for the user to select and confirm.
In summary, the problem rewriting module 1 constructs a problem rewriting flow suitable for the intelligent question-answering system, wherein a prompt judging mechanism is designed, and whether to rewrite the prompt for the user is determined by judging whether the rewritten problem is consistent with the original problem, so that the interaction between the user and the system is more consistent and interactive.
Referring to fig. 5, fig. 5 is a schematic diagram of a central computing and management module according to the present invention.
As a preferred embodiment, the central computing and management module 2 comprises: the knowledge graph base management module is used for providing knowledge query and reasoning execution of the knowledge graph base; the text resource library management module is used for providing text query support of the local text resource library; the large language model calculation and management module is used for providing model calculation and task adaptation support of a large language model library and a small amount of fine tuning parameter library for large language model task adaptation; and the management center module is used for scheduling the requirements of the knowledge graph library, the local text resource library, the large language model library and the small amount of fine tuning parameter library which are matched with the large language model task according to the plurality of sub question-answering modules, selecting one or more sub question-answering modules from the plurality of sub question-answering modules according to the type of the rewritten problem, and reasoning the rewritten problem so as to obtain one or more candidate answers of the rewritten problem and explanatory information of the candidate answers.
Specifically, the central computing and management module 2 aims to uniformly manage a large language model library, a fine-tuning small-amount parameter (Delta) library, a knowledge map library, a text resource library and the like adopted by the question-answering system, provide knowledge of different sources for related tasks, such as large language model knowledge, fact knowledge, text knowledge and the like, and simultaneously provide high-efficiency adaptation support of the large language model, such as support of fine-tuning small-amount parameter (Delta), model compression, inference acceleration and the like for the system. The module reduces the video memory occupation and the field task adaptation difficulty of the system and improves the expandability of the system through unified management and scheduling of large language model calculation and knowledge resources.
The central computing and management module 2 is mainly divided into a knowledge map base management module, a text resource base management module, a large language model computing and management module, a management central module and other sub-modules.
The knowledge map base management module provides fact knowledge support for the question-answering system mainly through large-scale knowledge map knowledge query, and comprises a knowledge map base and a knowledge base high-efficiency management part. The knowledge graph base refers to a large-scale knowledge graph such as Wikidata, DBpedia. The efficient management of the knowledge base is to provide functions of knowledge query and reasoning execution, and the specific functions relate to 1) a Program (Program) logic layer, such as knowledge graph query languages of KoPL, SPARQL and the like; 2) Entity link, namely, link the entity mention (movement) in the given problem text to the entity in the knowledge graph, realize the entity semantic disambiguation; 3) Sub-graph retrieval, namely obtaining sub-graph information of related entities in the problem text, can be used for entity visualization and problem background knowledge augmentation.
The text resource library management module mainly provides efficient text query support for local text resources, and the text resources can be collected encyclopedia text data or accumulated field text data, including three parts of sparse retrieval (such as BM 25), text vectorization and dense vector retrieval (such as HNSW).
The large language model calculation and management module aims at providing unified large language model and high-efficiency calculation support for each module of the system, and comprises a large language model library, a Delta library, high-efficiency calculation and management sub-modules of the large language model and the like. Wherein the large language model library refers to large language models such as T5, CPM, GPT-3 and the like, the Delta library refers to libraries formed by adapting small amount of parameters (called Delta) based on the fine Tuning of the large language models of Delta Tuning technology, such asDelta Delta generated in an adaptation Program (Program) 1 Delta Delta for adaptation problem overwriting 2 Delta Delta adaptation of dense search models 3 Etc., where delta 1 、Δ 2 、Δ 3 Delta libraries were composed. Delta Tuning is a parameter efficient learning technology oriented to a large language model, and fine-Tuning the large language model in a downstream task jWhen the model is used, the large language model parameter theta is fixed, and a small amount of parameter delta is added to the model j And only fine tuning the increment parameter to obtain a task j model Delta fine-tuned model +. >The model prediction effect of the full parameter fine tuning of the large language model is realized, and the model prediction effect is a four-two jack-pulling technology oriented to the large language model. Only a small amount of fine tuning parameters Delta of the fine tuning increment is beneficial to avoiding the system from storing a large language model for processing different tasks, only one basic model is needed to be stored, and corresponding Delta is called from a Delta library aiming at different tasks, so that the purposes of reducing the display memory of the system and realizing easy expansion of the system are achieved. In addition, the large language model high-efficiency calculation and management sub-module comprises modules of Delta fine adjustment, large language model high-efficiency compression, large language model high-efficiency inference and the like, and the purposes of high-efficiency adaptation of the large language model to each module and reduction of calculation cost are achieved.
The management center module is mainly used for scheduling requirements of knowledge maps, local text resources, large language models, delta libraries and the like according to the modules, so that efficient operation of the system is ensured. While an appropriate module may be selected from KBQA, textQA, BMQA for answer based on the question type.
In summary, the central computing and management module 2 may select a selectable module for answer according to the question type; the method for managing the large language model library and the Delta library is adopted, so that the calculation cost of the system is reduced, and the expandability of the system is improved.
Referring to fig. 6, fig. 6 is a schematic diagram of an aggregation inference module provided by the present invention.
Referring to fig. 7, fig. 7 is a schematic diagram of an iterative comprehensive reasoning process of the aggregate reasoning module provided by the present invention.
As a preferred embodiment, the aggregate reasoning module 4 is specifically configured to aggregate reasoning from the candidate answers and the interpretable specification information of the candidate answers of one or more rewritten questions based on a score fusion mechanism, an optimization mechanism fed back by the user, and an iterative comprehensive reasoning mechanism to obtain the final answer and the interpretable specification information of the final answer of the rewritten questions.
In particular, a more intelligent question-answering system often needs to have a variety of different question-answering capabilities. In the technical framework of the patent, a modularized design is adopted, and different expert question-answering modules (namely BMQA, textQA, KBQA) are used for realizing different question-answering capacities. The fields that different question-answering modules are adept in, skills that they possess, and types of questions that can be handled are different. In the field, there are modules that are adept at encyclopedia, and there are modules that are adept at biological or literature aspects. In the reasoning capability, some modules have common sense understanding and multi-hop reasoning capability, some modules have numerical reasoning and logic reasoning capability, and the final answer is selected from candidate expert question-answering module answers and returned to the user by combining the capabilities of different modules and the types of the questions.
The aggregate reasoning module 4 inputs the answers and interpretations output for each sub question-answer module and the corresponding questions to the reply generation module. The question "who is the Yangtze river and Nile longer? "different expert modules give different candidate answers," Nile "for the candidate answer given by module 1," Changjiang "for the candidate answer given by module 2, and" pandas "for the candidate answer given by module 3. It is known that the answer of module 1 is correct, and the evidence it gives is also correct; while the evidence of module 2 is realistic, but its answer is wrong, it cannot be inferred that "the Yangtze river is longer than Nile" from "the Yangtze river is the first long river in China"; while the answer given by module 3 is irrelevant. The aggregate reasoning module 4 takes as input the candidate answers of the questions and the different question-answering modules, and finally selects one of the candidate answers as the final answer, wherein the selected answer is 'nile'.
Answer selection based on score fusion: the aggregation inference module 4 makes decisions based on the two-part scores. The aggregate reasoning module 4 obtains candidate answer final scores and selects final answers by using a score fusion mechanism based on the large language model candidate answer scores and the confidence scores given by the question-answer module. The first partial score of the aggregate inference module 4 predicts based on a large language model. Features of the questions and candidate answers are extracted using a large language model. In order to enable the large language model to pay attention to all candidate answers given by the question-answering module at the same time, questions and candidate answers are concatenated into one character sequence and input into the large language model. In the process that the large language model presents the questions and the candidate answer features, the large language model can pay attention to all input sequences, and reasoning is carried out by combining knowledge stored in parameters by the large language model and answers of different question-answering modules. After extracting features of questions and candidate answers using a large language model, we use a linear layer as an answer score regressor to get the score of each answer.
The second part of the scores of the aggregate reasoning module 4 consists of the confidence scores given by the question-answering module. This portion of the score, as each of the question and answer modules gives, represents an estimate of the confidence level of the candidate answer for which the question and answer module gives. By introducing the partial score, misjudgment caused by that the aggregation inference module 4 excessively depends on the judgment of a single large language model can be effectively avoided. Because the large language model prediction candidate answer score of the first part depends on a single model to predict, the situation of fitting possibly occurs in the process of training the single model, so that the generalization performance of the model is reduced, the single model is difficult to simultaneously have the reasoning capability of a plurality of question-answering modules, and in the actual use process, the model is difficult to ensure that the problem beyond the training set is not wrongly judged. The confidence score given by each question-answering module comprises the judgment of the module itself on the candidate answers. On the one hand, from the aspect that a single model is easy to be over-fitted, the confidence score given by the question-answer module is introduced to be equivalent to the comprehensive judgment by using different models, so that the situation of over-fitting can be relieved; on the other hand, from the standpoint that it is difficult for a single model to have the reasoning capabilities of a plurality of question-answering modules at the same time, the confidence scores given by the question-answering modules are judged based on the reasoning capabilities of the different modules themselves, and the use of this score increases the reasoning capabilities of the aggregate reasoning module 4 to some extent.
Based on the scores of the candidate answers of the large language model and the confidence score given by the question-answering module, a score fusion mechanism is used for obtaining a final score, and the candidate answer with the highest score is selected as the final answer to be output. Since the scores given by the different question-answering modules may not be uniform in scale and also differ in distribution, we first normalize the confidence scores linearly so that their scores are distributed between 0 and 1. Let us have n question-answering modules, and for the i (i=1, 2,., n) candidate answers, the candidate answers given by the large language model are given a score of a i The confidence score of the normalized question-answering module is b i We can obtain the final score s of the i-th candidate answer by multiplying the two scores i ,s i =a i ×b i Finally, we select the candidate answer with the highest final score as the final answer i answer =argmax i {s 1 ,s 2 ,...,s n }。
Optimization mechanism based on user feedback: in the user interaction interface of the question-answering system, a user feedback channel is designed, so that a user can give feedback to the effect of the current aggregation reasoning module 4 in the process of using the system, including whether the current question-answering modules contain correct answers and whether the aggregation reasoning module 4 selects correct answers. The system background collects the data, and when the data reaches a certain quantity, the aggregation reasoning module 4 is further trained and updated, so that the on-line optimization based on user feedback is realized.
Iterative comprehensive reasoning mechanism: a mechanism for realizing iterative comprehensive reasoning through the aggregation reasoning module 4 and each question-answering module is constructed. The aggregate reasoning module 4 first receives the answers, interpretations and confidence levels given by the question-answering modules and selects the answers using an answer selection mechanism based on reasoning score fusion. At this time, the aggregate reasoning module 4 feeds back the selected answers to the rest question-answering modules, provides corresponding explanation and confidence level, helps other modules to further reason, and the question-answering modules can update own answers according to the newly obtained information. And then, the aggregation inference module 4 selects the answer again by using an answer selection mechanism based on score fusion according to the updated answer. If the answer is unchanged, outputting the answer as the answer finally selected, otherwise, continuing to conduct iterative reasoning according to the mode. Through an iterative comprehensive reasoning mechanism, each question-answering module realizes the interaction of information and reasoning by means of the aggregation reasoning module 4, and the accuracy of the question-answering system is further improved.
In conclusion, the aggregation reasoning module 4 builds a score fusion mechanism based on the large language model prediction score and the question-answering module confidence score, and the robustness and the accuracy of the aggregation reasoning module 4 are improved. An optimization mechanism based on user feedback is constructed, and the user feedback is used for optimizing the system model effect in the running process of the system. An iterative comprehensive reasoning mechanism is constructed, multi-module information and reasoning interaction is realized, the reasoning capability of each module is fully exerted, and the multi-step reasoning capability is provided.
As a preferred embodiment, the plurality of sub-question-answering modules include: the knowledge base question-answering module is used for reasoning and obtaining candidate answers of the rewritten questions and explanatory information of the candidate answers according to the large-scale knowledge map; the reading and understanding question-answering module is used for obtaining candidate answers for rewriting the questions and explanatory description information of the candidate answers according to local collection text and online internet retrieval text reasoning; and the large language model question-answering module is used for reasoning and obtaining candidate answers of the rewritten questions and explanatory description information of the candidate answers according to linguistic knowledge, common sense knowledge and fact knowledge in the large language model.
Referring to fig. 8, fig. 8 is a schematic diagram of a knowledge base question-answering module based on semantic parsing and logic expression execution according to the present invention.
As a preferred embodiment, the knowledge base question-answering module includes: the semantic analysis module is used for identifying the entity in the rewrite problem based on the entity link and the entity replacement, replacing the entity, and mapping the semantic analysis of the rewrite problem into a logic expression executable by the logic expression execution module supporting the neural-symbol hybrid reasoning; the logic expression execution module supporting the neural-symbol hybrid reasoning is used for inquiring information from the structured knowledge base or the unstructured text base based on the logic expression to reason and obtain the candidate answer of the rewritten problem and the explanatory description information of the candidate answer.
In particular, a knowledge base question and answer module (KBQA) based on a structured knowledge base is mainly used to answer factual questions, including but not limited to: query knowledge of the object, compare attributes of objects, filter satisfactory objects, etc. (examples are shown in table 2 below). The system obtains the answer based on the logic expression reasoning, can show the reasoning operation and the reasoning result in the middle process, and has good interpretability. The system can perform accurate symbol reasoning or nerve-symbol mixed reasoning based on the logic expression, and the accuracy of answer prediction is high.
TABLE 2KBQA example
The knowledge base question-answering module takes the rewritten question obtained by the rewrite module (Rewriter) as input, constructs a semantic analyzer based on the entity link and the large language model, generates a logic expression, and the follow-up nerve-symbol reasoning module executes the expression to search related information from the knowledge base and reasoning to obtain an answer. Finally, the answer is input to the aggregate reasoning module 4. The KBQA module includes two sub-modules:
a semantic parsing module (Entity-aware Semantic Parser) based on Entity links and Entity substitution aims to map the problem into executable logic expressions for guiding the reasoning direction of the follow-up reasoning module. Taking a non-autoregressive generation type large language model of an encoding-decoding formula as a base model (such as T5), and performing supervised training through paired problem-logic expression data to obtain a semantic analyzer. In the training stage, the problem of lack of training data may be faced, and the data is amplified by methods of rule generation, model generation, word sequence disorder and the like to form large-scale training data for improving the performance of the semantic analyzer.
In the application stage, there may be an entity which does not appear in the training set in the input problem, and in order to improve the generalization capability of the semantic analyzer for new entities, we propose a semantic analysis method based on entity links and entity substitution. The entities in the question are first identified by entity links and then replaced with placeholders (e.g., "When was the author of < placeholder-1> born; the replaced question is mapped to the logical expression "Find (< placeholder-1 >)" by the semantic parser (Relate (author) (QueryAttr (date of birth) "; finally, the placeholder in the logical expression is restored to the target entity, resulting in the executable logical expression" Find (The Old Man and the Sea) (Relate (author) (QueryAttr (date of birth))).
A logical expression execution module (Neural-symbolic Executor) supporting neuro-symbolic hybrid reasoning, based on the indication of the logical expression, queries information from a structured knowledge base (such as Wikidata) or an unstructured text base (such as Wikipedia), and gradually executes each step of operation of the logical expression, such as Find (The Old Man and the Sea) operation, wherein the executor is required to search the entity The Old Man and the Sea and Relate (author) from the knowledge base to query the author of the target entity. The symbol reasoning can directly search related information from the structured knowledge base, the accuracy is higher, but the recall rate of the related information is lower because the coverage rate of the structured knowledge base to the knowledge is lower; in order to improve the recall rate of information and the generalization capability of an actuator, neuro-reasoning attempts to infer relevant information from an unstructured text library, and the unstructured text library covers a large amount of knowledge, so that the recall rate of the relevant information is higher.
In addition, the KBQA module designs a knowledge reasoning channel for fusing the related entity sub-graph retrieval (KBQuery), namely KBQuery generation+sub-graph retrieval+neuro-symbol reasoning, firstly, inputting the rewrite problem into a Program generation model to generate a sub-graph query statement KBQuery, then, obtaining a knowledge sub-graph from a knowledge graph by utilizing a sub-graph retrieval function in a knowledge graph base management module, and finally, converting the retrieved knowledge sub-graph into a sequence text by JSON or direct splicing and the like to be input into a large language model for knowledge reasoning, thereby realizing the fuzzy knowledge reasoning capability of knowledge base question-answering.
In summary, the knowledge base question-answering module has an efficient data augmentation method: when the training samples are insufficient, a large amount of diversified data can be quickly constructed for training, so that the question-answering capacity and the generalization capacity of the model are improved; entity linking and entity replacement: the entities in the question are located and replaced with placeholders, which are restored to the target entity after the question is converted to a logical expression. The innovation can enhance the generalization capability of the model to new entities; neuro-symbolic reasoning: the module integrates the advantages of both neural reasoning and symbolic reasoning, the symbolic reasoning has accuracy and interpretability, and the neural reasoning provides good generalization.
Referring to fig. 9, fig. 9 is a schematic diagram of a reading understanding module based on open domain knowledge reasoning according to the present invention.
Referring to fig. 10, fig. 10 is a schematic diagram of an evidence retrieval module according to the present invention.
Referring to fig. 11, fig. 11 is a schematic diagram of a knowledge reasoning model provided by the present invention.
As a preferred embodiment, the reading understanding question and answer module includes: the evidence retrieval module is used for retrieving the local text data and the Internet data to obtain evidence candidates; and the knowledge reasoning model is used for obtaining the candidate answers of the rewritten questions and the explanatory description information of the candidate answers according to the evidence candidate reasoning.
Specifically, the reading understanding question-answering module mainly focuses on answering the open domain factual questions, and has certain logic reasoning and numerical reasoning capabilities. The system acquires related unstructured text knowledge from the internet and the domain-specific retrieval corpus according to the questions of the user, and obtains answers to the questions through knowledge integration and reasoning. The characteristics of the system are mainly characterized in two aspects: 1) The system can utilize massive unstructured text knowledge, the text knowledge is low in acquisition difficulty, wide in coverage range, high in updating speed and high in instantaneity, and the support of various open domain problems is realized; 2) In addition, the logical reasoning and numerical reasoning module equipped in the system can integrate text information more fully, further improve knowledge utilization rate and realize support for complex problems.
The reading and understanding question-answering module takes the rewritten question obtained by the rewrite module (Rewriter) as input, retrieves structured text knowledge related to the question, integrates information through an inference model (Reasoner), and generates an answer to the question. The TextQA module includes two sub-modules:
evidence retrieval (Evidence Retrieval) module supports retrieval of relevant text segments from locally collected domain-specific retrieval corpora and on the online internet for subsequent information integration and reasoning. The method mainly comprises two parts of local text data retrieval and Internet data retrieval.
And (3) local text data retrieval, wherein a mixed method of sparse retrieval algorithm and dense retrieval is adopted. After obtaining candidate document sets by sparse retrieval (BM 25) and Dense vector retrieval (Dense retrieval), we merge the two candidate sets and sum the sparse retrieval BM25 relevance score with the Dense retrieval vector relevance score to obtain the final relevance score. The sparse retrieval algorithm has more stable performance in different fields, and the dense retrieval algorithm can better capture semantics, so that the mixture of the sparse retrieval algorithm and the dense retrieval algorithm can better serve the knowledge retrieval requirement of an open-domain question-answering system.
Here, dense vector retrieval, a technique of dense vector retrieval and reordering based on a large language model is adopted: firstly, a large language model (such as a dense fragment search algorithm Dense Passage Retriever (DPR), an unsupervised dense information search algorithm Contriever and the like) is finely tuned by adopting a supervised or unsupervised dense search algorithm from training data, wherein the large language model fine Tuning adopts a parameter efficient fine Tuning method based on Delta Tuning. Dense vector retrieval performs coarse-grained evidence screening based on semantic vector similarity of user questions and candidate texts, and a reordering model performs fine-grained evidence screening on the basis of the coarse-grained evidence screening by depth matching of the user questions and the retrieved texts. The coarse-to-fine secondary evidence retrieval method can efficiently and accurately acquire relevant text knowledge. Here, the reordering model is a large language model based on supervised data fine Tuning, and is also implemented using Delta Tuning.
The core of dense vector retrieval is the matching of semantic vectors. When the text data is large in size, large-scale retrieval of dense vectors is involved, but storage of semantic vectors and dot product calculation bring great difficulty to model deployment. To address this difficulty, we further employed a hierarchical navigation small world vector search algorithm (Hierarchical Navigable Small Worlds, HNSW). The algorithm is a vector retrieval algorithm based on a graph structure, the graph is organized into hierarchical structures, similar nodes form high-level nodes together, nearest neighbor query is carried out in the high-level nodes, and nodes which are close to query vectors in the bottom-level nodes are quickly found.
And (5) retrieving data combination. And combining the Internet retrieval text and the local text data retrieval result obtained through the search engine API, thereby obtaining a final query result. The result is directly input to a downstream question-answer model as evidence candidate (Evidence Candidates) for reasoning and answer generation.
A knowledge reasoning model (Knowledge Reasoner) employs an Encoder-decoder framework, the Encoder consisting of a text representation module (Encoder), a logical reasoning module (Logic register), and a numerical reasoning module (numerical register). The text representation module encodes the user problem and the evidence obtained by retrieval to form an encoding representation matrix; the logical reasoning module and the numerical reasoning module make reasoning based on the coding representation matrix. The logic reasoning module adopts a structure of a transducer, and realizes implicit multi-hop logic reasoning in a continuous semantic space through multiple iterations, and a reasoning result is represented by a matrix with a fixed size. The numerical value reasoning module also adopts a transducer structure to output a directed acyclic computational graph, the nodes represent mathematical operations, and the edges represent the dependency relations among the operations; the numerical result obtained by calculation by the calculator is further used for perfecting node representation in the graph. Finally, the coded representation matrix of the text representation module, the inference matrix of the logical inference module, and the graphic representation matrix of the numerical inference module are used as inputs to the decoder to generate a final answer.
In conclusion, the reading understanding question-answering module constructs a logic reasoning module and a numerical reasoning module, realizes multi-step continuous semantic reasoning and discrete numerical reasoning, enables the knowledge learning reasoning model to more fully utilize unstructured text evidence obtained by retrieval, and realizes support to complex user problems.
Referring to fig. 12, fig. 12 is a schematic diagram of a big language model question-answering module based on knowledge detection and demonstration provided by the present invention.
As a preferred embodiment, the large language model question-answering module includes: the answer generation module based on knowledge detection is used for carrying out task adaptation on answer generation through a pluggable small-parameter efficient learning technology by taking a non-autoregressive generation type pre-training language model as a large language model; the system comprises an demonstration-based interpretation generation module, a regression analysis module and a query analysis module, wherein the demonstration-based interpretation generation module is used for carrying out example construction and large language model interpretation generation by taking an autoregressive generation type large language model as a large language model based on answer generation task adaptation so as to reasoning and obtain candidate answers of rewritten questions and explanatory information of the candidate answers; the big language model question-answering module is also used for judging whether the rewritten problem is a preset position problem or not according to a preset position problem refuting mechanism and generating big language model refuting explanation.
Specifically, the large language model question-answering module mainly supports fuzzy reasoning and perceives the question of reasoning. The method has better compensation and supplement effects on the processing of question-answering systems based on knowledge graphs, common sense problems (such as preset standing problems) which cannot be solved by question-answering systems based on reading understanding, problems needing creativity (such as poetry writing, writing and the like), and viewpoint attitude type problems and the like. The system enables the question-answering capability and the knowledge capability in a large language model to be effectively activated through carefully constructing the activated corpus. In addition, the system is provided with an internal interpretation module, and the large language model itself gives a long-term interpretation of its own answers.
The large language model question-answering module takes the rewritten questions obtained by the rewrite module (Rewriter) as input, takes the large language model as a knowledge source, generates answers and interpretations, gives answer confidence, and then inputs the answers and the interpretations to the aggregation reasoning module 4. The BMQA module includes two sub-modules:
1) Based on the answer generation of knowledge detection (Knowledge Probing), the non-autoregressive generation formula PLM is used as a large language model (such as T5, unifiedQA and the like), and the answer generation task adaptation (such as tasks of judging whether to preset a standing problem, answer generation and the like) is performed through Delta Tuning. The central computing and management module 2 here provides a large language model and Delta Tuning efficient computing support. The submodule consists of knowledge probe construction and PLM. Knowledge probe construction by defining a Prompt Template (Prompt Template) the answer generation is changed into a gap-filling question, as given question queston= "there are several eyes in the sun? "we can construct templates: "problem: { query } the answer to this question is _____ ", where query fills in the user question, and" ___ "is the model fill location.
2) Based on the exemplary interpretation generation (interpretation), the autoregressive generation formula PLM is used as a large language model (such as a GPT series model and the like), and comprises two parts of example construction and large language model interpretation generation. Example construction example input is constructed mainly by using an example template of Few-shot/Zero-shot, as in example 1..... Example k. Problems: { Question } answer: { Answer } interpretation: ____).
Here, k examples are shown, each consisting of three parts of questions, answers, and interpretations, with Question and Answer representing filled questions and answers. For example: when k=2, the example construction results are:
problem 1: what is the case with a computer for san jose? Answer 1: not, explain 1: the computer has not been invented at that time during the resurrection period of the literature.
Problem 2: yao Ming and james who is higher? Answer 2: yao Ming, interpretation 2: yao Ming height was 229cm and the james lift was 203cm.
Problem 3: is the sun with several eyes? Answer 3: no eyes, interpretation 3: ____
It should be noted that, further performing interpretation generation task adaptation in combination with Delta Tuning on an exemplary basis will help to further enhance the interpretation generation effect of the large language model, such as the ability to refute a preset position problem. Also here the central computing and management module 2 provides a large language model and Delta Tuning support.
Other functions: the BMQA is refuted from a preset standpoint. The preset position problem mainly refers to the premise that the user problem has knowledge errors such as common sense, facts, etc., for example, the problem "the sun has several eyes? "there is a preset stand" there is eyes in the sun "the question" what computer is used to write a script by Shakibi? The "Preset position" computer has been invented during the Shakibi period. This problem, GPT-3 with 175B parameter and the like can also have error replies of 'two eyes', 'apple computer' and the like, and the capability of anti-disagreement interpretation is lacking. To activate the ability of BMQA to refute a preset standpoint, two steps are taken: 1) Judging whether the answer generation module is added with a preset standing problem or not; 2) And generating the refuting interpretation, wherein the interpretation generation module activates the large language model refuting interpretation generation capability through a preset standpoint question-answer data set. And in the two steps, the question-answer restoring capability of BMQA on the questions comprising the preset position is realized by adopting pareto multi-objective optimization, delta Tuning, data playback and few-sample learning. Here we can build a preset setback question-answer dataset for model training, the dataset being in the form of table 3:
TABLE 3 preset standpoint dataset
In conclusion, the large language model question-answering module builds a BMQA technical framework based on knowledge detection and demonstration, wherein the large language model and Delta micro-adjustment and plug unification are supported through the central computing and management module 2, the module expandability is strong, and task data adaptation is flexible. The preset standing problem is supported. The module constructs a manner of pareto multi-objective optimization prediction and data playback, realizes the refute explanation of the preset standing problem, and has common sense reasoning capability. At present, the large language model cannot well support the problems such as OPT, GPT-3, T5-XXL and the like.
Referring to fig. 13, fig. 13 is a schematic diagram of an interface of the large language model driven open domain natural language reasoning question-answering system according to the present invention.
Referring to fig. 14, fig. 14 is a schematic diagram of an open domain natural language reasoning question-answering system interface driven by a large language model according to the present invention.
Referring to fig. 15, fig. 15 is a third schematic diagram of an open domain natural language reasoning question-answering system interface driven by a large language model according to the present invention.
The invention has the following beneficial effects:
(1) The supporting problem types are comprehensive. The system can simultaneously support the fact knowledge problem, the common sense knowledge problem and the text knowledge problem, and has certain capabilities of multi-hop reasoning, numerical reasoning, common sense reasoning and the like. The open-domain complex scene natural language reasoning intelligent question-answering system framework is used for referencing a human brain language understanding mechanism, adopts a symbol reasoning and nerve reasoning mixed reasoning mechanism, and has the capability of answering questions from a large-scale knowledge graph, a large-scale unstructured text and a large language model as knowledge sources. The system adopts the prior art framework in knowledge base question answering, reading and understanding, and carries out module innovation on the basis, and the system specifically comprises the following steps: 1) An reasoner with logical reasoning and numerical reasoning for textQA; 2) A preset standpoint refuting mechanism of BMQA; 3) An aggregate reasoning of the feedback; 4) The Program generalization mechanism in KBQA (entity substitution, data augmentation, neuro-symbolic reasoning reinforcement learning).
(2) The universality is strong. The large language model is used as a system basic model to penetrate through the whole system, and the method has the characteristics of good prediction performance, strong adaptability to complex scenes and the like. The method is realized by adopting a large language model in the process (Program) generation of KBQA, the dense retrieval, the reordering, the reading and understanding reasoner, the question and answer of the large language model, the explanation generation of the large language model and the like of the textQA.
(3) And the expandability is strong. Aiming at the problems of large hardware memory and display memory requirements, difficult field adaptation and the like caused by a large language model and a large-scale knowledge graph, the system designs a central computing and management platform, provides a large language model scheduling and adaptation technology based on pluggable small-parameter efficient fine adjustment, and supports dynamic adaptation and flexible call of the model.
(4) The interpretability is strong. The system adopts a modularized design, and adopts a Program (Program) to execute the modes of visualization, evidence display, answer interpretation and the like to realize interpretation functions such as question analysis, answer analysis and the like. And giving credibility scoring to the answer output of each module, and providing support for the user to select reliable answers. A KoPL presentation inference process is employed, evidence recalls the sources of presentation question answers, and demonstrates interpretation of large language model answers.
(5) Has rejection and refuting capabilities. The method for activating the large language model refuting capability by the preset position data is provided in the large language model question answering, namely, the activation strategy of the preset position problem refuting capability of pareto multi-objective optimization + Delta Tuning + data playback + few-sample learning.
Referring to fig. 16, fig. 16 is a flow chart of a large language model driven open domain natural language reasoning question-answering method provided by the present invention.
The invention also provides a large language model driven open domain natural language reasoning question-answering method, which is suitable for the large language model driven open domain natural language reasoning question-answering system, and comprises the following steps:
1601: the problem rewriting module 1 rewrites a user problem to obtain a rewritten problem;
1602: the central computing and managing module 2 outputs computing and knowledge resources of the large language model required by the rewritten problem and the problem core engine module to one or more sub-question-answering modules in the question-answering core engine module 3 according to the type of rewritten problem;
1603: the question and answer core engine module 3 obtains one or more candidate answers of the rewritten questions and the explanatory description information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model;
1604: the aggregation inference module 4 aggregates and infers the final answer of the rewritten question and the interpretability specification information of the final answer according to one or more candidate answers of the rewritten question and the interpretability specification information of the candidate answers.
For the description of the large language model driven open domain natural language reasoning question-answering method provided by the present invention, please refer to the above system embodiment, and the description of the present invention is omitted here.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (10)
1. The large language model driven open domain natural language reasoning question-answering system is characterized by comprising a question rewriting module, a central computing and managing module, a question-answering core engine module and an aggregation reasoning module; the question-answering core engine module comprises a plurality of sub-question-answering modules, and the question reasoning logics of the plurality of sub-question-answering modules are different;
The output end of the problem rewriting module is connected with the input end of the central computing and management module and is used for rewriting the user problem to obtain a rewriting problem;
the output end of the central computing and managing module is connected with the input end of the question-answering core engine module and is used for managing computing and knowledge resources of the large language model and outputting the computing and knowledge resources of the large language model required by the rewritten questions and the question core engine module to one or more sub-question-answering modules in the question-answering core engine module according to the type of the rewritten questions;
the output end of the question-answer core engine module is connected with the input end of the aggregation reasoning module and is used for obtaining one or more candidate answers of the rewritten questions and the interpretable explanatory information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model;
the aggregation reasoning module is used for aggregating and reasoning to obtain a final answer of the rewritten question and the explanatory information of the final answer according to one or more candidate answers of the rewritten question and the explanatory information of the candidate answers.
2. The large language model driven open domain natural language reasoning question-answering system of claim 1, further comprising:
the input end of the answer generation module is connected with the output end of the aggregation reasoning module, and the answer generation module is used for carrying out machine answer conversion on the final answer of the rewritten question and the explanatory information of the final answer to obtain a answer of the user question and the explanatory information of the answer;
and the dialogue interaction module is connected with the reply generation module and is used for performing dialogue interaction with the user question according to the reply answer of the user question and the explanatory description information of the reply answer.
3. The large language model driven open domain natural language reasoning question-answering system according to claim 1, wherein the question rewrite module is specifically configured to refer to solve or omit solve the user question according to historical question-answering data to obtain a rewrite question, and not trigger a rewrite prompt when the rewrite question is the same as the user question, and trigger a rewrite prompt when the rewrite question is not the same as the user question, so as to be selected by a user.
4. The large language model driven open domain natural language reasoning question-answering system of claim 1, wherein the central computing and management module comprises:
the knowledge graph base management module is used for providing knowledge query and reasoning execution of the knowledge graph base;
the text resource library management module is used for providing text query support of the local text resource library;
the large language model calculation and management module is used for providing model calculation and task adaptation support of a large language model library and a small amount of fine tuning parameter library for large language model task adaptation;
and the management center module is used for scheduling the knowledge graph library, the local text resource library, the large language model library and a small amount of fine tuning parameter library adapted to the large language model task according to the plurality of sub question-answer modules, selecting one or more sub question-answer modules from the plurality of sub question-answer modules according to the type of the rewritten problem, and reasoning the rewritten problem so as to obtain one or more candidate answers of the rewritten problem and explanatory information of the candidate answers.
5. The large language model driven open domain natural language reasoning question-answering system according to claim 1, wherein the aggregate reasoning module is specifically configured to aggregate reasoning from one or more candidate answers to the rewritten question and the interpretable explanatory information of the candidate answers based on a score fusion mechanism, an optimization mechanism for user feedback, and an iterative comprehensive reasoning mechanism.
6. The large language model driven open domain natural language reasoning questioning and answering system according to any one of claims 1 to 5, wherein the plurality of sub-questioning and answering modules include:
the knowledge base question-answering module is used for reasoning and obtaining candidate answers of the rewritten questions and explanatory description information of the candidate answers according to the large-scale knowledge map;
the reading and understanding question-answering module is used for obtaining candidate answers of the rewritten questions and explanatory description information of the candidate answers by reasoning according to local collection text and/or online internet retrieval text;
and the large language model question-answering module is used for reasoning and obtaining the candidate answers of the rewritten questions and the explanatory description information of the candidate answers according to the linguistic knowledge, the common sense knowledge and the fact knowledge in the large language model.
7. The large language model driven open domain natural language reasoning question-answering system of claim 5, wherein the knowledge base question-answering module comprises:
the semantic analysis module is used for identifying the entity in the rewrite problem based on entity link and entity replacement, replacing the entity, and mapping the semantic analysis of the rewrite problem into a logic expression executable by the logic expression execution module supporting neural-symbol hybrid reasoning;
The logic expression execution module supporting the neural-symbol mixed reasoning is used for inquiring information from a structured knowledge base or an unstructured text base based on the logic expression so as to reasoning and obtain candidate answers of the rewritten questions and explanatory information of the candidate answers.
8. The large language model driven open domain natural language reasoning question-answering system of claim 5, wherein the reading comprehension question-answering module comprises:
the evidence retrieval module is used for retrieving the local text data and the Internet data to obtain evidence candidates;
and the knowledge reasoning model is used for reasoning and obtaining the candidate answers of the rewritten questions and the explanatory information of the candidate answers according to the evidence candidates.
9. The large language model driven open domain natural language reasoning question-answering system of claim 5, wherein the large language model question-answering module comprises:
the answer generation module based on knowledge detection is used for carrying out task adaptation on answer generation through a pluggable small-parameter efficient learning technology by taking a non-autoregressive generation type pre-training language model as a large language model;
the explanation generating module is used for generating task adaptation based on the answers, performing example construction and large language model explanation generation by taking a generated large language model as a large language model, and reasoning to obtain candidate answers of the rewritten questions and explanatory information of the candidate answers;
The big language model question and answer module is also used for judging whether the rewritten problem is a preset position problem or not according to a preset position refuting mechanism and generating big language model refuting explanation.
10. A large language model driven open domain natural language reasoning question-answering method, characterized in that the method is applicable to the large language model driven open domain natural language reasoning question-answering system according to any one of claims 1 to 9, comprising:
the problem rewriting module rewrites the user problem to obtain a rewritten problem;
the central computing and managing module outputs computing and knowledge resources of the large language model required by the rewrite problem and the problem core engine module to one or more sub-question-answering modules in the question-answering core engine module according to the type of the rewrite problem;
the question-answering core engine module obtains one or more candidate answers to the rewritten questions and explanatory information of the candidate answers according to calculation and knowledge resource reasoning of the rewritten questions and the large language model;
and the aggregation reasoning module aggregates and reasoning to obtain a final answer of the rewritten question and the explanatory information of the final answer according to one or more candidate answers of the rewritten question and the explanatory information of the candidate answers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310414399.7A CN116932708A (en) | 2023-04-18 | 2023-04-18 | Open domain natural language reasoning question-answering system and method driven by large language model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310414399.7A CN116932708A (en) | 2023-04-18 | 2023-04-18 | Open domain natural language reasoning question-answering system and method driven by large language model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116932708A true CN116932708A (en) | 2023-10-24 |
Family
ID=88379524
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310414399.7A Pending CN116932708A (en) | 2023-04-18 | 2023-04-18 | Open domain natural language reasoning question-answering system and method driven by large language model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116932708A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117271700A (en) * | 2023-11-23 | 2023-12-22 | 武汉蓝海科创技术有限公司 | Device use and maintenance knowledge base integrating intelligent learning function |
CN117312535A (en) * | 2023-11-28 | 2023-12-29 | 中国平安财产保险股份有限公司 | Method, device, equipment and medium for processing problem data based on artificial intelligence |
CN117453895A (en) * | 2023-12-20 | 2024-01-26 | 苏州元脑智能科技有限公司 | Intelligent customer service response method, device, equipment and readable storage medium |
CN117634468A (en) * | 2023-11-30 | 2024-03-01 | 北京智谱华章科技有限公司 | Universal text quality evaluation method based on large language model |
CN117688164A (en) * | 2024-02-03 | 2024-03-12 | 北京澜舟科技有限公司 | Illusion detection method, system and storage medium based on large language model |
CN117688186A (en) * | 2023-11-14 | 2024-03-12 | 中国科学院软件研究所 | Automatic correction method and device for large language model illusion problem based on knowledge graph |
CN117688223A (en) * | 2023-12-12 | 2024-03-12 | 山东浪潮科学研究院有限公司 | Specific domain intelligent question-answering system and method based on large language model |
CN117708347A (en) * | 2023-12-14 | 2024-03-15 | 北京英视睿达科技股份有限公司 | Method and system for outputting multi-mode result by large model based on API (application program interface) endpoint |
CN117828050A (en) * | 2023-12-29 | 2024-04-05 | 北京智谱华章科技有限公司 | Traditional Chinese medicine question-answering method, equipment and medium based on long-document retrieval enhancement generation |
CN117874179A (en) * | 2023-11-02 | 2024-04-12 | 电投云碳(北京)科技有限公司 | CCER intelligent question-answering method and device, electronic equipment and storage medium |
CN117892818A (en) * | 2024-03-18 | 2024-04-16 | 浙江大学 | Large language model rational content generation method based on implicit thinking chain |
CN117910581A (en) * | 2024-01-22 | 2024-04-19 | 上海算法创新研究院 | Quotation writing method oriented to text automatic generation |
CN117952022A (en) * | 2024-03-26 | 2024-04-30 | 杭州广立微电子股份有限公司 | Yield multi-dimensional interactive system, method, computer equipment and storage medium |
CN118135592A (en) * | 2024-05-09 | 2024-06-04 | 支付宝(杭州)信息技术有限公司 | User service method and device based on medical LLM model |
CN118194996A (en) * | 2024-05-14 | 2024-06-14 | 智慧眼科技股份有限公司 | Knowledge graph-based large-model reliable medical knowledge injection method and device |
CN118260406A (en) * | 2024-05-29 | 2024-06-28 | 山东浪潮科学研究院有限公司 | Text retrieval enhancement generation method and system with enhanced diversity |
-
2023
- 2023-04-18 CN CN202310414399.7A patent/CN116932708A/en active Pending
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117874179B (en) * | 2023-11-02 | 2024-06-04 | 电投云碳(北京)科技有限公司 | CCER intelligent question answering method and device, electronic equipment and storage medium |
CN117874179A (en) * | 2023-11-02 | 2024-04-12 | 电投云碳(北京)科技有限公司 | CCER intelligent question-answering method and device, electronic equipment and storage medium |
CN117688186A (en) * | 2023-11-14 | 2024-03-12 | 中国科学院软件研究所 | Automatic correction method and device for large language model illusion problem based on knowledge graph |
CN117271700B (en) * | 2023-11-23 | 2024-02-06 | 武汉蓝海科创技术有限公司 | Construction system of equipment use and maintenance knowledge base integrating intelligent learning function |
CN117271700A (en) * | 2023-11-23 | 2023-12-22 | 武汉蓝海科创技术有限公司 | Device use and maintenance knowledge base integrating intelligent learning function |
CN117312535A (en) * | 2023-11-28 | 2023-12-29 | 中国平安财产保险股份有限公司 | Method, device, equipment and medium for processing problem data based on artificial intelligence |
CN117634468B (en) * | 2023-11-30 | 2024-05-28 | 北京智谱华章科技有限公司 | Universal text quality evaluation method based on large language model |
CN117634468A (en) * | 2023-11-30 | 2024-03-01 | 北京智谱华章科技有限公司 | Universal text quality evaluation method based on large language model |
CN117688223A (en) * | 2023-12-12 | 2024-03-12 | 山东浪潮科学研究院有限公司 | Specific domain intelligent question-answering system and method based on large language model |
CN117708347A (en) * | 2023-12-14 | 2024-03-15 | 北京英视睿达科技股份有限公司 | Method and system for outputting multi-mode result by large model based on API (application program interface) endpoint |
CN117453895A (en) * | 2023-12-20 | 2024-01-26 | 苏州元脑智能科技有限公司 | Intelligent customer service response method, device, equipment and readable storage medium |
CN117453895B (en) * | 2023-12-20 | 2024-03-01 | 苏州元脑智能科技有限公司 | Intelligent customer service response method, device, equipment and readable storage medium |
CN117828050A (en) * | 2023-12-29 | 2024-04-05 | 北京智谱华章科技有限公司 | Traditional Chinese medicine question-answering method, equipment and medium based on long-document retrieval enhancement generation |
CN117910581A (en) * | 2024-01-22 | 2024-04-19 | 上海算法创新研究院 | Quotation writing method oriented to text automatic generation |
CN117688164B (en) * | 2024-02-03 | 2024-05-17 | 北京澜舟科技有限公司 | Illusion detection method, system and storage medium based on large language model |
CN117688164A (en) * | 2024-02-03 | 2024-03-12 | 北京澜舟科技有限公司 | Illusion detection method, system and storage medium based on large language model |
CN117892818B (en) * | 2024-03-18 | 2024-05-28 | 浙江大学 | Large language model rational content generation method based on implicit thinking chain |
CN117892818A (en) * | 2024-03-18 | 2024-04-16 | 浙江大学 | Large language model rational content generation method based on implicit thinking chain |
CN117952022A (en) * | 2024-03-26 | 2024-04-30 | 杭州广立微电子股份有限公司 | Yield multi-dimensional interactive system, method, computer equipment and storage medium |
CN118135592A (en) * | 2024-05-09 | 2024-06-04 | 支付宝(杭州)信息技术有限公司 | User service method and device based on medical LLM model |
CN118194996A (en) * | 2024-05-14 | 2024-06-14 | 智慧眼科技股份有限公司 | Knowledge graph-based large-model reliable medical knowledge injection method and device |
CN118260406A (en) * | 2024-05-29 | 2024-06-28 | 山东浪潮科学研究院有限公司 | Text retrieval enhancement generation method and system with enhanced diversity |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116932708A (en) | Open domain natural language reasoning question-answering system and method driven by large language model | |
CN112487182B (en) | Training method of text processing model, text processing method and device | |
Uc-Cetina et al. | Survey on reinforcement learning for language processing | |
Wang et al. | Interactive natural language processing | |
CN112364660B (en) | Corpus text processing method, corpus text processing device, computer equipment and storage medium | |
US20230305822A1 (en) | Machine-Learning Assisted Natural Language Programming System | |
CN116820429A (en) | Training method and device of code processing model, electronic equipment and storage medium | |
Wang et al. | Towards information-rich, logical dialogue systems with knowledge-enhanced neural models | |
CN116882450B (en) | Question-answering model editing method and device, electronic equipment and storage medium | |
CN116594768A (en) | Large-model-oriented universal tool collaboration and refinement learning system and method | |
CN117216544A (en) | Model training method, natural language processing method, device and storage medium | |
US20230316001A1 (en) | System and method with entity type clarification for fine-grained factual knowledge retrieval | |
Wang et al. | A survey of the evolution of language model-based dialogue systems | |
CN118261163A (en) | Intelligent evaluation report generation method and system based on transformer structure | |
Park et al. | Visual language integration: A survey and open challenges | |
US11847575B2 (en) | Knowledge representation and reasoning system and method using dynamic rule generator | |
CN117932019A (en) | Training method and device for large language model, medium and electronic equipment | |
CN117453925A (en) | Knowledge migration method, apparatus, device, readable storage medium and program product | |
CN117193582A (en) | Interactive control method and system and electronic equipment | |
Chaurasia et al. | Conversational AI Unleashed: A Comprehensive Review of NLP-Powered Chatbot Platforms | |
Römer et al. | Behavioral control of cognitive agents using database semantics and minimalist grammars | |
Shi et al. | The design and implementation of intelligent english learning chabot based on transfer learning technology | |
Acharya et al. | A Survey on Symbolic Knowledge Distillation of Large Language Models | |
Kulkarni et al. | Deep Reinforcement-Based Conversational AI Agent in Healthcare System | |
US11797610B1 (en) | Knowledge acquisition tool |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |