CN117931983A

CN117931983A - Method and system for generating accurate answer by using large model

Info

Publication number: CN117931983A
Application number: CN202410101190.XA
Authority: CN
Inventors: 周熠; 李宁萍; 陈泳融; 宋建恒
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2024-01-25
Filing date: 2024-01-25
Publication date: 2024-04-26

Abstract

The invention discloses a method and a system for generating an accurate answer by using a large model, which relate to the technical field of natural language processing, and the method for generating the accurate answer by using the large model comprises the following steps: preprocessing texts in different source fields; importing the preprocessed text into an elastic search through a structure of an ES index map defined when an ES index is created; constructing a LANGCHAIN-based custom retrieval tool; returning a related document; embedding the processed text into a pre-designed template frame; the large model generates the final answer of the user question according to the external information in the prompt. According to the invention, by providing the information of the external text library, the large model can acquire more knowledge, especially the latest information, so that the illusion problem of the large model is relieved, the output quality of the large model is improved, and the large model is better in field problem by combining the external text library of the field.

Description

Method and system for generating accurate answer by using large model

Technical Field

The invention relates to the technical field of natural language processing, in particular to a method and a system for generating accurate answers by using a large model.

Background

With the application of large language models (e.g., chatGpt, LLaMA, etc.), an effective strategy to improve their output reliability is to use external text libraries for context retrieval enhancement (In-Context Retrieval-Augmented). The implementation method of the strategy comprises the following two steps: the first step is to retrieve documents relevant to the query from an external text library; the second step is to provide the retrieved document as reference text to the large language model.

Currently, the main scheme for retrieving related documents from an external text library can refer to the implementation means provided by LANGCHAIN, and the main technical route is as follows: the document is first loaded from a plurality of different sources using a text loader. The large document is then split (or chunked) into smaller chunks using a document splitter, obtaining only the document portions relevant to the query. Then, another key part of the search is to create an embedding for the split document, the embedding can capture the semantic meaning of the text, so that the user can quickly and effectively search for other similar text fragments. Finally, the embeddings need to be pinned in the form of persistent files, and the vector database provided by LANGCHAIN can support efficient storage and searching of these embeddings. After the data enters the database, the large language model uses the reference text, related text needs to be retrieved from the vector database, LANGCHAIN supports simple semantic search, and a series of algorithms are added on the basis of the simple semantic search to improve the performance. And we have chosen to use the elastiscearch directly as a source of text library data.

At present, the method for outputting the large language model by using the retrieved document comprises the following steps: the search document is directly spliced with the input prompt, nearest neighbor search (KNN) is combined with a large language model, and the search document is decoded after being searched in an autoregressive mode. The search document is directly spliced with the input prompt, and the method is simple and direct, and the information related to the query is directly placed in the input prompt of the model to be used as background knowledge. The large language model will then generate an output based on this information. Although the method is concise, the fineness is low, and the precise control is difficult, so that the method is adopted. KNN and large language models are combined, and this strategy combines nearest neighbor searching (KNN) and large language models. First, a portion tokens is generated using a large language model. Documents similar to these tokens are then retrieved from the external text library, spliced into the input prompt, and then predicted for the next token. Autoregressive, and then decoding, the strategy uses a large language model to decode the portion tokens and then retrieves documents (or documents) similar to these tokens. And splicing the retrieved document into the input prompt, and continuing to predict the next token. The method of vectorization after text segmentation is used for creating a text index library, which is not suitable for a large-scale text library, and simple semantic search is not beneficial to quickly and accurately mining the reference document most relevant to user query from the large-scale text library. The KNN and the large language model are combined and the method of searching by means of autoregressive mode and then the two methods of searching documents are decoded, and compared with the method of directly splicing with input prompts, the method is high in calculation cost and easy to cause unstable output.

In order to provide more useful external information to a large language model, thereby reducing the illusion thereof and improving the accuracy of questions and answers in the field, an elastic search is used as a data source of a large language model external text library. The elastiscearch is a distributed search and analysis engine that supports efficient full text retrieval and near real-time data analysis. Compared to other text library schemes, the elastiscearch has the following advantages: a higher data storage amount is supported. The elastic search can distribute a large number of documents on a plurality of nodes, so that horizontal expansion is realized, and the availability and fault tolerance of data are improved; faster retrieval efficiency is supported. The elastic search uses the structure of the inverted index to build mapping for each word in the document and the document list with the word, thereby accelerating the searching speed; a more flexible search logic is supported. The elastic search provides a rich query language, and can realize various types of search, such as exact match, fuzzy match, range match, boolean match and the like. Meanwhile, the elastic search also supports operations such as sorting, filtering, aggregation and the like on the search results. We use the elastic search to retrieve the document related to the input prompt, splice it with the input prompt according to the score of the document, as input for the large language model. In this way, the large language model can generate a more reliable output using the retrieved document as a reference text.

For the problems in the related art, no effective solution has been proposed at present.

Disclosure of Invention

Aiming at the problems in the related art, the invention provides a method and a system for generating accurate answers by using a large model, so as to overcome the technical problems in the prior related art.

For this purpose, the invention adopts the following specific technical scheme:

according to one aspect of the present invention, there is provided a method of generating an accurate answer using a large model, the method of generating an accurate answer using a large model comprising the steps of:

s1, preprocessing texts in different source fields and storing the preprocessed texts into an elastic search of a distributed database;

S2, creating an ES index, and importing the preprocessed text into an elastic search through an ES index mapping structure defined when the ES index is created;

s3, constructing a LANGCHAIN-based custom retrieval tool by utilizing ES REST API;

s4, acquiring a user query problem, automatically identifying and calling a user-defined retrieval tool, and returning a related document;

S5, carrying out subsequent processing on the returned related document, and embedding the processed text into a pre-designed template frame;

And S6, the large model generates a final answer of the user question according to the external information in the prompt.

Further, preprocessing the texts in different source fields and storing the texts in a distributed database elastic search comprises the following steps:

S11, cutting the texts in the fields from different sources into paragraphs or sentences through text cutting so as to optimize the full text retrieval function of the search engine;

S12, converting the text into vector representation by adopting an m3e embedded model, and calculating the similarity.

Further, creating an ES index, and importing the preprocessed text into the elastic search through a structure of an ES index map defined when the ES index is created, includes the steps of:

s21, in the elastic search, mapping and defining the data type and the attribute of a field through an ES index;

S22, key information extracted from the preprocessed text is used for determining the mapping relation of each field;

S23, sending a request for creating the ES index to the elastic search by using an API of the elastic search, and designating index information to create the ES index;

S24, after the ES index is created, importing the preprocessed text into the elastic search through the structure of the ES index mapping defined when the ES index is created.

Further, the ES index includes a specification of several settings and mappings:

Setting a specified number of fragments and a specified number of copies;

The map specifies the field type and keeps the field type consistent with the field type employed at the time of preprocessing.

Further, obtaining a user query question, automatically identifying and calling a custom search tool, and returning a related document comprises the following steps:

S31, acquiring a user query problem, and automatically capturing key information through LANGCHAIN;

s32, selecting and calling a text custom search tool, printing key information and returning related documents.

Further, obtaining the user query questions and automatically capturing the key information through LANGCHAIN includes the following steps:

S311, receiving a user query problem, and analyzing the query problem according to the Plan-and-execution agents of LANGCHAIN;

S312, LANGCHAIN, comparing the similarity between the query problem and the descriptions of all callable tools in the tool box;

s313, selecting the best tool according to the similarity, executing the selected tool, and observing the output or return result of the tool to the user.

Further, the following steps are carried out on the returned related documents, and the processed texts are embedded into a pre-designed template frame, wherein the steps comprise:

s41, matching related documents, and adding values of a time field and a text source field at the initial position of a reference text;

S42, adjusting or customizing a Prompt text according to the service requirement;

S43, adding the values of the time field and the text source field at the starting position of the reference text and embedding the values into the adjusted campt framework.

Further, performing subsequent processing on the returned related document, and embedding the processed text into a pre-designed template frame comprises the following steps:

s51, analyzing the multi-field Json data, and automatically extracting relevant information in a plurality of fields;

S52, combining the extracted related information into a text in a smooth and smooth mode, and outputting the text in a character string mode;

and S53, embedding the combined text into a pre-designed template frame.

Further, the large model generating a final answer to the user question according to the external information in the prompt includes:

Text information related to a user query question is directly introduced ELATICSERACH in a text database in the prompt to guide the large model to generate a content accurate answer, and reference source information is output.

According to another aspect of the present invention, there is also provided a system for generating an accurate answer using a large model, the system for generating an accurate answer using a large model including: the system comprises a database management module, an index management and data importing module, an API call and tool construction module, a user query processing and tool call module, a result processing and text embedding module and a response generation module;

The database management module is used for preprocessing texts in different source fields and storing the texts into the distributed database elastic search;

the index management and data importing module is used for creating an ES index and importing the preprocessed text into an elastic search through an ES index mapping structure defined when the ES index is created;

An API call and tool construction module for constructing a LANGCHAIN-based custom retrieval tool by utilizing ES REST API;

The user query processing and tool calling module is used for acquiring a user query problem, automatically identifying and calling a user-defined retrieval tool, and returning a related document;

The result processing and text embedding module is used for carrying out subsequent processing on the returned related document and embedding the processed text into a pre-designed template frame;

and the generating answer module is used for generating a final answer of the user question according to the external information in the prompt by the large model.

The beneficial effects of the invention are as follows:

1. According to the invention, by providing the information of the external text library, the large model can acquire more knowledge, especially the latest information, so that the illusion problem (namely, the phenomenon that the generated content of the model is inconsistent with the fact or lacks logic) of the large model is relieved, the output quality of the large model is improved, and the large model is combined with the external text library of the field, so that the large model has better performance on the field problem, further higher text data storage capacity, faster retrieval efficiency and more flexible retrieval logic are supported, the illusion phenomenon is reduced, and the accuracy of questions and answers in the field is improved.

2. The invention adopts the elastic search text database and the self-defined search tool based on LANGCHAIN, and can fully and efficiently utilize the information of the external text database. The elastic search is a distributed search and analysis engine that can be easily extended to multiple nodes to handle large-scale data and highly concurrent queries, making it well suited for handling large text datasets and highly loaded search scenarios. The elastomer search provides rich search and analysis functions including full text searching, aggregation, filtering, sorting, paging, etc. It supports complex query grammar and flexible query construction, so that various complex search requirements can be satisfied, and the elastic search supports various search strategies including accurate search, approximate search, sparse vector search, and the like.

3. The invention uses the elastic search as a text library carrier, indexes all the information in the text, the index mapping is provided with word vector fields, the values of the fields adopt word embedding models, the dense numerical vectors are used for representing a word, the purpose of the vectors is to collect the semantic attribute of the word, if the vectors of the words are similar, the words are similar in semantic meaning, so that the elastic search can store large-scale text data, and meanwhile, an advanced semantic information representation method is also adopted, the retrieval mode of the elastic search is very flexible, and the related text can be retrieved according to the query requirement of a user to the greatest extent.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for generating accurate answers using a large model according to an embodiment of the invention;

FIG. 2 is a functional block diagram of a system for generating accurate answers using a large model according to an embodiment of the invention.

In the figure:

1. a database management module; 2. an index management and data importing module; 3. an API call and tool construction module; 4. a user query processing and tool calling module; 5. the result processing and text embedding module; 6. an answer module is generated.

Detailed Description

For the purpose of further illustrating the various embodiments, the present invention provides the accompanying drawings, which are a part of the disclosure of the present invention, and which are mainly used to illustrate the embodiments and, together with the description, serve to explain the principles of the embodiments, and with reference to these descriptions, one skilled in the art will recognize other possible implementations and advantages of the present invention, wherein elements are not drawn to scale, and like reference numerals are generally used to designate like elements.

According to an embodiment of the invention, a method and a system for generating accurate answers by using a large model are provided.

The invention will now be further described with reference to the accompanying drawings and detailed description, as shown in fig. 1, a method for generating accurate answers using a large model according to an embodiment of the invention, the method for generating accurate answers using a large model comprising the steps of:

specifically, preprocessing text in different source fields mainly includes:

Text cutting: i.e., to segment long text into paragraphs or sentences, in order to better utilize the full text retrieval functionality of a search engine.

Embedding the representation: i.e. the text is converted into a vector representation using the m3e embedding model for similarity calculation.

Specifically, the elastomer search is used as a carrier of an external text library to provide a powerful text retrieval strategy; the user-defined retrieval tool is used as an interface of the external text library and the large language model, and the retrieved text is processed and then used by the large language model, so that the integration of the external text library and the large language model is realized; the large language model inputs prompt design, fully utilizes the searched related text, and improves the answer performance of the large language model.

In particular, in the present invention, the primary purpose of the elastic search text library is to store large-scale text data and provide an efficient search strategy. And (3) adopting the elastic search as a data hosting platform, inputting texts in fields of different sources in an elastic search database, and outputting the texts as indexes, namely the optimized document set. An elastsearch is a distributed document storage and search engine that indexes and searches through in near real time within 1 second when documents are stored. The elastomer search uses a structure called inverted index that supports very fast full text searches. The inverted index lists each unique word that appears in any document and identifies the document in which each word appears. An index can be thought of as an optimized set of documents, each document being a set of fields that contain the required data. By default, the elastomer search will index all the data in each field and create a dedicated optimized data structure for each field. For example, text fields are stored in inverted indexes and numeric and geographic fields are stored in BKD-trees. The ability to use the data structure of each field to combine and return search results is why the elastic search is so fast.

specifically, in the invention, the aim of the self-defined search tool based on LANGCHAIN is to realize the interface of the external text library and the large language model, and the searched text is processed and then used by the large language model. Custom search tools were developed based on the LANGCHAIN platform, the input of which was a user query string, the output was a list of documents (documents), each Document being an object, the field type being consistent with the field type specified in the map when the text data was stored. LANGCHAIN is an extensible and efficient language model enhancement platform that provides a series of Tool components (Tool) that can be used to assist in the output of large language models. These tools may be utilities (e.g., searches), other chains, and even other agents. Models are often required to handle complex tasks, such as natural language processing, image processing, etc., which can be broken into smaller, reusable components using Tool, making it simpler and controllable to develop and manage large language models, and assigning different functions and tasks to different tools, thereby enabling parallel processing and distributed computing, improving the scalability and performance of the system. Tool provides flexible interfaces and configuration options, can be customized and expanded according to specific requirements, and a developer can select a proper Tool according to own requirements and customize and optimize according to the requirements so as to meet the specific requirements of a large language model.

Specifically, the custom search tool, the elastic search query module, interacts with the distributed database using ELASTICSEARCH REST API. An interface is provided for querying text information, supporting retrieval of relevant text data in a database according to query conditions. LANGCHAIN provide Tool data class to implement the function of the custom Tool, where the Tool data class contains the Tool name, function, description, metadata, and other attributes. The present invention uses Tool data classes to create a search Tool, and the description and metadata attributes in the Tool class provide a detailed description of Tool functions and usage instructions. The description contents of the search tool defined by the invention are as follows: inquiring text information of a certain field.

Specifically, the relevant document returned by the search tool is multi-field Json data, and relevant information in a plurality of fields is automatically extracted, including but not limited to text, date, keywords and the like. And according to the extracted information, combining the information in the fields into a smooth and smooth text in a character string formatting or character string splicing implementation mode.

Specifically, the related document refers to K pieces of text data with highest similarity to the user query problem in the elastic search text database, wherein K is a configurable parameter.

specifically, the embedding technique provides a powerful method for collecting semantic content of a piece of text. By indexing the embedded information and scoring based on vector distance, documents can be ranked by similarity concepts without being limited to word level repeatability.

In particular, training of language models requires reliance on large amounts of text data by which the model learns rules, grammar and semantics of human language. This process typically uses an unsupervised learning approach in which the model can learn a generic language model and representation during a pre-training phase, and further training on a finer, smaller scale data set associated with a particular task or domain during a fine-tuning phase. This helps the model optimize its understanding capabilities and adapts to the specific needs of the task. However, large models may still suffer from problems such as factual errors in generating answers, and to improve this, we design a higher quality prompt, directing the model to generate a specific type of output, providing context and direction to the model. On the basis of a large-scale pre-training model, high-quality promt is a key factor for generating high-quality content, one good promt can clearly guide the model to generate accurate and targeted output, and low-quality promt can lead to chaotic, irrelevant or low-quality results of the model.

Specifically, the invention relates to a method for generating accurate answers by using a large model, which directly provides the retrieved text related to the user question for the large model through the prompt so as to improve the accuracy of the answers.

Preferably, preprocessing the text in different source fields and storing in a distributed database elastic search comprises the following steps:

Preferably, creating the ES index and importing the preprocessed text into the elastic search through a structure of the ES index map defined when the ES index is created includes the steps of:

S22, key information extracted from the preprocessed text is used for determining the mapping relation of each field (such as a text field, a keyword field, an embedded vector field and the like);

s23, sending a request for creating the ES index to the elastic search by using an API of the elastic search, and designating index information (information such as names and mapping relations) to create the ES index;

Preferably, the ES index includes a specification of several settings and mappings:

Setting a specified number of fragments and a specified number of copies;

Preferably, obtaining a user query question, automatically identifying and invoking a custom search tool, and returning a related document comprises the steps of:

Preferably, acquiring the user query questions and automatically capturing key information through LANGCHAIN comprises the steps of:

In particular, modules Agents of LANGCHAIN use LLMs to determine which actions to take and in what order to take by program execution. All steps are planned and then performed sequentially using the Plan-and-execution agents agent model provided by LANGCHAIN. The action controlled by the Agent may be to use the tool and observe its output, or may be to return to the user. After obtaining the user query questions, LANGCHAIN's Agent selects which tool to invoke by comparing the similarity between the questions and the descriptions of all callable tools in the toolbox.

Preferably, the subsequent processing of the returned related document and the embedding of the processed text into the pre-designed template frame comprises the following steps:

In particular, promt is part of the model input, directly affecting the output of the model. It is ensured that the Prompt contains enough information to guide the model to generate results that meet the business needs. If business requirements relate to a particular domain of expertise, an attempt may be made to use the relevant specialized vocabulary in promt to enhance the understanding and generation of the model for domain-specific problems.

s51, analyzing the multi-field Json data, and automatically extracting relevant information (including but not limited to text, date, keywords and the like) in a plurality of fields;

and S53, embedding the combined text into a pre-designed template frame.

Preferably, the large model generates a final answer to the user question based on external information in the prompt (i.e., text information in ELATICSERACH text database related to the user query question), including:

There is also provided, in accordance with another embodiment of the present invention, as shown in fig. 2, a system for generating an accurate answer using a large model, the system for generating an accurate answer using a large model including: the system comprises a database management module 1, an index management and data import module 2, an API call and tool construction module 3, a user query processing and tool call module 4, a result processing and text embedding module 5 and a response generation module 6;

The database management module 1 is used for preprocessing texts in different source fields and storing the texts into the distributed database elastic search;

An index management and data importing module 2, configured to create an ES index, and import the preprocessed text into an elastomer search through a structure of an ES index map defined when the ES index is created;

An API call and tool construction module 3 for constructing a LANGCHAIN-based custom retrieval tool by utilizing ES REST API;

The user query processing and tool calling module 4 is used for acquiring a user query problem, automatically identifying and calling a user-defined retrieval tool, and returning a related document;

the result processing and text embedding module 5 is used for carrying out subsequent processing on the returned related document and embedding the processed text into a pre-designed template frame;

And the generating answer module 6 is used for generating a final answer of the user question according to the external information in the prompt by the large model.

In summary, by means of the above technical solution of the present invention, the present invention adopts the elastic search text database and the LANGCHAIN-based custom search tool, so that the external text library information can be fully and efficiently utilized. The elastic search is a distributed search and analysis engine that can be easily extended to multiple nodes to handle large-scale data and highly concurrent queries, making it well suited for handling large text datasets and highly loaded search scenarios. The elastomer search provides rich search and analysis functions including full text searching, aggregation, filtering, sorting, paging, etc. It supports complex query grammar and flexible query construction, so that various complex search requirements can be satisfied, and the elastic search supports various search strategies including accurate search, approximate search, sparse vector search, and the like. The invention uses the elastic search as a text library carrier, indexes all the information in the text, the index mapping is provided with word vector fields, the values of the fields adopt word embedding models, the dense numerical vectors are used for representing a word, the purpose of the vectors is to collect the semantic attribute of the word, if the vectors of the words are similar, the words are similar in semantic meaning, so that the elastic search can store large-scale text data, and meanwhile, an advanced semantic information representation method is also adopted, the retrieval mode of the elastic search is very flexible, and the related text can be retrieved according to the query requirement of a user to the greatest extent.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. A method for generating accurate answers using a large model, the method and system for generating accurate answers using a large model comprising the steps of:

2. A method for generating accurate answers using large models according to claim 1 wherein said preprocessing and storing text in different source fields in a distributed database elastic search comprises the steps of:

3. The method for generating accurate answers using a large model according to claim 1, wherein the creating an ES index and importing the preprocessed text into an elastic search through a structure of an ES index map defined when the ES index is created comprises the steps of:

4. A method of generating accurate answers with large models according to claim 3, wherein said ES index comprises specifying a number of settings and mappings:

Setting a specified number of fragments and a specified number of copies;

5. The method for generating accurate answers with large models according to claim 1, wherein said obtaining user query questions, automatically identifying and invoking custom search tools, and returning relevant documents comprises the steps of:

6. The method for generating accurate answers with large models of claim 5, wherein said obtaining user query questions and automatically capturing key information through LANGCHAIN comprises the steps of:

7. The method for generating accurate answers using large models according to claim 1, wherein said post-processing of the returned related documents and embedding the processed text into a pre-designed template frame comprises the steps of:

8. The method for generating accurate answers using large models according to claim 1, wherein said post-processing of the returned related documents and embedding the processed text into a pre-designed template frame comprises the steps of:

and S53, embedding the combined text into a pre-designed template frame.

9. The method of generating accurate answers with large models according to claim 1, wherein the large models generating final answers of user questions according to external information in the prompt comprises:

10. A system for generating accurate answers using a large model for implementing the method for generating accurate answers using a large model according to any one of claims 1 to 9, comprising: the system comprises a database management module, an index management and data importing module, an API call and tool construction module, a user query processing and tool call module, a result processing and text embedding module and a response generation module;