CN117668179A

CN117668179A - Financial index accurate question-answering method based on large model

Info

Publication number: CN117668179A
Application number: CN202311529969.3A
Authority: CN
Inventors: 李翔; 汪凡
Original assignee: Zhuhai Zhuohuan Technology Co ltd
Current assignee: Zhuhai Zhuohuan Technology Co ltd
Priority date: 2023-11-16
Filing date: 2023-11-16
Publication date: 2024-03-08

Abstract

The invention discloses a financial index accurate question-answering method based on a large model, which belongs to the technical field of large models and comprises a plurality of steps of index extraction, data extraction, index inquiry and index result fusion.

Description

Financial index accurate question-answering method based on large model

Technical Field

The application relates to the technical field of large models, in particular to a financial index accurate question-answering method based on a large model.

Background

Large model: large scale language models such as chatGPT, chatGLM, etc.;

large model illusion: halucation, when the text generated by the model does not follow the original text or does not accord with the original text, the model can be considered to have a phantom problem, and the model is generally described as 'one-main-description eight-way';

prompt: the prompt word is used for enabling the large model to extract and generate texts according to the requirements of the prompt word;

chatGLM2: the most excellent Chinese open source large model at present;

turbo big model: open GPT large model number of OpenAI company;

langchain: a knowledge base retrieval mode based on a large model and vector retrieval is generally used for relieving the illusion problem of the large model;

SQL is a standard computer language for accessing and processing databases;

for companies in the financial field, annual newspaper analysis can involve many indexes related to finance, such as dial coverage rate, reject ratio, profit acceleration, bad breath narrowing, return to mother net profit and the like, professional personnel are usually required to understand the performance of the enterprise on each index from a report through reading, but most of non-professional personnel cannot obtain information which is wanted by the enterprise, great information difference exists, after the technology of a large model appears, many finance companies try to solve the problem, including training a large financial model in the vertical field to improve the effect, but the obtained index data cannot guarantee the accuracy, and although vector search based on Langchain and a large vertical financial model can effectively improve the situation, 100% accuracy cannot be guaranteed, and the existing methods cannot be applied to scenes with extremely high requirements on the accuracy in the financial industry.

Disclosure of Invention

The application aims to provide a financial index accurate question-answering method based on a large model so as to solve the problems in the background technology.

In order to achieve the above purpose, the present application provides the following technical solutions: a financial index accurate question-answering method based on a large model comprises the following steps:

step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, writing a prompt 'please help me extract professional financial industry indexes from the problem', so that a large model extracts financial industry indexes involved in annual reports from a single problem;

step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording association relations between indexes and the indexes in the fields of the database; meanwhile, the formula for calculating the specific index is also stored in the database;

step 3, inquiring indexes, namely when a user puts forward a problem to be analyzed, word segmentation is carried out on the problem by a large model, a financial index which the user wants to inquire is found, then the association relation between the indexes and index fields is found in a field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and indexes and formulas related to the problem are found;

and 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and comprehensively answering the user questions based on the large model logic reasoning capability.

Preferably, the data source in step 1 includes multiple languages.

Preferably, the large model in the step 1 adopts chatGLM2 as a base large model.

Preferably, the step 3 uses Turbo as a base large model for generating SQL.

Preferably, in the step 4, the chatGLM2 large model outputs a specific value of each index first, outputs calculated index values based on the indexes and formulas second, and gives the index values and the calculation process to the large model finally.

Compared with the prior art, the beneficial effects of this application are:

according to the invention, the question-answering scene is indirectly converted into the database retrieval mode, the index data in the annual report data are processed and extracted in advance through the large model, and the real index value in the database is returned when the user questions are answered, so that the large model answers the real index value in the database, the problem that the large model answers false index data due to the 'illusion' of the large model in value is avoided, and the accuracy of answer is ensured.

Drawings

Fig. 1 is a schematic flow chart of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

Examples:

referring to fig. 1, the present application provides a technical solution: a financial index accurate question-answering method based on a large model comprises the following steps:

step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, a data source comprises multiple languages, and professional financial industry indexes are extracted from the problem by writing a prompt 'please help me', so that a large model extracts financial industry indexes related to annual reports from a single problem; (index data in annual report data is processed and extracted in advance through a large model by indirectly converting a question-answer scene into a database retrieval mode), and a chatGLM2 which is most excellent in the current Chinese field is selected as a base large model of an index extraction stage through benchmarking analysis;

step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording the association relation between the indexes and the indexes in the fields of the database (the index values stored in the database are true and reliable because only word segmentation capacity of the large model is used); based on the characteristics of the financial industry, the formula for calculating the specific index is also stored in the database (so that basic index data are stored in the database in combination with the step 1, and the calculation formulas for calculating other indexes can be calculated based on the basic index data);

step 3, inquiring the index, when the user puts forward the problem to be analyzed, word segmentation is carried out on the problem by the large model, the financial index which the user wants to inquire is found, then the association relation between the index and the index field is found in the field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and the index and the formula related to the problem are found (when the user problem is answered, the real index value in the database is returned, and the large model is enabled to answer the real index value in the database, so that the accuracy of answer is ensured); generating a corresponding SQL sentence based on text input of a user requires that a large model has strong capability in the programming field, and selecting Turbo as a base large model for generating SQL by transverse comparison S3;

step 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and outputting a specific numerical value of each index, wherein the specific numerical value is a real numerical value of the database; secondly, outputting calculated index values based on the indexes and the formulas; and finally, giving the index values and the calculation process to a large model, and comprehensively answering the questions of the user based on the logic reasoning capability of the large model.

While the fundamental principles and main features of the present application and advantages thereof have been shown and described, it will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but may be embodied in other specific forms without departing from the spirit or essential characteristics thereof; the present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Although embodiments of the present application have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the application, the scope of which is defined in the appended claims and their equivalents.

Claims

1. The financial index accurate question-answering method based on the large model is characterized by comprising the following steps of:

2. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: the data source in step 1 comprises a plurality of languages.

3. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: the large model in the step 1 adopts chatGLM2 as a base large model.

4. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: and 3, turbo is adopted as a base large model for generating SQL.

5. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: in the step 4, the chatGLM2 large model firstly outputs a specific numerical value of each index during calculation, secondly outputs calculated index values based on the indexes and formulas, and finally gives the index values and the calculation process to the large model.