CN117668179A - Financial index accurate question-answering method based on large model - Google Patents

Financial index accurate question-answering method based on large model Download PDF

Info

Publication number
CN117668179A
CN117668179A CN202311529969.3A CN202311529969A CN117668179A CN 117668179 A CN117668179 A CN 117668179A CN 202311529969 A CN202311529969 A CN 202311529969A CN 117668179 A CN117668179 A CN 117668179A
Authority
CN
China
Prior art keywords
index
model
indexes
large model
financial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311529969.3A
Other languages
Chinese (zh)
Inventor
李翔
汪凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Zhuohuan Technology Co ltd
Original Assignee
Zhuhai Zhuohuan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Zhuohuan Technology Co ltd filed Critical Zhuhai Zhuohuan Technology Co ltd
Priority to CN202311529969.3A priority Critical patent/CN117668179A/en
Publication of CN117668179A publication Critical patent/CN117668179A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a financial index accurate question-answering method based on a large model, which belongs to the technical field of large models and comprises a plurality of steps of index extraction, data extraction, index inquiry and index result fusion.

Description

Financial index accurate question-answering method based on large model
Technical Field
The application relates to the technical field of large models, in particular to a financial index accurate question-answering method based on a large model.
Background
Large model: large scale language models such as chatGPT, chatGLM, etc.;
large model illusion: halucation, when the text generated by the model does not follow the original text or does not accord with the original text, the model can be considered to have a phantom problem, and the model is generally described as 'one-main-description eight-way';
prompt: the prompt word is used for enabling the large model to extract and generate texts according to the requirements of the prompt word;
chatGLM2: the most excellent Chinese open source large model at present;
turbo big model: open GPT large model number of OpenAI company;
langchain: a knowledge base retrieval mode based on a large model and vector retrieval is generally used for relieving the illusion problem of the large model;
SQL is a standard computer language for accessing and processing databases;
for companies in the financial field, annual newspaper analysis can involve many indexes related to finance, such as dial coverage rate, reject ratio, profit acceleration, bad breath narrowing, return to mother net profit and the like, professional personnel are usually required to understand the performance of the enterprise on each index from a report through reading, but most of non-professional personnel cannot obtain information which is wanted by the enterprise, great information difference exists, after the technology of a large model appears, many finance companies try to solve the problem, including training a large financial model in the vertical field to improve the effect, but the obtained index data cannot guarantee the accuracy, and although vector search based on Langchain and a large vertical financial model can effectively improve the situation, 100% accuracy cannot be guaranteed, and the existing methods cannot be applied to scenes with extremely high requirements on the accuracy in the financial industry.
Disclosure of Invention
The application aims to provide a financial index accurate question-answering method based on a large model so as to solve the problems in the background technology.
In order to achieve the above purpose, the present application provides the following technical solutions: a financial index accurate question-answering method based on a large model comprises the following steps:
step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, writing a prompt 'please help me extract professional financial industry indexes from the problem', so that a large model extracts financial industry indexes involved in annual reports from a single problem;
step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording association relations between indexes and the indexes in the fields of the database; meanwhile, the formula for calculating the specific index is also stored in the database;
step 3, inquiring indexes, namely when a user puts forward a problem to be analyzed, word segmentation is carried out on the problem by a large model, a financial index which the user wants to inquire is found, then the association relation between the indexes and index fields is found in a field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and indexes and formulas related to the problem are found;
and 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and comprehensively answering the user questions based on the large model logic reasoning capability.
Preferably, the data source in step 1 includes multiple languages.
Preferably, the large model in the step 1 adopts chatGLM2 as a base large model.
Preferably, the step 3 uses Turbo as a base large model for generating SQL.
Preferably, in the step 4, the chatGLM2 large model outputs a specific value of each index first, outputs calculated index values based on the indexes and formulas second, and gives the index values and the calculation process to the large model finally.
Compared with the prior art, the beneficial effects of this application are:
according to the invention, the question-answering scene is indirectly converted into the database retrieval mode, the index data in the annual report data are processed and extracted in advance through the large model, and the real index value in the database is returned when the user questions are answered, so that the large model answers the real index value in the database, the problem that the large model answers false index data due to the 'illusion' of the large model in value is avoided, and the accuracy of answer is ensured.
Drawings
Fig. 1 is a schematic flow chart of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Examples:
referring to fig. 1, the present application provides a technical solution: a financial index accurate question-answering method based on a large model comprises the following steps:
step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, a data source comprises multiple languages, and professional financial industry indexes are extracted from the problem by writing a prompt 'please help me', so that a large model extracts financial industry indexes related to annual reports from a single problem; (index data in annual report data is processed and extracted in advance through a large model by indirectly converting a question-answer scene into a database retrieval mode), and a chatGLM2 which is most excellent in the current Chinese field is selected as a base large model of an index extraction stage through benchmarking analysis;
step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording the association relation between the indexes and the indexes in the fields of the database (the index values stored in the database are true and reliable because only word segmentation capacity of the large model is used); based on the characteristics of the financial industry, the formula for calculating the specific index is also stored in the database (so that basic index data are stored in the database in combination with the step 1, and the calculation formulas for calculating other indexes can be calculated based on the basic index data);
step 3, inquiring the index, when the user puts forward the problem to be analyzed, word segmentation is carried out on the problem by the large model, the financial index which the user wants to inquire is found, then the association relation between the index and the index field is found in the field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and the index and the formula related to the problem are found (when the user problem is answered, the real index value in the database is returned, and the large model is enabled to answer the real index value in the database, so that the accuracy of answer is ensured); generating a corresponding SQL sentence based on text input of a user requires that a large model has strong capability in the programming field, and selecting Turbo as a base large model for generating SQL by transverse comparison S3;
step 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and outputting a specific numerical value of each index, wherein the specific numerical value is a real numerical value of the database; secondly, outputting calculated index values based on the indexes and the formulas; and finally, giving the index values and the calculation process to a large model, and comprehensively answering the questions of the user based on the logic reasoning capability of the large model.
While the fundamental principles and main features of the present application and advantages thereof have been shown and described, it will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but may be embodied in other specific forms without departing from the spirit or essential characteristics thereof; the present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Although embodiments of the present application have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the application, the scope of which is defined in the appended claims and their equivalents.

Claims (5)

1. The financial index accurate question-answering method based on the large model is characterized by comprising the following steps of:
step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, writing a prompt 'please help me extract professional financial industry indexes from the problem', so that a large model extracts financial industry indexes involved in annual reports from a single problem;
step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording association relations between indexes and the indexes in the fields of the database; meanwhile, the formula for calculating the specific index is also stored in the database;
step 3, inquiring indexes, namely when a user puts forward a problem to be analyzed, word segmentation is carried out on the problem by a large model, a financial index which the user wants to inquire is found, then the association relation between the indexes and index fields is found in a field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and indexes and formulas related to the problem are found;
and 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and comprehensively answering the user questions based on the large model logic reasoning capability.
2. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: the data source in step 1 comprises a plurality of languages.
3. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: the large model in the step 1 adopts chatGLM2 as a base large model.
4. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: and 3, turbo is adopted as a base large model for generating SQL.
5. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: in the step 4, the chatGLM2 large model firstly outputs a specific numerical value of each index during calculation, secondly outputs calculated index values based on the indexes and formulas, and finally gives the index values and the calculation process to the large model.
CN202311529969.3A 2023-11-16 2023-11-16 Financial index accurate question-answering method based on large model Pending CN117668179A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311529969.3A CN117668179A (en) 2023-11-16 2023-11-16 Financial index accurate question-answering method based on large model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311529969.3A CN117668179A (en) 2023-11-16 2023-11-16 Financial index accurate question-answering method based on large model

Publications (1)

Publication Number Publication Date
CN117668179A true CN117668179A (en) 2024-03-08

Family

ID=90085447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311529969.3A Pending CN117668179A (en) 2023-11-16 2023-11-16 Financial index accurate question-answering method based on large model

Country Status (1)

Country Link
CN (1) CN117668179A (en)

Similar Documents

Publication Publication Date Title
US11694036B2 (en) Using natural language constructs for data visualizations
Yu et al. Sparc: Cross-domain semantic parsing in context
KR100969447B1 (en) Rendering tables with natural language commands
Luo et al. Synthesizing natural language to visualization (NL2VIS) benchmarks from NL2SQL benchmarks
US9690849B2 (en) Systems and methods for determining atypical language
TWI643076B (en) Financial analysis system and method for unstructured text data
CN110188163A (en) Data intelligence processing system based on natural language
CN116701431A (en) Data retrieval method and system based on large language model
CN112925901B (en) Evaluation resource recommendation method for assisting online questionnaire evaluation and application thereof
CN114547274B (en) Multi-turn question and answer method, device and equipment
CN110597844A (en) Heterogeneous database data unified access method and related equipment
CN112115252A (en) Intelligent auxiliary writing processing method and device, electronic equipment and storage medium
CN112287090A (en) Financial question asking back method and system based on knowledge graph
Guindon A multidisciplinary perspective on dialogue structure in user-advisor dialogues
CN117112767A (en) Question and answer result generation method, commercial query big model training method and device
CN117668179A (en) Financial index accurate question-answering method based on large model
CN112632106B (en) Knowledge graph query method, device, equipment and storage medium
CN115878814A (en) Knowledge graph question-answering method and system based on machine reading understanding
Ning et al. Review of question answering technology based on Text to SQL
CN114676298B (en) Defect report header automatic generation method based on quality filter
US11972223B1 (en) Query evaluation in natural language processing systems
CN117725078A (en) Multi-table data query and prediction method based on natural language
Yu Learning to map natural language to executable programs over databases
Lan et al. FNDS: a dialogue-based system for accessing digested financial news
Hoi et al. Corpus Database Management Design for Chinese-Portuguese Bidirectional Parallel Corpora

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination