CN117668179A - Financial index accurate question-answering method based on large model - Google Patents
Financial index accurate question-answering method based on large model Download PDFInfo
- Publication number
- CN117668179A CN117668179A CN202311529969.3A CN202311529969A CN117668179A CN 117668179 A CN117668179 A CN 117668179A CN 202311529969 A CN202311529969 A CN 202311529969A CN 117668179 A CN117668179 A CN 117668179A
- Authority
- CN
- China
- Prior art keywords
- index
- model
- indexes
- large model
- financial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 238000013075 data extraction Methods 0.000 claims abstract description 4
- 230000004927 fusion Effects 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 206010006326 Breath odour Diseases 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Abstract
The invention discloses a financial index accurate question-answering method based on a large model, which belongs to the technical field of large models and comprises a plurality of steps of index extraction, data extraction, index inquiry and index result fusion.
Description
Technical Field
The application relates to the technical field of large models, in particular to a financial index accurate question-answering method based on a large model.
Background
Large model: large scale language models such as chatGPT, chatGLM, etc.;
large model illusion: halucation, when the text generated by the model does not follow the original text or does not accord with the original text, the model can be considered to have a phantom problem, and the model is generally described as 'one-main-description eight-way';
prompt: the prompt word is used for enabling the large model to extract and generate texts according to the requirements of the prompt word;
chatGLM2: the most excellent Chinese open source large model at present;
turbo big model: open GPT large model number of OpenAI company;
langchain: a knowledge base retrieval mode based on a large model and vector retrieval is generally used for relieving the illusion problem of the large model;
SQL is a standard computer language for accessing and processing databases;
for companies in the financial field, annual newspaper analysis can involve many indexes related to finance, such as dial coverage rate, reject ratio, profit acceleration, bad breath narrowing, return to mother net profit and the like, professional personnel are usually required to understand the performance of the enterprise on each index from a report through reading, but most of non-professional personnel cannot obtain information which is wanted by the enterprise, great information difference exists, after the technology of a large model appears, many finance companies try to solve the problem, including training a large financial model in the vertical field to improve the effect, but the obtained index data cannot guarantee the accuracy, and although vector search based on Langchain and a large vertical financial model can effectively improve the situation, 100% accuracy cannot be guaranteed, and the existing methods cannot be applied to scenes with extremely high requirements on the accuracy in the financial industry.
Disclosure of Invention
The application aims to provide a financial index accurate question-answering method based on a large model so as to solve the problems in the background technology.
In order to achieve the above purpose, the present application provides the following technical solutions: a financial index accurate question-answering method based on a large model comprises the following steps:
step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, writing a prompt 'please help me extract professional financial industry indexes from the problem', so that a large model extracts financial industry indexes involved in annual reports from a single problem;
step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording association relations between indexes and the indexes in the fields of the database; meanwhile, the formula for calculating the specific index is also stored in the database;
step 3, inquiring indexes, namely when a user puts forward a problem to be analyzed, word segmentation is carried out on the problem by a large model, a financial index which the user wants to inquire is found, then the association relation between the indexes and index fields is found in a field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and indexes and formulas related to the problem are found;
and 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and comprehensively answering the user questions based on the large model logic reasoning capability.
Preferably, the data source in step 1 includes multiple languages.
Preferably, the large model in the step 1 adopts chatGLM2 as a base large model.
Preferably, the step 3 uses Turbo as a base large model for generating SQL.
Preferably, in the step 4, the chatGLM2 large model outputs a specific value of each index first, outputs calculated index values based on the indexes and formulas second, and gives the index values and the calculation process to the large model finally.
Compared with the prior art, the beneficial effects of this application are:
according to the invention, the question-answering scene is indirectly converted into the database retrieval mode, the index data in the annual report data are processed and extracted in advance through the large model, and the real index value in the database is returned when the user questions are answered, so that the large model answers the real index value in the database, the problem that the large model answers false index data due to the 'illusion' of the large model in value is avoided, and the accuracy of answer is ensured.
Drawings
Fig. 1 is a schematic flow chart of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Examples:
referring to fig. 1, the present application provides a technical solution: a financial index accurate question-answering method based on a large model comprises the following steps:
step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, a data source comprises multiple languages, and professional financial industry indexes are extracted from the problem by writing a prompt 'please help me', so that a large model extracts financial industry indexes related to annual reports from a single problem; (index data in annual report data is processed and extracted in advance through a large model by indirectly converting a question-answer scene into a database retrieval mode), and a chatGLM2 which is most excellent in the current Chinese field is selected as a base large model of an index extraction stage through benchmarking analysis;
step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording the association relation between the indexes and the indexes in the fields of the database (the index values stored in the database are true and reliable because only word segmentation capacity of the large model is used); based on the characteristics of the financial industry, the formula for calculating the specific index is also stored in the database (so that basic index data are stored in the database in combination with the step 1, and the calculation formulas for calculating other indexes can be calculated based on the basic index data);
step 3, inquiring the index, when the user puts forward the problem to be analyzed, word segmentation is carried out on the problem by the large model, the financial index which the user wants to inquire is found, then the association relation between the index and the index field is found in the field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and the index and the formula related to the problem are found (when the user problem is answered, the real index value in the database is returned, and the large model is enabled to answer the real index value in the database, so that the accuracy of answer is ensured); generating a corresponding SQL sentence based on text input of a user requires that a large model has strong capability in the programming field, and selecting Turbo as a base large model for generating SQL by transverse comparison S3;
step 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and outputting a specific numerical value of each index, wherein the specific numerical value is a real numerical value of the database; secondly, outputting calculated index values based on the indexes and the formulas; and finally, giving the index values and the calculation process to a large model, and comprehensively answering the questions of the user based on the logic reasoning capability of the large model.
While the fundamental principles and main features of the present application and advantages thereof have been shown and described, it will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but may be embodied in other specific forms without departing from the spirit or essential characteristics thereof; the present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Although embodiments of the present application have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the application, the scope of which is defined in the appended claims and their equivalents.
Claims (5)
1. The financial index accurate question-answering method based on the large model is characterized by comprising the following steps of:
step 1, index extraction, namely, based on a large amount of problem data proposed by historical users, writing a prompt 'please help me extract professional financial industry indexes from the problem', so that a large model extracts financial industry indexes involved in annual reports from a single problem;
step 2, data extraction, namely word segmentation is carried out on annual report data by adopting a large model based on the financial industry index extracted in the step 1, corresponding indexes and index values are extracted and stored in an index database, and a field association table is used for recording association relations between indexes and the indexes in the fields of the database; meanwhile, the formula for calculating the specific index is also stored in the database;
step 3, inquiring indexes, namely when a user puts forward a problem to be analyzed, word segmentation is carried out on the problem by a large model, a financial index which the user wants to inquire is found, then the association relation between the indexes and index fields is found in a field association table in the step 2, the related contents are delivered to the large model together, a corresponding database inquiry statement is generated, and indexes and formulas related to the problem are found;
and 4, fusing index results, namely delivering indexes and formulas returned by the query database in the step 3 to a chatGLM2 large model for calculation and fusion, and comprehensively answering the user questions based on the large model logic reasoning capability.
2. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: the data source in step 1 comprises a plurality of languages.
3. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: the large model in the step 1 adopts chatGLM2 as a base large model.
4. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: and 3, turbo is adopted as a base large model for generating SQL.
5. The large-scale-model-based financial index accurate question-answering method according to claim 1, wherein the method comprises the following steps: in the step 4, the chatGLM2 large model firstly outputs a specific numerical value of each index during calculation, secondly outputs calculated index values based on the indexes and formulas, and finally gives the index values and the calculation process to the large model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311529969.3A CN117668179A (en) | 2023-11-16 | 2023-11-16 | Financial index accurate question-answering method based on large model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311529969.3A CN117668179A (en) | 2023-11-16 | 2023-11-16 | Financial index accurate question-answering method based on large model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117668179A true CN117668179A (en) | 2024-03-08 |
Family
ID=90085447
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311529969.3A Pending CN117668179A (en) | 2023-11-16 | 2023-11-16 | Financial index accurate question-answering method based on large model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117668179A (en) |
-
2023
- 2023-11-16 CN CN202311529969.3A patent/CN117668179A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11694036B2 (en) | Using natural language constructs for data visualizations | |
Yu et al. | Sparc: Cross-domain semantic parsing in context | |
KR100969447B1 (en) | Rendering tables with natural language commands | |
Luo et al. | Synthesizing natural language to visualization (NL2VIS) benchmarks from NL2SQL benchmarks | |
US9690849B2 (en) | Systems and methods for determining atypical language | |
TWI643076B (en) | Financial analysis system and method for unstructured text data | |
CN110188163A (en) | Data intelligence processing system based on natural language | |
CN116701431A (en) | Data retrieval method and system based on large language model | |
CN112925901B (en) | Evaluation resource recommendation method for assisting online questionnaire evaluation and application thereof | |
CN114547274B (en) | Multi-turn question and answer method, device and equipment | |
CN110597844A (en) | Heterogeneous database data unified access method and related equipment | |
CN112115252A (en) | Intelligent auxiliary writing processing method and device, electronic equipment and storage medium | |
CN112287090A (en) | Financial question asking back method and system based on knowledge graph | |
Guindon | A multidisciplinary perspective on dialogue structure in user-advisor dialogues | |
CN117112767A (en) | Question and answer result generation method, commercial query big model training method and device | |
CN117668179A (en) | Financial index accurate question-answering method based on large model | |
CN112632106B (en) | Knowledge graph query method, device, equipment and storage medium | |
CN115878814A (en) | Knowledge graph question-answering method and system based on machine reading understanding | |
Ning et al. | Review of question answering technology based on Text to SQL | |
CN114676298B (en) | Defect report header automatic generation method based on quality filter | |
US11972223B1 (en) | Query evaluation in natural language processing systems | |
CN117725078A (en) | Multi-table data query and prediction method based on natural language | |
Yu | Learning to map natural language to executable programs over databases | |
Lan et al. | FNDS: a dialogue-based system for accessing digested financial news | |
Hoi et al. | Corpus Database Management Design for Chinese-Portuguese Bidirectional Parallel Corpora |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |