CN118093635A - Data query method, device, equipment and computer readable storage medium - Google Patents

Data query method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN118093635A
CN118093635A CN202410487627.8A CN202410487627A CN118093635A CN 118093635 A CN118093635 A CN 118093635A CN 202410487627 A CN202410487627 A CN 202410487627A CN 118093635 A CN118093635 A CN 118093635A
Authority
CN
China
Prior art keywords
path
data
data query
description
filtering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410487627.8A
Other languages
Chinese (zh)
Inventor
梁相
章华奥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Tonghuashun Data Development Co ltd
Original Assignee
Hangzhou Tonghuashun Data Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Tonghuashun Data Development Co ltd filed Critical Hangzhou Tonghuashun Data Development Co ltd
Priority to CN202410487627.8A priority Critical patent/CN118093635A/en
Publication of CN118093635A publication Critical patent/CN118093635A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data query method, a device, equipment and a computer readable storage medium, which are applied to the field of data processing and comprise the following steps: describing the access paths by using a large language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; selecting a preset number of initial selection access paths with maximum relevance from the access paths according to the data query request and the path description; filtering the initial selection access path by using a large-scale language reasoning model according to the data query request and the path description to obtain a filtered access path; and selecting the specific indexes by using a large language selection model according to the data query request and the specific indexes under the filtering access path to obtain the target result number. The application utilizes various large language models to carry out a series of processing on the data query request and determines the accurate path description, thereby carrying out accurate screening by combining the path description on the basis of the data query request and improving the accuracy of the data query.

Description

Data query method, device, equipment and computer readable storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data query method, apparatus, device, and computer readable storage medium.
Background
Intelligent query refers to automatically identifying the input query intent or requirement by an algorithm or technique, and returning financial data results meeting the requirement from a database. The prior art mainly has keyword matching inquiry. The experimental scenario is mainly a financial database, including a root directory, subdirectory multi-level system database.
Keyword matching query is achieved by defining a mapping relation between a group of specific keywords and financial data fields, and the keyword matching query mainly depends on the fact that a user inputs accurate keywords, when the user inputs keywords deviate from a predefined mapping set, accuracy is not high enough when query is conducted, and target data are obtained and are not data intended by the user.
Therefore, when data is queried currently, the technical problem of inaccurate data query exists.
Disclosure of Invention
Accordingly, the present invention is directed to a data query method, apparatus, device, and computer readable storage medium, which solve the technical problem of inaccurate data query in the prior art.
In order to solve the technical problems, the invention provides a data query method, which comprises the following steps:
describing the access paths by using a large language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; wherein the fetch path is a directory hierarchy, and the path description is a generalized description of the index;
selecting a preset number of initial selection access paths with maximum relevance from the access paths according to a data query request and the path description;
Filtering the initial selection access path by using a large language reasoning model according to the data query request and the path description to obtain a filtered access path;
And selecting the specific indexes under the filtering access path by using a large-scale language selection model according to the data query request to obtain target result data.
Optionally, the selecting, according to the data query request and the path description, the initial selection of the access paths with the largest preset number of correlations from the access paths includes:
And carrying out semantic matching based on vectors according to the data query request and the path description, and selecting the initial selected fetch path from the fetch paths according to the semantic matching degree.
Optionally, the filtering the initial selection access path according to the data query request and the path description by using a large language reasoning model to obtain a filtered access path, including:
And carrying out reasoning and filtering on the initial selection access path by using a first large-scale language reasoning model according to the data query request to obtain an access path filtering and analyzing process and a first filtering access path.
Optionally, after the performing inference filtering on the initial selection access path by using a first large language inference model according to the data query request to obtain an access path filtering analysis process and a first filtered access path, the method further includes:
and filtering the first filtered access path by using a second large language reasoning model according to the data query request and the path description to obtain a second filtered access path.
Optionally, the training process of the large-scale language reasoning model includes:
Training a large language model according to the training data input into the data query request and the path name training data and the training data output into the training data of the first filtering result training data to obtain the large language reasoning model;
And/or training the large language model according to the training data which is input to the data query request training data, the path description training data and the path name training data and the training data which is output to the training data of the second filtering result, so as to obtain the large language reasoning model.
Optionally, describing the access path by using a large language description model according to the prompt word to obtain a path description corresponding to the index details under each path, including:
and describing the root directory of the financial data by using the large language description model according to the prompt word to obtain path description corresponding to the index details under each root directory.
Optionally, the content of the prompt word includes a requirement of the general description and a path description reference sample.
The application also provides a data query device, which comprises:
The path description module is used for describing the access paths by utilizing a large-scale language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; wherein the fetch path is a directory hierarchy, and the path description is a generalized description of the index;
the initial selection module is used for selecting a preset number of initial selection access paths with maximum relevance from the access paths according to the data query request and the path description;
The filtering module is used for filtering the initial selection access path by utilizing a large-scale language reasoning model according to the data query request and the path description to obtain a filtered access path;
and the target result data determining module is used for selecting the specific indexes by utilizing a large-scale selection language model according to the data query request and the specific indexes under the filtering access path to obtain target result data.
The application also provides a data query device, comprising:
a memory for storing a computer program;
And the processor is used for realizing the steps of the data query method when executing the computer program.
The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the data querying method as described above.
Therefore, the application describes the access paths by utilizing a large language description model according to the prompt words to obtain the path description corresponding to the index details under each path; wherein, the fetch path is a directory hierarchy, and the path description is a generalized description of the index; selecting a preset number of initial selection access paths with maximum relevance from the access paths according to the data query request and the path description; filtering the initial selection access path by using a large-scale language reasoning model according to the data query request and the path description to obtain a filtered access path; and selecting the specific indexes by using a large-scale language selection model according to the data query request and the specific indexes under the filtering access path to obtain target result data. Compared with the current query through keywords, the method and the device have the advantages that because the keywords have limitations, when the keywords are not accurate enough, the query result is not accurate enough, the method and the device can use a large model to initially select data according to the data query request and the path description, further use a large language selection model to select a filtered number-taking path according to the data query request, the path description and specific indexes under the path, and obtain target result data.
In addition, the invention also provides a data query device, equipment and a computer readable storage medium, which also have the beneficial effects.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a data query method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for querying financial data according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an overall framework for querying financial data according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data query device according to an embodiment of the present invention;
Fig. 5 is a schematic structural diagram of a data query device according to an embodiment of the present invention;
In fig. 4-5, the reference numerals are as follows:
A 100-path description module;
200-primary selection module;
300-a filtration module;
10-memory;
A 20-processor;
30-a communication interface;
40-communication bus.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Some of the terms or terms appearing in describing embodiments of the present application are applicable to the following explanation:
Large language model: the generic term is commonly referred to as "Large Language Model" (LLM), which may be referred to as a "large language model" in chinese. Large language models refer to machine learning models with large scale parameters and complex computational structures, particularly in the field of Natural Language Processing (NLP). These models are typically built from deep neural networks with billions or even billions of parameters. The design purpose of the large language model is to improve the expressive power and predictive performance of the model, so that it can handle more complex natural language processing tasks such as text generation, semantic understanding, text classification, dialog systems, etc. By pre-training over large amounts of text data, these models are able to learn complex patterns and structures of language, thereby exhibiting powerful generalization capabilities and performance in a variety of downstream tasks.
Referring to fig. 1, fig. 1 is a flowchart of a data query method according to an embodiment of the present invention. The method may include:
S101, describing the access paths by using a large language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; wherein the fetch path is a directory hierarchy and the path description is a generalized description of the metrics.
The execution subject of this embodiment is an electronic device. For example, the electronic device may be a computer, tablet, or the like. The large language description model in the embodiment is a large language model, the input of the large language model is index details (for example, a first-level label is industrial economic data, a second-level label is an automobile, a third-level label is product yield and sales volume, a fourth-level label is a passenger sales volume and is a split automobile type, a fifth-level label is a passenger sales volume and is a passenger (month), specific indexes under the fifth-level label are a passenger sales volume and a passenger (month) version 1 (automobile) current month value, a passenger sales volume and a passenger (automobile) version 1 current month comparison, a passenger (automobile) sales volume and a passenger (automobile) version 2 (automobile) current month value, a passenger (automobile) sales volume and a passenger (automobile) version 2 current month comparison, a passenger (automobile) sales volume and a passenger (automobile) version 2 current month value, a passenger (automobile) sales volume and a passenger (automobile) version 3 (automobile) version, a passenger (automobile) version 3 (automobile) current month value, a passenger (automobile) version 4) and a passenger (automobile) version 4, a passenger (automobile) brand version 5, a passenger (automobile) version 4 and a passenger (automobile) brand version 5) current month value and a passenger (automobile version 5) current month value are obtained, and the large language description model is described by the large language description model. The hint words in this embodiment are some of the descriptive requirements given to the large language description model. The hint words in this embodiment may be a segment of words generated using a machine model, adjusted continuously using the model. The content of the hint words includes the requirements of the general description and the path description reference examples, and can help fully describe the fetch path. The embodiment is not limited to a particular type of hint word. For example, the hint word of this embodiment may be < index details > based on the financial data path, generating < path description >, < path description > is a generalized description of the underlying specific index; or the embodiment may be based on < metric details > under the economic data path, generating < path description >. < path description > is a generalized description of the underlying specific index; or the hint word in this embodiment may also be based on < metric details > under the environmental data path, generating < path description >. The < path description > is a generalized description of the underlying specific index. For ease of understanding, the hint words in this embodiment may be, for example:
Please play the role of securities analyst, generate < path description > based on < index details > under the financial data path. The < path description > is a generalized description of the underlying specific index. Note that the interval sign between the names of the underlying indices is a pause, which can help you distinguish between different index names. The generated summary description meets the following requirements:
(1) The overall profile is at least 70 words, and is not too general;
(2) The region information related to the specific index of the bottom layer can be exemplified by 1-3 region information. Note that the zone information must come from the name of the underlying index and not exceed the scope of the underlying index name. If the regional information contains china, listing is prioritized. If the region information is not available, writing the non-information;
(3) The index unit information related to the bottom index needs to list and describe all the situations one by one. Note three points: first, the same ratio or ring ratio is not missed in particular. Second, the index of the same unit only needs to be selected for enumeration. For example, the units of the ring ratio and the homoratio are the same, and thus, only the homoratio or the ring ratio need be enumerated. Thirdly, the index unit information does not exceed the range of the bottom index name without kneading;
(4) Finally, 3 index names are attached, and 3 indexes which you feel representative can be selected. Note that the names of the exemplary representative metrics must be from the underlying metrics, not beyond the scope of the underlying metrics names.
The following is a reference sample.
< Index details >:
The first-level label is industrial economic data, the second-level label is an automobile, the third-level label is product yield and sales, and the fourth-level label is passenger car sales: the split type, five-level label is passenger car sales volume: the brand A vehicle (month), the specific indexes under the five-level label are as follows: passenger car sales volume: brand a car version 1: the vehicle sales volume is a lunar value, the passenger vehicle sales volume is a A brand vehicle version 1 is a same ratio in the lunar value, the passenger vehicle sales volume is a A brand vehicle version 2 is a same ratio in the lunar value, the passenger vehicle sales volume is a A brand vehicle version 3 is a same ratio in the passenger vehicle sales volume is a same ratio in the A brand vehicle version 3, the passenger vehicle sales volume is a same ratio in the A brand vehicle version 4 is a same ratio in the lunar value, the passenger vehicle sales volume is a same ratio in the A brand vehicle version 4, the passenger vehicle sales volume is a same ratio in the A brand vehicle version 5 is a same ratio in the lunar value, the passenger vehicle sales volume is a brand vehicle version 5 is a same ratio in the lunar value, and the passenger vehicle sales volume is a brand vehicle version 5 is a cumulative value.
< Path description >:
Overall profile: the bottom index under the path mainly relates to sales data of different types of brand A passenger cars, wherein the sales data comprise the monthly sales, the monthly sales comparably increasing rate, the monthly accumulated sales from the beginning of each year and the like;
(2) The region information related to the bottom index under the path: the method is free;
(3) Index unit information related to the bottom index under the path: the current month value, the current month same ratio and the accumulated value;
(4) Representative index: the passenger vehicle sales volume is A brand vehicle version 1, the same ratio of the current month is provided, the passenger vehicle sales volume is A brand vehicle version 1, the accumulated value is provided, the passenger vehicle sales volume is A brand vehicle version 2, and the current month value is provided.
Please follow the previous requirements exactly, and generate a path description based on the following index details according to the authoring format of the reference sample.
< Index details >: < input >;
< path description >: < output >.
As can be seen from the above hint words, the input of the large language description model is index details, and the output is the path description, i.e., the content of the < path description > section. It will be appreciated that this embodiment may describe all the index details, i.e. all the paths; or the embodiment can also determine the part of index details according to the data query request, so as to obtain the path description corresponding to the part of index details. The embodiment is not limited to a specific fetch path. For example, the fetch path in this embodiment may be a root directory, or the fetch path in this embodiment may also be a secondary root directory. This embodiment can more precisely determine the content under the directory through path descriptions.
It should be further noted that, in order to improve the efficiency of data query, the describing the access paths by using the large language description model according to the prompt word to obtain the path descriptions corresponding to the index details under each path may include: and describing the root directory of the financial data by using a large language description model according to the prompt word to obtain path descriptions corresponding to index details under each root directory. The access path in this embodiment is a root directory, and it can be understood that the data in the database is relatively large and summarized as the root directory, and other secondary directories can be divided under the root directory. The method and the system can directly describe the root directory to integrally summarize a large amount of data under the root directory, so that the data query request is directly positioned to a more accurate root directory, and the overall efficiency of the data query can be improved because the number of path descriptions can be reduced.
S102, selecting a preset number of initial selection access paths with maximum relevance from the access paths according to the data query request and the path description.
In this embodiment, the initial access paths with the highest relevance in the preset number are selected from the access paths according to the data query request and the path description, and the selection is performed from the access paths not only according to the data query request, but also in combination with the more detailed path description, so that the accuracy of the selection is improved. Considering that the number of the bottom indexes of each database is huge, the index is usually in the order of millions, the pressure of directly searching the bottom indexes or reasoning the large model is high, and the accuracy is low. Therefore, the fetch path of the upper layer is first screened. Before large model reasoning screening, the initial selection is carried out on the access paths through vector retrieval, so that paths with close semantics but different keywords can be matched quickly, and the subsequent large model reasoning pressure is relieved. The embodiment is not limited to a specific preset number. For example, the preset number in this embodiment may be 15; or the number of fetches in this embodiment may also be 20. The module is mainly based on a vector retrieval method, and the query sentences in the input natural language form and the description of the access paths produced by the module 1 are utilized to carry out primary screening on the access paths of the data. For example, in the field of financial data, 5, 10, 15 and 20 initial selection experiments are performed respectively, and the result shows that when the number of the initial selection paths is 15, the hit rate of 100% can be stably reached, that is, at least one path in the 15 paths can meet the requirement of financial data extraction.
It should be further noted that, in order to improve the accuracy of determining the initial selection access path, the selecting, according to the data query request and the path description, the initial selection access path with the preset number and the largest association from the access paths may include: and carrying out semantic matching based on vectors according to the data query request and the path description, and selecting an initial selection fetch path from the fetch paths according to the semantic matching degree. When the initial selection access path is determined according to the query request and the path description, semantic matching is performed based on vectors, and a preset number of initial selection access paths with the maximum similarity with the data query request are selected from the access paths according to the path description of the access paths. The embodiment can carry out semantic matching according to the vector, and improves the accuracy of determining the initial selection fetch path.
S103, filtering the initial selection access path by using a large-scale language reasoning model according to the data query request and the path description to obtain a filtered access path.
The embodiment can filter the initial selection access path by utilizing a large language reasoning model according to the data query request and the path description to obtain a filtered access path. The embodiment is not limited to a specific large language reasoning model (which is named large language reasoning model because it is used for reasoning). For example, the inputs to the large language inference model in this embodiment may be < financial data query requirements >, < alternative fetch paths > (initially selected fetch paths), and the model outputs may be < analysis process of fetch path filtering >, < filtering results >. Or the inputs of the large language reasoning model in this embodiment < financial data query requirements >, < alternative fetch path > (initial choice of fetch path), path description, model outputs can be the analysis process of < fetch path filtering > and < filtering results >. The training process of the large language reasoning model can comprise the following steps: training a large language model according to the training data input into the data query request and the path name training data and the training data output into the training data of the first filtering result training data to obtain the large language reasoning model; and/or training the large language model according to the input data query request training data, the path description training data and the path name training data and the training data which is output as the second filtering result training data to obtain the large language reasoning model. It can be appreciated that in order to improve the accuracy of the filtering, coarse filtering may be performed according to the data query request, and then fine filtering may be performed according to the path description.
It should be further noted that, in order to improve the accuracy of filtering, the filtering the initial selection access path according to the data query request and the path description by using the large-scale language reasoning model to obtain a filtered access path may include: and carrying out reasoning and filtering on the initial selected access path by utilizing a first large-scale language reasoning model according to the data query request to obtain an access path filtering and analyzing process and a first filtered access path. In order to ensure the accuracy of filtering, the embodiment obtains the filtering analysis process of the access path and the first filtering access path when the first large language reasoning model is used for reasoning and filtering the initial selection access path. In order to facilitate understanding, the embodiment of the invention provides a method for determining a first filtering access path, and the module is mainly used for logically reasoning and filtering a preset number (15 pieces of financial data) access paths (initial selection access paths) of a retrieval initial selection based on a big model, wherein the prompting words are as follows: please play the role of a securities analyst, based on < financial data query requirements > and < alternative access paths >, generate < analysis procedure of access path filtering > and < filtering results >. Note that the following requirements need to be met: (1) The < alternative fetch path > should be equal to the path culled in the analysis process of < fetch path filtering > plus the path in the < filtering result >. (2) The path name in < filter result > must come from < alternative fetch path >, not outside this range.
The following is a reference sample.
< Financial data query requirement >: regional feature analysis of passenger car sales in the automotive industry;
< alternative fetch path >:
Sequence number |fetch Path|;
---|---|;
Path 1|industry economic data-automobile-product yield and sales-automobile full industry statistics: co-car sales (months);
Path 2|industry economic data-car-product yield and sales-car-all-industry statistics: passenger meeting-passenger car sales (weeks);
path 3|industry economic data-car-product yield and sales-car-all-industry statistics: co-car yield (months);
Path 4|industry economic data-car-product yield and sales-passenger sales: split-passenger sales: brand a vehicle (month);
path 5|industry economic data-car-product yield and sales-passenger sales: split-passenger sales: brand B car (month);
path 6|industry economic data-car-product yield and sales-passenger sales: split-passenger sales: brand C car (month);
Path 7|industry economic data-car-product yield and sales-passenger sales: split-passenger sales: brand D car (month);
Path 8|industry economic data-car-product yield and sales-passenger sales: branding-passenger sales: brand a vehicle (month);
path 9|industry economic data-car-product yield and sales-passenger sales: branding-passenger sales: brand B car (month);
route 10|industry economic data-car-product yield and sales-passenger sales: branding-passenger sales: brand C car (month);
route 11|industry economic data-car-product yield and sales-passenger sales: branding-passenger sales: brand D car (month);
Path 12|industry economic data-car-product yield and sales-passenger cars: terminal sales: regional (month);
Path 13|industry economic data-car-product yield and sales-passenger cars: terminal sales: branding (month);
path 14|industry economic data-car-product yield and sales-passenger car sales: whole vehicle (month);
Path 15|industry economic data-car-product yield and sales-car sales data-car yield (months);
< analysis procedure of fetch path filtration >:
(1) Since the regional characteristics of the sales of the passenger car are analyzed, the index data of the yield class is not needed. Thus, path 3, path 15 are culled;
(2) Because the regional characteristics of the whole sales volume of the passenger car are analyzed, sales volume data of a single vehicle type or a single brand is not representative, and cannot represent the whole sales volume data of the passenger car. Thus, culling path 4, path 5, path 6, path 7, path 8, path 9, path 10, path 11;
(3) Since the industry of analysis is regional characteristics of passenger car sales, sales data of passenger cars are not needed. Thus, path 14 is culled;
< filtration results >:
By filtering the fetch path, the remaining alternative paths are: path 1, path 2, path 12, path 13.
Following strictly the foregoing requirements, in accordance with the authoring format of the reference example, the analysis process of < get path filtering > and < filter result > are generated based on the following < financial data query requirements > and < alternative get paths >.
< Financial data query requirement >: < input1>;
< alternative fetch path > (initial select path): < input2>;
< analysis procedure of fetch path filtration >: < output1>;
< filtration results >: < output2>.
The embodiment uses the first large language reasoning model to filter the initial selection fetch path according to the prompt words. The inputs of the first large language reasoning model in this embodiment are the data query request and the initial selection path, and the outputs are the fetch path filtering analysis process and the filtering result. Because the embodiment performs inference analysis when the first large-scale language inference model is used for filtering the initial selection access path, the access path filtering analysis process and the first filtering access path can be obtained, and the accuracy of the filtering analysis process is ensured.
It should be further noted that, in order to further improve the accuracy of filtering, after performing, according to the data query request, inference filtering on the initially selected access path by using a first large-scale language inference model to obtain an access path filtering analysis process and a first filtered access path, the method may further include: and filtering the first filtered access path by using a second large language reasoning model according to the data query request and the path description to obtain a second filtered access path. The second large language reasoning model in this embodiment may be further filtered based on the filtering of the first large language reasoning model. For ease of understanding, this embodiment presents a process of filtering using a second large language reasoning model, which is based primarily on large models, for further logical reasoning and screening. The fetch path after the filtering of the first large-scale language reasoning model is a path which cannot be directly judged by the path name, and more specific reasoning and screening are needed to be carried out by further combining the path description. The hint words are as follows:
Please play the role of a securities analyst, generating an analysis process for < get path screening > and < screening results > based on < financial data query requirements >, < alternative get paths >, < path description >. Note that the following requirements need to be met:
(1) The path name in < screening results > must come from < alternative fetch path >, not outside this range.
The following is a reference sample.
< Financial data query requirement >: regional feature analysis of passenger car sales in the automotive industry;
< alternative fetch path >:
Path 1|industry economic data-automobile-product yield and sales-automobile full industry statistics: co-car sales (months);
Path 2|industry economic data-car-product yield and sales-car-all-industry statistics: passenger meeting-passenger car sales (weeks);
Path 12|industry economic data-car-product yield and sales-passenger cars: terminal sales: regional (month);
Path 13|industry economic data-car-product yield and sales-passenger cars: terminal sales: branding (month);
< path description >:
path 1 describes:
(1) Overall profile: the bottom index under this path is the month data of retail sales of various types of passenger cars. The classification of passenger cars specifically includes generalized passenger cars, narrow passenger cars, sedans, SUVs (sport utility vehicles), MPVs (utility vehicles), minibus;
(2) Region information related to the bottom layer specific index: the method is free;
(3) Index unit information related to the bottom index under the path: the current month value, the current month ring ratio, the current month same ratio and the accumulated value;
(4) Representative index: retail sales volume of passenger cars, broad sense passenger cars, retail sales volume of passenger cars, narrow sense passenger cars, retail sales volume of passenger cars, and passenger cars.
Path 2 describes:
(1) Overall profile: the bottom index under the path is retail data and wholesale data of daily average passenger cars of current week factories;
(2) Regional information: the method is free;
(3) Index unit information related to the bottom index under the path: retail quantity, wholesale quantity, and the same ratio;
(4) Representative index: the average daily retail quantity of the current week manufacturer, the average daily wholesale quantity of the current week manufacturer and the average daily retail quantity of the current week manufacturer are the same as each other.
Path 12 describes:
(1) Overall profile: the bottom index under the path is month data of terminal sales of passenger cars in various areas;
(2) Regional information: beijing, tianjin, shanghai, henan, hubei, etc.;
(3) Index unit information related to the bottom index under the path: the current month value and the current month ring ratio;
(4) Representative index: terminal sales of passenger cars, beijing, current month, terminal sales of passenger cars, hubei, current month.
Path 13 describes:
(1) Overall profile: the bottom index under the path is month data of terminal sales of various brands of passenger cars;
(2) Regional information: the method is free;
(3) Unit information: the current month value, the current month ring ratio, the accumulated value and the accumulated same ratio;
(4) Representative index: the passenger car is the terminal sales volume, the current month value, the passenger car is the terminal sales volume, the F-brand car is the current month value, the passenger car is the terminal sales volume, the G-brand car is the current month ring ratio, the passenger car is the terminal sales volume, the H-brand car is the current month ring ratio.
< Analytical procedure for Access Path screening >
(1) Evaluation of Path 1: the bottom layer index under the path is mainly month data of retail sales of various types of passenger cars, such as current month retail sales data of generalized passenger cars and sedans. However, only the whole data in the whole country is lacking in the data of each region, and the regional characteristics of the sales of the passenger cars cannot be analyzed. This path is therefore not available;
(2) Evaluation of Path 2: the bottom index under the path is mainly retail data and wholesale data of daily average passenger cars of current manufacturers, but only total data of national scope is lacking in data of each region, and regional characteristics of sales volume of the passenger cars cannot be analyzed. This path is therefore not available;
(3) Evaluation of path 12: the bottom index under the path is mainly month data of terminal sales of passenger cars in all areas, such as the current month value of terminal sales of passenger cars in Beijing area, the current month value of terminal sales of passenger cars in Hubei province, and the like, and can be used for analyzing regional characteristics of the terminal sales of the passenger cars, so that the path is available;
(4) Evaluation of path 13: the bottom index under the path is mainly month data of terminal sales of various brands of passenger cars, such as terminal sales of E brands of passenger cars in the same month, terminal sales of H brands of passenger cars in the same month and the like. However, only the total data of all brands of manufacturers in the whole country is lacking in the data of all areas, and the regional characteristics of the sales volume of the passenger cars cannot be analyzed. So this path is not available;
< filtration results >:
Path 12|industry economic data-car-product yield and sales-passenger cars: terminal sales: regional (month);
following strictly the foregoing requirements, the analysis procedure of < get path screening > and < screening result > are generated based on the following < financial data query requirement >, < alternative get path >, < path description > in accordance with the authoring format of the reference example.
< Financial data query requirement >: < input1>;
< alternative fetch path > (first filtered fetch path): < input2>;
< path description >: < input3>;
< analysis procedure for fetch path screening >: < output1>;
< screening results >: < output2>.
In this embodiment, the input of the second large language reasoning model is a data query request, a first filtered access path, a path description, and the output is an analysis process of access path screening and a further screening result. The embodiment can be combined with the path description for filtering in the filtering process, so that the filtering accuracy is further improved.
S104, selecting the specific indexes by using a large-scale language selection model according to the data query request and the specific indexes under the filtering access path to obtain target result data.
The large language model in this embodiment is a model for selection and is therefore named a large language selection model to distinguish from the large language model above. The large language selection model in the embodiment can select from the filtered access paths according to the analysis process of the data query request, the access paths and the access path filtering, and target result data is obtained. The module is mainly based on a large language selection model, and aims at the filtered number-taking path screened out, and the specific index of the bottom layer under the path is further screened out. Specific metrics in this embodiment refer to those metrics that are the lowest under the filtered fetch path. For example, industry economic data-car-product yield and sales-passenger cars: terminal sales: the regional (month) is a path, and specific indexes under the path are as follows: passenger car: terminal sales: beijing: the method comprises the steps of taking a current month value, taking a passenger car as a terminal sales value, taking a passenger car as a current month value, taking a passenger car as a terminal sales value, taking a current month value, taking a passenger car as a terminal sales value, taking a passenger car as a current month value, taking a passenger car as a terminal sales value, taking a current city as a current month value, taking a passenger car as a current city, taking a current river as a current month value, taking a passenger car as a terminal sales value, taking a passenger car as a current city, taking a current city as a current city, taking a terminal value, taking a passenger car as a current city as a three-ocean value, taking a three as a value, taking a terminal value, taking a ocean as a terminal value as a three as a value, taking a terminal value as ocean value, and taking a value as ocean value, etc value, taking a value as appropriate market as market, etc value. The large model analyzes and screens the indexes under a single access path each time, and resources can be reasonably arranged according to the number of the screened paths to perform multi-task concurrency so as to improve the operation efficiency. The hint words are as follows:
Please play the role of securities analyst, based on < finance data query requirement >, < get path >, < specific index under path >, adopt mark down (markup language) format, return < index screening result > according to index name, index meaning, reasoning logic, whether these 4 fields are available. Note that the following requirements need to be met:
(1) The index names in < index screening result > must come from < specific index under path >, and do not go beyond this range.
The following is a reference sample.
< Financial data query requirement >: regional feature analysis of passenger car sales in the automotive industry;
< get path >: industry economic data-car-product yield and sales-passenger cars: terminal sales: regional (month);
< specific index under path >:
Passenger car: terminal sales: beijing: the method comprises the steps of taking a current month value, taking a passenger car as a terminal sales value, taking a passenger car as a current month value, taking a passenger car as a terminal sales value, taking a current month value, taking a passenger car as a terminal sales value, taking a passenger car as a current month value, taking a passenger car as a terminal sales value, taking a current city as a current month value, taking a passenger car as a current city, taking a current river as a current month value, taking a passenger car as a terminal sales value, taking a passenger car as a current city, taking a current city as a current city, taking a terminal value, taking a passenger car as a current city as a three-ocean value, taking a three as a value, taking a terminal value, taking a ocean as a terminal value as a three as a value, taking a terminal value as ocean value, and taking a value as ocean value, etc value, taking a value as appropriate market as market, etc value.
< Index screening results >:
Index name |index meaning|inference logic| is available;
---|---|---|---;
terminal sales volume of passenger cars, namely, the current month value of the passenger car terminal sales volume in the range of Beijing area is used as the current month value of Beijing as the direct city, and the data of the passenger car sales volume in the Beijing area are enough and can be used for analyzing the regional characteristics of the passenger car sales volume, so that the index is available. The I is available;
The terminal sales quantity of the passenger car is that the current month value of the passenger car terminal sales quantity in the current month value of the Tianjin area range is used as the current month value of the passenger car terminal sales quantity of the Tianjin area to be used as the data of the passenger car sales quantity in the direct city, and the data of the passenger car sales quantity in the Tianjin area are enough to be used for analyzing the regional characteristics of the passenger car sales quantity, so that the index is available. The I is available;
Terminal sales volume of passenger cars, namely current month value of the passenger car terminal sales volume in the current month value of the current month area of the Shanghai, wherein the current month value of the passenger car terminal sales volume in the current month area of the Shanghai is used as the direct city, and the data of the passenger car sales volume in the current ocean area are enough and can be used for analyzing the regional characteristics of the passenger car sales volume, so that the index is available. The I is available;
terminal sales volume of Chongqing the current month value of the terminal sales volume of the passenger car in the area of Chongqing is taken as the direct jurisdiction city, and the data of the passenger car sales volume in Chongqing area are enough to be used for analyzing the regional characteristics of the passenger car sales volume, so that the index is available. The I is available;
terminal sales of passenger cars, namely, the current month value of the passenger car terminal sales in the range of Hubei province, is taken as an important province, and the data of the passenger car sales in the Hubei province are enough and can be used for analyzing the regional characteristics of the passenger car sales, so that the index is available. The I is available;
The data of the passenger vehicle terminal sales in the current month value I Hainan province range of Hainan and the passenger vehicle terminal sales in the current month value I Hainan province are enough and can be used for analyzing the regional characteristics of the passenger vehicle sales, so that the index is available. The I is available;
terminal sales of passenger cars, namely, the current month value of the terminal sales of the passenger cars in the range of Henan province, the current month value of the terminal sales of the passenger cars in Henan province, is enough, and can be used for analyzing the regional characteristics of the sales of the passenger cars, so that the index is available. The I is available;
Terminal sales of passenger cars in Hubei and Wuhan City, namely current month value of passenger cars in regional range of Wuhan City in Hubei province, regional characteristics of passenger car sales in China automobile industry should be analyzed by using sales data in province or direct jurisdiction, and the regional characteristics are more representative. In order to keep the statistical caliber consistent, the sales data of the passenger cars in the Wuhan market are not commonly used with the sales data of the passenger cars in the province or the direct jurisdiction market, so the index is not available. I not available;
Terminal sales of passenger cars in Hubei Yichang city, current month value I of passenger cars in Yichang city area in Hubei province and current month value I of passenger car terminal sales of China automobile industry are analyzed, and sales data in province or direct jurisdiction are used, so that the method is more representative. In order to keep the statistical caliber consistent, the sales data of the passenger cars in Yichang city is not commonly used with the sales data of the passenger cars in province or direct jurisdiction, so the index is not available. I not available;
Terminal sales of passenger cars in Hainan, hainan and Hainan, and current month value of passenger car terminal sales in the area of Hainan province and Hainan province, regional characteristics of passenger car sales in China automobile industry should be analyzed by using sales data in province or direct jurisdiction, and the regional characteristics are more representative. In order to keep the statistical caliber consistent, the sales data of the passenger cars in the sea are not commonly used with the sales data of the passenger cars in provinces or in direct jurisdictions, so the index is not available. I not available;
terminal sales of passenger cars in Hainan and three-city, namely the current month value of the passenger car terminal sales in the area of three-city, hainan province, analyzing regional characteristics of the passenger car sales in the automotive industry in China, the sales data in province or direct jurisdiction should be used, and the method is more representative. In order to keep the statistical aperture consistent, the three-city passenger vehicle sales data is not commonly used with the province or the direct-jurisdiction passenger vehicle sales data, so the index is not available. I not available;
Terminal sales of passenger cars in Henan, zhengzhou, henan province, terminal sales of passenger cars in the area of Zhengzhou, henan province, current month value of the passenger cars in the area of Zhengzhou, analyzing regional characteristics of the sales of the passenger cars in the automotive industry in China, and using sales data in province or direct jurisdiction, is more representative. In order to keep the statistical caliber consistent, the sales data of the Zhengzhou city passenger cars is not commonly used with the sales data of the province or the direct-jurisdiction city passenger cars, so the index is not available. I not available;
Terminal sales of passenger cars in Henan, luoyang city, current month value I the current month value I of the passenger car terminal sales of the regional range of Henan province, luoyang city, the regional characteristics of the passenger car sales of the China automobile industry should be analyzed by using sales data of provinces or direct jurisdictions, and the regional characteristics are more representative. In order to keep the statistical caliber consistent, the sales data of the passenger cars in the Luoyang city is not commonly used with the sales data of the passenger cars in the province or the direct jurisdiction, so the index is not available. I not available;
Terminal sales of passenger cars in Henan, namely complex city, current month value |current month value|of passenger car terminal sales of the regional range of Henan, and Henan, regional characteristics of passenger car sales in the automobile industry in China are analyzed by using sales data in province or direct jurisdiction, and the regional characteristics are more representative. In order to keep the statistical caliber consistent, the sales data of the complex city passenger cars is not commonly used with the sales data of the province or the direct city passenger cars, so the index is not available. I not available;
The data query method provided by the embodiment of the application can comprise the following steps: s101, describing the access paths by using a large language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; wherein, the fetch path is a directory hierarchy, and the path description is a generalized description of the index; s102, selecting a preset number of initial selection access paths with maximum relevance from the access paths according to the data query request and the path description; s103, filtering the initial selection access path by using a large-scale language reasoning model according to the data query request and the path description to obtain a filtered access path; s104, selecting the specific indexes by using a large-scale language selection model according to the data query request and the specific indexes under the filtering access path to obtain target result data. Compared with the current query through keywords, once the keyword input by a user deviates from a predefined mapping set, the query accuracy is lower, the method and the device for searching the target result data by utilizing the large model to initially select the data according to the data query request and the path description, further selecting the specific index by utilizing the large language selection model according to the data query request and the specific index under the filtering access path, and obtaining the target result data. In addition, the embodiment directly describes the root directory, so that a large amount of data under the root directory can be summarized integrally, and the data query request is directly positioned to a more accurate root directory, and the efficiency of data query can be improved because the number of path descriptions can be reduced; in addition, the embodiment can carry out semantic matching according to the vector, so that the accuracy of determining the initial selection fetch path is improved; and, the input of the first large language reasoning model in the embodiment is a data query request and an initial selection path, and the output is a fetch path filtering analysis process and a filtering result. Because the embodiment performs inference analysis when the first large-scale language inference model is used for filtering the initial selection access path, and the access path filtering analysis process and the first filtering access path are obtained, the accuracy of the filtering analysis process is ensured; in addition, in the embodiment, the input of the second large-scale language reasoning model is a data query request, a first filtering access path and path description, and the output is an analysis process of access path screening and a further screening result.
In order to facilitate understanding of the application, the application provides a financial database intelligent query method based on a large model, wherein the financial database intelligent query refers to automatically identifying the input query intention or demand through an algorithm or technology, and returning a financial data result meeting the demand from the financial database. The prior art is primarily concerned with keyword matching queries for which the accuracy of the query can be very low once the user entered keywords deviate from a predefined set of mappings. Referring to fig. 2 specifically, fig. 2 is a flowchart illustrating a financial data query method according to an embodiment of the present application, which may specifically include:
s201, determining a target fetch path according to the data query request.
The embodiment will determine the closest fetch path as the target fetch path based on the data query request.
S202, describing a target access path of financial data by using a large language description model according to the prompt word, and obtaining path descriptions corresponding to index details under each path; wherein the fetch path includes a root directory, and the path description is a generalized description of the metrics.
When the target access path is obtained, the embodiment only uses a large language description model to describe the target access path of the financial data according to the prompt words, and the path description corresponding to the index details under each path is obtained. The large language description model in this embodiment may describe the target fetch path, such as the path description described above.
S203, filtering the target access path by using vector retrieval according to the data query request and the path description to obtain a rough filtering access path.
The embodiment uses vector retrieval to determine a preset number of paths with the path description closest to the data query request from the target fetch paths according to the data query request, and the paths are used as rough filtering fetch paths. The preset number in this embodiment may be 15, which corresponds to the above process of performing preliminary screening on the access path of the financial data by using the input query sentence in the form of natural language based on vector search.
S204, performing reasoning and filtering on the coarse filtering access path by using a first large language reasoning model according to the data query request to obtain an access path filtering and analyzing process and a fine filtering access path.
The embodiment utilizes a first large-scale language reasoning model to carry out reasoning and filtering on the coarse filtering access path according to the data query request to obtain an access path filtering and analyzing process and a fine filtering access path, and the access path filtering and analyzing process can determine the process of determining the filtering access path according to the data query request. This embodiment corresponds to the above process of generating < analysis process of access path filtering > and < filtering result > based on < financial data query requirement > and < alternative access path >.
And S205, filtering the fine filtering access path by using a second large language reasoning model according to the data query request and the path description to obtain a complete filtering access path.
On the basis of coarse filtering, the embodiment utilizes a second large-scale language reasoning model to carry out fine filtering on the fine filtering access path according to the data query request and the path description, so as to obtain the complete filtering access path. The steps in this embodiment correspond to the processes of "analysis process based on < financial data query requirement >, < alternative fetch path >, < path description >, < fetch path screening > and < screening result >" above.
S206, selecting the specific indexes by using a large-scale language selection model according to the data query request and the specific indexes under the complete filtering access path to obtain target result data.
The invention mainly solves the defects of the prior art scheme in two aspects when the intelligent inquiry of the financial database is realized: first, the tedious process of manually constructing and maintaining query sentences, word banks and rules is avoided. Second, the accuracy, flexibility and efficiency of the query are improved by utilizing multi-level filtering.
In order to facilitate understanding of the present invention, referring to fig. 3 in detail, fig. 3 is a schematic diagram of an overall framework of financial data query according to an embodiment of the present invention, which may specifically include:
The invention provides a financial database intelligent query method based on a large model, which restores the process of querying financial database products by people. On the one hand, a large-scale pre-training language model (HITHINKGPT) is used as a basic framework, and the model is pre-trained through a large amount of text data, so that rich semantic information and financial domain knowledge are mastered. On the other hand, a migration learning strategy is adopted, and the pre-training model is finely tuned on a specific data set in the financial field, so that the pre-training model is suitable for specific contexts and professional vocabularies of financial data query. The specific flow module of the scheme is as follows:
Module (1): a description module of financial data paths based on a large model.
The module is mainly based on a large model, and text description is carried out on the fetch path of the financial data through intelligent prompt word fine adjustment. The fetch path of the financial data will be one of the inputs to modules 2 and 4.
Module (2): and a preliminary selection module of the financial index path based on vector retrieval.
The module considers that the number of the bottom indexes of the financial index library is huge, usually in the order of millions, and the bottom indexes are directly searched or the large model reasoning pressure is high, and the accuracy is low. Therefore, the fetch path of the upper layer is first screened. Before large model reasoning screening, the initial selection is carried out on the access paths through vector retrieval, so that paths with close semantics but different keywords can be matched quickly, and the subsequent large model reasoning pressure is relieved. The module is mainly based on a vector retrieval method, and performs primary screening on the access path of the financial data by utilizing the input data query request and the description of the access path. Experimental results show that when the number of the primarily selected access paths is 15, the hit rate of 100% can be stably reached, namely, at least one path in the 15 access paths can meet the requirement of financial data extraction.
Module (3): an inference filtering module based on financial data paths of large models.
The module is mainly based on a large model and carries out logical reasoning and filtering on 15 financial data fetch paths which are searched and initially selected.
Module (4): and an inference selection module based on the financial data path of the large model.
The module is mainly based on a large model, and the access path after the filtering of the module 3 is further logically inferred and screened. The filtered fetch path of the module 3 is a path which cannot be directly judged by the path name, and more specific reasoning and screening are needed by further combining the path description of the module 1.
Module (5): and a selection module of the financial data index.
The module is mainly based on a large model, and aims at the number-taking paths screened by the module 4 to further screen the specific indexes of the bottom layer under the paths. The large model analyzes and screens the indexes under a single access path each time, and resources can be reasonably arranged according to the number and the size of the paths screened by the module 4, so that multitasking concurrency is performed, and the running efficiency is improved.
In summary, compared with the prior art, the intelligent query method of the financial database based on the large model can respond to complex and changeable user query demands more accurately and flexibly, has stronger generalization capability and user adaptability, and is expected to provide more intelligent and better-quality data query service in the financial field.
The following describes a data query device provided in the embodiments of the present invention, where the data query device described below and the data query method described above may be referred to correspondingly.
Referring to fig. 4 specifically, fig. 4 is a schematic structural diagram of a data query device according to an embodiment of the present invention, which may include:
The path description module 100 is configured to describe the access paths by using a large language description model according to the prompt words, so as to obtain path descriptions corresponding to the index details under each path; wherein the fetch path is a directory hierarchy, and the path description is a generalized description of the index;
the initial selection module 200 is configured to select, according to a data query request and the path description, a preset number of initial selection access paths with maximum relevance from the access paths;
The filtering module 300 is configured to filter the initial selection access path by using a large language reasoning model according to the data query request and the path description, so as to obtain a filtered access path;
And the target result data determining module 400 is configured to select the specific index by using a large-scale language selection model according to the data query request and the specific index under the filtering access path, so as to obtain target result data.
Further, based on the above embodiment, the preliminary selection module 200 may include:
And the semantic matching unit is used for carrying out semantic matching based on vectors according to the data query request and the path description, and selecting the initial selection fetch path from the fetch paths according to the semantic matching degree.
Further, based on any of the above embodiments, the filtering module 300 may include:
And the first filtering unit is used for carrying out reasoning filtering on the initial selection access path by utilizing a first large-scale language reasoning model according to the data query request to obtain an access path filtering analysis process and a first filtering access path.
Further, based on the above embodiment, it may further include:
And the second filtering unit is used for filtering the first filtering access path by using a second large-scale language reasoning model according to the data query request and the path description to obtain a second filtering access path.
Further, based on any of the above embodiments, the data query device may further include:
The first training module is used for training the large-scale language model according to the training data input into the data query request training data and the path name training data and the training data output into the first filtering result training data to obtain the large-scale language reasoning model;
And the second training module is used for training the large language model according to the training data input for the data query request, the path description training data and the path name training data and the training data output for the second filtering result training data to obtain the large language reasoning model.
Further, based on any of the above embodiments, the path description module 100 may include:
and describing the root directory of the financial data by using the large language description model according to the prompt word to obtain path description corresponding to the index details under each root directory.
Further, based on the above embodiment, the content of the hint word may include the requirements of the general description and the path description reference examples.
It should be noted that, the modules and units in the data query device can be changed in order without affecting the logic.
The data query device provided by the embodiment of the application can comprise: the path description module 100 is configured to describe the access paths by using a large language description model according to the prompt words, so as to obtain path descriptions corresponding to the index details under each path; wherein the fetch path is a directory hierarchy, and the path description is a generalized description of the index; the initial selection module 200 is configured to select, according to a data query request and the path description, a preset number of initial selection access paths with maximum relevance from the access paths; the filtering module 300 is configured to filter the initial selection access path by using a large language reasoning model according to the data query request and the path description, so as to obtain a filtered access path; and the target result data determining module 400 is configured to select the specific index by using a large-scale language selection model according to the data query request and the specific index under the filtering access path, so as to obtain target result data. Compared with the current query through keywords, once the keyword input by a user deviates from a predefined mapping set, the query accuracy is lower, the method and the device of the application can utilize a large model to initially select data according to the data query request and the path description, and further utilize a large language selection model to select a filtered access path according to the data query request, the path description and specific indexes under the path to obtain target result data. In addition, the embodiment directly describes the root directory, so that a large amount of data under the root directory can be summarized integrally, and the data query request is directly positioned to a more accurate root directory, and the efficiency of data query can be improved because the number of path descriptions can be reduced; in addition, the embodiment can carry out semantic matching according to the vector, so that the accuracy of determining the initial selection fetch path is improved; and, the input of the first large language reasoning model in the embodiment is a data query request and an initial selection path, and the output is a fetch path filtering analysis process and a filtering result. Because the embodiment performs inference analysis when the first large-scale language inference model is used for filtering the initial selection access path, and the access path filtering analysis process and the first filtering access path are obtained, the accuracy of the filtering analysis process is ensured; in this embodiment, the input of the second large language reasoning model is a data query request, a first filtered access path and a path description, and the output is an analysis process of access path screening and a further screening result. The embodiment can be combined with the path description for filtering in the filtering process, so that the filtering accuracy is further improved.
The following describes a data query device provided in the embodiments of the present invention, where the data query device described below and the data query method described above may be referred to correspondingly.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a data query device according to an embodiment of the present invention, which may include:
A memory 10 for storing a computer program;
a processor 20 for executing a computer program for implementing the steps of the data query method described above.
The memory 10, the processor 20, and the communication interface 30 all communicate with each other via a communication bus 40.
In the embodiment of the present invention, the memory 10 is used for storing one or more programs, the programs may include program codes, the program codes include computer operation instructions, and in the embodiment of the present invention, the memory 10 may store programs for implementing the following functions:
Describing the access paths by using a large language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; wherein, the fetch path is a directory hierarchy, and the path description is a generalized description of the index;
Selecting a preset number of initial selection access paths with maximum relevance from the access paths according to the data query request and the path description;
Filtering the initial selection access path by using a large-scale language reasoning model according to the data query request and the path description to obtain a filtered access path;
And selecting the specific indexes by using a large-scale language selection model according to the data query request and the specific indexes under the filtering access path to obtain target result data.
In one possible implementation, the memory 10 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, and at least one application program required for functions, etc.; the storage data area may store data created during use.
In addition, memory 10 may include read only memory and random access memory and provide instructions and data to the processor. A portion of the memory may also include NVRAM. The memory stores an operating system and operating instructions, executable modules or data structures, or a subset thereof, or an extended set thereof, where the operating instructions may include various operating instructions for performing various operations. The operating system may include various system programs for implementing various basic tasks as well as handling hardware-based tasks.
The processor 20 may be a central processing unit (Central Processing Unit, CPU), an asic, a dsp, a fpga or other programmable logic device, and the processor 20 may be a microprocessor or any conventional processor. The processor 20 may call a program stored in the memory 10.
The communication interface 30 may be an interface of a communication module for connecting with other devices or systems.
Of course, it should be noted that the structure shown in fig. 5 does not limit the data query device in the embodiment of the present invention, and the data query device may include more or fewer components than those shown in fig. 5 or may be combined with some components in practical applications.
The following describes a computer readable storage medium provided in an embodiment of the present invention, where the computer readable storage medium described below and the data query method described above may be referred to correspondingly.
The invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the steps of the data query method when being executed by a processor.
The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
Finally, it is further noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The foregoing has described in detail the methods, apparatus, devices and computer readable storage medium of the present invention, and specific examples have been presented herein to illustrate the principles and embodiments of the present invention and to assist in understanding the methods and core ideas thereof; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (10)

1. A method of querying data, comprising:
describing the access paths by using a large language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; wherein the fetch path is a directory hierarchy, and the path description is a generalized description of the index;
selecting a preset number of initial selection access paths with maximum relevance from the access paths according to a data query request and the path description;
Filtering the initial selection access path by using a large language reasoning model according to the data query request and the path description to obtain a filtered access path;
And selecting the specific indexes under the filtering access path by using a large-scale language selection model according to the data query request to obtain target result data.
2. The data query method as claimed in claim 1, wherein said selecting a preset number of initially selected access paths with the highest relevance from the access paths according to the data query request and the path description comprises:
And carrying out semantic matching based on vectors according to the data query request and the path description, and selecting the initial selected fetch path from the fetch paths according to the semantic matching degree.
3. The data query method of claim 1, wherein said filtering said initially selected access path using a large language reasoning model based on said data query request and said path description to obtain a filtered access path comprises:
And carrying out reasoning and filtering on the initial selection access path by using a first large-scale language reasoning model according to the data query request to obtain an access path filtering and analyzing process and a first filtering access path.
4. The data query method of claim 3, further comprising, after said initially selecting an access path from said data query request using a first large language inference model to perform an inference filter, an access path filter analysis process and a first filtered access path:
and filtering the first filtered access path by using a second large language reasoning model according to the data query request and the path description to obtain a second filtered access path.
5. The data query method of any one of claims 1 to 4, wherein the training process of the large language reasoning model comprises:
Training a large language model according to the training data input into the data query request and the path name training data and the training data output into the training data of the first filtering result training data to obtain the large language reasoning model;
And/or training the large language model according to the training data which is input to the data query request training data, the path description training data and the path name training data and the training data which is output to the training data of the second filtering result, so as to obtain the large language reasoning model.
6. The data query method according to claim 1, wherein the describing the access paths by using a large language description model according to the prompt word, to obtain path descriptions corresponding to the index details under each path, includes:
and describing the root directory of the financial data by using the large language description model according to the prompt word to obtain path description corresponding to the index details under each root directory.
7. The data query method of claim 1, wherein the content of the hint words includes a summary description requirement and a path description reference sample.
8. A data query device, comprising:
The path description module is used for describing the access paths by utilizing a large-scale language description model according to the prompt words to obtain path descriptions corresponding to index details under each path; wherein the fetch path is a directory hierarchy, and the path description is a generalized description of the index;
the initial selection module is used for selecting a preset number of initial selection access paths with maximum relevance from the access paths according to the data query request and the path description;
The filtering module is used for filtering the initial selection access path by utilizing a large-scale language reasoning model according to the data query request and the path description to obtain a filtered access path;
and the target result data determining module is used for selecting the specific indexes by utilizing a large-scale language selection model according to the data query request and the specific indexes under the filtering access path to obtain target result data.
9. A data query device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the data query method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the data query method of any of claims 1 to 7.
CN202410487627.8A 2024-04-23 2024-04-23 Data query method, device, equipment and computer readable storage medium Pending CN118093635A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410487627.8A CN118093635A (en) 2024-04-23 2024-04-23 Data query method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410487627.8A CN118093635A (en) 2024-04-23 2024-04-23 Data query method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN118093635A true CN118093635A (en) 2024-05-28

Family

ID=91150304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410487627.8A Pending CN118093635A (en) 2024-04-23 2024-04-23 Data query method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN118093635A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210382923A1 (en) * 2020-06-04 2021-12-09 Louis Rudolph Gragnani Systems and methods of question answering against system of record utilizing natural language interpretation
CN117251538A (en) * 2023-08-08 2023-12-19 杭州阿里云飞天信息技术有限公司 Document processing method, computer terminal and computer readable storage medium
CN117539990A (en) * 2023-11-03 2024-02-09 重庆数智逻辑科技有限公司 Problem processing method and device, electronic equipment and storage medium
CN117743543A (en) * 2023-12-20 2024-03-22 北京百度网讯科技有限公司 Sentence generation method and device based on large language model and electronic equipment
CN117744753A (en) * 2024-02-19 2024-03-22 浙江同花顺智能科技有限公司 Method, device, equipment and medium for determining prompt word of large language model
CN117744804A (en) * 2024-02-19 2024-03-22 粤港澳大湾区数字经济研究院(福田) Reasoning method, terminal and medium of financial analysis task based on large language model
CN117788109A (en) * 2023-12-26 2024-03-29 拉扎斯网络科技(上海)有限公司 Method for generating commodity label based on large language model and electronic equipment
US20240112074A1 (en) * 2022-09-30 2024-04-04 International Business Machines Corporation Natural language query processing based on machine learning to perform a task

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210382923A1 (en) * 2020-06-04 2021-12-09 Louis Rudolph Gragnani Systems and methods of question answering against system of record utilizing natural language interpretation
US20240112074A1 (en) * 2022-09-30 2024-04-04 International Business Machines Corporation Natural language query processing based on machine learning to perform a task
CN117251538A (en) * 2023-08-08 2023-12-19 杭州阿里云飞天信息技术有限公司 Document processing method, computer terminal and computer readable storage medium
CN117539990A (en) * 2023-11-03 2024-02-09 重庆数智逻辑科技有限公司 Problem processing method and device, electronic equipment and storage medium
CN117743543A (en) * 2023-12-20 2024-03-22 北京百度网讯科技有限公司 Sentence generation method and device based on large language model and electronic equipment
CN117788109A (en) * 2023-12-26 2024-03-29 拉扎斯网络科技(上海)有限公司 Method for generating commodity label based on large language model and electronic equipment
CN117744753A (en) * 2024-02-19 2024-03-22 浙江同花顺智能科技有限公司 Method, device, equipment and medium for determining prompt word of large language model
CN117744804A (en) * 2024-02-19 2024-03-22 粤港澳大湾区数字经济研究院(福田) Reasoning method, terminal and medium of financial analysis task based on large language model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张秀红;刘纪平;王勇;罗安;: "面向自然语言空间方向关系查询的语义扩展框架", 地理与地理信息科学, no. 06, 15 November 2018 (2018-11-15) *

Similar Documents

Publication Publication Date Title
CN109284363B (en) Question answering method and device, electronic equipment and storage medium
US20030018616A1 (en) Systems, methods and computer program products for integrating databases to create an ontology network
Ibrahim et al. Making sense of entities and quantities in web tables
US20090157653A1 (en) Methods for enhancing digital search results based on task-oriented user activity
US20090157617A1 (en) Methods for enhancing digital search query techniques based on task-oriented user activity
CN104239340A (en) Search result screening method and search result screening device
CN110222045A (en) A kind of data sheet acquisition methods, device and computer equipment, storage medium
CN115270738B (en) Research and report generation method, system and computer storage medium
CN104008106A (en) Method and apparatus for obtaining hot topic
CN102200974A (en) Unified information retrieval intelligent agent system and method for search engine
Zhang et al. Building a highly-compact and accurate associative classifier
US9031886B2 (en) Pluggable modules in a cascading learning system
CN115827819A (en) Intelligent question and answer processing method and device, electronic equipment and storage medium
CN116340530A (en) Intelligent design method based on mechanical knowledge graph
CN113946686A (en) Electric power marketing knowledge map construction method and system
Qomariyah et al. Comparative analysis of decision tree algorithm for learning ordinal data expressed as pairwise comparisons
Anoop et al. A topic modeling guided approach for semantic knowledge discovery in e-commerce
Jiang et al. A concept-based approach to retrieval from an electronic industrial directory
Omri et al. Towards an efficient big data indexing approach under an uncertain environment
CN118093635A (en) Data query method, device, equipment and computer readable storage medium
CN117033744A (en) Data query method and device, storage medium and electronic equipment
CN117149804A (en) Data processing method, device, electronic equipment and storage medium
Avogadro et al. LamAPI: a comprehensive tool for string-based entity retrieval with type-base filters.
CN112965998B (en) Method and system for establishing and retrieving compound database
CN114118082A (en) Resume retrieval method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination