CN103927346A - Query connection method on basis of data volumes - Google Patents
Query connection method on basis of data volumes Download PDFInfo
- Publication number
- CN103927346A CN103927346A CN201410124531.1A CN201410124531A CN103927346A CN 103927346 A CN103927346 A CN 103927346A CN 201410124531 A CN201410124531 A CN 201410124531A CN 103927346 A CN103927346 A CN 103927346A
- Authority
- CN
- China
- Prior art keywords
- statistical information
- data
- query
- data volume
- row
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24558—Binary matching operations
- G06F16/2456—Join operations
Abstract
The invention discloses a query connection method on the basis of data volumes. Characteristics such as line file reading are taken into deep consideration during real-time query on big data by the aid of the query connection method, so that costs can be estimated, and the optimal connection sequences can be assuredly generated. The query connection method mainly includes constructing metadata servers; collecting statistical information; querying the metadata servers and acquiring relevant statistical information of various tables participating in connection; estimating the selectivity and relevant parameters such as the data volumes according to the statistical information; computing the corresponding costs of various execution plans to find out the optimal connection sequences. The query connection method has the advantages that the cost estimation accuracy can be improved by the aid of the query connection method, accordingly, the optimal execution plans can be assuredly found out, and the integral query efficiency can be effectively improved.
Description
Technical field
The present invention relates to large data real-time query optimisation technique field, relate in particular to a kind of inquiry method of attachment based on data volume.
Background technology
Large data real-time query is important large data technique, and existing large data query system has Google Dremel, Cloudera Impala, Berkeley Shark, Apache Drill etc.Large data real-time query generally adopts distributed computing architecture, due to the support having weakened functions such as affairs, so have higher extensibility with respect to relevant database cluster.Be well positioned to meet the user's request of real-time query due to large data real-time query simultaneously, therefore its in internet, there is wide application space in the field such as wisdom city.
Multi-link sequential query optimization is the important component part of data base management system (DBMS), in large data real-time query technical field, possesses equally irreplaceable importance.It,, by adopting certain optimization method, constantly travels through the search volume of executive plan, finds out the best order of connection, to generate best executive plan, thereby promotes the performance of large data query system, meets the real-time demand of user's inquiry.
Estimate it is very important part in multi-link sequential query optimizing process due to cost, can provide a kind of effective result size estimation method be the key that query optimization is effectively realized.Traditional cost method of estimation is a kind of method based on table radix, can effectively solve traditional cost estimation problem by the method, thereby ensures to find the Optimum Implementation Plan that meets Cost Model.But in distributed data base system or data warehouse, there is the tables of data with row formula stored in file format, this formatted file is the I/O performance when optimizing bottom data and to read and reduces data transmission data volume, taking RCFile file as example, this file be a kind of first by row transversally cutting then by the file layout of the longitudinal cutting of row, it will only read and transmit required data rows.In the time that the tables of data to there being row formula stored in file format participates in connecting, while adopting the cost method of estimation of tradition based on table radix to estimate, its the possibility of result can produce serious deviation, and then cause order of connection optimized algorithm to be found out meeting the executive plan of Cost Model not for best, the order of connection finding not, for optimum, consequently makes whole query latency higher.
Summary of the invention
The technical problem to be solved in the present invention is how to guarantee that large data real time inquiry system promotes the accuracy that its cost is estimated while carrying out multi-link sequential optimization, thereby promotes the overall efficiency of inquiry.The problem of carrying out cost estimation existence based on table radix in order to solve above-mentioned tradition, the present invention proposes the multi-join query cost method of estimation based on data volume, consider that the part relations that participates in connecting in the inquiry of user's submission may be with the storage of row formula file, by characteristics such as deep consideration row formula file read, increase more fine-grained statistical information, utilize the average length of each field with the connection intermediate result size of estimation inquiry, thereby effectively guarantee the accuracy of cost estimation.
An inquiry method of attachment based on data volume, comprising:
Step 1, to the request of meta data server submit Query, obtains the corresponding statistical information of each table that participates in connection;
Step 2, obtains the data volume of all tables in current executive plan according to the statistical information estimation getting;
Step 3, repeating step 1 and step 2, until the executive plan that has suitable data amount and make Query Cost minimum is found out, the connection of showing by the order of connection in this executive plan in the search volume of traversal executive plan.
Wherein the search volume of executive plan refers to the set of the table order of connection that all executive plans obtain.
The present invention determines the order connecting in multi-join query using data volume as Query Cost, thereby guarantees that large data real time inquiry system promotes the accuracy that its cost is estimated while carrying out multi-link sequential optimization, thereby promotes the overall efficiency of inquiry.
Wherein, meta data server building mode is, chooses relevant database and designs the table schema of row rank, creates metadatabase and table relation according to the table schema designing in corresponding relevant database, obtains meta data server.
For the statistical information of three kinds of granularities such as table rank, subregion rank and row rank can be provided for inquiry system, design corresponding table schema and need to meet suitable normal form, can complete under the prerequisite of cost estimation simultaneously, reduce unnecessary storage overhead as far as possible.
Statistical information in meta data server is every corresponding statistical information of table, and described statistical information is added up and obtained according to the table schema his-and-hers watches of design.
The fine granularity of statistical information obtains because table schema is row class pattern according to the fine granularity of table schema, and therefore statistical information comprises the statistical information of row rank.
Described relevant database is: MYSQL database, Derby database or oracle database.
According to the actual demand of enterprise customer and system, choose the meta data server of suitable relevant database as large data real time inquiry system.
Statistical information comprises: the upper bound of data value in the lower bound of data value, row in row names, row, table midrange be the total line number according to maximum length, table or the view of field data in the average length of field data in the quantity of different value, row and row according to the quantity for empty, table midrange.
The storage in meta data server of meta data server and statistical information all completes under off-line state.
Because the structure of meta data server and the collection of statistical information are all that off-line completes, while making actual inquiry, carrying out returning of statistical information does not need to expend how many run-time overheads, has greatly reduced the time delay that cost is estimated.
In step 2, the data volume of each table calculates according to total line number of the corresponding selectance of this table, field average amount and table.
Selectance is according to statistical informations such as the upper bounds of data value in the lower bound of data value in the row of statistical information gained, row and connect correlated condition in inquiry, wherein generally represents with selectivity.
The evaluation method of selectance is, carries out corresponding calculating according to querying condition and statistical information in inquiry, the row that obtains meeting in table querying condition shared ratio in the object set that will inquire about.
Object set is wherein to be the set of table, view or intermediate result.
The computing formula of data volume size is as follows:
Selectivity represents the selectance of inquiry, and numsOfTableLine is total line number of table or view, avgColSize
ithe average amount of i row field in the table that expression need to be returned, j is the columns of table.
Evaluation method compared to tradition based on table radix, it not only depends on the line number size that inquiry intermediate result produces, and also the data volume of estimation is taken into account simultaneously, thereby promotes the accuracy of cost estimation.
Advantage of the present invention comprises:
There is the inaccurate problem of estimation in the cost method for tradition based on table radix, deeply considers the characteristics such as row formula file reads, and increases more fine-grained statistical information, effectively promoted the accuracy of cost estimation.
By meta data server storage and maintenance table ASSOCIATE STATISTICS information, avoid repeatedly carrying out a large amount of analytical works, reduce run-time overhead, promote the efficiency that cost is estimated.
Brief description of the drawings
Fig. 1 is the inquiry method of attachment overview flow chart of embodiment of the inventive method based on data volume;
The query processing Organization Chart that Fig. 2 adopts for the current embodiment of the present invention;
Fig. 3 is that in the current embodiment of the present invention, meta data server builds process flow diagram;
Fig. 4 is that in the current embodiment of the present invention, statistical information is collected process flow diagram;
Fig. 5 is statistical information querying flow figure in the current embodiment of the present invention;
Fig. 6 is data volume estimation process flow diagram in the current embodiment of the present invention;
Fig. 7 is order of connection product process figure in the current embodiment of the present invention.
Embodiment
The present invention proposes the inquiry method of attachment based on data volume, in the time inquiring about, multi-join query is carried out to cost estimation, the overall procedure of cost method of estimation as shown in Figure 1.First it carry out the construction work of meta data server; Then complete the collection of statistical information; Secondly obtain by query metadata server the ASSOCIATE STATISTICS information that participates in the each table connecting; Then carry out the estimation work of the correlation parameters such as selectance and data volume according to statistical information; Finally adopt method of estimation based on data volume to calculate the corresponding cost of each executive plan and find out the best order of connection.
The effect of method in query optimization proposing in order to introduce more intuitively the present invention, now provides the framework of query processing as shown in Figure 2, and it has set forth the relation between cost estimation module and the order of connection generation module based on data volume.Wherein, in order of connection generation module, carried out the work of executive plan search by related optimization, and cost estimation module based on data volume is mainly made up of Cost Model and MetaStore two parts, the work of estimating to complete cost.The inquiry of submitting to for user, through parsing after by by multi-link sequential query optimization method to complete the work of sequential optimization, it is carrying out in the process of executive plan search, need to call associated costs estimation module and carry out the estimation work of cost, to guarantee to find the Best link order that meets given Cost Model.
The step of the multi-join query cost method of estimation based on data volume that the present invention proposes comprises:
First need to build meta data server and by the statistical information of storing in the table in meta data server inquiring about before connecting.
Relevant database also designs table schema, builds meta data server.
For the cost method of estimation based on data volume can be able to efficient realization, first need to carry out the construction work of meta data server, as shown in Figure 3, concrete steps are as follows for its flow process:
According to the actual demand of enterprise customer and system, choose the meta data server of suitable relevant database (as MYSQL database, Derby database) as large data real time inquiry system;
For the statistical information of three kinds of granularities such as table rank, subregion rank and row rank can be provided for inquiry system, design corresponding table schema and need to meet suitable normal form, can complete under the prerequisite of cost estimation simultaneously, reduce unnecessary storage overhead as far as possible;
In corresponding database server, create metadatabase and table relation according to the table schema designing, use for subsequent step.
According to designed good table schema, analyze the relation in every table and corresponding statistical information is stored in meta data server to complete the collection of statistical information;
For the inquiry after resolving is carried out to order of connection Optimization Work, the work that has needed statistical information to collect after creating meta data server, as shown in Figure 4, concrete steps are as follows for its flow process:
Estimate to obtain the expense of statistical information in order to reduce cost in order of connection optimizing process, first carry out analytical work by corresponding anolytic sentence or instrument to often connecting the table of inquiring about;
Table after analyzing is carried out to the collection work of ASSOCIATE STATISTICS information, and this statistical information is stored in the respective table of meta data server, for the cost better completing based on data volume is estimated, need to collect the statistical information that comprises the row ranks such as field average length AVG_COL_LEN, it provides in the process of carrying out table schema design.Wherein statistical information comprises: the upper bound of data value in the lower bound of data value, row in row names, row, table midrange be the total line number according to maximum amount of data, table or the view of field data in the average amount of field data in the quantity of different value, row and row according to the quantity for empty, table midrange.
The establishment of meta data server (being metadatabase) and the collection of statistical information are off-line and complete, and then inquire about.
Step 1, by the request of meta data server submit Query to obtain the ASSOCIATE STATISTICS information of each table that connects of participating in;
This step mainly completes the inquiry of ASSOCIATE STATISTICS information and obtains work, and as shown in Figure 5, concrete steps are as follows for its flow process:
In order to obtain participating in inquiry the corresponding statistical information of the each table connecting, need to be by query optimization module to the request of respective meta-data server submit Query;
Return to the corresponding statistical information of each table relation by meta data server, to complete the work of obtaining of statistical information, thereby for the calculating of next stage correlation parameter.
Because the structure of meta data server and the collection of statistical information are all that off-line completes, therefore this step does not need to expend how many run-time overheads, greatly reduce the time delay that cost is estimated.
Step 2, obtains the data volume of all tables in current executive plan according to the statistical information estimation getting.
Wherein executive plan refers to the inquiry of carrying out with the different table order of connection.
Before the corresponding cost of carrying out executive plan is estimated, need to complete the estimation work of correlation parameter, as shown in Figure 6, concrete steps are as follows for its flow process of calculating that correlation parameter comprises selectance and data volume:
By the ASSOCIATE STATISTICS information getting in previous step, first participate in the calculating of the each table selectance connecting, step 2-1, carries out corresponding calculating according to the querying condition and the statistical information that connect in inquiry, the row that is met condition shared ratio in the object set that will inquire about.
For any two querying conditions that comprise in inquiry, the corresponding computing formula difference of satisfied different relations:
Selectance selectivity when inquiry meets querying condition A and querying condition B simultaneously
(AandB)computing formula be:
selectivity
(AandB)=selectivity
(A)×selectivity
(B) (1)
Wherein, selectivity
(A)represent the selectance of single query condition A, selectivity
(B)represent the selectance of single query condition B;
Selectance selevtivity when inquiry meets querying condition A or querying condition B
(AorB)computing formula is:
selevtivity
(AorB)=P(A)+P(B)-selectivity
(AandB) (2)
P(A) represent the probability of occurrence of querying condition A, P(B) represent the probability of occurrence of querying condition B;
Inquiry meets selectance selectivity while getting rid of querying condition A
(notA)computing formula:
selectivity
(ntoA)=1-selectivity
(A) (3)
Between any two querying condition A and B, satisfied pass is: meet simultaneously, meet A or meet B, querying condition also may be for not comprising A.When comprising multiple queries condition and comprising between querying condition multiple the relation, can carry out combination of two according to above-mentioned formula to querying condition wherein respectively, calculate according to the satisfied relation of each combination, obtain final selectance.
Step 2-2, calculates the data volume of each table according to the selectance of step 2-1 gained, computing formula is as follows:
Selectivity represents that step 2-1 calculates gained selectance, and numsOfTableLine is total line number of table or view, avgColSize
ithe average amount of i row field in the table that expression need to be returned, j is the columns of table.
Each table data volume input Cost Model that formula (4) is calculated to gained, carries out the cost estimation of multi-join query, thereby obtains the cost of different executive plan gained.Evaluation method compared to tradition based on table radix, it not only depends on the line number size that inquiry intermediate result produces, and also the data volume of estimation is taken into account simultaneously, thereby promotes the accuracy of cost estimation.
Step 3, repeating step 1 and step 2, until the search volume of traversal executive plan, the table order of connection of finding out data volume minimum connects.
In order to find the best order of connection, in the search procedure of executive plan, need the cost method of estimation based on data volume that uses the present invention to propose, as shown in Figure 7, concrete steps are as follows for its flow process:
Carry out the space search work (being repeating step 1 and step 2) of executive plan according to adopted order of connection optimization method, it,, by consider the characteristic of real time inquiry system and increase corresponding technology of prunning branches to optimize the performance of executive plan search simultaneously, reduces the query latency that algorithm itself is carried out;
Obtain the estimated value of the data volume of corresponding executive plan by step 2, find out the executive plan that meets given Cost Model, and store;
The Optimum Implementation Plan of finding out according to above-mentioned steps, to generate the best order of connection, due to the cost estimation method that has adopted the present invention to propose, thereby has effectively improved the accuracy that cost is estimated.
Claims (9)
1. the inquiry method of attachment based on data volume, is characterized in that, comprising:
Step 1, to the request of meta data server submit Query, obtains the corresponding statistical information of each table that participates in connection;
Step 2, obtains the data volume of all tables in current executive plan according to the statistical information estimation getting;
Step 3, repeating step 1 and step 2, until the executive plan that has suitable data amount and make Query Cost minimum is found out, the connection of showing by the order of connection in this executive plan in the search volume of traversal executive plan.
2. the inquiry method of attachment based on data volume as claimed in claim 1, it is characterized in that, wherein, meta data server building mode is, choose relevant database and design the table schema of row rank, in corresponding relevant database, create metadatabase and table relation according to the table schema designing, build meta data server.
3. the inquiry method of attachment based on data volume as claimed in claim 1, is characterized in that, the statistical information of storing in meta data server is every corresponding statistical information of table, and described statistical information is added up and obtained according to the table schema his-and-hers watches of design.
4. the inquiry method of attachment based on data volume as claimed in claim 1, is characterized in that, described relevant database is: MYSQL database, Derby database or oracle database.
5. the inquiry method of attachment based on data volume as claimed in claim 1, it is characterized in that, statistical information comprises: the upper bound of data value in the lower bound of data value, row in row names, row, table midrange be the total line number according to maximum amount of data, table or the view of field data in the average amount of field data in the quantity of different value, row and row according to the quantity for empty, table midrange.
6. the inquiry method of attachment based on data volume as claimed in claim 1, is characterized in that, wherein, the storage in meta data server of meta data server and statistical information all completes under off-line state.
7. the inquiry method of attachment based on data volume as claimed in claim 1, is characterized in that, in step 2, the data volume of each table calculates according to total line number of the corresponding selectance of this table, field average amount and table.
8. the inquiry method of attachment based on data volume as claimed in claim 7, it is characterized in that, the evaluation method of selectance is, carries out corresponding calculating according to querying condition and statistical information in inquiry, the row that obtains meeting in table querying condition shared ratio in the object set that will inquire about.
9. the inquiry method of attachment based on data volume as claimed in claim 8, is characterized in that, the computing formula of every table data volume size is as follows:
Selectivity represents the selectance of inquiry, and numsOfTableLine is total line number of table or view, avgColSize
ithe average amount of i row field in the table that expression need to be returned, j is the columns of table.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410124531.1A CN103927346B (en) | 2014-03-28 | 2014-03-28 | Query connection method on basis of data volumes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410124531.1A CN103927346B (en) | 2014-03-28 | 2014-03-28 | Query connection method on basis of data volumes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103927346A true CN103927346A (en) | 2014-07-16 |
CN103927346B CN103927346B (en) | 2017-02-15 |
Family
ID=51145567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410124531.1A Expired - Fee Related CN103927346B (en) | 2014-03-28 | 2014-03-28 | Query connection method on basis of data volumes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103927346B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250567A (en) * | 2016-08-31 | 2016-12-21 | 天津南大通用数据技术股份有限公司 | In distributed data base system, table connects system of selection and the device of data distribution mode |
CN106446170A (en) * | 2016-09-27 | 2017-02-22 | 努比亚技术有限公司 | Data querying method and device |
CN107193813A (en) * | 2016-03-14 | 2017-09-22 | 阿里巴巴集团控股有限公司 | Tables of data connected mode processing method and processing device |
CN108268536A (en) * | 2016-12-30 | 2018-07-10 | 北京国双科技有限公司 | Database aggregation processing method and device |
CN108491516A (en) * | 2018-03-26 | 2018-09-04 | 哈工大大数据(哈尔滨)智能科技有限公司 | Distributed multi-table join selection method based on mixed integer linear programming and device |
CN111625557A (en) * | 2020-04-07 | 2020-09-04 | 上海熙菱信息技术有限公司 | Method for rapidly estimating results of billion-level data volume multi-condition |
CN112395372A (en) * | 2020-12-10 | 2021-02-23 | 四川长虹电器股份有限公司 | Quick statistical method based on two-dimensional table of relational database system |
CN112905591A (en) * | 2021-02-04 | 2021-06-04 | 成都信息工程大学 | Data table connection sequence selection method based on machine learning |
CN113010547A (en) * | 2021-05-06 | 2021-06-22 | 电子科技大学 | Database query optimization method and system based on graph neural network |
CN113656437A (en) * | 2021-07-02 | 2021-11-16 | 阿里巴巴新加坡控股有限公司 | Method and device for determining optimal query plan |
CN114090695A (en) * | 2022-01-24 | 2022-02-25 | 北京奥星贝斯科技有限公司 | Query optimization method and device for distributed database |
CN114461677A (en) * | 2022-04-12 | 2022-05-10 | 天津南大通用数据技术股份有限公司 | Method for transmitting and adjusting connection sequence based on selection degree |
CN117056361A (en) * | 2023-07-03 | 2023-11-14 | 杭州拓数派科技发展有限公司 | Data query method and device for distributed database |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060106777A1 (en) * | 2004-11-18 | 2006-05-18 | International Business Machines Corporation | Method and apparatus for predicting selectivity of database query join conditions using hypothetical query predicates having skewed value constants |
CN101739451A (en) * | 2009-12-03 | 2010-06-16 | 南京航空航天大学 | Joint query adaptive processing method for grid database |
CN102929996A (en) * | 2012-10-24 | 2013-02-13 | 华南理工大学 | XPath query optimization method and system |
CN103164495A (en) * | 2011-12-19 | 2013-06-19 | 中国人民解放军63928部队 | Half-connection inquiry optimizing method based on periphery searching and system thereof |
-
2014
- 2014-03-28 CN CN201410124531.1A patent/CN103927346B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060106777A1 (en) * | 2004-11-18 | 2006-05-18 | International Business Machines Corporation | Method and apparatus for predicting selectivity of database query join conditions using hypothetical query predicates having skewed value constants |
CN101739451A (en) * | 2009-12-03 | 2010-06-16 | 南京航空航天大学 | Joint query adaptive processing method for grid database |
CN103164495A (en) * | 2011-12-19 | 2013-06-19 | 中国人民解放军63928部队 | Half-connection inquiry optimizing method based on periphery searching and system thereof |
CN102929996A (en) * | 2012-10-24 | 2013-02-13 | 华南理工大学 | XPath query optimization method and system |
Non-Patent Citations (2)
Title |
---|
周强等: "基于改进DPhyp算法的Impala查询优化", 《计算机研究与发展》 * |
孟凡辉: "数据库基于值的查询优化的研究与实践", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107193813A (en) * | 2016-03-14 | 2017-09-22 | 阿里巴巴集团控股有限公司 | Tables of data connected mode processing method and processing device |
US11650990B2 (en) | 2016-03-14 | 2023-05-16 | Alibaba Group Holding Limited | Method, medium, and system for joining data tables |
CN106250567A (en) * | 2016-08-31 | 2016-12-21 | 天津南大通用数据技术股份有限公司 | In distributed data base system, table connects system of selection and the device of data distribution mode |
CN106446170A (en) * | 2016-09-27 | 2017-02-22 | 努比亚技术有限公司 | Data querying method and device |
CN108268536A (en) * | 2016-12-30 | 2018-07-10 | 北京国双科技有限公司 | Database aggregation processing method and device |
CN108491516A (en) * | 2018-03-26 | 2018-09-04 | 哈工大大数据(哈尔滨)智能科技有限公司 | Distributed multi-table join selection method based on mixed integer linear programming and device |
CN108491516B (en) * | 2018-03-26 | 2021-09-14 | 哈工大大数据(哈尔滨)智能科技有限公司 | Distributed multi-table connection selection method and device based on mixed integer linear programming |
CN111625557B (en) * | 2020-04-07 | 2023-04-14 | 上海熙菱信息技术有限公司 | Method for quickly estimating result of multi-condition billion-level data volume |
CN111625557A (en) * | 2020-04-07 | 2020-09-04 | 上海熙菱信息技术有限公司 | Method for rapidly estimating results of billion-level data volume multi-condition |
CN112395372A (en) * | 2020-12-10 | 2021-02-23 | 四川长虹电器股份有限公司 | Quick statistical method based on two-dimensional table of relational database system |
CN112905591A (en) * | 2021-02-04 | 2021-06-04 | 成都信息工程大学 | Data table connection sequence selection method based on machine learning |
CN113010547A (en) * | 2021-05-06 | 2021-06-22 | 电子科技大学 | Database query optimization method and system based on graph neural network |
CN113656437A (en) * | 2021-07-02 | 2021-11-16 | 阿里巴巴新加坡控股有限公司 | Method and device for determining optimal query plan |
CN113656437B (en) * | 2021-07-02 | 2023-10-03 | 阿里巴巴新加坡控股有限公司 | Model construction method for predicting execution cost stability of reference |
CN114090695A (en) * | 2022-01-24 | 2022-02-25 | 北京奥星贝斯科技有限公司 | Query optimization method and device for distributed database |
CN114461677A (en) * | 2022-04-12 | 2022-05-10 | 天津南大通用数据技术股份有限公司 | Method for transmitting and adjusting connection sequence based on selection degree |
CN114461677B (en) * | 2022-04-12 | 2022-07-26 | 天津南大通用数据技术股份有限公司 | Method for transmitting and adjusting connection sequence based on selection degree |
CN117056361A (en) * | 2023-07-03 | 2023-11-14 | 杭州拓数派科技发展有限公司 | Data query method and device for distributed database |
Also Published As
Publication number | Publication date |
---|---|
CN103927346B (en) | 2017-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103927346A (en) | Query connection method on basis of data volumes | |
US10216793B2 (en) | Optimization of continuous queries in hybrid database and stream processing systems | |
Zhao et al. | Modeling MongoDB with relational model | |
CN110837585B (en) | Multi-source heterogeneous data association query method and system | |
CN102722531B (en) | Query method based on regional bitmap indexes in cloud environment | |
CN103176974A (en) | Method and device used for optimizing access path in data base | |
JPH07319923A (en) | Method and equipment for processing of parallel database of multiprocessor computer system | |
CN104834754A (en) | SPARQL semantic data query optimization method based on connection cost | |
US20110022581A1 (en) | Derived statistics for query optimization | |
CN112328578B (en) | Database query optimization method based on reinforcement learning and graph attention network | |
CN105630881A (en) | Data storage method and query method for RDF (Resource Description Framework) | |
CN107870949B (en) | Data analysis job dependency relationship generation method and system | |
CN108052635A (en) | A kind of heterogeneous data source unifies conjunctive query method | |
CN104137095A (en) | System for evolutionary analytics | |
CN103019728A (en) | Effective complex report parsing engine and parsing method thereof | |
US10726006B2 (en) | Query optimization using propagated data distinctness | |
Simitsis | Modeling and managing ETL processes. | |
CN114691786A (en) | Method and device for determining data blood relationship, storage medium and electronic device | |
CN103793467A (en) | Method for optimizing real-time query on big data on basis of hyper-graphs and dynamic programming | |
CN103678589A (en) | Database kernel query optimization method based on equivalence class | |
US9406027B2 (en) | Making predictions regarding evaluation of functions for a database environment | |
CN104268298A (en) | Method for creating database index and inquiring data | |
CN110795835A (en) | Three-dimensional process model reverse generation method based on automatic synchronous modeling | |
CN111814458A (en) | Rule engine system optimization method and device, computer equipment and storage medium | |
CN110750560A (en) | System and method for optimizing network multi-connection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170215 Termination date: 20200328 |