WO2019019573A1 - 一种新型的olap预计算模型及生成预计算结果的方法 - Google Patents

一种新型的olap预计算模型及生成预计算结果的方法 Download PDF

Info

Publication number
WO2019019573A1
WO2019019573A1 PCT/CN2018/073318 CN2018073318W WO2019019573A1 WO 2019019573 A1 WO2019019573 A1 WO 2019019573A1 CN 2018073318 W CN2018073318 W CN 2018073318W WO 2019019573 A1 WO2019019573 A1 WO 2019019573A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
computed
combination
dimension combination
query statement
Prior art date
Application number
PCT/CN2018/073318
Other languages
English (en)
French (fr)
Inventor
施继成
李扬
韩卿
Original Assignee
上海跬智信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海跬智信息技术有限公司 filed Critical 上海跬智信息技术有限公司
Priority to EP18837795.6A priority Critical patent/EP3709127A4/en
Priority to US15/769,416 priority patent/US20200097483A1/en
Publication of WO2019019573A1 publication Critical patent/WO2019019573A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/244Grouping and aggregation

Definitions

  • the invention belongs to the field of OLAP pre-computation information, and in particular relates to a novel OLAP pre-calculation model and a method for generating pre-calculation results.
  • the data collection in the era of big data has the characteristics of only incremental and no modification, that is, the historical data is often not corrected, and only new data is continuously added.
  • the OLTP (online transaction processing) adopted by the traditional data warehouse is not suitable for this scenario, especially when the amount of data is extremely large.
  • the root cause is that OLTP is designed to handle complex application scenarios, including additions, deletions, and transactions, and in order to implement these complex functions, OLTP has made concessions on query performance, especially in large-scale data queries. Poor performance in the scene.
  • OLAP Compared with OLTP, OLAP is more suitable for the use of modern big data.
  • OLAP provides a solution for improving the efficiency of multidimensional analysis based on pre-computation, by pre-aggregating the data in the data warehouse by implementing a "data cube" and saving the results; when the analyst conducts the actual In business query, it is not necessary to re-aggregate the data, but directly read the pre-computed results, which makes it possible to analyze millions or even hundreds of millions of data sizes.
  • Some use large-scale computer clusters to do multi-dimensional pre-calculation of large data sets. These techniques often translate business query semantics into retrievals of data cubes, resulting in higher performance than raw data queries.
  • the business scenario is often complicated and involves many dimensions. If the data cubes of all the dimensions are pre-computed and saved, a large amount of computing resources and storage space are consumed. Since the number of dimensional combinations grows exponentially with the number of dimensions, pre-computing all data cubes is an impossible task in some extremely complex usage scenarios. Moreover, in the traditional OLAP data cube, the combination of each dimension only predicts a fixed result, that is, the pre-calculation result in the order of a certain dimension, but the actual query scene has variability and the arrangement of dimensions. The order has a significant impact on query efficiency, so a single dimension order cannot satisfy the requirements.
  • the technical problem to be solved by the present invention is that the existing pre-calculation model only performs pre-calculation according to a combination of dimensions, and cannot satisfy the user's variable query scenario.
  • the present invention provides a novel OLAP pre-computation model, which includes: a query statement statistical analyzer, a dynamic dimension combination generator, and a pre-calculated result usage monitor;
  • the query statement statistic analyzer is configured to receive an input query statement and perform statistical analysis on the query statement, and configured to determine, according to the statistical analysis result, whether the query exists in the pre-stored pre-computed dimension combination a precomputed combination of dimensions that the statement matches;
  • the dynamic dimension combination generator is configured to: when there is no matching pre-computed dimension combination, generate an optimal combination of dimensions corresponding to the query statement, and an optimal combination order corresponding to the optimal dimension combination, and according to The optimal dimension combination, the optimal combination ordering generates a pre-computed dimension combination that matches the query statement, and simultaneously stores the matched pre-computed dimension combination;
  • the query statement statistic analyzer is further configured to perform a pre-computed query according to the matched pre-computed dimension combination to obtain a desired query result;
  • the pre-calculation result usage monitor is configured to monitor all pre-computed dimension combinations generated by the dynamic dimension combination generator, and determine the use of pre-calculation results corresponding to each pre-calculated dimension combination within a preset time period. The number of times, if the usage count is lower than the preset threshold, the pre-calculation result corresponding to the pre-calculated dimension combination lower than the preset threshold is deleted.
  • the invention has the beneficial effects of continuously collecting the user's query statement through the above model, analyzing the optimal dimension combination expected by the sentence, dynamically generating the pre-calculation result of the corresponding dimension combination, and improving the query efficiency of subsequent identical or similar queries,
  • the number of user queries increases, the pre-computation results are more and more suitable for query requirements, and the query efficiency is higher.
  • the OLAP pre-calculation takes up too many calculation and storage resources, and the optimal order of the generated dimension combinations is improved.
  • Query efficiency can meet the user's varied query scenarios.
  • the query statement statistic analyzer is further configured to: when there is no pre-computed dimension combination matching the query statement in the pre-stored pre-computed dimension combination, in the pre-stored pre-computed dimension combination Select a sub-optimal pre-computed dimension combination;
  • query statement statistical analyzer is specifically configured to receive the input query statement, and perform statistical analysis on the data tables, dimensions, metrics, and filter conditions used in the query statement.
  • query statement statistic analyzer is further configured to directly read from the source data when there is no matching pre-computed dimension combination and sub-optimal pre-computed dimension combination in the query statement statistic analyzer. Data corresponding to the query statement;
  • the data read from the source data is aggregated and filtered to obtain the desired query result.
  • the invention also relates to a novel method for generating pre-calculation results, which adopts the above OLAP pre-calculation model, the method comprising:
  • the optimal combination ordering generates a pre-computed dimension combination that matches the query statement, and simultaneously stores the matched pre-computed dimension combination;
  • the invention has the beneficial effects that the user's query statement is continuously collected by the above method, the optimal dimension combination expected by the sentence is analyzed, the pre-calculation result of the corresponding dimension combination is dynamically generated, and the query efficiency of the subsequent identical or similar query is improved,
  • the number of user queries increases, the pre-computation results are more and more suitable for query requirements, and the query efficiency is higher.
  • the OLAP pre-calculation takes up too many calculation and storage resources, and the optimal order of the generated dimension combinations is improved. Query efficiency can meet the user's varied query scenarios.
  • the determining, according to the statistical analysis result, whether the pre-computed dimension combination matching the query statement exists in the pre-stored pre-computed dimension combination further includes:
  • receiving the input query statement and performing statistical analysis on the query statement include:
  • the input query statement is received, and the data tables, dimensions, metrics, and filter conditions used in the query statement are statistically analyzed.
  • the determining, according to the statistical analysis result, whether the pre-computed dimension combination matching the query statement exists in the pre-stored pre-computed dimension combination further includes:
  • the data read from the source data is aggregated and filtered to obtain the desired query result.
  • FIG. 1 is a schematic structural diagram of a novel OLAP precomputation model according to the present invention.
  • FIG. 2 is a flow chart of a novel method for generating pre-calculation results of the present invention.
  • the first embodiment of the present invention provides a novel OLAP pre-computation model, which includes: a query statement statistical analyzer, a dynamic dimension combination generator, and a pre-calculation result usage monitor;
  • the query statement statistic analyzer is configured to receive an input query statement and perform statistical analysis on the query statement, and configured to determine, according to the statistical analysis result, whether the query exists in the pre-stored pre-computed dimension combination a precomputed combination of dimensions that the statement matches;
  • the dynamic dimension combination generator is configured to: when there is no matching pre-computed dimension combination, generate an optimal combination of dimensions corresponding to the query statement, and an optimal combination order corresponding to the optimal dimension combination, and according to The optimal dimension combination, the optimal combination ordering generates a pre-computed dimension combination that matches the query statement, and simultaneously stores the matched pre-computed dimension combination;
  • the query statement statistic analyzer is further configured to perform a pre-computed query according to the matched pre-computed dimension combination to obtain a desired query result;
  • the pre-calculation result usage monitor is configured to monitor all pre-computed dimension combinations generated by the dynamic dimension combination generator, and determine the use of pre-calculation results corresponding to each pre-calculated dimension combination within a preset time period. The number of times, if the usage count is lower than the preset threshold, the pre-calculation result corresponding to the pre-calculated dimension combination lower than the preset threshold is deleted.
  • the YEAR and LOCATION in the statement are the required dimensions, and only the combination of the two dimensions is the optimal combination.
  • YEAR is the column in GROUP BY
  • LOCATION is the column in the WHERE filter condition.
  • the query dimension is extracted from these two places.
  • the query statement statistic analyzer When not present, generating an optimal combination of dimensions corresponding to the query statement, an optimal combination order corresponding to the optimal dimension combination, and generating a match with the query statement according to the optimal dimension combination and the optimal combination order
  • the combination of dimensions is precomputed and the precomputed combination of dimensions of the match is stored.
  • the columns YEAR and LOCATION are the required dimensions
  • the optimal dimension combination is only the combination of the two columns
  • the sub-optimal combination can be the column YEAR, the column LOCAT ION or the column PRICE.
  • the data of the three columns PRICE, YEAR, and LOCATION need to be read from the source data, and then the aggregation calculation is performed. Filter to get the final result.
  • Such a query will be sent to the "Dynamic Dimension Combiner" to generate precomputed results, speed up the same or similar query statements, and use the precomputed result usage monitor because the user's query is constantly changing, this month
  • the query of interest is different from last month, so the previously generated precomputed results may lose their validity after a period of time, ie they are no longer accessed by the user.
  • the monitor obtains relevant information from the query mode statistical analyzer, determines pre-calculation results that are not accessed for a long time, and clears or transfers the pre-calculation results to other storage devices, frees up more storage space, and reduces storage pressure. .
  • the query statement statistic analyzer is further configured to: when there is no pre-computed dimension combination matching the query statement in the pre-stored pre-computed dimension combination, in the pre-stored pre-preparation Select a sub-optimal pre-computed dimension combination from the calculated dimension combination;
  • the columns YEAR and LOCATION are the required dimensions, and the optimal dimension combination is a combination containing only these two columns. If there is a combination of dimensions, including YEAR and LOCATION, but not limited to these two combinations of dimensions, such as the combination of dimensions YEAR, LOCATION, CATEGORY, this combination of dimensions is the available combination of dimensions for the query.
  • the precomputed result can still be used to speed up the query, but some simple processing is needed on the precomputed result, such as online aggregation operation, to obtain the desired query result, as shown in Table 1:
  • the dynamic dimension combination generator from The query statement statistic analyzer receives the dimension combination that needs to be dynamically generated, and then the dynamic dimension combination generator generates a new dimension combination pre-calculation result required by the matching query based on the already completed pre-calculation result or the source data. It should be pointed out here that the dynamic dimension combination generator not only needs to select the required dimensions, but also needs to pay attention to the order of the dimensions, that is, the storage arrangement of the final pre-calculation results.
  • the required dimensions are A, B, and C. Since the query uses C as the query condition, placing C in the first position of the list is beneficial to the query, and the generated dimensions are arranged as CAB. Reflected on the storage, the results will be similar to the results described in Table 1:
  • the filtering conditions for the dimension C can be more efficient, the data read is more concentrated, and the efficiency is higher.
  • the query statement statistic analyzer is specifically configured to receive an input query statement and perform statistical analysis on data tables, dimensions, metrics, and filter conditions used in the query statement.
  • the query statement statistical analyzer collects and analyzes the following aspects of each query statement of the user:
  • the query statement statistic analyzer is further configured to: when there is no matching pre-computed dimension combination and sub-optimal pre-computed dimension combination in the query statement statistic analyzer,
  • the data read from the source data is aggregated and filtered to obtain the desired query result.
  • the columns YEAR and LOCATION are the required dimensions, and the optimal dimension combination is a combination containing only these two columns.
  • Such queries are sent to the "Dynamic Dimension Combiner" to generate precomputed results that speed up the same or similar queries.
  • the OLAP pre-computation model of Embodiment 1-4 is used, and the method includes:
  • S5 Monitor all pre-computed dimension combinations generated by the dynamic dimension combination generator, and determine the number of times of pre-calculation results corresponding to each pre-calculated dimension combination in a preset time period, if the usage times are lower than a preset threshold. , the pre-calculation result corresponding to the pre-calculated dimension combination lower than the preset threshold is deleted.
  • the user's query statements are continuously collected, and the query statements are statistically analyzed to determine whether there is a pre-computed dimension combination matching the query statement in the previously stored pre-computed dimension combination. When it does not exist, it indicates that there is no matching pre-computed dimension combination in the previous pre-calculation result.
  • the YEAR and LOCATION are the required dimensions, only the combination of the two dimensions is the optimal combination, and the sub-optimal combination can be the column YEAR, the column LOCATION or the column PRICE.
  • YEAR is the column in GROUP BY
  • LOCATION is the column in the WHERE filter condition.
  • the query dimension is extracted from these two places.
  • the query statement statistic analyzer When not present, generating an optimal combination of dimensions corresponding to the query statement, an optimal combination order corresponding to the optimal dimension combination, and generating a match with the query statement according to the optimal dimension combination and the optimal combination order
  • the combination of dimensions is precomputed and the precomputed combination of dimensions of the match is stored.
  • the columns YEAR and LOCATION are the required dimensions, and the optimal dimension combination is a combination containing only these two columns.
  • the data of the three columns PRICE, YEAR, and LOCATION need to be read from the source data, and then the aggregation calculation is performed. Filter to get the final result.
  • Such a query will be sent to the "Dynamic Dimension Combiner" to generate precomputed results, speed up the same or similar query statements, and use the precomputed result usage monitor because the user's query is constantly changing, this month
  • the query of interest is different from last month, so the previously generated precomputed results may lose their validity after a period of time, ie they are no longer accessed by the user.
  • the monitor obtains relevant information from the query mode statistical analyzer, determines the pre-calculation result that is not accessed for a long time, and clears or transfers the result to other storage devices, leaving more storage space and reducing storage pressure.
  • determining whether there is a pre-computed dimension combination matching the query statement in the pre-stored pre-computed dimension combination further comprises:
  • the columns YEAR and LOCATION are the required dimensions, and the optimal dimension combination is a combination containing only these two columns. If there is a combination of dimensions, including YEAR and LOCATION, but not limited to these two combinations of dimensions, such as the combination of dimensions YEAR, LOCATION, CATEGORY, this combination of dimensions is the available combination of dimensions for the query.
  • the precomputed result can still be used to speed up the query, but some simple processing is needed on the precomputed result, such as online aggregation operation, to obtain the desired query result, as shown in Table 1:
  • the dynamic dimension combination generator from The query statement statistic analyzer receives the dimension combination that needs to be dynamically generated, and then the dynamic dimension combination generator generates a new dimension combination pre-calculation result required by the matching query based on the already completed pre-calculation result or the source data. It should be pointed out here that the dynamic dimension combination generator not only needs to select the required dimensions, but also needs to pay attention to the order of the dimensions, that is, the storage arrangement of the final pre-calculation results.
  • the required dimensions are A, B, and C. Since the query uses C as the query condition, placing C in the first position of the list is beneficial to the query, and the generated dimensions are arranged as CAB. Reflected on the storage, the results will be similar to the results described in Table 1:
  • the filtering conditions for the dimension C can be more efficient, the data read is more concentrated, and the efficiency is higher.
  • the receiving the input query statement and performing statistical analysis on the query statement in another embodiment 7 include:
  • the input query statement is received, and the data tables, dimensions, metrics, and filter conditions used in the query statement are statistically analyzed.
  • determining whether there is a pre-computed dimension combination matching the query statement in the pre-stored pre-computed dimension combination further comprises:
  • the data read from the source data is aggregated and filtered to obtain the desired query result.
  • the columns YEAR and LOCATION are the required dimensions, and the optimal dimension combination is a combination containing only these two columns.
  • Such queries are sent to the "Dynamic Dimension Combiner" to generate precomputed results that speed up the same or similar queries.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种新型的OLAP预计算模型及生成预计算结果的方法,该OLAP预计算模型包括:查询语句统计分析器、预计算结果生成器和预计算结果使用率监控器;还涉及一种方法,该方法包括:对查询语句进行统计分析;根据统计分析结果,判断是否存在匹配的预计算维度组合,若不存在,生成匹配的预计算维度组合;最后根据该匹配的预计算维度组合得到期望的查询结果,或者没有匹配的组合维度,直接从源数据中查询结果。通过本方法分析语句期望的最优维度组合,动态生成相应维度组合的预计算结果,提高后续相同或类似查询的查询效率,随着用户查询数目的增加,预计算结果越来越契合查询需求,查询效率则越高。

Description

一种新型的OLAP预计算模型及生成预计算结果的方法 技术领域
本发明属于OLAP预计算信息领域,尤其涉及一种新型的OLAP预计算模型及生成预计算结果的方法。
背景技术
在数据大爆炸的互联网时代,人们收集数据的规模越来越大,收集数据的分类越来越细致。如何有效地利用这些数据,挖掘数据中的潜在规律,最终给前瞻性的指导性的意见,成为了一个亟待解决的问题。
大数据时代的数据收集具有仅增量不修改的特点,即往往不会对历史数据进行修正,仅仅不断添加新增数据。而传统数据仓库采用的OLTP(online transaction processing)则不适用于该场景,尤其是数据量极其庞大的情况。其中的根本原因是,OLTP被设计成为处理复杂的应用场景,包括增删改查以及事务性,而为了能够实现这些复杂功能,OLTP在查询性能上则做出了让步,尤其在大规模数据的查询场景中表现欠佳。
相较于OLTP,OLAP则更加适合现代大数据的使用场景。OLAP提供了一种基于预计算提高多维分析效率的解决方案,即通过实现一个“数据立方体”对数据仓库中的数据按不同的维度组合进行预聚合,并把结果保存下来;当分析师进行实际业务查询时,无需重新对数据执行聚合运算,而是直接读取预计算结果,这使得对百万甚至上亿数据规模的分析变得可能。一些利用大规模计算机集群对大数据集做多维预计算的技术应用而生。这些技术常常将业务查询语义,转换成对数据立方体的检索,从而获得比从原始数据查询更高的性能。
然而在实际使用场景中,业务场景往往比较复杂,涉及的维度较多, 如果将所有维度组合的数据立方体预先计算并且保存下来,会消耗大量的计算资源和存储空间。由于维度组合的数目随着维度数目的增加成指数级增长,因此在某些极度复杂的使用场景中,将所有数据立方体预计算是不可能完成的任务。不仅如此,在传统的OLAP数据立方体中,每一种维度的组合只会预计算出一个固定结果,即按照某一种维度排列顺序的预计算结果,但是实际的查询场景具有多变性,维度的排列顺序对查询效率具有显著影响,因此单一的维度排列顺序不能满足需求。
发明内容
本发明所要解决的技术问题是:现有的预计算模型只会按照一种维度组合进行预计算,无法满足用户多变的查询场景。
为解决上面的技术问题,本发明提供了一种新型的OLAP预计算模型,该OLAP预计算模型包括:查询语句统计分析器、动态维度组合生成器和预计算结果使用率监控器;
所述查询语句统计分析器,用于接收输入的查询语句,并对所述查询语句进行统计分析;以及用于根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合;
所述动态维度组合生成器,用于当不存在匹配的预计算维度组合时,生成与所述查询语句对应的最优维度组合、与所述最优维度组合对应的最优组合排序,并根据所述最优维度组合、所述最优组合排序生成与所述查询语句匹配的预计算维度组合,同时存储所述匹配的预计算维度组合;
所述查询语句统计分析器,还用于根据所述匹配的预计算维度组合进行预计算查询,得到期望的查询结果;
所述预计算结果使用率监控器,用于对所述动态维度组合生成器生成的所有预计算维度组合进行监控,确定在预设时间段内每个预计算维度组合对应的预计算结果的使用次数,若使用次数低于预设阈值,则将与低于预设阈 值的预计算维度组合对应的预计算结果删除。
本发明的有益效果:通过上述的模型持续不断地收集用户的查询语句,分析语句期望的最优维度组合,动态生成相应维度组合的预计算结果,提高后续相同或类似查询的查询效率,随着用户查询数目的增加,预计算结果越来越契合查询需求,查询效率则越高,解决OLAP预计算占用过多计算和存储资源等问题,同时为生成的维度组合安排最优的排列顺序,提高查询效率,能够满足用户多变的查询场景。
进一步地,所述查询语句统计分析器,还用于当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合时,在所述预存储的预计算维度组合中选取一个次优的预计算维度组合;
并根据所述次优的预计算维度组合和所述查询语句进行预计算查询,得到次优的查询结果;
对所述次优的查询结果进行聚合运算,得到所述期望的查询结果。
进一步地,所述查询语句统计分析器,具体用于接收输入的查询语句,并对所述查询语句中使用的数据表格、维度、度量和过滤条件进行统计分析。
进一步地,所述查询语句统计分析器,还用于当在所述查询语句统计分析器中不存在匹配的预计算维度组合和次优的预计算维度组合时,直接从源数据中读取与所述查询语句对应的数据;
将从源数据中读取的数据进行聚合计算和过滤,得到所述期望的查询结果。
本发明还涉及一种新型的生成预计算结果的方法,采用上述的OLAP预计算模型,该方法包括:
接收输入的查询语句,并对所述查询语句进行统计分析;
根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合;
当不存在匹配的预计算维度组合时,生成与所述查询语句对应的所述最 优维度组合、与所述最优维度组合对应的最优组合排序,并根据所述最优维度组合、所述最优组合排序生成与所述查询语句匹配的预计算维度组合,同时存储所述匹配的预计算维度组合;
根据所述匹配的预计算维度组合进行预计算查询,得到期望的查询结果;
对生成的所有预计算维度组合进行监控,确定在预设时间段内每个预计算维度组合对应的预计算结果的使用次数,若使用次数低于预设阈值,则将与低于预设阈值的预计算维度组合对应的预计算结果删除。
本发明的有益效果:通过上述的方法持续不断地收集用户的查询语句,分析语句期望的最优维度组合,动态生成相应维度组合的预计算结果,提高后续相同或类似查询的查询效率,随着用户查询数目的增加,预计算结果越来越契合查询需求,查询效率则越高,解决OLAP预计算占用过多计算和存储资源等问题,同时为生成的维度组合安排最优的排列顺序,提高查询效率,能够满足用户多变的查询场景。
进一步地,所述根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合还包括:
当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合时,在所述预存储的预计算维度组合中选取一个次优的预计算维度组合;
根据所述次优的预计算维度组合进行预计算查询,得到次优的查询结果;
对所述次优的查询结果进行聚合运算,得到所述期望的查询结果。
进一步地,所述接收输入的查询语句,并对所述查询语句进行统计分析包括:
接收输入的查询语句,并对所述查询语句中使用的数据表格、维度、度量和过滤条件进行统计分析。
进一步地,所述根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合还包括:
当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合,且也不存在次优的预计算维度组合时,直接从源数据中读取与所述查询语句对应的数据;
将从源数据中读取的数据进行聚合计算和过滤,得到所述期望的查询结果。
附图说明
图1为本发明的一种新型的OLAP预计算模型的结构示意图;
图2为本发明的一种新型的生成预计算结果的方法的流程图。
具体实施方式
以下结合附图对本发明的原理和特征进行描述,所举实例只用于解释本发明,并非用于限定本发明的范围。
如图1所示,本发明实施例1提供的是一种新型的OLAP预计算模型,该OLAP预计算模型包括:查询语句统计分析器、动态维度组合生成器和预计算结果使用率监控器;
所述查询语句统计分析器,用于接收输入的查询语句,并对所述查询语句进行统计分析;以及用于根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合;
所述动态维度组合生成器,用于当不存在匹配的预计算维度组合时,生成与所述查询语句对应的最优维度组合、与所述最优维度组合对应的最优组合排序,并根据所述最优维度组合、所述最优组合排序生成与所述查询语句匹配的预计算维度组合,同时存储所述匹配的预计算维度组合;
所述查询语句统计分析器,还用于根据所述匹配的预计算维度组合进行 预计算查询,得到期望的查询结果;
所述预计算结果使用率监控器,用于对所述动态维度组合生成器生成的所有预计算维度组合进行监控,确定在预设时间段内每个预计算维度组合对应的预计算结果的使用次数,若使用次数低于预设阈值,则将与低于预设阈值的预计算维度组合对应的预计算结果删除。
需要说明的是,在本实施例1中持续不断地收集用户的查询语句,对这些查询语句进行统计分析,判断在之前存储的预计算维度组合中是否存在与该查询语句匹配的预计算维度组合,当不存在时,表明之前的预计算结果中不存在匹配的预计算维度组合,比如:对于查询语句“SELECT SUM(PRICE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”这条查询语句中的YEAR和LOCATION就是需要的维度,仅仅包含这两个维度组合是最优组合。其中的YEAR为GROUP BY中的列,LOCATION是WHERE过滤条件中的列,一般而言查询维度是从这两个地方分析抽取。
当不存在时,生成与该查询语句对应的最优维度组合、与该最优维度组合对应的最优组合排序,并根据该最优维度组合、该最优组合排序生成与该查询语句匹配的预计算维度组合,同时存储该匹配的预计算维度组合。然后查询语句统计分析器就会根据所述匹配的预计算维度组合进行预计算查询,得到期望的查询结果,并保存所述期望的查询结果,比如:使用查询语句“SELECT SUM(PRICE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”作为样例。从上述分析中可以看出,列YEAR和LOCATION为所需维度,最优维度组合为仅包含这两列的组合,而次优的组合可以是列YEAR、列LOCAT ION或者列PRICE这三个。另外对于上述如果没有任何满足需求的预计算结果,则需要从源数据读取需要的结果,比如上述例子,需要从源数据读取PRICE、YEAR、LOCATION这三列的数据,然后进行聚合计算和过滤得到最终结果。这样的查询语句会被发送给“动态维度组合生成器”,以生成预计算结果,加速相同或者类似的查询语句,另外使用预计算结果使 用率监控器是因为用户的查询在不断变化,这个月关注的查询和上个月的不同,因此之前生成的预计算结果可能在一段时间后失去了有效性,即不再被用户访问。该监控器从查询模式统计分析器获得相关信息,判断出长时间不被访问的预计算结果,将预计算结果清除或转移到其他的存储设备上,空余出更多的存储空间,减少存储压力。
在另一实施例2中所述查询语句统计分析器,还用于当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合时,在所述预存储的预计算维度组合中选取一个次优的预计算维度组合;
并根据所述次优的预计算维度组合和所述查询语句进行预计算查询,得到次优的查询结果;
对所述次优的查询结果进行聚合运算,得到所述期望的查询结果。
可以理解的是,在本实施例2中比如:使用查询语句“SELECT SUM(PRI CE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”作为样例。从上述分析中可以看出,列YEAR和LOCATION为所需维度,最优维度组合为仅包含这两列的组合。如果存在一种维度组合,包含YEAR和LOCATION,但不限于这两种维度组合,例如维度组合YEAR、LOCATION、CATEGORY,这种维度组合为该查询的可用维度组合。该预计算结果仍然能够被用来加速查询,但需要对预计算结果进行一些简单的处理,比如在线的聚合运算,从而得到期望的查询结果,例如表一所示:该动态维度组合生成器从查询语句统计分析器接收到需要动态生成的维度组合,然后动态维度组合生成器以已经完成的预计算结果或者源数据为基础,生成新的匹配查询要求的维度组合预计算结果。这里需要指出,动态维度组合生成器不仅仅需要选择所需的维度,同时需要关注维度的排列顺序,即最终预计算结果的存储排布方式。
例如,所需的维度为A、B和C,由于查询以C作为查询条件,因此将C放到排列的第一位有利于该查询,生成的维度排列为CAB。反映到存储上,结果就会呈现类似表一所述结果:
维度C 维度A 维度B
1 100 7
1 200 5
4 50 10
7 90 8
9 80 3
9 80 4
10 10 4
表一
如表一所示,考虑了维度组合的排列顺序后,针对维度C的过滤条件能够更有效率,读取的数据更集中,效率更高。
在另一实施例3中所述查询语句统计分析器,具体用于接收输入的查询语句,并对所述查询语句中使用的数据表格、维度、度量和过滤条件进行统计分析。
可以理解的是,在本实施例3中查询语句统计分析器在收集用户的每一条查询语句,分析和统计以下方面的信息:
1)查询用到的数据表格;2)查询用到的维度及其它信息;3)查询用到的度量及其它信息;4)查询用到的过滤条件;5)相同查询出现次数和概率;6)所期望最优的维度组合(包括其排列信息);7)其它可能的信息。
在另一实施例4中所述查询语句统计分析器,还用于当在所述查询语句统计分析器中不存在匹配的预计算维度组合和次优的预计算维度组合时,
直接从源数据中读取与所述查询语句对应的数据;
将从源数据中读取的数据进行聚合计算和过滤,得到所述期望的查询结果。
可以理解的是,在本实施例4中比如:使用查询语句“SELECTSUM(PRICE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”作为样例。从上述分析中可以看出,列YEAR和LOCATION为所需维度,最优维度组合为仅包含这两列的组合。但是对于没有任何满足需求的预计算结果,则需要从源数据读取需要的结果,比如上述例子,需要从源数据读取PRICE、YEAR、LOCATION这三列的数据,然后进行聚合计算和过滤得到最终结果。这样的查询语句会被发送给“动态维度组合生成器”,以生成预计算结果,加速相同或者类似的查询语句。
如图2所示,本发明实施例5中一种新型的生成预计算结果的方法,采用上述实施例1-4的OLAP预计算模型,该方法包括:
S1,接收输入的查询语句,并对所述查询语句进行统计分析;
S2,根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合;
S3,当不存在匹配的预计算维度组合时,生成与所述查询语句对应的所述最优维度组合、与所述最优维度组合对应的最优组合排序,并根据所述最优维度组合、所述最优组合排序生成与所述查询语句匹配的预计算维度组合,同时存储所述匹配的预计算维度组合;
S4,根据所述匹配的预计算维度组合进行预计算查询,得到期望的查询结果;
S5,对所述动态维度组合生成器生成的所有预计算维度组合进行监控,确定在预设时间段内每个预计算维度组合对应的预计算结果的使用次数,若使用次数低于预设阈值,则将与低于预设阈值的预计算维度组合对应的预计算结果删除。
可以理解的是,在本实施例5中持续不断地收集用户的查询语句,对这些查询语句进行统计分析,判断在之前存储的预计算维度组合中是否存在与该查询语句匹配的预计算维度组合,当不存在时,表明之前的预计算结果中不存在匹配的预计算维度组合比如:对于查询语句“SELECT SUM(PRICE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”这条查询语句中的YEAR和LOCATION就是需要的维度,仅仅包含这两个维度组合是最优组合,而次优的组合可以是列YEAR、列LOCATION或者列PRICE这三个。其中的YEAR为GROUP BY中的列,LOCATION是WHERE过滤条件中的列,一般而言查询维度是从这两个地方分析抽取。
在不存在时,生成与该查询语句对应的最优维度组合、与该最优维度组合对应的最优组合排序,并根据该最优维度组合、该最优组合排序生成与该查询语句匹配的预计算维度组合,同时存储该匹配的预计算维度组合。然后查询语句统计分析器就会根据所述匹配的预计算维度组合进行预计算查询,得到期望的查询结果,并保存所述期望的查询结果,比如:使用查询语句“SELECT SUM(PRICE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”作为样例。从上述分析中可以看出,列YEAR和LOCATION为所需维度,最优维度组合为仅包含这两列的组合。另外对于上述如果没有任何满足需求的预计算结果,则需要从源数据读取需要的结果,比如上述例子,需要从源数据读取PRICE、YEAR、LOCATION这三列的数据,然后进行聚合计算和过滤得到最终结果。这样的查询语句会被发送给“动态维度组合生成器”,以生成预计算结果,加速相同或者类似的查询语句,另外使用预计算结果使用率监控器是因为用户的查询在不断变化,这个月关注的查询和上个月的不同,因此之前生成的预计算结果可能在一段时间后失去了有效性,即不再被用户访问。该监控器从查询模式统计分析器获得相关信息,判断出长时间不被访问的预计算结果,将结果清除或转移到其他的存储设备上,空余出更多的存储空间,减少存储压力。
在另一实施例6中所述根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合还包括:
当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合时,在所述预存储的预计算维度组合中选取一个次优的预计算维度组合;
根据所述次优的预计算维度组合进行预计算查询,得到次优的查询结果;
对所述次优的查询结果进行聚合运算,得到所述期望的查询结果。
可以理解的是,在本实施例6中比如:使用查询语句“SELECT SUM(PRICE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”作为样例。从上述分析中可以看出,列YEAR和LOCATION为所需维度,最优维度组合为仅包含这两列的组合。如果存在一种维度组合,包含YEAR和LOCATION,但不限于这两种维度组合,例如维度组合YEAR、LOCATION、CATEGORY,这种维度组合为该查询的可用维度组合。该预计算结果仍然能够被用来加速查询,但需要对预计算结果进行一些简单的处理,比如在线的聚合运算,从而得到期望的查询结果,例如表一所示:该动态维度组合生成器从查询语句统计分析器接收到需要动态生成的维度组合,然后动态维度组合生成器以已经完成的预计算结果或者源数据为基础,生成新的匹配查询要求的维度组合预计算结果。这里需要指出,动态维度组合生成器不仅仅需要选择所需的维度,同时需要关注维度的排列顺序,即最终预计算结果的存储排布方式。
例如,所需的维度为A、B和C,由于查询以C作为查询条件,因此将C放到排列的第一位有利于该查询,生成的维度排列为CAB。反映到存储上,结果就会呈现类似表一所述结果:
维度C 维度A 维度B
1 100 7
1 200 5
4 50 10
7 90 8
9 80 3
9 80 4
10 10 4
表一
如表一所示,考虑了维度组合的排列顺序后,针对维度C的过滤条件能够更有效率,读取的数据更集中,效率更高。
在另一实施例7中所述接收输入的查询语句,并对所述查询语句进行统计分析包括:
接收输入的查询语句,并对所述查询语句中使用的数据表格、维度、度量和过滤条件进行统计分析。
在另一实施例8中所述根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合还包括:
当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合,且也不存在次优的预计算维度组合时,直接从源数据中读取与所述查询语句对应的数据;
将从源数据中读取的数据进行聚合计算和过滤,得到所述期望的查询结果。
可以理解的是,在本实施例8中比如:使用查询语句“SELECT SUM(PRICE),YEAR FROM SALES_TABLE WHERE LOCATION=‘Shanghai’GROUP BY YEAR”作为样例。从上述分析中可以看出,列YEAR和LOCATION为所需维度, 最优维度组合为仅包含这两列的组合。但是对于没有任何满足需求的预计算结果,则需要从源数据读取需要的结果,比如上述例子,需要从源数据读取PRICE、YEAR、LOCATION这三列的数据,然后进行聚合计算和过滤得到最终结果。这样的查询语句会被发送给“动态维度组合生成器”,以生成预计算结果,加速相同或者类似的查询语句。
在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (8)

  1. 一种新型的OLAP预计算模型,其特征在于,该OLAP预计算模型包括:查询语句统计分析器、动态维度组合生成器和预计算结果使用率监控器;
    所述查询语句统计分析器,用于接收输入的查询语句,并对所述查询语句进行统计分析;以及用于根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合;
    所述动态维度组合生成器,用于当不存在匹配的预计算维度组合时,生成与所述查询语句对应的最优维度组合、与所述最优维度组合对应的最优组合排序,并根据所述最优维度组合、所述最优组合排序生成与所述查询语句匹配的预计算维度组合,同时存储所述匹配的预计算维度组合;
    所述查询语句统计分析器,还用于根据所述匹配的预计算维度组合进行预计算查询,得到期望的查询结果;
    所述预计算结果使用率监控器,用于对所述动态维度组合生成器生成的所有预计算维度组合进行监控,确定在预设时间段内每个预计算维度组合对应的预计算结果的使用次数,若使用次数低于预设阈值,则将与低于预设阈值的预计算维度组合对应的预计算结果删除。
  2. 根据权利要求1所述的OLAP预计算模型,其特征在于,所述查询语句统计分析器,还用于当预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合时,在所述预存储的预计算维度组合中选取一个次优的预计算维度组合;
    并根据所述次优的预计算维度组合进行预计算查询,得到次优的查询结果;
    对所述次优的查询结果进行聚合运算,得到所述期望的查询结果。
  3. 根据权利要求1或2所述的OLAP预计算模型,其特征在于,所述查询语句统计分析器,具体用于接收输入的查询语句,并对所述查询语句中使 用的数据表格、维度、度量和过滤条件进行统计分析。
  4. 根据权利要求3所述的OLAP预计算模型,其特征在于,所述查询语句统计分析器,还用于当在预存储的预计算维度组合中不存在匹配的预计算维度组合,且也不存在次优的预计算维度组合时,直接从源数据中读取与所述查询语句对应的数据;
    将从源数据中读取的数据进行聚合计算和过滤,得到所述期望的查询结果。
  5. 一种新型的生成预计算结果的方法,其特征在于,采用如权利要求1-4任一所述的OLAP预计算模型,该方法包括:
    接收输入的查询语句,并对所述查询语句进行统计分析;
    根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合;
    当不存在匹配的预计算维度组合时,生成与所述查询语句对应的所述最优维度组合、与所述最优维度组合对应的最优组合排序,并根据所述最优维度组合、所述最优组合排序生成与所述查询语句匹配的预计算维度组合,同时存储所述匹配的预计算维度组合;
    根据所述匹配的预计算维度组合进行预计算查询,得到期望的查询结果;
    对生成的所有预计算维度组合进行监控,确定在预设时间段内每个预计算维度组合对应的预计算结果的使用次数,若使用次数低于预设阈值,则将与低于预设阈值的预计算维度组合对应的预计算结果删除。
  6. 根据权利要求5所述的方法,其特征在于,所述根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合还包括:
    当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合时,在所述预存储的预计算维度组合中选取一个次优的预计算维度 组合;
    根据所述次优的预计算维度组合进行预计算查询,得到次优的查询结果;
    对所述次优的查询结果进行聚合运算,得到所述期望的查询结果。
  7. 根据权利要求6所述的方法,其特征在于,所述接收输入的查询语句,并对所述查询语句进行统计分析包括:
    接收输入的查询语句,并对所述查询语句中使用的数据表格、维度、度量和过滤条件进行统计分析。
  8. 根据权利要求7所述的方法,其特征在于,所述根据统计分析结果,判断在预存储的预计算维度组合中是否存在与所述查询语句匹配的预计算维度组合还包括:
    当在预存储的预计算维度组合中不存在与所述查询语句匹配的预计算维度组合,且也不存在次优的预计算维度组合时,直接从源数据中读取与所述查询语句对应的数据;
    将从源数据中读取的数据进行聚合计算和过滤,得到所述期望的查询结果。
PCT/CN2018/073318 2018-01-11 2018-01-19 一种新型的olap预计算模型及生成预计算结果的方法 WO2019019573A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18837795.6A EP3709127A4 (en) 2018-01-11 2018-01-19 NEW OLAP PRE-CALCULATION MODEL AND METHOD FOR GENERATING A PRE-CALCULATION RESULT
US15/769,416 US20200097483A1 (en) 2018-01-11 2018-01-19 Novel olap pre-calculation model and method for generating pre-calculation result

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810025700.4A CN108376143B (zh) 2018-01-11 2018-01-11 一种新型的olap预计算系统及生成预计算结果的方法
CN201810025700.4 2018-01-11

Publications (1)

Publication Number Publication Date
WO2019019573A1 true WO2019019573A1 (zh) 2019-01-31

Family

ID=63016714

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/073318 WO2019019573A1 (zh) 2018-01-11 2018-01-19 一种新型的olap预计算模型及生成预计算结果的方法

Country Status (4)

Country Link
US (1) US20200097483A1 (zh)
EP (1) EP3709127A4 (zh)
CN (1) CN108376143B (zh)
WO (1) WO2019019573A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3086412B1 (fr) * 2018-09-20 2020-10-30 Amadeus Sas Recalcul des resultats de recherche precalcules
CN110110165B (zh) * 2019-04-01 2021-04-02 跬云(上海)信息科技有限公司 用于预计算系统中查询引擎的动态路由方法及装置
CN110222124A (zh) * 2019-05-08 2019-09-10 跬云(上海)信息科技有限公司 基于olap的多维数据处理方法及系统
CN110297858B (zh) * 2019-05-27 2021-11-09 苏宁云计算有限公司 执行计划的优化方法、装置、计算机设备和存储介质
CN111143397B (zh) * 2019-12-10 2021-04-13 跬云(上海)信息科技有限公司 混合数据查询方法及装置、存储介质
CN111125264B (zh) * 2019-12-12 2021-05-28 跬云(上海)信息科技有限公司 基于扩展olap模型的超大集合分析方法及装置
CN111143411A (zh) * 2019-12-23 2020-05-12 跬云(上海)信息科技有限公司 动态流式预计算方法及装置、存储介质
CN112445814A (zh) * 2020-12-15 2021-03-05 北京乐学帮网络技术有限公司 一种数据获取方法、装置、计算机设备及存储介质
CN112965991B (zh) * 2021-03-08 2023-12-08 咪咕文化科技有限公司 预计算结果生成方法、装置、电子设备及存储介质
CN116644098B (zh) * 2023-05-15 2024-01-30 绵阳市商业银行股份有限公司 一种自识别灵活查询及多维分析自动化装配实现方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060074901A1 (en) * 2004-09-30 2006-04-06 Pirahesh Mir H Canonical abstraction for outerjoin optimization
CN106997386A (zh) * 2017-03-28 2017-08-01 上海跬智信息技术有限公司 一种olap预计算模型、自动建模方法及自动建模系统
CN107169070A (zh) * 2017-05-08 2017-09-15 山大地纬软件股份有限公司 一种基于大数据的社保指标仓库的构建系统及其方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7181450B2 (en) * 2002-12-18 2007-02-20 International Business Machines Corporation Method, system, and program for use of metadata to create multidimensional cubes in a relational database
US20090287666A1 (en) * 2008-05-13 2009-11-19 International Business Machines Corporation Partitioning of measures of an olap cube using static and dynamic criteria
US10275484B2 (en) * 2013-07-22 2019-04-30 International Business Machines Corporation Managing sparsity in a multidimensional data structure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060074901A1 (en) * 2004-09-30 2006-04-06 Pirahesh Mir H Canonical abstraction for outerjoin optimization
CN106997386A (zh) * 2017-03-28 2017-08-01 上海跬智信息技术有限公司 一种olap预计算模型、自动建模方法及自动建模系统
CN107169070A (zh) * 2017-05-08 2017-09-15 山大地纬软件股份有限公司 一种基于大数据的社保指标仓库的构建系统及其方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3709127A4 *

Also Published As

Publication number Publication date
CN108376143B (zh) 2019-12-27
CN108376143A (zh) 2018-08-07
EP3709127A4 (en) 2021-01-20
US20200097483A1 (en) 2020-03-26
EP3709127A1 (en) 2020-09-16

Similar Documents

Publication Publication Date Title
WO2019019573A1 (zh) 一种新型的olap预计算模型及生成预计算结果的方法
EP3605358A1 (en) Olap precomputed model, automatic modeling method, and automatic modeling system
US10902022B2 (en) OLAP pre-calculation model, automatic modeling method, and automatic modeling system
CN109408347A (zh) 一种指标实时分析系统及指标实时计算方法
TWI643076B (zh) 金融非結構化文本分析系統及其方法
CN104700190B (zh) 一种用于项目与专业人员匹配的方法和装置
US11550762B2 (en) Implementation of data access metrics for automated physical database design
WO2021128523A1 (zh) 一种基于科技大数据的技术成熟度判断方法和系统
US20140046975A1 (en) Aggregate data streams in relational database systems
WO2024174305A1 (zh) 一种基于预计算场景的查询处理方法及其装置
Suriarachchi et al. Big provenance stream processing for data intensive computations
WO2018053889A1 (zh) 分布式计算框架和分布式计算方法
CN117806929A (zh) MySQL慢日志采集分析方法、系统、设备及存储介质
CN112148719B (zh) 基于olap预计算模型的数据加工查询方法及装置
Lou et al. Research on data query optimization based on SparkSQL and MongoDB
Rao et al. Efficient Iceberg query evaluation using compressed bitmap index by deferring bitwise-XOR operations
CN106598492B (zh) 一种应用于海量不完整数据的压缩优化方法
US20210319014A1 (en) Fast processing method of massive time-series data based on aggregated edge and time-series aggregated edge
CN111221824B (zh) 存储空间的存储优化方法、装置、设备和介质
CN109656981B (zh) 一种数据统计方法及系统
CN112818017A (zh) 一种事件数据处理方法及装置
CN112559620A (zh) 一种针对量化交易的交互式投资组合分析界面系统
CN116975041B (zh) Ab实验分流及分析系统
CN112131302B (zh) 一种商业数据分析方法及平台
Behan et al. Comparative analysis of RDBMS and NoSQL databases

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 18837795.6

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2018837795

Country of ref document: EP

Effective date: 20200611

NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18837795

Country of ref document: EP

Kind code of ref document: A1