WO2020187023A1 - 数据配置查询方法和装置 - Google Patents

数据配置查询方法和装置 Download PDF

Info

Publication number
WO2020187023A1
WO2020187023A1 PCT/CN2020/077710 CN2020077710W WO2020187023A1 WO 2020187023 A1 WO2020187023 A1 WO 2020187023A1 CN 2020077710 W CN2020077710 W CN 2020077710W WO 2020187023 A1 WO2020187023 A1 WO 2020187023A1
Authority
WO
WIPO (PCT)
Prior art keywords
association
data sets
target data
target
olap
Prior art date
Application number
PCT/CN2020/077710
Other languages
English (en)
French (fr)
Inventor
张逸凡
吴逸飞
李扬
韩卿
Original Assignee
跬云(上海)信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 跬云(上海)信息科技有限公司 filed Critical 跬云(上海)信息科技有限公司
Priority to US17/051,008 priority Critical patent/US11281698B2/en
Publication of WO2020187023A1 publication Critical patent/WO2020187023A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Definitions

  • This application relates to the technical field of data configuration query, and specifically, to a data configuration query method and device.
  • OLAP Online Analytical Processing
  • data models are the basis of OLAP analysis.
  • the bottom layer of the OLAP analysis architecture is a data warehouse, which contains a series of data tables; modelers design data models based on these tables for analysts to use according to business analysis requirements; ultimately, the analysis operations of the analysts will be transformed into a series of data tables.
  • SQL Structured Query Language, structured query language
  • the data model gives the data table business meaning, decouples the relationship between the bottom layer of the data and the business requirements, how to effectively associate the query with the model, and maximize the use of OLAP analysis to serve the business is a very important part of it.
  • the OLAP data model is the core element of the OLAP analysis engine based on processing data logic. It serves SQL business queries, so feature information is closely related to the content of SQL queries. Basic information includes fact tables, dimension tables, association methods, dimensions and measures And so on, sometimes a business query is associated with a specific model, but other times in relatively complex scenarios, because of the need to use cross-analysis of different business data, it is often necessary to use a combination of models to get the final analysis result.
  • the process of SQL query related OLAP model is completed through the query execution engine.
  • the main process includes: parsing SQL statements, generating SQL syntax tree, analyzing SQL syntax tree, converting it into query execution plan (query execution process), and confirming OLAP Model, generate physical execution plan, extract pre-calculated results, combine and analyze pre-calculated results, and output final results.
  • the main purpose of this application is to provide a data configuration query method and device to solve the problems of a large number of OLAP models included in an OLAP query system and a low utilization rate of OLAP models in related technologies.
  • this application provides a data configuration query method, which is applied to an online analytical processing OLAP query system, and the method includes:
  • determining at least two target data sets that need to be queried by the query instruction and the orderly association between the target data sets includes:
  • association information between the two target data sets is included in the equivalent association information, it is determined that the orderly association between the two target data sets is a two-way association.
  • determining at least two target data sets that need to be queried by the query instruction and the orderly association between the target data sets further includes:
  • association information between the two target data sets is not included in the equivalent association, it is determined that the orderly association between the two target data sets is a one-way association.
  • output the OLAP model that conforms to the target association path in the database including:
  • the OLAP model conforms to the target association path, the OLAP model is output.
  • this application also provides a data configuration query device, which is applied to an OLAP query system, and includes:
  • the determining module is used to determine at least two target data sets that are required to be queried by the query instruction and the orderly association between the target data sets, where the orderly association includes at least one-way association and/or two-way association;
  • a generating module for generating a target association path based on the orderly association between the target data sets in at least two target data sets;
  • the output module is used to output the OLAP model that conforms to the target association path in the database.
  • association information between the two target data sets is included in the equivalent association information, it is determined that the orderly association between the two target data sets is a two-way association.
  • association information between the two target data sets is not included in the equivalent association, it is determined that the orderly association between the two target data sets is a one-way association.
  • an output module for:
  • the OLAP model conforms to the target association path, the OLAP model is output.
  • this application also provides a computer device, which includes:
  • One or more processors are One or more processors;
  • Memory used to store one or more computer programs
  • one or more processors When one or more computer programs are executed by one or more processors, one or more processors are caused to implement the above-mentioned data configuration query method.
  • the present application also provides a computer-readable storage medium that stores computer code.
  • the computer code When the computer code is executed, the above-mentioned data configuration query method is executed.
  • At least two target data sets and the orderly association between the target data sets that need to be queried by the query instruction are determined, where the orderly association includes at least one-way association and/or Two-way association; generate a target association path based on the orderly association between the target data sets in at least two target data sets; output an OLAP model that conforms to the target association path in the database.
  • the two-way association between the target data sets can be confirmed, and the equivalent or similar OLAP model can be replaced by an OLAP data, which enlarges the scope of application of the OLAP data model, reduces the number of OLAP model requirements, and improves the OLAP model Utilization rate, reuse existing OLAP models to the greatest extent, avoid redundant models caused by the original support for similar analysis processes, and improve query execution efficiency; thereby solving the large number of OLAP model requirements included in the OLAP query system in related technologies and Technical problem of low utilization rate of OLAP model.
  • FIG. 1 is a schematic flowchart of a data configuration query method provided by an embodiment of the present application
  • Figure 2 is a directed graph of a target association path provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of step 100 according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of another step 100 provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of step 300 according to an embodiment of the present application.
  • Fig. 6 is a schematic diagram of an OLAP model provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a data configuration query device provided by an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a data configuration query method provided by an embodiment of the present application, as shown in FIG. As shown in 1, the method includes the following steps 100 to 300:
  • 100 Determine at least two target data sets that are required to be queried by the query instruction and an orderly association between the target data sets, where the orderly association includes at least one-way association and/or two-way association.
  • the query instruction can be input by the user through the user terminal to query the business.
  • the query instruction includes at least one sequence of instruction characters.
  • the query instruction is a SQL query instruction for SQL services.
  • the information of each target data set and the association information between the target data sets, and then the orderly association between the target data sets can be determined according to the association information, and the orderly association includes at least one-way association and/or two-way association.
  • the query command is an SQL query command for SQL business.
  • the query command contains information about the four target data sets A, B, C, and D and the association information between the target data sets. Between A and B The associated information is INNER JOIN, the associated information between A and D is LEFT JOIN, and the associated information between B and C is LEFT JOIN.
  • the orderly association between A and D is a single A to D
  • the orderly association between B and C is a one-way association from B to C
  • the association information INNER JOIN between A and B is an equivalent association, namely "AINNER JOIN B" and "B INNER JOIN A”
  • the orderly association between A and B is a two-way association between A and B.
  • the OLAP model can be shown in Figure 6.
  • the query command contains the information of four target data sets A, B, C, and D and the association information between the target data sets.
  • the orderly association between A and D is a one-way association from A to D
  • the orderly association between B and C is a one-way association from B to C
  • the orderly association between A and B is a two-way association between A and B
  • the directed graph representing the target association path is shown in Figure 2.
  • the target association path can start from A or B, that is, the target association path includes two paths.
  • the first path is: A to B and then to C, and A to D
  • the second path is: B association To A and then to B, and B to C, therefore, there is an OLAP model that meets the first path or there is an OLAP model that meets the second path to achieve the query command requirements, which meets the first path
  • the OLAP model and the OLAP model conforming to the second path are expressed in the same way, so only one OLAP model needs to be defined in the OLAP query system.
  • the OLAP model conforming to the target association path is matched in the database, and the OLAP model is output.
  • the equivalent or similar OLAP model can be replaced by an OLAP data, which enlarges the scope of application of the OLAP data model, reduces the number of OLAP model requirements, increases the utilization rate of OLAP models, and reuses existing OLAP models to the maximum. It avoids the redundant model originally caused by supporting the similar analysis process, and improves the execution efficiency of the query.
  • FIG. 3 is a schematic flowchart of step 100 provided in an embodiment of the present application.
  • step 100 determines at least two target data sets and targets that need to be queried by the query instruction.
  • the orderly association between data sets includes the following steps 110 to 130:
  • association information between the two target data sets is included in the equivalent association information, it is determined that the orderly association between the two target data sets is a two-way association.
  • step 100 specifically includes identifying all target data sets (at least two target data sets) and associated information (associated characters, such as LEFT JOIN) between the target data sets based on the character sequence information of the query instruction, and then Determine whether the association information between two target data sets belongs to equivalent association information (equivalent association characters, such as INNER JOIN).
  • equivalent association information equivalent association characters, such as INNER JOIN.
  • FIG. 4 is a schematic flowchart of another step 100 provided in an embodiment of the present application.
  • step 100 determines at least two target data sets that need to be queried by the query instruction and
  • the orderly association between target data sets also includes the following step 140:
  • association information between the two target data sets is not included in the equivalent association, determine that the orderly association between the two target data sets is a one-way association.
  • equivalent associated information equivalent associated characters, for example, INNER JOIN belongs to equivalent associated characters
  • equivalent association information for example, LEFT JOIN is not an equivalent association character
  • the orderly association between the two target data sets is a one-way association. In this way, through steps 110 to 140, an orderly association between the target data sets can be determined.
  • FIG. 5 is a schematic flow chart of a step 300 provided by an embodiment of the present application.
  • step 300 outputting an OLAP model that conforms to the target association path in the database, includes the following steps 310 to Step 330:
  • OLAP model that only contains all target data sets. For each OLAP model that only contains at least two target data sets that the query instruction needs to query, set Any target data set contained in the OLAP model is used as a candidate center to match the target association path to determine whether the OLAP model meets the target association path. When the OLAP model meets the target association path, the OLAP model is output for subsequent processing.
  • At least two target data sets and the orderly association between the target data sets that need to be queried by the query instruction are determined, where the orderly association includes at least one-way association and/or Two-way association; generate a target association path based on the orderly association between the target data sets in at least two target data sets; output an OLAP model that conforms to the target association path in the database.
  • the two-way association between the target data sets can be confirmed, and the equivalent or similar OLAP model can be replaced by an OLAP data, which enlarges the scope of application of the OLAP data model, reduces the number of OLAP model requirements, and improves the OLAP model Utilization rate, reuse existing OLAP models to the greatest extent, avoid redundant models caused by the original support for similar analysis processes, and improve query execution efficiency; thereby solving the large number of OLAP model requirements included in the OLAP query system in related technologies and Technical problem of low utilization rate of OLAP model.
  • FIG. 7 is a schematic structural diagram of a data configuration query device provided in an embodiment of the application. As shown in FIG. 7, the device is used in an OLAP query system. , The device includes:
  • the determining module 10 is configured to determine at least two target data sets that are required to be queried by the query instruction and an orderly association between the target data sets, where the orderly association includes at least one-way association and/or two-way association;
  • the generating module 20 is configured to generate a target association path based on the orderly association between the target data sets in at least two target data sets;
  • the output module 30 is used for outputting the OLAP model that conforms to the target association path in the database.
  • the determining module 10 is used to:
  • association information between the two target data sets is included in the equivalent association information, it is determined that the orderly association between the two target data sets is a two-way association.
  • the determining module 10 is used to:
  • association information between the two target data sets is not included in the equivalent association, it is determined that the orderly association between the two target data sets is a one-way association.
  • the output module 30 is used for:
  • the OLAP model conforms to the target association path, the OLAP model is output.
  • the determining module 10 is used to determine at least two target data sets and the orderly associations between the target data sets that need to be queried by the query instruction, where the orderly association at least includes One-way association and/or two-way association; a generation module 20, for generating a target association path based on the orderly association between the target data sets in at least two target data sets; an output module 30, for outputting the target association path in the database OLAP model.
  • the equivalent or similar OLAP model can be replaced by an OLAP data, which enlarges the scope of application of the OLAP data model, reduces the number of OLAP model requirements, and improves the utilization rate of the OLAP model .
  • OLAP data which enlarges the scope of application of the OLAP data model, reduces the number of OLAP model requirements, and improves the utilization rate of the OLAP model .
  • an embodiment of the present application also provides a computer device, which includes:
  • One or more processors are One or more processors;
  • Memory used to store one or more computer programs
  • one or more processors When one or more computer programs are executed by one or more processors, one or more processors are caused to implement the aforementioned data configuration query method.
  • the embodiments of the present application also provide a computer-readable storage medium that stores computer code.
  • the computer code When the computer code is executed, the above-mentioned data configuration query method is executed.
  • modules or steps of the present invention can be implemented by a general computing device. They can be concentrated on a single computing device or distributed in a network composed of multiple computing devices. Above, alternatively, they can be implemented with program codes executable by a computing device, so that they can be stored in a storage device for execution by the computing device, or they can be made into individual integrated circuit modules, or they can be Multiple modules or steps are made into a single integrated circuit module to achieve. In this way, the present invention is not limited to any specific combination of hardware and software.
  • the computer program involved in this application can be stored in a computer-readable storage medium, and the computer-readable storage medium can include: any physical device, virtual device, USB, mobile hard disk, magnetic disk, optical disk, Computer memory, read-only computer memory (Read-Only Memory, ROM), random access computer memory (Random Access Memory, RAM), electrical carrier signal, telecommunications signal, and other software distribution media, etc.
  • modules or steps of the present invention can be implemented by a general computing device. They can be concentrated on a single computing device or distributed in a network composed of multiple computing devices. Above, alternatively, they can be implemented with program codes executable by a computing device, so that they can be stored in a storage device for execution by the computing device, or they can be made into individual integrated circuit modules, or they can be Multiple modules or steps are made into a single integrated circuit module to achieve. In this way, the present invention is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种数据配置查询方法和装置。该方法包括确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联;基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;在数据库中输出符合目标关联路径的OLAP模型。本申请可以解决了相关技术中OLAP查询系统包括的OLAP模型需求数量大以及OLAP模型利用率低的技术问题。

Description

数据配置查询方法和装置
相关申请的交叉引用
本申请要求于2019年3月20日提交中国专利局,申请号为2019102146157,发明名称为“数据配置查询方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据配置查询技术领域,具体而言,涉及一种数据配置查询方法和装置。
背景技术
在当今数据化的时代,如何通过OLAP(Online Analytical Processing,联机分析处理)分析海量、复杂的数据以辅助商业决策,是商务智能和数据分析领域的重要课题;而数据模型就是OLAP分析的基础。OLAP分析的架构底层是数据仓库,包含一系列数据表;建模人员根据业务分析需求,基于这些表设计数据模型供分析人员使用;最终,分析人员的分析操作都会转化为一系列针对数据表的SQL(Structured Query Language,结构化查询语言)查询。数据模型给数据表赋予了业务含义,解耦了数据底层和业务需求的关系,如何有效地关联查询于模型,最大限度的利用OLAP分析来服务业务,是其中非常重要的一个部分。
OLAP数据模型是OLAP分析引擎基于处理数据逻辑的核心元素,其服务于SQL业务查询,所以特征信息和SQL查询的内容方式密切相关,基本的信息包括事实表,维度表,关联方式,维度和度量等等,有时一个业务查询关联到一个特定的模型,但是另一些时候在相对复杂的场景中由于需要使用不同业务数据的交叉分析,常常需要使用模型的相互组合才能得到最后的分析结果。
SQL查询相关OLAP模型的过程是透过查询执行引擎来完成,其主要流程包括:解析SQL语句,生成SQL语法树,分析SQL语法树,将其转换为查询执行计划(查询执行流程),确认OLAP模型,生成物理执行计划,提取预计算结果,组合分析预计算结果,输出最终结果。
由于查询引擎中OLAP模型的选择和匹配逻辑比较固定,使得整个过程中对目标模型有着严格的要求,无法适配等价或相近模型。使得系统中的OLAP模型数量会随着查询的不断增加,会对整体系统的存储、管理和运维方面带来困难和挑战。
针对相关技术中OLAP查询系统包括的OLAP模型需求数量大以及OLAP模型利用率低的问题,目前尚未提出有效的解决方案。
发明内容
本申请的主要目的在于提供一种数据配置查询方法和装置,以解决相关技术中OLAP查询系统包括的OLAP模型需求数量大以及OLAP模型利用率低的问题。
为了实现上述目的,第一方面,本申请提供了一种数据配置查询方法,该方法应用于联机分析处理OLAP查询系统中,该方法包括:
确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联;
基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;
在数据库中输出符合目标关联路径的OLAP模型。
可选地,确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,包括:
基于查询指令的字符序列信息,识别出至少两个目标数据集以及目标数据集之间的关联信息;
判断两个目标数据集之间的关联信息是否包含于等价关联信息中;
当两个目标数据集之间的关联信息包含于等价关联信息中时,确定该两个目标数据集之间的有序关联为双向关联。
可选地,确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,还包括:
当两个目标数据集之间的关联信息不包含于等价关联中时,确定该两个目标数据集之间的有序关联为单向关联。
可选地,在数据库中输出符合目标关联路径的OLAP模型,包括:
在数据库中筛选出仅包含有查询指令所需要查询的至少两个目标数据集的OLAP模型;
将OLAP模型中包含的任意一个目标数据集作为候选中心与目标关联路径进行匹配,确定OLAP模型是否符合目标关联路径;
当OLAP模型符合目标关联路径时,输出该OLAP模型。
第二方面,本申请还提供了一种数据配置查询装置,该装置应用于OLAP查询系统中,该装置包括:
确定模块,用于确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联;
生成模块,用于基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;
输出模块,用于在数据库中输出符合目标关联路径的OLAP模型。
可选地,确定模块,用于:
基于查询指令的字符序列信息,识别出至少两个目标数据集以及目标数据集之间的关联信息;
判断两个目标数据集之间的关联信息是否包含于等价关联信息中;
当两个目标数据集之间的关联信息包含于等价关联信息中时,确定该两个目标数据集之间的有序关联为双向关联。
可选地,确定模块,用于:
当两个目标数据集之间的关联信息不包含于等价关联中时,确定该两个目标数据集之间的有序关联为单向关联。
可选地,输出模块,用于:
在数据库中筛选出仅包含有查询指令所需要查询的至少两个目标数据集的OLAP模型;
将OLAP模型中包含的任意一个目标数据集作为候选中心与目标关联路径进行匹配,确定OLAP模型是否符合目标关联路径;
当OLAP模型符合目标关联路径时,输出该OLAP模型。
第三方面,本申请还提供了一种计算机设备,该计算机设备包括:
一个或多个处理器;
存储器,用于存储一个或多个计算机程序;
当一个或多个计算机程序被一个或多个处理器执行时,使得一个或多个处理器实现如上述的数据配置查询方法。
第四方面,本申请还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机代码,当计算机代码被执行时,上述的数据配置查询方法被执行。
在本申请提供的数据配置查询方法中,通过确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联;基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;在数据库中输出符合目标关联路径 的OLAP模型。通过上述方法,可以确认目标数据集之间的双向关联,将等价或相近OLAP模型可以通过一个OLAP数据代替,进而放大了OLAP数据模型的适用范围,降低了OLAP模型需求数量,提高了OLAP模型利用率,最大限度地复用已有OLAP模型,避免原来为支持相似分析过程造成的冗余模型,提高了查询的执行效率;从而解决了相关技术中OLAP查询系统包括的OLAP模型需求数量大以及OLAP模型利用率低的技术问题。
附图说明
构成本申请的一部分的附图用来提供对本申请的进一步理解,使得本申请的其它特征、目的和优点变得更明显。本申请的示意性实施例附图及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是本申请实施例提供的一种数据配置查询方法的流程示意图;
图2是本申请实施例提供的一种目标关联路径的有向图;
图3是本申请实施例提供的一种步骤100的流程示意图;
图4是本申请实施例提供的另一种步骤100的流程示意图;
图5是本申请实施例提供的一种步骤300的流程示意图;
图6是本申请实施例提供的一种OLAP模型的示意图;
图7是本申请实施例提供的一种数据配置查询装置的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
根据本申请的一个方面,本申请实施例提供了一种数据配置查询方法,该方法应用于OLAP查询系统中,图1是本申请实施例提供的一种数据配置查询方法的流程示意图,如图1所示,该方法包括如下的步骤100至步骤300:
100,确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联。
其中,查询指令可以由用户通过用户端输入用于查询业务的指令,该查询指令至少包括一个指令字符序列,例如,该查询指令为一个针对SQL业务的SQL查询指令,该查询指令中包含有多个目标数据集的信息以及目标数据集之间的关联信息,进而可以根据关联信息确定目标数据集之间的有序关联,有序关联至少包括单向关联和/或双向关联。
举例说明,查询指令为一个针对SQL业务的SQL查询指令,该查询指令中包含有A、B、C和D四个目标数据集的信息以及目标数据集之间的关联信息,A与B之间的关联信息为INNER JOIN,A与D之间的关联信息为LEFT JOIN,B与C之间的关联信息为LEFT JOIN,因此,可以确定A与D之间的有序关联为A向D的单向关联,B与C之间的有序关联为B向C的单向关联,而A与B之间的关联信息INNER JOIN属于等价关联,即“AINNER JOIN B”和“B INNER JOIN A”可以有统一的表达方式,仅需要定义一个OLAP模型即可,A与B之间的有序关联为A与B双向关联。其中,OLAP模型可以图 图6所示。
200,基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径。
具体的,基于查询指令所需要查询的全部目标数据集之间的有序关联,而且基于查询指令所需的目标数据集之间均会有所关联,进而基于目标数据集之间的有序关联,可以将全部目标数据集进行串联起来,因为是有序关联,进而可以生成一个有方向的目标数据集串联路径,即目标关联路径,由该目标关联路径可以通过一个有向图的方式来表达数据集关联关系。
举例说明,该查询指令中包含有A、B、C和D四个目标数据集的信息以及目标数据集之间的关联信息,A与D之间的有序关联为A向D的单向关联,B与C之间的有序关联为B向C的单向关联,A与B之间的有序关联为A与B双向关联,而表示该目标关联路径的有向图如图2所示,该目标关联路径可以是由A或B开始,即目标关联路径包括两个路径,第一个路径为:A关联至B再关联至C,以及A关联至D,第二路径为:B关联至A再关联至B,以及B关联至C,因此,存在一个符合第一个路径的OLAP模型或者存在一个符合第二个路径的OLAP模型即可实现查询指令的需求,符合第一个路径的OLAP模型和符合第二个路径的OLAP模型的表达方式相同,因此在该OLAP查询系统中仅需要定义一个OLAP模型。
300,在数据库中输出符合目标关联路径的OLAP模型。
具体的,在数据库中匹配出符合目标关联路径的OLAP模型,并将该OLAP模型输出。这样,将等价或相近OLAP模型可以通过一个OLAP数据代替,进而放大了OLAP数据模型的适用范围,降低了OLAP模型需求数量,提高了OLAP模型利用率,最大限度地复用已有OLAP模型,避免原来为支持相似分析过程造成的冗余模型,提高了查询的执行效率。
在一个可行的实施方式中,图3是本申请实施例提供的一种步骤100的流程示意图,如图3所示,步骤100,确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,包括如下的步骤110至步骤130:
110,基于查询指令的字符序列信息,识别出至少两个目标数据集以及目标数据集之间的关联信息;
120,判断两个目标数据集之间的关联信息是否包含于等价关联信息中;
130,当两个目标数据集之间的关联信息包含于等价关联信息中时,确定该两个目标数据集之间的有序关联为双向关联。
具体的,步骤100具体包括,基于查询指令的字符序列信息,识别出全部的目标数据集(至少两个目标数据集)以及目标数据集之间的关联信息(关联字符,例如LEFT JOIN),进而判断两个目标数据集之间的关联信息是否属于等价关联信息(等价关联字符,例如INNER JOIN),当两个目标数据集之间的关联信息包含于等价关联信息中时,确定该两个目标数据集之间的有序关联为双向关联。
在一个可行的实施方式中,图4是本申请实施例提供的另一种步骤100的流程示意图,如图4所示,步骤100,确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,还包括如下的步骤140:
140,当两个目标数据集之间的关联信息不包含于等价关联中时,确定该两个目标数据集之间的有序关联为单向关联。
具体的,当判断两个目标数据集之间的关联信息是否属于等价关联信息(等价关联字符,例如INNER JOIN属于等价关联字符)时,如果判断两个目标数据集之间的关联信息不包含等价关联信息中(例如,LEFT JOIN不属于等价关联字符),确定该两个目标数据集之间的有序关联为单向关联。这样,通过步骤110至步骤140,可以确定目标数据集之间的有序关联。
在一个可行的实施方式中,图5是本申请实施例提供的一种步骤300的流程示意图,如图5所示,步骤300,数据库中输出符合目标关联路径的OLAP模型,包括如下步骤310至步骤330:
310,在数据库中筛选出仅包含有查询指令所需要查询的至少两个目标数据集的OLAP模型;
320,将OLAP模型中包含的任意一个目标数据集作为候选中心与目标关联路径进行匹配,确定OLAP模型是否符合目标关联路径;
330,当OLAP模型符合目标关联路径时,输出该OLAP模型。
具体的,基于查询指令所需要查询的至少两个目标数据集确定仅包含全部目标数据集的OLAP模型,对于每个仅包含有查询指令所需要查询的至少两个目标数据集的OLAP模型,将OLAP模型中包含的任意一个目标数据集作为候选中心与目标关联路径进行匹配,确定OLAP模型是否符合目标关联路径,当当OLAP模型符合目标关联路径时,输出该OLAP模型,以便于进行后续处理。
在本申请提供的数据配置查询方法中,通过确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联;基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;在数据库中输出符合目标关联路径的OLAP模型。通过上述方法,可以确认目标数据集之间的双向关联,将等价或相近OLAP模型可以通过一个OLAP数据代替,进而放大了OLAP数据模型的适用范围,降低了OLAP模型需求数量,提高了OLAP模型利用率,最大限度地复用已有OLAP模型,避免原来为支持相似分析过程造成的冗余模型,提高了查询的执行效率;从而解决了相关技术中OLAP查询系统包括的OLAP模型需求数量大以及OLAP模型利用率低的技术问题。
基于相同的技术构思,本申请还提供了一种数据配置查询装置,图7是本申请实施例提供的一种数据配置查询装置的结构示意图,如图7所示,该装置应用OLAP查询系统中,该装置包括:
确定模块10,用于确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联;
生成模块20,用于基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;
输出模块30,用于在数据库中输出符合目标关联路径的OLAP模型。
可选地,确定模块10,用于:
基于查询指令的字符序列信息,识别出至少两个目标数据集以及目标数据集之间的关联信息;
判断两个目标数据集之间的关联信息是否包含于等价关联信息中;
当两个目标数据集之间的关联信息包含于等价关联信息中时,确定该两个目标数据集之间的有序关联为双向关联。
可选地,确定模块10,用于:
当两个目标数据集之间的关联信息不包含于等价关联中时,确定该两个目标数据集之间的有序关联为单向关联。
可选地,输出模块30,用于:
在数据库中筛选出仅包含有查询指令所需要查询的至少两个目标数据集的OLAP模型;
将OLAP模型中包含的任意一个目标数据集作为候选中心与目标关联路径进行匹配,确定OLAP模型是否符合目标关联路径;
当OLAP模型符合目标关联路径时,输出该OLAP模型。
在本申请提供的数据配置查询装置中,通过确定模块10,用于确定出查询指令所需要查询的至少两个目标数据集以及目标数据集之间的有序关联,其中,有序关联至少包括单向关联和/或双向关联;生成模块20,用于基于至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;输出模块30,用于在数据库中输出符合目标关联路径的OLAP模型。这样,通过确认目标数据集之间的双向关联,将等价或相近OLAP模型可以通过一个OLAP数据代替,进而放大了OLAP数据模型的适用范围,降低了OLAP模型需求数量,提高了OLAP模型利用率,最大限度地复用已有OLAP模型,避免原来为支持相似分析过程造成的冗余模型,提高了查询的执行效率;从而解决了相 关技术中OLAP查询系统包括的OLAP模型需求数量大以及OLAP模型利用率低的技术问题。
基于相同的技术构思,本申请实施例还提供了一种计算机设备,该计算机设备包括:
一个或多个处理器;
存储器,用于存储一个或多个计算机程序;
当一个或多个计算机程序被一个或多个处理器执行时,使得一个或多个处理器实现上述的数据配置查询方法。
基于相同的技术构思,本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机代码,当计算机代码被执行时,上述的数据配置查询方法被执行。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
本申请所涉及的计算机程序可以存储于计算机可读存储介质中,所述计算机可读存储介质可以包括:能够携带计算机程序代码的任何实体装置、虚拟装置、优盘、移动硬盘、磁碟、光盘、计算机存储器、只读计算机存储器(Read-Only Memory,ROM)、随机存取计算机存储器(Random Access Memory,RAM)、电载波信号、电信信号以及其他软件分发介质等。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们 分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (10)

  1. 一种数据配置查询方法,其特征在于,所述方法应用于联机分析处理OLAP查询系统中,所述方法包括:
    确定出查询指令所需要查询的至少两个目标数据集以及所述目标数据集之间的有序关联,其中,所述有序关联至少包括单向关联和/或双向关联;
    基于所述至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;
    在数据库中输出符合所述目标关联路径的OLAP模型。
  2. 根据权利要求1所述的数据配置查询方法,其特征在于,所述确定出查询指令所需要查询的至少两个目标数据集以及所述目标数据集之间的有序关联,包括:
    基于所述查询指令的字符序列信息,识别出所述至少两个目标数据集以及所述目标数据集之间的关联信息;
    判断两个所述目标数据集之间的所述关联信息是否包含于等价关联信息中;
    当两个所述目标数据集之间的所述关联信息包含于等价关联信息中时,确定该两个所述目标数据集之间的有序关联为双向关联。
  3. 根据权利要求2所述的数据配置查询方法,其特征在于,所述确定出查询指令所需要查询的至少两个目标数据集以及所述目标数据集之间的有序关联,还包括:
    当两个所述目标数据集之间的所述关联信息不包含于等价关联中时,确定该两个所述目标数据集之间的有序关联为单向关联。
  4. 根据权利要求1所述的数据配置查询方法,其特征在于,所述在数据库中输出符合所述目标关联路径的OLAP模型,包括:
    在所述数据库中筛选出仅包含有所述查询指令所需要查询的至少两个目标数据集的所述OLAP模型;
    将所述OLAP模型中包含的任意一个所述目标数据集作为候选中心与所述目标关联路径进行匹配,确定所述OLAP模型是否符合所述目标关联路径;
    当所述OLAP模型符合所述目标关联路径时,输出该所述OLAP模型。
  5. 一种数据配置查询装置,其特征在于,所述装置应用于OLAP查询系统中,所述装置包括:
    确定模块,用于确定出查询指令所需要查询的至少两个目标数据集以及所述目标数据集之间的有序关联,其中,所述有序关联至少包括单向关联和/或双向关联;
    生成模块,用于基于所述至少两个目标数据集中目标数据集之间的有序关联生成目标关联路径;
    输出模块,用于在数据库中输出符合所述目标关联路径的OLAP模型。
  6. 根据权利要求5所述的数据配置查询装置,其特征在于,所述确定模块,用于:
    基于所述查询指令的字符序列信息,识别出所述至少两个目标数据集以及所述目标数据集之间的关联信息;
    判断两个所述目标数据集之间的所述关联信息是否包含于等价关联信息中;
    当两个所述目标数据集之间的所述关联信息包含于等价关联信息中时,确定该两个所述目标数据集之间的有序关联为双向关联。
  7. 根据权利要求6所述的数据配置查询装置,其特征在于,所述确定模块,用于:
    当两个所述目标数据集之间的所述关联信息不包含于等价关联中时,确定该两个所述目标数据集之间的有序关联为单向关联。
  8. 根据权利要求5所述的数据配置查询装置,其特征在于,所述输出模块,用于:
    在所述数据库中筛选出仅包含有所述查询指令所需要查询的至少两个目标数据集的所述OLAP模型;
    将所述OLAP模型中包含的任意一个所述目标数据集作为候选中心与所述目标关联路径进行匹配,确定所述OLAP模型是否符合所述目标关联路径;
    当所述OLAP模型符合所述目标关联路径时,输出该所述OLAP模型。
  9. 一种计算机设备,所述计算机设备包括:
    一个或多个处理器;
    存储器,用于存储一个或多个计算机程序;
    当一个或多个计算机程序被一个或多个处理器执行时,使得一个或多个处理器实现如权利要求1-4任一项所述的数据配置查询方法。
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机代码,当所述计算机代码被执行时,如权利要求1-4任一项所述的数据配置查询方法被执行。
PCT/CN2020/077710 2019-03-20 2020-03-04 数据配置查询方法和装置 WO2020187023A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/051,008 US11281698B2 (en) 2019-03-20 2020-03-04 Data configuration query method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910214615.7 2019-03-20
CN201910214615.7A CN109977175B (zh) 2019-03-20 2019-03-20 数据配置查询方法和装置

Publications (1)

Publication Number Publication Date
WO2020187023A1 true WO2020187023A1 (zh) 2020-09-24

Family

ID=67079745

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/077710 WO2020187023A1 (zh) 2019-03-20 2020-03-04 数据配置查询方法和装置

Country Status (3)

Country Link
US (1) US11281698B2 (zh)
CN (1) CN109977175B (zh)
WO (1) WO2020187023A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732243A (zh) * 2021-01-11 2021-04-30 京东数字科技控股股份有限公司 一种用于生成功能组件的数据处理方法及装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977175B (zh) 2019-03-20 2021-06-01 跬云(上海)信息科技有限公司 数据配置查询方法和装置
CN111061910B (zh) * 2019-12-16 2020-12-15 湖南大学 一种基于HBase和Solr的视频特征数据查询方法和系统
CN111309726B (zh) * 2020-01-17 2024-03-22 北京明略软件系统有限公司 一种有向图的生成方法、生成装置及可读存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729500A (zh) * 2017-10-20 2018-02-23 锐捷网络股份有限公司 一种联机分析处理的数据处理方法、装置及后台设备
CN208207819U (zh) * 2018-07-17 2018-12-07 于果鑫 一种基于可扩展节点集群的大数据分析处理系统
CN109977175A (zh) * 2019-03-20 2019-07-05 跬云(上海)信息科技有限公司 数据配置查询方法和装置

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6446059B1 (en) * 1999-06-22 2002-09-03 Microsoft Corporation Record for a multidimensional database with flexible paths
US6961728B2 (en) * 2000-11-28 2005-11-01 Centerboard, Inc. System and methods for highly distributed wide-area data management of a network of data sources through a database interface
CN101197876B (zh) * 2006-12-06 2012-02-29 中兴通讯股份有限公司 一种对消息类业务数据进行多维分析的方法和系统
CN101286151A (zh) * 2007-04-13 2008-10-15 国际商业机器公司 建立多维模型和数据仓库模式的映射的方法及相关系统
CN101673287A (zh) * 2009-10-16 2010-03-17 金蝶软件(中国)有限公司 一种sql语句生成方法及系统
CN102663114B (zh) * 2012-04-17 2013-09-11 中国人民大学 面向并发olap的数据库查询处理方法
US20150199378A1 (en) * 2012-06-29 2015-07-16 Nick Alex Lieven REYNTJEN Method and apparatus for realizing a dynamically typed file or object system enabling a user to perform calculations over the fields associated with the files or objects in the system
CN103927337B (zh) * 2014-03-26 2017-12-19 北京国双科技有限公司 用于联机分析处理中关联关系的数据处理方法和装置
CN104391928B (zh) * 2014-11-21 2018-08-28 用友网络科技股份有限公司 动态构建多维模型定义的装置和方法
CN104361118B (zh) * 2014-12-01 2017-07-21 中国人民大学 一种适应协处理器的混合olap查询处理方法
US10909178B2 (en) * 2015-03-05 2021-02-02 Workday, Inc. Methods and systems for multidimensional analysis of interconnected data sets stored in a graph database
CN105550241B (zh) * 2015-12-07 2019-06-25 珠海多玩信息技术有限公司 多维数据库查询方法及装置
CN106372190A (zh) * 2016-08-31 2017-02-01 华北电力大学(保定) 实时olap查询方法和装置
CN106844703B (zh) * 2017-02-04 2019-08-02 中国人民大学 一种面向数据库一体机的内存数据仓库查询处理实现方法
EP3401808A1 (en) * 2017-05-12 2018-11-14 QlikTech International AB Interactive data exploration
CN109117429B (zh) * 2017-06-22 2020-09-22 北京嘀嘀无限科技发展有限公司 数据库查询方法、装置和电子设备
US10726052B2 (en) * 2018-07-03 2020-07-28 Sap Se Path generation and selection tool for database objects

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729500A (zh) * 2017-10-20 2018-02-23 锐捷网络股份有限公司 一种联机分析处理的数据处理方法、装置及后台设备
CN208207819U (zh) * 2018-07-17 2018-12-07 于果鑫 一种基于可扩展节点集群的大数据分析处理系统
CN109977175A (zh) * 2019-03-20 2019-07-05 跬云(上海)信息科技有限公司 数据配置查询方法和装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732243A (zh) * 2021-01-11 2021-04-30 京东数字科技控股股份有限公司 一种用于生成功能组件的数据处理方法及装置

Also Published As

Publication number Publication date
CN109977175B (zh) 2021-06-01
CN109977175A (zh) 2019-07-05
US20210406281A1 (en) 2021-12-30
US11281698B2 (en) 2022-03-22

Similar Documents

Publication Publication Date Title
WO2020187023A1 (zh) 数据配置查询方法和装置
US11068439B2 (en) Unsupervised method for enriching RDF data sources from denormalized data
CN110633292B (zh) 一种异构数据库的查询方法、装置、介质、设备及系统
CN107038207B (zh) 一种数据查询方法、数据处理方法及装置
WO2021083239A1 (zh) 一种进行图数据查询的方法、装置、设备及存储介质
CN106897322B (zh) 一种数据库和文件系统的访问方法和装置
TWI706259B (zh) 資料的查詢方法及查詢裝置
CN106033439B (zh) 一种分布式事务处理方法及系统
CN105824957A (zh) 分布式内存列式数据库的查询引擎系统及查询方法
US20150120775A1 (en) Answering relational database queries using graph exploration
CN111177231A (zh) 报表生成方法和报表生成装置
CN109144997A (zh) 数据关联方法、装置及存储介质
CN111563101B (zh) 执行计划优化方法、装置、设备及存储介质
CN108052635A (zh) 一种异构数据源统一联合查询方法
US20150269234A1 (en) User Defined Functions Including Requests for Analytics by External Analytic Engines
CN111488332B (zh) 一种ai服务开放中台及方法
US20170060977A1 (en) Data preparation for data mining
CN106897467A (zh) 一种大数据分析引擎的数据库适配方法
CN106484699B (zh) 数据库查询字段的生成方法及装置
CN110263104A (zh) Json字符串处理方法及装置
WO2018045610A1 (zh) 用于执行分布式计算任务的方法和装置
CN114820080A (zh) 基于人群流转的用户分群方法、系统、装置及介质
CN108182204A (zh) 基于房产交易多维度数据的数据查询的处理方法及装置
CN104408183A (zh) 数据系统的数据导入方法和装置
CN109710630A (zh) 异构数据源的查询方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20773711

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20773711

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 260122)

122 Ep: pct application non-entry in european phase

Ref document number: 20773711

Country of ref document: EP

Kind code of ref document: A1