WO2023035355A1 - 批流融合的信息处理方法和装置、存储介质 - Google Patents

批流融合的信息处理方法和装置、存储介质 Download PDF

Info

Publication number
WO2023035355A1
WO2023035355A1 PCT/CN2021/123288 CN2021123288W WO2023035355A1 WO 2023035355 A1 WO2023035355 A1 WO 2023035355A1 CN 2021123288 W CN2021123288 W CN 2021123288W WO 2023035355 A1 WO2023035355 A1 WO 2023035355A1
Authority
WO
WIPO (PCT)
Prior art keywords
index
data segment
created
batch
data source
Prior art date
Application number
PCT/CN2021/123288
Other languages
English (en)
French (fr)
Inventor
曹鲁
马洪宾
张逸凡
陈志雄
李扬
韩卿
Original Assignee
上海跬智信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海跬智信息技术有限公司 filed Critical 上海跬智信息技术有限公司
Priority to EP21930627.1A priority Critical patent/EP4170524A4/en
Priority to US18/092,326 priority patent/US20230153308A1/en
Publication of WO2023035355A1 publication Critical patent/WO2023035355A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9017Indexing; Data structures therefor; Storage structures using directory or table look-up
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • G06F16/2386Bulk updating operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • G06F16/2456Join operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the invention belongs to the technical field of data processing, and in particular relates to an information processing method, device and storage medium for batch stream fusion.
  • the present invention provides an information processing method for batch flow fusion, comprising steps:
  • the index-based extraction of pre-computed index data segments as query results specifically includes:
  • the pre-computed index data segment specifically includes:
  • the statistical information of the pre-created index is stored in the memory, and the pre-computed index data segment is stored in the storage medium.
  • the pre-created index defined based on a predefined unified model specifically includes:
  • the streaming data source mapping table is a fact table
  • the batch data source mapping table is a shadow table bound to it
  • the index data segment calculated based on the pre-created index specifically includes:
  • pre-created index where the pre-created index is divided into batch index, stream index and fusion index;
  • pre-computation is performed in the data source through the computing engine, where the computing engine includes a batch processing computing engine and a stream processing computing engine, and the data source includes a batch data source and a streaming data source;
  • the pre-calculated index data segment is obtained based on the pre-calculation, wherein the pre-calculated index data segment is divided into a batch index data segment and a stream index data segment.
  • the storing the statistical information of the pre-created index in memory specifically includes:
  • Extracts statistics for a pre-created index where statistics include row count, size, and cardinality.
  • said extracting the newly calculated index data segment to update the query result specifically includes:
  • the index is divided into batch index, stream index and fusion index, and the data source includes batch data source and stream data source;
  • the newly calculated index data segment is obtained through the new calculation of the computing engine, wherein the computing engine includes a batch processing computing engine and a stream processing computing engine, and the newly calculated index data segment is divided into a batch index data segment and a stream index data segment;
  • An information processing device for batch flow fusion characterized in that it includes:
  • the client is used to obtain the index based on the input query statement
  • a query engine for extracting pre-computed index data segments based on the index as a query result
  • the update module is used to extract the newly calculated index data segment and update the query result.
  • the query engine :
  • the query engine includes:
  • a pre-created index definition module configured to define a pre-created index based on a predefined unified model
  • a pre-calculation module configured to perform pre-calculation based on a pre-created index to obtain an index data segment
  • the information storage module stores the statistical information of the pre-created index in the memory, and stores the pre-calculated index data segment in the storage medium.
  • the pre-created index definition module :
  • the streaming data source mapping table is a fact table
  • the batch data source mapping table is a shadow table bound to it
  • the pre-calculation module :
  • pre-created index where the pre-created index is divided into batch index, stream index and fusion index;
  • pre-computation is performed in the data source through the computing engine, where the computing engine includes a batch processing computing engine and a stream processing computing engine, and the data source includes a batch data source and a streaming data source;
  • the pre-calculated index data segment is obtained based on the pre-calculation, wherein the pre-calculated index data segment is divided into a batch index data segment and a stream index data segment.
  • the information storage module :
  • Extracts statistics for a pre-created index where statistics include row count, size, and cardinality.
  • the update module :
  • the index is divided into batch index, stream index and fusion index, and the data source includes batch data source and stream data source;
  • the newly calculated index data segment is obtained through the new calculation of the computing engine, wherein the computing engine includes a batch processing computing engine and a stream processing computing engine, and the newly calculated index data segment is divided into a batch index data segment and a stream index data segment;
  • An electronic device includes a memory and a processor, the memory stores a computer program, and it is characterized in that any one of the above methods can be implemented by executing the computer program in the processor.
  • a storage medium storing a computer program, wherein the computer program is executed in a processor to implement any one of the above methods.
  • the present invention realizes the technical effect of simultaneously analyzing historical data and real-time data through a single SQL query statement; through pre-calculation and real-time merging technology, realizes the technical effect of effectively reducing query response time; by storing calculation results in The storage medium is reused to realize the technical effect of ensuring the stability of the system; by continuously storing the newly calculated results in the storage medium, the technical effect of updating the query result is realized.
  • Fig. 1 is the flow chart of the information processing method of batch flow fusion of the present application
  • Fig. 2 is the pre-defined unified model diagram of the present application
  • FIG. 3 is a flow chart of defining a pre-created index of the present application
  • FIG. 4 is a flow chart of the present application for calculating and obtaining index data segments based on pre-created indexes.
  • the term “storage medium” may refer to various media capable of storing computer programs such as ROM, RAM, magnetic disk or optical disk.
  • the term “processor” can be CPLD (Complex Programmable Logic Device: Complex Programmable Logic Device), FPGA (Field-Programmable Gate Array: Field Programmable Gate Array), MCU (Microcontroller Unit: Micro Control Unit), PLC (Programmable Logic Controller: programmable logic controller) and CPU (Central Processing Unit: central processing unit) and other chips or circuits with data processing functions.
  • the term "electronic device” may refer to any device having a data processing function and a storage function, and may generally include fixed terminals and mobile terminals. Fixed terminals such as desktop computers, etc. Mobile terminals such as mobile phones, PADs, and mobile robots. In addition, technical features involved in different embodiments of the present invention described later may be combined with each other as long as they do not constitute a conflict with each other.
  • This embodiment provides an information processing method for batch flow fusion, as shown in FIG. 1 , including steps:
  • the acquiring an index based on an input query statement specifically includes:
  • the keyword association index specifically includes:
  • the index-based extraction of the pre-computed index data segment as the query result specifically includes:
  • the obtaining the statistical information of the index specifically includes:
  • the pre-computed index data segment specifically includes:
  • the definition of the pre-created index based on the predefined unified model specifically includes:
  • the calculation of the index data segment based on the pre-created index specifically includes:
  • the storing the statistical information of the pre-created index in memory specifically includes:
  • Extract statistics for pre-created indexes where statistics include but are not limited to row count, size, and cardinality.
  • the extracting the newly calculated index data segment to update the query result specifically includes:
  • the technical effect of updating the query result is realized by continuously storing the newly calculated result in the storage medium.
  • This embodiment provides an information processing device for batch flow fusion, which is characterized in that it includes:
  • the client is used to obtain the index based on the input query statement
  • a query engine for extracting pre-computed index data segments based on the index as a query result
  • the update module is used to extract the newly calculated index data segment and update the query result.
  • the query engine :
  • the query engine includes:
  • a pre-created index definition module configured to define a pre-created index based on a predefined unified model
  • a pre-calculation module configured to perform pre-calculation based on a pre-created index to obtain an index data segment
  • the information storage module stores the statistical information of the pre-created index in the memory, and stores the pre-calculated index data segment in the storage medium.
  • the pre-created index definition module :
  • the streaming data source mapping table is a fact table
  • the batch data source mapping table is a shadow table bound to it
  • the pre-calculation module :
  • pre-created index where the pre-created index is divided into batch index, stream index and fusion index;
  • pre-computation is performed in the data source through the computing engine, where the computing engine includes a batch processing computing engine and a stream processing computing engine, and the data source includes a batch data source and a streaming data source;
  • the pre-calculated index data segment is obtained based on the pre-calculation, wherein the pre-calculated index data segment is divided into a batch index data segment and a stream index data segment.
  • the information storage module :
  • Extracts statistics for a pre-created index where statistics include row count, size, and cardinality.
  • the update module :
  • the index is divided into batch index, stream index and fusion index, and the data source includes batch data source and stream data source;
  • the newly calculated index data segment is obtained through the new calculation of the computing engine, wherein the computing engine includes a batch processing computing engine and a stream processing computing engine, and the newly calculated index data segment is divided into a batch index data segment and a stream index data segment;
  • the dimensions of a pre-created index are insurance salesperson (seller_id) and date (date), and the measure is the sum of the policy amount sum (amount). Since the number of salespersons may be large, the degree of aggregation of this index may be Not very high.
  • the data content corresponding to this pre-created index may be shown in Table 1 below, which is the result of summarizing according to the transaction amount of each salesperson's daily sales records:
  • Seller_Id date Sum(amount) 10001 2020-05-01 00:00:00 100 10001 2020-05-01 00:10:00 200 10002 2020-05-01 00:20:00 150 10003 2020-05-01 00:30:00 80 10003 2020-05-01 00:40:00 30
  • the data content of this table is the batch index data segment.
  • the system will calculate the pre-created index and save the calculated pre-calculation results, that is, save the table data in real time.
  • the steps are:
  • the system When consuming continuously generated streaming real-time data, the system will create an index close to the above pre-created index for this query.
  • the acquiring an index based on an input query statement specifically includes:
  • the keyword association index specifically includes:
  • keywords are retrieved in the query statement SQL 1: salesperson (seller_id), May 1, 2020 (date) and total transaction sum (amount), among which, the associated dimensions of salesperson and May 1, 2020 are respectively
  • the insurance salesperson (seller_id) and date (date), the total transaction sum (amount) is related to the sum of the policy amount sum (amount).
  • the index-based extraction of the pre-calculated index data segment as the query result specifically includes:
  • each column there are three columns in the index, and the cardinality of each column is Seller_Id, Date, and Sum (amount). It is detected in the statistical information of the pre-created index stored in the memory, and the above-mentioned pre-created index is obtained, and its pre-calculated index data is located in the storage medium.
  • the segment namely Table 1, answers Table 1 as the result of the query.
  • the obtaining the statistical information of the index specifically includes:
  • the statistical information of the index obtained through statistics includes three columns, and the cardinality of each column is Seller_Id, Date, and Sum(amount).
  • the pre-computed index data segment specifically includes:
  • the definition of the pre-created index based on the predefined unified model specifically includes:
  • the dimensions are the insurance salesperson (seller_id) and the date (date), and the measure is the sum (amount) of the policy amount.
  • the calculation of the index data segment based on the pre-created index specifically includes:
  • pre-computing is performed in the data source through the computing engine, wherein the computing engine includes a batch processing computing engine and a stream processing computing engine, and the data source includes a batch data source and a streaming data source;
  • the batch index data segment obtained by pre-calculation is Table 1.
  • the storing the statistical information of the pre-created index in memory specifically includes:
  • Extract statistics for pre-created indexes where statistics include but are not limited to row count, size, and cardinality.
  • the results in the query will also be updated accordingly.
  • the extracting the newly calculated index data segment to update the query result specifically includes:
  • the embodiment of the present invention also includes an electronic device, including a memory and a processor, the memory stores a computer program, and when the computer program is executed in the processor, the computer program is used to implement the above-mentioned information processing method for batch flow fusion,
  • the method includes:
  • the present invention also provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, it is used to implement the above-mentioned information processing method for batch flow fusion , the method includes:
  • the readable storage medium may be a computer storage medium, or a communication medium.
  • Communication media includes any medium that facilitates transfer of a computer program from one place to another.
  • Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer.
  • a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium.
  • the readable storage medium can also be a component of the processor.
  • the processor and the readable storage medium may be located in application specific integrated circuits (Application Specific Integrated Circuits, ASIC). Additionally, the ASIC may be located in the user equipment.
  • ASIC Application Specific Integrated Circuits
  • the processor and the readable storage medium can also exist in the communication device as discrete components.
  • the readable storage medium may be read only memory (ROM), random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage devices, among others.
  • the present invention also provides a program product, which includes execution instructions, and the execution instructions are stored in a readable storage medium. At least one processor of the device may read the execution instruction from the readable storage medium, and at least one processor executes the execution instruction so that the device implements the methods provided by the above-mentioned various implementations.
  • the processor can be a central processing unit (English: Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (English: Digital Signal Processor, DSP )wait.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, and the like.
  • the steps of the method disclosed in conjunction with the present invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
  • each module or each step of the above-mentioned present invention can be realized by a general-purpose computing device, and they can be concentrated on a single computing device, or distributed in a network formed by multiple computing devices
  • they can be implemented with program codes executable by a computing device, so that they can be stored in a storage device and executed by a computing device, or they can be made into individual integrated circuit modules, or they can be integrated into Multiple modules or steps are fabricated into a single integrated circuit module to realize.
  • the present invention is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种批流融合的信息处理方法和装置、存储介质。该方法包括基于输入的查询语句获取索引;基于索引提取预计算的索引数据段作为查询结果;提取新计算的索引数据段更新查询结果。本申请解决了实时数据与离线数据难以进行融合分析的技术问题。

Description

批流融合的信息处理方法和装置、存储介质 技术领域
本发明属于数据处理技术领域,尤其涉及一种批流融合的信息处理方法和装置、存储介质。
背景技术
随着工业4.0以及5G时代的到来,人们对于数据处理的时效性有了越来越高的需求。研究表明,数据的价值随着时间的流逝而降低。反过来说,越实时的数据,其价值越高。在此背景下,业界涌现出了像Storm,Spark,Kafka,Flink等一批优秀的开源流计算框架。然而,尽管不少流计算框架宣称自己具备批流一体的处理能力,事实上由于批处理与流计算在使用场景和侧重点的不同——流计算更关注数据的时效性,而批处理则更关注数据的完整性、准确性和计算成本,像Hive这样的批处理系统仍然无法被完全替代。这使得人们在需要将实时数据与离线数据进行融合分析的时候,变得非常困难。数据口径不一致,语义不统一,无法跨系统查询,查询性能无法满足要求等问题,常常会成为难以逾越的鸿沟,困扰着大数据架构师和工程师们。
综上所述,现有技术存在如下技术问题:
将实时数据与离线数据进行融合分析的时候,变得非常困难。
发明内容
为解决上述技术问题,本发明提供一种批流融合的信息处理方法,包括步骤:
基于输入的查询语句获取索引;
基于索引提取预计算的索引数据段作为查询结果;
提取新计算的索引数据段更新查询结果。
优选的,所述基于索引提取预计算的索引数据段作为查询结果,具 体包括:
获取索引的统计信息;
在内存中检索拥有该统计信息的预创建索引;
基于预创建索引在存储介质中定位其预计算的索引数据段;
将定位到的预计算的索引数据段作为查询结果。
优选的,所述预计算的索引数据段,具体包括:
基于预先定义的统一模型定义预创建索引;
基于预创建索引进行预计算得到索引数据段;
将预创建索引的统计信息储存于内存,并将预计算的索引数据段储存于存储介质。
优选的,所述基于预先定义的统一模型定义预创建索引,具体包括:
获取预先创建的流数据源映射表、批数据源映射表,其中,流数据源映射表为事实表,批数据源映射表为与其绑定的影子表;
获取预先创建的事实表与影子表连接的维度和度量;
基于获取的维度和度量,确定事实表和影子表与维度表的连接关系;
基于连接关系定义预创建索引。
优选的,所述基于预创建索引计算得到索引数据段,具体包括:
获取预创建索引,其中,预创建索引分为批索引、流索引和融合索引;
基于预创建索引,通过计算引擎在数据源中进行预计算,其中,计算引擎包括批处理计算引擎、流处理计算引擎,数据源包括批数据源、流数据源;
基于预计算得到预计算的索引数据段,其中,预计算的索引数据段分为批索引数据段、流索引数据段。
优选的,所述将预创建索引的统计信息储存于内存,具体包括:
提取预创建索引的统计信息,其中,统计信息包括行数、大小和基数。
优选的,所述提取新计算的索引数据段更新查询结果,具体包括:
基于索引在数据源中进行新计算,其中,索引分为批索引、流索引和融合索引,数据源包括批数据源、流数据源;
通过计算引擎新计算得到新计算的索引数据段,其中,计算引擎包括批处理计算引擎、流处理计算引擎,新计算的索引数据段分为批索引数据段、流索引数据段;
将新计算的索引数据段储存在存储介质中;
提取存储介质中新计算的索引数据段更新查询结果。
一种批流融合的信息处理的装置,其特征在于,包括:
客户端,用于基于输入的查询语句获取索引;
查询引擎,用于基于索引提取预计算的索引数据段作为查询结果;
更新模块,用于提取新计算的索引数据段更新查询结果。
优选的,所述查询引擎:
获取索引的统计信息;
在内存中检索拥有该统计信息的预创建索引;
基于预创建索引在存储介质中定位其预计算的索引数据段;
将定位到的预计算的索引数据段作为查询结果。
优选的,所述查询引擎,包括:
预创建索引定义模块,用于基于预先定义的统一模型定义预创建索引;
预计算模块,用于基于预创建索引进行预计算得到索引数据段;
信息储存模块,将预创建索引的统计信息储存于内存,并将预计算的索引数据段储存于存储介质。
优选的,所述预创建索引定义模块:
获取预先创建的流数据源映射表、批数据源映射表,其中,流数据源映射表为事实表,批数据源映射表为与其绑定的影子表;
获取预先创建的事实表与影子表连接的维度和度量;
基于获取的维度和度量,确定事实表和影子表与维度表的连接关系;
基于连接关系定义预创建索引。
优选的,所述预计算模块:
获取预创建索引,其中,预创建索引分为批索引、流索引和融合索引;
基于预创建索引,通过计算引擎在数据源中进行预计算,其中,计算引擎包括批处理计算引擎、流处理计算引擎,数据源包括批数据源、流数据源;
基于预计算得到预计算的索引数据段,其中,预计算的索引数据段分为批索引数据段、流索引数据段。
优选的,所述信息储存模块:
提取预创建索引的统计信息,其中,统计信息包括行数、大小和基数。
优选的,所述更新模块:
基于索引在数据源中进行新计算,其中,索引分为批索引、流索引和融合索引,数据源包括批数据源、流数据源;
通过计算引擎新计算得到新计算的索引数据段,其中,计算引擎包括批处理计算引擎、流处理计算引擎,新计算的索引数据段分为批索引数据段、流索引数据段;
将新计算的索引数据段储存在存储介质中;
提取存储介质中新计算的索引数据段更新查询结果。
一种电子设备,包括存储器和处理器,所述存储器存储计算机程序,其特征在于,所述计算机程序在所述处理器中执行可实现以上任一种方法。
一种存储介质,存储计算机程序,其特征在于,所述计算机程序在处理器中执行可实现以上任一种方法。
本发明通过定义统一模型,实现了通过一条SQL查询语句同时分析历史数据和实时数据的技术效果;通过预计算和实时合并技术,实现了有效降低查询响应时间的技术效果;通过将计算结果存储在存储介质中加以重复使用,实现了保障系统稳定性的技术效果;通过不断将新计算的结果储存到存储介质中,实现了更新查询结果的技术效果。
附图说明
图1为本申请的批流融合的信息处理方法的流程图;
图2为本申请的预先定义的统一模型图;
图3为本申请的定义预创建索引的流程图;
图4为本申请的基于预创建索引计算得到索引数据段的流程图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,在本发明的描述中,除非另有明确的规定和限定,术语“存储介质”可以是ROM、RAM、磁碟或者光盘等各种可以存储计算机程序的介质。术语“处理器”可以是CPLD(Complex Programmable Logic Device:复杂可编程逻辑器件)、FPGA(Field-Programmable Gate Array:现场可编程门阵列)、MCU(Microcontroller Unit:微控制单元)、PLC(Programmable Logic Controller:可编程逻辑控制器)以及CPU(Central Processing Unit:中央处理器)等具备数据处理功能的芯片或电路。术语“电子设备”可以是 具有数据处理功能和存储功能的任何设备,通常可以包括固定终端和移动终端。固定终端如台式机等。移动终端如手机、PAD以及移动机器人等。此外,后续所描述的本发明不同实施方式中所涉及的技术特征只要彼此之间未构成冲突就可以相互结合。
下面,本发明提出部分优选实施例以教导本领域技术人员实现。
实施例一
本实施例提供一种批流融合的信息处理方法,如图1所示,包括步骤:
S100、基于输入的查询语句获取索引;
S200、基于索引提取预计算的索引数据段作为查询结果;
S300、提取新计算的索引数据基于输入的查询语句获取索引段更新查询结果。
在进一步的实施例中,所述基于输入的查询语句获取索引,具体包括:
S110、获取输入的查询语句;
S120、在查询语句中检索关键词;
S130、根据检索得到的关键词关联维度、度量;
S140、根据关联得到的维度、度量创建索引。
在更进一步的实施例中,所述关键词关联索引,具体包括:
S131、基于维度、度量的种类设置关键词:
S132、建立关键词与其代表的维度、度量的种类的映射关系;
S133、当检索到关键词时,获得其代表的维度、度量的种类。
在进一步的实施例中,所述基于索引提取预计算的索引数据段作为查询结果,具体包括:
S210、获取索引的统计信息;
S220、在内存中检索拥有该统计信息的预创建索引;
S230、基于预创建索引在存储介质中定位其预计算的索引数据段;
S240、将定位到的预计算的索引数据段作为查询结果。
在更进一步的实施例中,所述获取索引的统计信息,具体包括:
S211、接收创建的索引;
S212、对索引进行统计;
S213、根据统计结果得到相关的统计信息,包括但不限于行数、大小和基数。
在进一步的实施例中,所述预计算的索引数据段,具体包括:
S250、基于预先定义的统一模型定义预创建索引;
S260、基于预创建索引进行预计算得到索引数据段;
S270、将预创建索引的统计信息储存于内存,并将预计算的索引数据段储存于存储介质。
在更进一步的实施例中,所述基于预先定义的统一模型定义预创建索引,如图2、图3所示,具体包括:
S251、获取预先创建的流数据源映射表、批数据源映射表,其中,流数据源映射表为事实表,批数据源映射表为与其绑定的影子表;
S252、获取预先创建的事实表与影子表连接的维度和度量;
S253、基于获取的维度和度量,确定事实表和影子表与维度表的连接关系;
S254、基于连接关系定义预创建索引。
在更进一步的实施例中,所述基于预创建索引计算得到索引数据段,如图4所示,具体包括:
S261、获取预创建索引,其中,预创建索引分为批索引、流索引和融合索引;
S262、基于预创建索引,通过计算引擎在数据源中进行预计算,其中,计算引擎包括批处理计算引擎、流处理计算引擎,数据源包括批数据源、流数据源;
S263、基于预计算得到预计算的索引数据段,其中,预计算的索引数据段分为批索引数据段、流索引数据段。
在更进一步的实施例中,所述将预创建索引的统计信息储存于内存,具体包括:
提取预创建索引的统计信息,其中,统计信息包括但不限于行数、大小和基数。
在进一步的实施例中,所述提取新计算的索引数据段更新查询结果,具体包括:
S310、基于索引在数据源中进行新计算,其中,索引分为批索引、流索引和融合索引,数据源包括批数据源、流数据源;
S320、通过计算引擎新计算得到新计算的索引数据段,其中,计算引擎包括批处理计算引擎、流处理计算引擎,新计算的索引数据段分为批索引数据段、流索引数据段;
S330、将新计算的索引数据段储存在存储介质中;
S340、提取存储介质中新计算的索引数据段更新查询结果。
从以上的描述中,可以看出,本发明实现了如下技术效果:
1、通过定义统一模型,实现了通过一条SQL查询语句同时分析历史数据和实时数据的技术效果;
2、通过预计算和实时合并技术,实现了有效降低查询响应时间的技术效果;
3、通过将计算结果存储在存储介质中加以重复使用,实现了保障系统稳定性的技术效果;
4、通过不断将新计算的结果储存到存储介质中,实现了更新查询结果的技术效果。
实施例二
本实施例提供一种批流融合的信息处理的装置,其特征在于,包括:
客户端,用于基于输入的查询语句获取索引;
查询引擎,用于基于索引提取预计算的索引数据段作为查询结果;
更新模块,用于提取新计算的索引数据段更新查询结果。
在进一步的实施例中,所述查询引擎:
获取索引的统计信息;
在内存中检索拥有该统计信息的预创建索引;
基于预创建索引在存储介质中定位其预计算的索引数据段;
将定位到的预计算的索引数据段作为查询结果。
在进一步的实施例中,所述查询引擎,包括:
预创建索引定义模块,用于基于预先定义的统一模型定义预创建索引;
预计算模块,用于基于预创建索引进行预计算得到索引数据段;
信息储存模块,将预创建索引的统计信息储存于内存,并将预计算的索引数据段储存于存储介质。
在更进一步的实施例中,所述预创建索引定义模块:
获取预先创建的流数据源映射表、批数据源映射表,其中,流数据源映射表为事实表,批数据源映射表为与其绑定的影子表;
获取预先创建的事实表与影子表连接的维度和度量;
基于获取的维度和度量,确定事实表和影子表与维度表的连接关系;
基于连接关系定义预创建索引。
在更进一步的实施例中,所述预计算模块:
获取预创建索引,其中,预创建索引分为批索引、流索引和融合索引;
基于预创建索引,通过计算引擎在数据源中进行预计算,其中,计算引擎包括批处理计算引擎、流处理计算引擎,数据源包括批数据源、流数据源;
基于预计算得到预计算的索引数据段,其中,预计算的索引数据段分为批索引数据段、流索引数据段。
在更进一步的实施例中,所述信息储存模块:
提取预创建索引的统计信息,其中,统计信息包括行数、大小和基数。
在进一步的实施例中,所述更新模块:
基于索引在数据源中进行新计算,其中,索引分为批索引、流索引和融合索引,数据源包括批数据源、流数据源;
通过计算引擎新计算得到新计算的索引数据段,其中,计算引擎包括批处理计算引擎、流处理计算引擎,新计算的索引数据段分为批索引数据段、流索引数据段;
将新计算的索引数据段储存在存储介质中;
提取存储介质中新计算的索引数据段更新查询结果。
实施例三
在这个实施例中,一个预创建索引的维度是保险销售员(seller_id)和日期(date),度量是保单金额总和sum(amount),由于销售员的数量可能很多,因此这个索引的聚合度可能不是很高。此预创建索引对应的数据内容可能如下表1所示,是按照每位销售员在每天的销售记录的交易额进行汇总的结果:
Seller_Id Date Sum(amount)
10001 2020-05-01 00:00:00 100
10001 2020-05-01 00:10:00 200
10002 2020-05-01 00:20:00 150
10003 2020-05-01 00:30:00 80
10003 2020-05-01 00:40:00 30
表1
假设总共有10万个销售员,那么此处省略剩下的10w行预计算结果。
此表的数据内容即为批索引数据段,系统会对该预创建索引进行计算,将计算好的预计算结果保存下来,即将该表数据实时保存。
基于本实施例提供的一种批流融合的信息处理方法,通过步骤:
S100、基于输入的查询语句获取索引;
其中用户提供一条查询:
SQL 1分析编号为10003的销售员从2020年5月1日到目前为止的成交总额:select sum(amount)from transactions where date>timestamp(‘2020-05-01 00:00:00’)and seller_id=‘10003’
在消费不断产生的流式实时数据时,系统会为这一条查询,建立与上述预创建索引接近的索引。
在进一步的实施例中,所述基于输入的查询语句获取索引,具体包括:
S110、获取输入的查询语句;
S120、在查询语句中检索关键词;
S130、根据检索得到的关键词关联维度、度量;
S140、根据关联得到的维度、度量创建索引。
在更进一步的实施例中,所述关键词关联索引,具体包括:
S131、基于维度、度量的种类设置关键词:
S132、建立关键词与其代表的维度、度量的种类的映射关系;
S133、当检索到关键词时,获得其代表的维度、度量的种类。
其中在查询语句SQL 1中检索到关键词:销售员(seller_id)、2020年5月1日(date)和成交总额sum(amount),其中,销售员、2020年5月1日关联维度分别为保险销售员(seller_id)和日期(date),成交总额sum(amount)关联度量为保单金额总和sum(amount)。
基于本实施例提供的一种批流融合的信息处理方法,通过步骤:
S200、基于索引提取预计算的索引数据段作为查询结果;
在查询SQL1中的语句时,直接利用之前预存在存储介质中的结果直接回答,即表1的数据,也是批式数据,保障了系统的性能和效率以及稳定性。
在进一步的实施例中,所述基于索引提取预计算的索引数据段作为查询结果,具体包括:
S210、获取索引的统计信息;
S220、在内存中检索拥有该统计信息的预创建索引;
S230、基于预创建索引在存储介质中定位其预计算的索引数据段;
S240、将定位到的预计算的索引数据段作为查询结果。
其中索引的有三列,每列基数为Seller_Id、Date和Sum(amount),在内存中储存的预创建索引的统计信息中检测,得到上述预创建索引,在存储介质中定位其预计算的索引数据段,即表1,将表1作为查询结果回答。
在更进一步的实施例中,所述获取索引的统计信息,具体包括:
S211、接收创建的索引;
S212、对索引进行统计;
S213、根据统计结果得到相关的统计信息,包括但不限于行数、大小和基数。
其中统计得到的该索引的统计信息为三列,每列基数为Seller_Id、Date和Sum(amount)。
在进一步的实施例中,所述预计算的索引数据段,具体包括:
S250、基于预先定义的统一模型定义预创建索引;
S260、基于预创建索引进行预计算得到索引数据段;
S270、将预创建索引的统计信息储存于内存,并将预计算的索引数据段储存于存储介质。
在更进一步的实施例中,所述基于预先定义的统一模型定义预创建索引,如图2、图3所示,具体包括:
S251、获取预先创建的流数据源映射表、批数据源映射表,其中,流数据源映射表为事实表,批数据源映射表为与其绑定的影子表;
S252、获取预先创建的事实表与影子表连接的维度和度量;
S253、基于获取的维度和度量,确定事实表和影子表与维度表的连接关系;
S254、基于连接关系定义预创建索引。
其中维度分别为保险销售员(seller_id)和日期(date),度量为保单金额总和sum(amount)。
在更进一步的实施例中,所述基于预创建索引计算得到索引数据段,如图4所示,具体包括:
S261、获取预创建索引,其中,预创建索引分为批索引、流索引和融合索引;
S262、基于预创建索引,通过计算引擎在数据源中进行预计算,其 中,计算引擎包括批处理计算引擎、流处理计算引擎,数据源包括批数据源、流数据源;
S263、基于预计算得到预计算的索引数据段,其中,预计算的索引数据段分为批索引数据段、流索引数据段。
其中预计算得到的批索引数据段为表1.
在更进一步的实施例中,所述将预创建索引的统计信息储存于内存,具体包括:
提取预创建索引的统计信息,其中,统计信息包括但不限于行数、大小和基数。
基于本实施例提供的一种批流融合的信息处理方法,通过步骤:
S300、提取新计算的索引数据基于输入的查询语句获取索引段更新查询结果。
同时随着不断将新计算的结果储存到存储介质中,查询中的结果也会发生相应的更新。
在进一步的实施例中,所述提取新计算的索引数据段更新查询结果,具体包括:
S310、基于索引在数据源中进行新计算,其中,索引分为批索引、流索引和融合索引,数据源包括批数据源、流数据源;
S320、通过计算引擎新计算得到新计算的索引数据段,其中,计算引擎包括批处理计算引擎、流处理计算引擎,新计算的索引数据段分为批索引数据段、流索引数据段;
S330、将新计算的索引数据段储存在存储介质中;
S340、提取存储介质中新计算的索引数据段更新查询结果。
实施例四
本发明实施例,还包括一种电子设备,包括存储器和处理器,所述 存储器存储计算机程序,所述计算机程序在所述处理器中执行时用于实现上述的批流融合的信息处理方法,该方法包括:
基于输入的查询语句获取索引;
基于索引提取预计算的索引数据段作为查询结果;
提取新计算的索引数据段更新查询结果。
实施例五
本实施例中,本发明还提供了一种可读存储介质,所述可读存储介质中存储有计算机程序,所述计算机程序被处理器执行时用于实现上述的批流融合的信息处理方法,该方法包括:
基于输入的查询语句获取索引;
基于索引提取预计算的索引数据段作为查询结果;
提取新计算的索引数据段更新查询结果。
其中,可读存储介质可以是计算机存储介质,也可以是通信介质。通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。计算机存储介质可以是通用或专用计算机能够存取的任何可用介质。例如,可读存储介质耦合至处理器,从而使处理器能够从该可读存储介质读取信息,且可向该可读存储介质写入信息。当然,可读存储介质也可以是处理器的组成部分。处理器和可读存储介质可以位于专用集成电路(Application Specific Integrated Circuits,ASIC)中。另外,该ASIC可以位于用户设备中。当然,处理器和可读存储介质也可以作为分立组件存在于通信设备中。可读存储介质可以是只读存储器(ROM)、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本发明还提供一种程序产品,该程序产品包括执行指令,该执行指令存储在可读存储介质中。设备的至少一个处理器可以从可读存储介质读取该执行指令,至少一个处理器执行该执行指令使得设备实施上述的 各种实施方式提供的方法。
在上述终端或者服务器的实施例中,应理解,处理器可以是中央处理单元(英文:Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(英文:Digital Signal Processor,DSP)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (15)

  1. 一种批流融合的信息处理方法,包括:
    基于输入的查询语句获取索引;
    基于索引提取预计算的索引数据段作为查询结果;
    提取新计算的索引数据段更新查询结果。
  2. 如权利要求1所述的方法,其中,所述基于索引提取预计算的索引数据段作为查询结果,具体包括:
    获取索引的统计信息;
    在内存中检索拥有该统计信息的预创建索引;
    基于预创建索引在存储介质中定位其预计算的索引数据段;
    将定位到的预计算的索引数据段作为查询结果。
  3. 如权利要求1所述的方法,其中,所述预计算的索引数据段,具体包括:
    基于预先定义的统一模型定义预创建索引;
    基于预创建索引进行预计算得到索引数据段;
    将预创建索引的统计信息储存于内存,并将预计算的索引数据段储存于存储介质。
  4. 如权利要求3所述的方法,其中,所述基于预先定义的统一模型定义预创建索引,具体包括:
    获取预先创建的流数据源映射表、批数据源映射表,其中,流数据源映射表为事实表,批数据源映射表为与其绑定的影子表;
    获取预先创建的事实表与影子表连接的维度和度量;
    基于获取的维度和度量,确定事实表和影子表与维度表的连接关系;
    基于连接关系定义预创建索引。
  5. 如权利要求3所述的方法,其中,所述基于预创建索引计算得到索引数据段,具体包括:
    获取预创建索引,其中,预创建索引分为批索引、流索引和融合索引;
    基于预创建索引,通过计算引擎在数据源中进行预计算,其中,计算引擎包括批处理计算引擎、流处理计算引擎,数据源包括批数据源、流数据源;
    基于预计算得到预计算的索引数据段,其中,预计算的索引数据段分为批索引数据段、流索引数据段。
  6. 如权利要求3所述的方法,其中,所述将预创建索引的统计信息储存于内存,具体包括:
    提取预创建索引的统计信息,其中,统计信息包括行数、大小和基数。
  7. 如权利要求1所述的方法,其中,所述提取新计算的索引数据段更新查询结果,具体包括:
    基于索引在数据源中进行新计算,其中,索引分为批索引、流索引和融合索引,数据源包括批数据源、流数据源;
    通过计算引擎新计算得到新计算的索引数据段,其中,计算引擎包括批处理计算引擎、流处理计算引擎,新计算的索引数据段分为批索引数据段、流索引数据段;
    将新计算的索引数据段储存在存储介质中;
    提取存储介质中新计算的索引数据段更新查询结果。
  8. 一种批流融合的信息处理的装置,其特征在于,包括:
    客户端,用于基于输入的查询语句获取索引;
    查询引擎,用于基于索引提取预计算的索引数据段作为查询结果;
    更新模块,用于提取新计算的索引数据段更新查询结果。
  9. 如权利要求8所述的装置,其中,所述查询引擎:
    获取索引的统计信息;
    在内存中检索拥有该统计信息的预创建索引;
    基于预创建索引在存储介质中定位其预计算的索引数据段;
    将定位到的预计算的索引数据段作为查询结果。
  10. 如权利要求8所述的装置,其中,所述查询引擎,包括:
    预创建索引定义模块,用于基于预先定义的统一模型定义预创建索引;
    预计算模块,用于基于预创建索引进行预计算得到索引数据段;
    信息储存模块,将预创建索引的统计信息储存于内存,并将预计算的索引数据段储存于存储介质。
  11. 如权利要求10所述的装置,其中,所述预创建索引定义模块:
    获取预先创建的流数据源映射表、批数据源映射表,其中,流数据源映射表为事实表,批数据源映射表为与其绑定的影子表;
    获取预先创建的事实表与影子表连接的维度和度量;
    基于获取的维度和度量,确定事实表和影子表与维度表的连接关系;
    基于连接关系定义预创建索引。
  12. 如权利要求8所述的装置,其中,所述预计算模块:
    获取预创建索引,其中,预创建索引分为批索引、流索引和融合索引;
    基于预创建索引,通过计算引擎在数据源中进行预计算,其中,计算引擎包括批处理计算引擎、流处理计算引擎,数据源包括批数据源、流数据源;
    基于预计算得到预计算的索引数据段,其中,预计算的索引数据段分为批索引数据段、流索引数据段。
  13. 如权利要求8所述的装置,其中,所述更新模块:
    基于索引在数据源中进行新计算,其中,索引分为批索引、流索引和融合索引,数据源包括批数据源、流数据源;
    通过计算引擎新计算得到新计算的索引数据段,其中,计算引擎包括批处理计算引擎、流处理计算引擎,新计算的索引数据段分为批索引数据段、流索引数据段;
    将新计算的索引数据段储存在存储介质中;
    提取存储介质中新计算的索引数据段更新查询结果。
  14. 一种电子设备,包括存储器和处理器,所述存储器存储计算机程序,其特征在于,所述计算机程序在所述处理器中执行可实现权利要求1-7中任一种方法。
  15. 一种存储介质,存储计算机程序,其特征在于,所述计算机程序在处理器中执行可实现权利要求1-7中任一种方法。
PCT/CN2021/123288 2021-09-08 2021-10-12 批流融合的信息处理方法和装置、存储介质 WO2023035355A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21930627.1A EP4170524A4 (en) 2021-09-08 2021-10-12 INFORMATION PROCESSING METHOD AND APPARATUS FOR BATCH STREAM FUSION AND STORAGE MEDIUM
US18/092,326 US20230153308A1 (en) 2021-09-08 2023-01-01 Method and device for processing information by batch-stream fusion, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111049808.5 2021-09-08
CN202111049808.5A CN113918771A (zh) 2021-09-08 2021-09-08 批流融合的信息处理方法和装置、存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/092,326 Continuation US20230153308A1 (en) 2021-09-08 2023-01-01 Method and device for processing information by batch-stream fusion, and storage medium

Publications (1)

Publication Number Publication Date
WO2023035355A1 true WO2023035355A1 (zh) 2023-03-16

Family

ID=79234245

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/123288 WO2023035355A1 (zh) 2021-09-08 2021-10-12 批流融合的信息处理方法和装置、存储介质

Country Status (4)

Country Link
US (1) US20230153308A1 (zh)
EP (1) EP4170524A4 (zh)
CN (1) CN113918771A (zh)
WO (1) WO2023035355A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117435596B (zh) * 2023-12-20 2024-04-02 杭州网易云音乐科技有限公司 流批任务一体化方法、装置、存储介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157749A1 (en) * 2016-12-06 2018-06-07 International Business Machines Corporation Building a data query engine that leverages expert data preparation operations
CN109063017A (zh) * 2018-07-12 2018-12-21 广州市闲愉凡生信息科技有限公司 一种云计算平台的数据持久化分布方法
CN111143411A (zh) * 2019-12-23 2020-05-12 跬云(上海)信息科技有限公司 动态流式预计算方法及装置、存储介质
CN112527839A (zh) * 2020-12-10 2021-03-19 上海浦东发展银行股份有限公司 多源数据处理方法、系统、设备及存储介质
CN112835966A (zh) * 2019-11-22 2021-05-25 北京金山云网络技术有限公司 数据查询方法、装置以及电子设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130073586A1 (en) * 2011-05-02 2013-03-21 Amadeus S.A.S. Database system using batch-oriented computation
WO2013160721A1 (en) * 2012-04-26 2013-10-31 Amadeus S.A.S. Database system using batch-oriented computation
US10838931B1 (en) * 2017-04-28 2020-11-17 EMC IP Holding Company LLC Use of stream-oriented log data structure for full-text search oriented inverted index metadata
US10474655B1 (en) * 2018-07-23 2019-11-12 Improbable Worlds Ltd Entity database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157749A1 (en) * 2016-12-06 2018-06-07 International Business Machines Corporation Building a data query engine that leverages expert data preparation operations
CN109063017A (zh) * 2018-07-12 2018-12-21 广州市闲愉凡生信息科技有限公司 一种云计算平台的数据持久化分布方法
CN112835966A (zh) * 2019-11-22 2021-05-25 北京金山云网络技术有限公司 数据查询方法、装置以及电子设备
CN111143411A (zh) * 2019-12-23 2020-05-12 跬云(上海)信息科技有限公司 动态流式预计算方法及装置、存储介质
CN112527839A (zh) * 2020-12-10 2021-03-19 上海浦东发展银行股份有限公司 多源数据处理方法、系统、设备及存储介质

Also Published As

Publication number Publication date
EP4170524A4 (en) 2023-10-25
US20230153308A1 (en) 2023-05-18
CN113918771A (zh) 2022-01-11
EP4170524A1 (en) 2023-04-26

Similar Documents

Publication Publication Date Title
TWI706259B (zh) 資料的查詢方法及查詢裝置
JP6434154B2 (ja) トランザクションアクセスパターンに基づいた結合関係の識別
KR102522274B1 (ko) 사용자 그룹화 방법 및 장치, 컴퓨터 장비, 컴퓨터 판독가능 저장 매체 및 컴퓨터 프로그램
US9189535B2 (en) Compensating for unbalanced hierarchies when generating OLAP queries from report specifications
WO2021068549A1 (zh) 一种数据处理方法、平台及系统
WO2022007592A1 (zh) 数据多维分析方法、装置及系统
US20110218978A1 (en) Operating on time sequences of data
WO2020042804A1 (zh) 一种数据库查询优化方法、装置、及计算机设备
TWI643076B (zh) 金融非結構化文本分析系統及其方法
US8219547B2 (en) Indirect database queries with large OLAP cubes
WO2023078120A1 (zh) 图数据的查询
US20140258312A1 (en) Insight determination and explanation in multi-dimensional data sets
US20080235181A1 (en) Query Expression Evaluation Using Sample Based Projected Selectivity
MX2013014800A (es) Recomendacion de enriquecimientos de datos.
CN111125266B (zh) 数据处理方法、装置、设备及存储介质
WO2019101119A1 (zh) 代价优化器与代价估计的方法及其设备
US9600559B2 (en) Data processing for database aggregation operation
WO2020199832A1 (zh) 用于预计算系统中查询引擎的动态路由方法及装置
WO2020088262A1 (zh) 数据分析方法、设备及存储介质
WO2022247443A1 (zh) 数据查询方法、装置、设备和存储介质
WO2023035355A1 (zh) 批流融合的信息处理方法和装置、存储介质
CN110675238A (zh) 客户标签配置方法、系统、可读存储介质及电子设备
CN108363741A (zh) 大数据统一接口方法、装置、设备及存储介质
WO2016112502A1 (zh) 存储查询结果的方法和装置、计算设备
CN114443615A (zh) 数据库管理系统、相关装置、方法和介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021930627

Country of ref document: EP

Effective date: 20220926

NENP Non-entry into the national phase

Ref country code: DE