CN106202408B - Data query server based on OLAP, system and method - Google Patents

Data query server based on OLAP, system and method Download PDF

Info

Publication number
CN106202408B
CN106202408B CN201610543412.9A CN201610543412A CN106202408B CN 106202408 B CN106202408 B CN 106202408B CN 201610543412 A CN201610543412 A CN 201610543412A CN 106202408 B CN106202408 B CN 106202408B
Authority
CN
China
Prior art keywords
data
module
inquiry request
request
cpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610543412.9A
Other languages
Chinese (zh)
Other versions
CN106202408A (en
Inventor
王桂兰
周国亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Electric Power University
Original Assignee
North China Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Electric Power University filed Critical North China Electric Power University
Priority to CN201610543412.9A priority Critical patent/CN106202408B/en
Publication of CN106202408A publication Critical patent/CN106202408A/en
Application granted granted Critical
Publication of CN106202408B publication Critical patent/CN106202408B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/24569Query processing with adaptation to specific hardware, e.g. adapted for using GPUs or SSDs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Abstract

The data query server that the present invention relates to a kind of based on OLAP, system and method, which includes CPU and GPU;The request module of CPU, for obtaining data inquiry request, data inquiry request includes dimension table information and true table information to be checked;Meta data block, for loading the corresponding dimension table of dimension table information to be checked;Dimensional filter module obtains the filtered key-value pair data indicated with key-value pair, and the key-value pair data is sent to GPU for being filtered to dimension table according to data inquiry request;The cube data module of GPU, for storing cube data corresponding with true table information;Cube filtering module, for filtered data to be obtained by filtration according to key-value pair data other side's volume data;Concentrating module obtains request data for filtered data to be grouped aggregation.By accelerating the inquiry of OLAP using GPU, the treatment effeciency of OLAP is effectively improved.

Description

Data query server based on OLAP, system and method
Technical field
The present invention relates to database technical fields, more particularly to a kind of data query server based on OLAP, system And method.
Background technique
OLAP (Online Analytical Processing, on-line analytical processing) is a kind of important data analysis hand Section, provides support for business decision.Continuous improvement with the continuous growth of data volume and user to performance requirement, performance are asked Topic can become more prominent.Meanwhile what OLAP usually analyzed is the historical data with certain timeliness, is no longer satisfied and works as Capture requirement for commercial enterprise to transient business opportunity, enterprise needs to analyze latest data rather than historical data.
Existing OLAP system carries out data query based on CPU.In past 10 years, universal cpu technology have very greatly into Exhibition, but the speed that performance improves is slower and slower, and single-threading program performance is largely restricted.These limitations one Aspect instruction level parallelism too low in general-purpose computations program, another aspect CPU be limited by power wall (Power Wall), Storage wall (Memory Wall) and frequency wall (Frequency Wall), performance are difficult to continue to improve.Processor will not be increasingly Fastly, but it is more and more wider.In the processor of current design, most of transistor be used to manufacture cache (Cache), without It is used as computing unit.Although power consumption of processing unit can be controlled in reasonable range by doing so, hinder performance into one Step improves, therefore, the requirement for being difficult to meet people's high efficiency query performance of the OLAP system based on CPU.
Summary of the invention
Based on this, it is necessary to provide a kind of high data query server based on OLAP of search efficiency, system and method.
A kind of data query server based on OLAP, including CPU and GPU;
The CPU includes request module, meta data block and dimensional filter module;The GPU includes cube filter module Block, concentrating module and cube data module;
The request module, for obtaining data inquiry request, the data inquiry request includes dimension to be checked Table information and true table information;
The meta data block, for loading dimension table corresponding with the dimension table information to be checked;
The dimensional filter module, for being filtered to the dimension table, after being filtered according to the data inquiry request The key-value pair data indicated with key-value pair, and the key-value pair data is sent to GPU;
The cube data module, for storing cube data corresponding with the fact table information;;
The cube filtering module, it is filtered for being obtained according to the key-value pair data to the cube data filtering Data;
The concentrating module obtains request data for being grouped aggregation to the filtered data.
In one embodiment, the dimension table information includes the hierarchical information of peacekeeping dimension;The request module includes Validator and resolver;The validator and the resolver are connect with the meta data block respectively;
The validator for obtaining service API, and verifies whether the service API meets specification and be encapsulated in institute Whether the hierarchical information for stating the peacekeeping dimension in service API is correct;
The resolver obtains described for being parsed after the validator is verified to the service API Data inquiry request.
In one embodiment, the type of the data inquiry request includes upper volume, lower brill, slice, stripping and slicing and rotation.
In one embodiment, the CPU further includes cache module and enquiry module,
The cache module, for caching historical query data;
Whether the enquiry module is stored with for when receiving data inquiry request, inquiring in the cache module Data corresponding with the data inquiry request;
Dimensional filter module quilt when the cache module is not stored data corresponding with the data inquiry request Starting;The request data that the concentrating module is also used to obtain packet aggregation is sent to the cache module and stores.
In one embodiment, the CPU further includes write-back module, for receiving write back request, and according to the write-back Request modifies to the request data that packet aggregation obtains.
A kind of data query system based on OLAP is looked into including expression layer, accumulation layer and the above-mentioned data based on OLAP Ask server;
The data inquiry request that user inputs is encapsulated as by the expression layer for providing data inquiry request input entrance Corresponding service API, and the service API is sent to CPU, and for showing the request data inquired;
The accumulation layer is used to store the related data of OLAP, and the related data includes dimension table and true table;The member Data module, for loading dimension table corresponding with the dimension table information to be checked from the accumulation layer;The cube data Module, for storing cube data corresponding with the fact table information.
A kind of data query method based on OLAP, comprising:
CPU obtains data inquiry request, and the data inquiry request includes dimension table information and true table information to be checked;
CPU loads dimension table corresponding with the dimension table information to be checked;
CPU is filtered the dimension table according to the data inquiry request, and obtain filtered is indicated with key-value pair Key-value pair data, and the key-value pair data is sent to GPU;
GPU is obtained according to the key-value pair data to cube data filtering corresponding with true table information is stored in GPU Filtered data;
GPU is grouped aggregation to the filtered data and obtains request data.
In one embodiment, the step of CPU acquisition data inquiry request includes:
CPU obtains service API;
CPU verifies the level letter whether the service API meets specification and the peacekeeping dimension being encapsulated in the service API It whether correct ceases;
When being verified, CPU parses the service API, obtains the data inquiry request.
In one embodiment, the type of the data inquiry request includes upper volume, lower brill, slice, stripping and slicing and rotation.
In one embodiment, described when being verified, the service API is parsed, data query is obtained After the step of request, further includes: whether be stored with data corresponding with the data inquiry request in CPU query caching;
If so, CPU returns to data corresponding with the data inquiry request from caching;
If it is not, then executing CPU according to the data inquiry request, the dimension table is filtered, obtain it is filtered with The key-value pair data that key-value pair indicates, and the step of key-value pair data is sent to GPU;
After the GPU is grouped the step of aggregation obtains request data to the filtered data, further includes:
The request data that GPU obtains packet aggregation is sent to CPU and is stored in the caching.
The above-mentioned data query server based on OLAP carries out dimension table at the end CPU when receiving data inquiry request Data filtering carries out true table data filtering and packet aggregation at the end GPU.Have the characteristics that high-performance is efficient due to GPU, By the computing resource using GPU, accelerates the inquiry of OLAP using GPU, effectively improve the treatment effeciency of OLAP.
Detailed description of the invention
Fig. 1 is the functional block diagram of the data query server based on OLAP of one embodiment;
The storage format of the fact that Fig. 2 is one embodiment table;
Fig. 3 is the functional block diagram of the data query server based on OLAP of another embodiment;
Fig. 4 is the functional block diagram of the data query system based on OLAP of one embodiment;
Fig. 5 is the flow chart of the data query method based on OLAP of one embodiment;
Fig. 6 is the flow chart of the data query method based on OLAP of another embodiment;
Fig. 7 is the processing operational process in the upper volume inquiry operation of one embodiment to where clause;
Fig. 8 is the processing operational process in the upper volume inquiry operation of one embodiment to another where clause;
Fig. 9 is in the upper volume inquiry operation of one embodiment to the process of the specification of integer field;
Figure 10 is the result data for meeting condition that obtains in the upper volume inquiry operation of one embodiment;
Figure 11 is the final result that obtains in the upper volume inquiry operation of one embodiment;
Figure 12 is the final result that the upper volume inquiry operation of one embodiment is shown in front end.
Specific embodiment
In order to which the purpose of the present invention, technical solution and advantage is more clearly understood, with reference to the accompanying drawings and embodiments, The present invention will be described in further detail.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, Do not limit the present invention.
In one embodiment, a kind of data query server based on OLAP is provided, which includes CPU (Central Processing Unit, central processing unit) and GPU (Graphics Processing Unit, graphics process Device).Since game player is to the rigors of graphical display, in recent years, the performance of GPU temporally increases by geometric progression, GPU from the dedicated parallel processor being made of several dedicated fixed-function units (Fixed Function Unit), into It has turned to based on general-purpose computational resources, the framework supplemented by fixed-function unit, referred to as GPGPU (General Purpose GPU, graphics processing unit).OLAP can accelerate its calculating as a kind of data and compute-intensive applications by GPU Some operations that amount is big, performance requirement is high, can effectively improve the treatment effeciency of OLAP.
As shown in Figure 1, the data query server 10 based on OLAP, including CPU12 and GPU14.CPU12 includes that request obtains Modulus block 120, meta data block 122 and dimensional filter module 124.GPU14 includes cube filtering module 140,142 and of concentrating module Cube data module 144.
Request module 120, for obtaining data inquiry request.Data inquiry request includes dimension table information to be checked With true table information.
Meta data block 122, for loading dimension table corresponding with dimension table information to be checked.
Dimension table contains the associated detailed information of specified attribute in true table, for example, the corresponding product dimension of detailed product Table, or the time dimension table temporally analyzed.Dimension table let others have a look at analyze the modes of data, and the basic element of composition cube, can With comprising many levels, each level has multiple members, it may for example comprise the dimension table of product information, which is generally comprised, is divided into food for product Information can be divided into the knot of different levels by several class hierarchies such as product, beverage, non-consumption product, the column field in dimension table Structure.
Dimension table is stored in a manner of columns group structural body, and in one embodiment, client's dimension be can store as structural body such as Under:
The hierarchical information of dimension is stored and is indicated by schema file.
Dimensional filter module 124 obtains filtered with key assignments for being filtered to dimension table according to data inquiry request GPU is sent to the key-value pair data of expression, and by key-value pair data.
Cube data module 144, for storing cube data corresponding with true table information.
True table is used to store true measurement and is directed toward the foreign key value of each dimension, is after data aggregate according to some dimension The result table of generation, true table storage is in a hard disk.When corresponding true table is queried for the first time, it is loaded to obtain from hard disk Corresponding cube data, cube data are stored in the global memory of GPU, are stored in a manner of Multidimensional numerical, the side of one embodiment Volume data is as shown in Figure 2.
Cube filtering module 140, for filtered data to be obtained by filtration according to key-value pair data other side's volume data.
Specifically, judging whether corresponding each element with dimension respective column in corresponding true table meets data query Corresponding position is set to 1 if meeting, is otherwise 0, to obtain filtered data by the requirement of inquiry request in request.
Concentrating module 142 obtains request data for being grouped aggregation to filtered data.
Packet aggregation is specially assemble filtered data according to the order of data inquiry request, and data are looked into The type for asking request include drill through, upper volume, slice, stripping and slicing and rotation.Concentrating module according to the type of corresponding inquiry request into The corresponding packet aggregation of row is to obtain request data.
The above-mentioned data query server based on OLAP carries out dimension table at the end CPU when receiving data inquiry request Data filtering carries out true table data filtering and packet aggregation at the end GPU.Have the characteristics that high-performance is efficient due to GPU, By the computing resource using GPU, accelerates the inquiry of OLAP using GPU, effectively improve the treatment effeciency of OLAP.
In another embodiment, dimension table information includes the hierarchical information of peacekeeping dimension.As shown in figure 3, request module 120 include validator 121 and resolver 123, and validator 121 and resolver 123 are connect with meta data block 122 respectively.
Validator 121, for obtaining service API, and whether service for checking credentials API meets specification and is encapsulated in service API In peacekeeping dimension hierarchical information it is whether correct.
Service API (Application Programming Interface, application programming interface) refers to that system is pre- The function first defined.Front end is encapsulated as corresponding service API in the inquiry request for receiving request, by inquiry request.
When validator 121 receives data inquiry request corresponding service API, whether service for checking credentials API meets specification, tool Body, whether the inquiry instruction corresponding with data inquiry request encapsulated in service for checking credentials API meets specification, the class of inquiry request Type includes upper volume, lower brill, slice, stripping and slicing and rotation, and whether the inquiry request encapsulated in 121 service for checking credentials API of validator meets Whether the requirement of corresponding inquiry request, query statement are wrong.
Information can be divided into the structure of different levels by the column field in dimension table, in 121 service for checking credentials API of validator Whether the hierarchical information of the corresponding dimension of peacekeeping in querying command is correct, i.e., the hierarchical information and dimension of the peacekeeping dimension in querying command Whether the information in table is consistent.If inconsistent, for querying command mistake, inquiry not can be carried out.
Resolver 122, for being parsed to service API, obtaining data inquiry request after validator is by verifying.
By being parsed to service API, so that data inquiry request is obtained, including dimension table information to be checked and the fact The type of table information and specific inquiry request.
Please continue to refer to Fig. 3, in another embodiment, CPU further includes cache module 125 and enquiry module 127.Caching Module 125 is for caching historical query data.Enquiry module 127, for when receiving data inquiry request, query caching mould Whether with data inquiry request corresponding data are stored in block 125.
Dimensional filter module 124 is activated when cache module 125 is not stored data corresponding with data inquiry request;It is poly- The request data that collection module 142 is also used to obtain packet aggregation is sent to cache module and stores.
Cache module 125 is cached with historical query data, and resolver 123 is looked into carrying out parsing to service API and obtain data After asking request, first inquire whether there are data corresponding with data inquiry request in the buffer.If so, then directly by cache module Corresponding data return in 125.
If in not stored in cache module 125 when data corresponding with data inquiry request, dimensional filter module is activated, root According to data inquiry request, dimension table is filtered, the filtered key-value pair data indicated with key-value pair is obtained, by key assignments logarithm According to being sent to GPU.
Meanwhile after the request data that 142 packet aggregation of the concentrating module of GPU obtains, also request data is sent to slow Storing module 125 is stored.
Although GPU computational efficiency with higher, start the data transmission of GPU program and GPU and CPU with higher Cost, the data query server based on OLAP of the present embodiment, by CPU be arranged for caching historical query data Cache module first searches whether to be stored in the cache module of CPU and data query when parsing obtains data inquiry request Corresponding data are requested, only when cache module is not stored there are data corresponding with data inquiry request, are just executed subsequent Inquiry operation improves the efficiency of data query so as to avoid computing repeatedly.
In another embodiment, CPU further includes write-back module 126, for receiving write back request, and according to write back request To the request data modification that packet aggregation obtains, the hypothesis of similar " what-if " is supported to analyze.
With OLAP Xiang Gengguang deeper into field apply, the inquiry of simple analytic type has been unable to meet user's decision support It is required that the inquiry such as Forecasting, Planning, Budgeting.Traditional OLAP to data manipulation based on reading, data into Row periodically updates.But user-driven plans class inquiry, needs to modify to data, such as What-if analysis.Therefore System provides write-back module 126, and user is allowed to carry out real time modifying to aggregation data.These not practical updates of modification data are arrived Basic data is concentrated, and is stored in write-back module, the query result of user by legitimate reading and user write-back group Conjunction provides.
In one embodiment, the data query system of OLAP a kind of is provided, as shown in figure 4, including expression layer 20, storage Layer 30 and the above-mentioned data query server 10 based on OLAP.
The data inquiry request that user inputs is encapsulated as by expression layer 20 for providing data inquiry request input entrance Corresponding service API, and service API is sent to CPU, and for showing the request data inquired.
In order to make system that there is stronger ease for use, operability, using EXCEL as the client of OLAP in the present embodiment End, EXCEL is easy to operate with its as a kind of widely applied Form Handle tool, powerful, obtains the blueness of user It looks at.But EXCEL has many inadequate natural endowments, and the advantage of EXCEL is that data are shown, but most of user use EXCEL as Data storage, when data volume is larger, when file is more, just will appear " the EXCEL HELL " often said.But if being EXCEL rear Platform adds a data management server, then can overcome these problems, further promotes the performance and ease for use of EXCEL.Together When, EXCEL have very strong programmability, pass through VBA or other developing plug tools, it is easy to extension EXCEL function, Make EXCEL as a front-end presentation tool of the data query system of OLAP, while the data that can use EXCEL again are shown Function.In other embodiments, also the mode of Web is supported to access.
Accumulation layer 30, for storing the related data of OLAP, related data includes dimension table and true table.Meta data block, For loading dimension table corresponding with dimension table information to be checked from accumulation layer;Cube data module, for storing and true table The corresponding cube data of information.
In one embodiment, data query server further includes ELT module, for obtaining number in real time from creation data According to, and it is converted into system data storage format.
ETL (Extract-Transform-Load) is filled, is updated the data pick-up of data warehouse, conversion, loading Process.In real-time OLAP, to keep OLAP data consistent with creation data as far as possible, so ETL tool is particularly important.
Real-time ETL either can generate very big pressure to Production database or OLAP system.Need creation data Library in real time sends over new data, and OLAP handles new data in real time.For this purpose, opening up one piece of memory sky in the memory of CPU Between, it is specifically used to store new increment data, when executing user query, GPU and CPU are handled simultaneously, but operate different data, Most latter two result merges, and generates final result.With the increase of incremental data, when data are more than a certain threshold value, OLAP will increase Refresh Data is measured into the basic data for being stored in GPU.
The data query system based on OLAP of the present embodiment, additionally provides api function and facilitates client program calls. API is mainly the encapsulation operated to common OLAP, specifically includes that volume, lower brill, stripping and slicing, slice and rotation and to What-if vacation If the support of analysis.
The function of upper volume (Roll-up) includes: Void**Rollup (string*aggfunction, string* Measures, string*groupbys, string*orderbys, string*tables, string*where), wherein Aggfunction is that upper volume operates corresponding aggregate function, such as sum, is supported multiple;Measures is that aggregate function is corresponding Estimate field;Groupbys is grouping field;Orderbys is sort field;The table that Tables is related generally to;Wheres is main The WHERE condition being related to include (=,>,<,>=,≤, between, in) pass through AND connection between condition.
Lower brill (Drill-down), does not store aggregation data in OLAP system, obtains aggregation data by calculating in real time, Drill down operator is a upper volume operation.The function of lower brill includes:
Void**drilldown(string*aggfunction,string*measures,string*drilldown)。 Wherein, Aggfunction is lower brill function;Measures, which is that lower brill is corresponding, estimates field;Drilldown is lower brill Dimensional level Information, i.e. lower bore carry out in that value of that layer of that dimension.
Stripping and slicing (Dicing) operation is exactly to obtain a part of data, is equivalent to where condition.The function of stripping and slicing includes Void**Dicing (string*measures, string*dimensions1, string*dimensions2), wherein Measures is to need what is obtained to estimate field;Dimensions1 is dimension attribute combination-lower bound of stripping and slicing;Dimensions2 is Dimension attribute combination-upper bound of stripping and slicing.
Slice (Slicing) operation is the special case of dicing operation, and stripping and slicing is the range filter carried out in multiple dimensions, slice It is the equivalent filtering in a dimension.The function of slice includes: Void**Dicing (string*measures, string* dimensions);Wherein, measures is to need what is obtained to estimate field;Dimensions is the dimension attribute combination of slice.
The above-mentioned data query system based on OLAP carries out dimension table number at the end CPU when receiving data inquiry request According to filtering, true table data filtering and packet aggregation are carried out at the end GPU.Have the characteristics that high-performance is efficient due to GPU, leads to The computing resource using GPU is crossed, accelerates the inquiry of OLAP using GPU, effectively improves the treatment effeciency of OLAP.
In one embodiment, a kind of data query method based on OLAP, data of this method based on OLAP are also provided The realization of query service device, as shown in figure 5, method includes the following steps:
S502:CPU obtains data inquiry request, and data inquiry request includes dimension table information to be checked and true table letter Breath.
S504:CPU loads dimension table corresponding with dimension table information to be checked.
S506:CPU is filtered dimension table according to data inquiry request, obtains the filtered key indicated with key-value pair Value is sent to GPU to data, and by key-value pair data.
S508:GPU is obtained according to key-value pair data to cube data filtering corresponding with true table information is stored in GPU Filtered data.
Specifically, judging whether corresponding each element with dimension respective column in corresponding true table meets data query Corresponding position is set to 1 if meeting, is otherwise 0, to obtain filtered data by the requirement of inquiry request in request.
S510:GPU is grouped aggregation to filtered data and obtains request data.
Packet aggregation is specially assemble filtered data according to the order of data inquiry request, and data are looked into The type for asking request include drill through, upper volume, slice, stripping and slicing and rotation.Concentrating module according to the type of corresponding inquiry request into The corresponding packet aggregation of row is to obtain request data.
The above-mentioned data query method based on OLAP carries out dimension table number at the end CPU when receiving data inquiry request According to filtering, true table data filtering and packet aggregation are carried out at the end GPU.Have the characteristics that high-performance is efficient due to GPU, leads to The computing resource using GPU is crossed, accelerates the inquiry of OLAP using GPU, effectively improves the treatment effeciency of OLAP.
As shown in fig. 6, step S502 the following steps are included:
S5021:CPU obtains service API.
Service API (Application Programming Interface, application programming interface) refers to that system is pre- The function first defined.Front end is encapsulated as corresponding service API in the inquiry request for receiving request, by inquiry request.
Whether S5022:CPU service for checking credentials API meets specification and is encapsulated in the hierarchical information for servicing the dimension of the peacekeeping in API It is whether correct.If being verified, S5023 is thened follow the steps.
When receiving the corresponding service API of data inquiry request, whether service for checking credentials API meets specification, specifically, verifying Whether the inquiry instruction corresponding with data inquiry request encapsulated in service API meets specification, and the type of inquiry request includes upper Volume, it is lower bore, slice, stripping and slicing and rotation, whether the inquiry request encapsulated in service for checking credentials API meet wanting for corresponding inquiry request It asks, whether query statement is wrong.
Information can be divided into the structure of different levels, the querying command in service for checking credentials API by the column field in dimension table In the corresponding dimension of peacekeeping hierarchical information it is whether correct, i.e., the letter in the hierarchical information and dimension table of peacekeeping in querying command dimension It whether consistent ceases.If inconsistent, for querying command mistake, inquiry not can be carried out.
S5023:CPU parses service API, obtains data inquiry request.
By being parsed to service API, so that data inquiry request is obtained, including dimension table information to be checked and the fact Table information and the inquiry requests such as drill through.
Please continue to refer to Fig. 6, after step S5023, further includes:
Whether with data inquiry request corresponding data are stored in S5024:CPU query caching.If so, thening follow the steps S5025, if it is not, thening follow the steps S504.
S5025:CPU returns to data corresponding with data inquiry request from caching.
Historical query data are cached in caching, after carrying out parsing to service API and obtaining data inquiry request, first slow Deposit whether middle inquiry there are data corresponding with data inquiry request.If so, then directly data corresponding in cache module 125 are returned It returns.
If in not stored in cache module when data corresponding with data inquiry request, CPU is asked according to the data query It asks, the dimension table is filtered, obtain the filtered key-value pair data indicated with key-value pair, and by the key-value pair data It is sent to GPU.
Meanwhile after step S506, further includes: the request data that S512:GPU obtains packet aggregation is sent to CPU And it stores in the buffer.
Although GPU computational efficiency with higher, start the data transmission of GPU program and GPU and CPU with higher Cost, the data query server based on OLAP of the present embodiment, by CPU be arranged for caching historical query data Cache module first searches whether to be stored in the cache module of CPU and data query when parsing obtains data inquiry request Corresponding data are requested, only when cache module is not stored there are data corresponding with data inquiry request, are just executed subsequent Inquiry operation improves the efficiency of data query so as to avoid computing repeatedly.
Specifically, the type of inquiry inquiry request includes upper volume, lower brill, slice, stripping and slicing and rotation.Wherein, lower brill Drill-down is a kind of special circumstances of upper volume operation, and drill down operator is converted to roll-up and operated by OLAP system.
Upper volume (Roll-up), is mainly exactly aggregation operator, the Group by operation in similar sql like language.
The inquiry request command of one embodiment is as follows:
select sum(lo_revenue),d_year,p_brand1
from lineorder,date,part,supplier
Where lo_orderdate=d_datekey
And lo_partkey=p_partkey
And lo_suppkey=s_suppkey
and p_brand1between'MFGR#2221'and'MFGR#2228'
And s_region='ASIA'
group by d_year,p_brand1
order by d_year,p_brand1。
Specifically, according to inquiry request, inquiry the following steps are included:
S1:CPU is obtained in part table according to p_brand1between'MFGR#2221'and'MFGR#2228' condition The set { 1114,1587,3631,4378 ... } of p_partkey.Operational process is as shown in Figure 7.
S2: according to s_region='ASIA' condition, obtained in supplier table the value of s_suppkey set 11, 12,15,23,26,27 ... }, effect as shown in Figure 8
S3: by two set and (datakey, d_year), (partkey, p_brand1) two Key-Value tables from CPU is sent to GPU.It needs to carry out specification to non-integer type-word section in this step.For example p_brand1 is one here Character data needs to carry out coded treatment to it.Data conversion process and key-value pair form such as Fig. 9 of transmission show
S4:GPU starts kernel program, completes data filtering, filter algorithm is approximately, to the true table being stored on GPU Lo_revenue, d_year and p_brand1 are filtered.It is to each element in p_partkey and s_suppkey two column It is no to meet the requirements, if meeting home position 1, otherwise set 0.Then by prefix and scanning, filtered data are obtained.Process As shown in Figure 10
S5: filtered d_year and p_brand1 two column are pressed into certain linearization process.Then it sorts, prefix scanning, Reduction and etc., finally obtain result.Process is as shown in figure 11
S6: by the inverse linearisation of linearisation ordering structure, and being returned to the end CPU from the end GPU for result, finally aobvious in front end Show, as shown in figure 12.
By the above-mentioned means, avoiding the star-like Join operation of complicated and time consumption, while GPU is utilized to carry out Aggregation computation Ability.
Stripping and slicing dicing and slice slicing is specially to obtain a part of data.It is equivalent to processing where condition.
The querying command of the stripping and slicing of one embodiment are as follows:
Select d_datekey,lo_revenue
from lineorder,supplier
Where lo_suppkey=s_suppkey
And s_region='ASIA';
Specifically, according to inquiry request, inquiry the following steps are included:
S1: the set of corresponding s_suppkey is obtained according to condition s_region='ASIA'.
S2: per thread judges whether in set the s_suppkey value arranged, after being considered as sequential search or sequence Binary search.
S3: obtained set is sent to GPU.
S4: a bit vector is generated, 1 is set in the vector corresponding position for the condition that meets, otherwise sets 0.
S5: to bit vector prefix sum, final structure size n is determined.
S6: restarting GPU, and distribute the space 2n.
S7: by d_datekey, lo_revenue respective flag position is 1 to be put into new space.
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.

Claims (10)

1. a kind of data query server based on OLAP, including CPU and GPU;
The CPU includes request module, meta data block and dimensional filter module;The GPU includes cube filtering module, gathers Collect module and cube data module;
The request module, for obtaining data inquiry request, the data inquiry request includes dimension table letter to be checked Breath and true table information;
The meta data block, for loading dimension table corresponding with the dimension table information to be checked, the dimension table includes the fact The associated detailed information of specified attribute in table;
The dimensional filter module is filtered the dimension table for according to the data inquiry request, obtain it is filtered with The key-value pair data that key-value pair indicates, and the key-value pair data is sent to GPU;
The cube data module, for storing cube data corresponding with the fact table information;
The cube filtering module, for obtaining filtered number to the cube data filtering according to the key-value pair data According to;
The concentrating module obtains request data for being grouped aggregation to the filtered data.
2. the data query server according to claim 1 based on OLAP, which is characterized in that the dimension table information includes The hierarchical information of peacekeeping dimension;The request module includes validator and resolver;The validator and the resolver point It is not connect with the meta data block;
The validator for obtaining service API, and verifies whether the service API meets specification and be encapsulated in the clothes Whether the hierarchical information for the peacekeeping dimension being engaged in API is correct;
The resolver, for being parsed to the service API, obtaining the data after the validator is verified Inquiry request.
3. the data query server according to claim 2 based on OLAP, which is characterized in that the data inquiry request Type include upper volume, it is lower bore, slice, stripping and slicing and rotation.
4. the data query server according to claim 2 based on OLAP, which is characterized in that the CPU further includes delaying Storing module and enquiry module,
The cache module, for caching historical query data;
Whether the enquiry module is stored with and institute for when receiving data inquiry request, inquiring in the cache module State the corresponding data of data inquiry request;
The dimensional filter module is activated when the cache module is not stored data corresponding with the data inquiry request; The request data that the concentrating module is also used to obtain packet aggregation is sent to the cache module and stores.
5. the data query server according to claim 1 based on OLAP, which is characterized in that the CPU further includes back Writing module repairs the request data that packet aggregation obtains for receiving write back request, and according to the write back request Change.
6. a kind of data query system based on OLAP, including expression layer, accumulation layer and as described in any one of claim 1 to 5 The data query server based on OLAP;
The data inquiry request that user inputs is encapsulated as corresponding to by the expression layer for providing data inquiry request input entrance Service API, and the service API is sent to CPU, and for showing the request data inquired;
The accumulation layer is used to store the related data of OLAP, and the related data includes dimension table and true table;The metadata Module, for loading dimension table corresponding with the dimension table information to be checked from the accumulation layer;The cube data module, For storing cube data corresponding with the fact table information.
7. a kind of data query method based on OLAP, comprising:
CPU obtains data inquiry request, and the data inquiry request includes dimension table information and true table information to be checked;
CPU loads dimension table corresponding with the dimension table information to be checked, and the dimension table includes the phase of specified attribute in true table Close details;
CPU is filtered the dimension table according to the data inquiry request, obtains the filtered key assignments indicated with key-value pair GPU is sent to data, and by the key-value pair data;
GPU is filtered according to the key-value pair data to cube data filtering corresponding with true table information is stored in GPU Data afterwards;
GPU is grouped aggregation to the filtered data and obtains request data.
8. the data query method according to claim 7 based on OLAP, which is characterized in that the CPU obtains data and looks into Asking the step of requesting includes:
CPU obtains service API;
CPU verifies whether the service API meets specification and the hierarchical information for the peacekeeping dimension being encapsulated in the service API is It is no correct;
When being verified, CPU parses the service API, obtains the data inquiry request.
9. the data query method according to claim 8 based on OLAP, which is characterized in that the data inquiry request Type includes upper volume, lower brill, slice, stripping and slicing and rotation.
10. the data query method according to claim 8 based on OLAP, which is characterized in that ought be verified described When, after the step of parsing to the service API, obtain data inquiry request, further includes: in CPU query caching whether It is stored with data corresponding with the data inquiry request;
If so, CPU returns to data corresponding with the data inquiry request from caching;
If it is not, then executing CPU according to the data inquiry request, the dimension table is filtered, is obtained filtered with key assignments To the key-value pair data of expression, and the step of key-value pair data is sent to GPU;
After the GPU is grouped the step of aggregation obtains request data to the filtered data, further includes:
The request data that GPU obtains packet aggregation is sent to CPU and is stored in the caching.
CN201610543412.9A 2016-07-11 2016-07-11 Data query server based on OLAP, system and method Expired - Fee Related CN106202408B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610543412.9A CN106202408B (en) 2016-07-11 2016-07-11 Data query server based on OLAP, system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610543412.9A CN106202408B (en) 2016-07-11 2016-07-11 Data query server based on OLAP, system and method

Publications (2)

Publication Number Publication Date
CN106202408A CN106202408A (en) 2016-12-07
CN106202408B true CN106202408B (en) 2019-10-18

Family

ID=57476968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610543412.9A Expired - Fee Related CN106202408B (en) 2016-07-11 2016-07-11 Data query server based on OLAP, system and method

Country Status (1)

Country Link
CN (1) CN106202408B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165248B (en) * 2018-01-29 2019-09-03 北京数聚鑫云信息技术有限公司 A kind of management system and management method based on API
CN110442653B (en) * 2019-07-03 2023-09-29 平安科技(深圳)有限公司 Method, device, server and storage medium for incrementally constructing CUBE model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN104246717A (en) * 2012-05-08 2014-12-24 文雅科一番株式会社 Data processing system, server, client, and program for managing data
CN104866608A (en) * 2015-06-05 2015-08-26 中国人民大学 Query optimization method based on join index in data warehouse

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8965866B2 (en) * 2009-12-17 2015-02-24 Business Objects Software Limited Optimizing data transfer time on graphics processor units
US10353923B2 (en) * 2014-04-24 2019-07-16 Ebay Inc. Hadoop OLAP engine

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104246717A (en) * 2012-05-08 2014-12-24 文雅科一番株式会社 Data processing system, server, client, and program for managing data
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN104866608A (en) * 2015-06-05 2015-08-26 中国人民大学 Query optimization method based on join index in data warehouse

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GPU-Based Aggregation of On-Line Analytical Processing;Guilan Wang et al.;《Communications and Information Processing》;20121231;第234-245页 *

Also Published As

Publication number Publication date
CN106202408A (en) 2016-12-07

Similar Documents

Publication Publication Date Title
KR102627690B1 (en) Dimensional context propagation techniques for optimizing SKB query plans
US20230084389A1 (en) System and method for providing bottom-up aggregation in a multidimensional database environment
US10664497B2 (en) Hybrid database table stored as both row and column store
US20190272265A1 (en) Hybrid Database Table Stored As Both Row and Column Store
US10565200B2 (en) Conversion of model views into relational models
CN103177062B (en) The acceleration inquiry operation device inquired about and operated for high-speed internal memory Data Environments
US8537160B2 (en) Generating distributed dataflow graphs
US8768927B2 (en) Hybrid database table stored as both row and column store
US20140074771A1 (en) Query optimization
US9348874B2 (en) Dynamic recreation of multidimensional analytical data
US9146979B2 (en) Optimization of business warehouse queries by calculation engines
CN103678665A (en) Heterogeneous large data integration method and system based on data warehouses
US11803865B2 (en) Graph based processing of multidimensional hierarchical data
US20230315727A1 (en) Cost-based query optimization for untyped fields in database systems
CN106202408B (en) Data query server based on OLAP, system and method
US9229968B2 (en) Management of searches in a database system
CN107391528A (en) Front end assemblies Dependency Specification searching method and equipment
CN105786948A (en) OLAP system based on GPU
KR20180077830A (en) Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method
CN105243063A (en) Information recommendation method and device
Real et al. Full Speed Ahead: 3D Spatial Database Acceleration with GPUs
Rajadnye Is Datawarehouse Relevant in the Era of Big Data?
Jun et al. Research on In-Memory Computing Model and Data Analysis
Nanda Performance enhancement techniques of cloud database queries

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191018

Termination date: 20210711