CN106202408B - Data query server based on OLAP, system and method - Google Patents
Data query server based on OLAP, system and method Download PDFInfo
- Publication number
- CN106202408B CN106202408B CN201610543412.9A CN201610543412A CN106202408B CN 106202408 B CN106202408 B CN 106202408B CN 201610543412 A CN201610543412 A CN 201610543412A CN 106202408 B CN106202408 B CN 106202408B
- Authority
- CN
- China
- Prior art keywords
- data
- module
- inquiry request
- request
- cpu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/24569—Query processing with adaptation to specific hardware, e.g. adapted for using GPUs or SSDs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
Abstract
The data query server that the present invention relates to a kind of based on OLAP, system and method, which includes CPU and GPU;The request module of CPU, for obtaining data inquiry request, data inquiry request includes dimension table information and true table information to be checked;Meta data block, for loading the corresponding dimension table of dimension table information to be checked;Dimensional filter module obtains the filtered key-value pair data indicated with key-value pair, and the key-value pair data is sent to GPU for being filtered to dimension table according to data inquiry request;The cube data module of GPU, for storing cube data corresponding with true table information;Cube filtering module, for filtered data to be obtained by filtration according to key-value pair data other side's volume data;Concentrating module obtains request data for filtered data to be grouped aggregation.By accelerating the inquiry of OLAP using GPU, the treatment effeciency of OLAP is effectively improved.
Description
Technical field
The present invention relates to database technical fields, more particularly to a kind of data query server based on OLAP, system
And method.
Background technique
OLAP (Online Analytical Processing, on-line analytical processing) is a kind of important data analysis hand
Section, provides support for business decision.Continuous improvement with the continuous growth of data volume and user to performance requirement, performance are asked
Topic can become more prominent.Meanwhile what OLAP usually analyzed is the historical data with certain timeliness, is no longer satisfied and works as
Capture requirement for commercial enterprise to transient business opportunity, enterprise needs to analyze latest data rather than historical data.
Existing OLAP system carries out data query based on CPU.In past 10 years, universal cpu technology have very greatly into
Exhibition, but the speed that performance improves is slower and slower, and single-threading program performance is largely restricted.These limitations one
Aspect instruction level parallelism too low in general-purpose computations program, another aspect CPU be limited by power wall (Power Wall),
Storage wall (Memory Wall) and frequency wall (Frequency Wall), performance are difficult to continue to improve.Processor will not be increasingly
Fastly, but it is more and more wider.In the processor of current design, most of transistor be used to manufacture cache (Cache), without
It is used as computing unit.Although power consumption of processing unit can be controlled in reasonable range by doing so, hinder performance into one
Step improves, therefore, the requirement for being difficult to meet people's high efficiency query performance of the OLAP system based on CPU.
Summary of the invention
Based on this, it is necessary to provide a kind of high data query server based on OLAP of search efficiency, system and method.
A kind of data query server based on OLAP, including CPU and GPU;
The CPU includes request module, meta data block and dimensional filter module;The GPU includes cube filter module
Block, concentrating module and cube data module;
The request module, for obtaining data inquiry request, the data inquiry request includes dimension to be checked
Table information and true table information;
The meta data block, for loading dimension table corresponding with the dimension table information to be checked;
The dimensional filter module, for being filtered to the dimension table, after being filtered according to the data inquiry request
The key-value pair data indicated with key-value pair, and the key-value pair data is sent to GPU;
The cube data module, for storing cube data corresponding with the fact table information;;
The cube filtering module, it is filtered for being obtained according to the key-value pair data to the cube data filtering
Data;
The concentrating module obtains request data for being grouped aggregation to the filtered data.
In one embodiment, the dimension table information includes the hierarchical information of peacekeeping dimension;The request module includes
Validator and resolver;The validator and the resolver are connect with the meta data block respectively;
The validator for obtaining service API, and verifies whether the service API meets specification and be encapsulated in institute
Whether the hierarchical information for stating the peacekeeping dimension in service API is correct;
The resolver obtains described for being parsed after the validator is verified to the service API
Data inquiry request.
In one embodiment, the type of the data inquiry request includes upper volume, lower brill, slice, stripping and slicing and rotation.
In one embodiment, the CPU further includes cache module and enquiry module,
The cache module, for caching historical query data;
Whether the enquiry module is stored with for when receiving data inquiry request, inquiring in the cache module
Data corresponding with the data inquiry request;
Dimensional filter module quilt when the cache module is not stored data corresponding with the data inquiry request
Starting;The request data that the concentrating module is also used to obtain packet aggregation is sent to the cache module and stores.
In one embodiment, the CPU further includes write-back module, for receiving write back request, and according to the write-back
Request modifies to the request data that packet aggregation obtains.
A kind of data query system based on OLAP is looked into including expression layer, accumulation layer and the above-mentioned data based on OLAP
Ask server;
The data inquiry request that user inputs is encapsulated as by the expression layer for providing data inquiry request input entrance
Corresponding service API, and the service API is sent to CPU, and for showing the request data inquired;
The accumulation layer is used to store the related data of OLAP, and the related data includes dimension table and true table;The member
Data module, for loading dimension table corresponding with the dimension table information to be checked from the accumulation layer;The cube data
Module, for storing cube data corresponding with the fact table information.
A kind of data query method based on OLAP, comprising:
CPU obtains data inquiry request, and the data inquiry request includes dimension table information and true table information to be checked;
CPU loads dimension table corresponding with the dimension table information to be checked;
CPU is filtered the dimension table according to the data inquiry request, and obtain filtered is indicated with key-value pair
Key-value pair data, and the key-value pair data is sent to GPU;
GPU is obtained according to the key-value pair data to cube data filtering corresponding with true table information is stored in GPU
Filtered data;
GPU is grouped aggregation to the filtered data and obtains request data.
In one embodiment, the step of CPU acquisition data inquiry request includes:
CPU obtains service API;
CPU verifies the level letter whether the service API meets specification and the peacekeeping dimension being encapsulated in the service API
It whether correct ceases;
When being verified, CPU parses the service API, obtains the data inquiry request.
In one embodiment, the type of the data inquiry request includes upper volume, lower brill, slice, stripping and slicing and rotation.
In one embodiment, described when being verified, the service API is parsed, data query is obtained
After the step of request, further includes: whether be stored with data corresponding with the data inquiry request in CPU query caching;
If so, CPU returns to data corresponding with the data inquiry request from caching;
If it is not, then executing CPU according to the data inquiry request, the dimension table is filtered, obtain it is filtered with
The key-value pair data that key-value pair indicates, and the step of key-value pair data is sent to GPU;
After the GPU is grouped the step of aggregation obtains request data to the filtered data, further includes:
The request data that GPU obtains packet aggregation is sent to CPU and is stored in the caching.
The above-mentioned data query server based on OLAP carries out dimension table at the end CPU when receiving data inquiry request
Data filtering carries out true table data filtering and packet aggregation at the end GPU.Have the characteristics that high-performance is efficient due to GPU,
By the computing resource using GPU, accelerates the inquiry of OLAP using GPU, effectively improve the treatment effeciency of OLAP.
Detailed description of the invention
Fig. 1 is the functional block diagram of the data query server based on OLAP of one embodiment;
The storage format of the fact that Fig. 2 is one embodiment table;
Fig. 3 is the functional block diagram of the data query server based on OLAP of another embodiment;
Fig. 4 is the functional block diagram of the data query system based on OLAP of one embodiment;
Fig. 5 is the flow chart of the data query method based on OLAP of one embodiment;
Fig. 6 is the flow chart of the data query method based on OLAP of another embodiment;
Fig. 7 is the processing operational process in the upper volume inquiry operation of one embodiment to where clause;
Fig. 8 is the processing operational process in the upper volume inquiry operation of one embodiment to another where clause;
Fig. 9 is in the upper volume inquiry operation of one embodiment to the process of the specification of integer field;
Figure 10 is the result data for meeting condition that obtains in the upper volume inquiry operation of one embodiment;
Figure 11 is the final result that obtains in the upper volume inquiry operation of one embodiment;
Figure 12 is the final result that the upper volume inquiry operation of one embodiment is shown in front end.
Specific embodiment
In order to which the purpose of the present invention, technical solution and advantage is more clearly understood, with reference to the accompanying drawings and embodiments,
The present invention will be described in further detail.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention,
Do not limit the present invention.
In one embodiment, a kind of data query server based on OLAP is provided, which includes CPU
(Central Processing Unit, central processing unit) and GPU (Graphics Processing Unit, graphics process
Device).Since game player is to the rigors of graphical display, in recent years, the performance of GPU temporally increases by geometric progression,
GPU from the dedicated parallel processor being made of several dedicated fixed-function units (Fixed Function Unit), into
It has turned to based on general-purpose computational resources, the framework supplemented by fixed-function unit, referred to as GPGPU (General Purpose
GPU, graphics processing unit).OLAP can accelerate its calculating as a kind of data and compute-intensive applications by GPU
Some operations that amount is big, performance requirement is high, can effectively improve the treatment effeciency of OLAP.
As shown in Figure 1, the data query server 10 based on OLAP, including CPU12 and GPU14.CPU12 includes that request obtains
Modulus block 120, meta data block 122 and dimensional filter module 124.GPU14 includes cube filtering module 140,142 and of concentrating module
Cube data module 144.
Request module 120, for obtaining data inquiry request.Data inquiry request includes dimension table information to be checked
With true table information.
Meta data block 122, for loading dimension table corresponding with dimension table information to be checked.
Dimension table contains the associated detailed information of specified attribute in true table, for example, the corresponding product dimension of detailed product
Table, or the time dimension table temporally analyzed.Dimension table let others have a look at analyze the modes of data, and the basic element of composition cube, can
With comprising many levels, each level has multiple members, it may for example comprise the dimension table of product information, which is generally comprised, is divided into food for product
Information can be divided into the knot of different levels by several class hierarchies such as product, beverage, non-consumption product, the column field in dimension table
Structure.
Dimension table is stored in a manner of columns group structural body, and in one embodiment, client's dimension be can store as structural body such as
Under:
The hierarchical information of dimension is stored and is indicated by schema file.
Dimensional filter module 124 obtains filtered with key assignments for being filtered to dimension table according to data inquiry request
GPU is sent to the key-value pair data of expression, and by key-value pair data.
Cube data module 144, for storing cube data corresponding with true table information.
True table is used to store true measurement and is directed toward the foreign key value of each dimension, is after data aggregate according to some dimension
The result table of generation, true table storage is in a hard disk.When corresponding true table is queried for the first time, it is loaded to obtain from hard disk
Corresponding cube data, cube data are stored in the global memory of GPU, are stored in a manner of Multidimensional numerical, the side of one embodiment
Volume data is as shown in Figure 2.
Cube filtering module 140, for filtered data to be obtained by filtration according to key-value pair data other side's volume data.
Specifically, judging whether corresponding each element with dimension respective column in corresponding true table meets data query
Corresponding position is set to 1 if meeting, is otherwise 0, to obtain filtered data by the requirement of inquiry request in request.
Concentrating module 142 obtains request data for being grouped aggregation to filtered data.
Packet aggregation is specially assemble filtered data according to the order of data inquiry request, and data are looked into
The type for asking request include drill through, upper volume, slice, stripping and slicing and rotation.Concentrating module according to the type of corresponding inquiry request into
The corresponding packet aggregation of row is to obtain request data.
The above-mentioned data query server based on OLAP carries out dimension table at the end CPU when receiving data inquiry request
Data filtering carries out true table data filtering and packet aggregation at the end GPU.Have the characteristics that high-performance is efficient due to GPU,
By the computing resource using GPU, accelerates the inquiry of OLAP using GPU, effectively improve the treatment effeciency of OLAP.
In another embodiment, dimension table information includes the hierarchical information of peacekeeping dimension.As shown in figure 3, request module
120 include validator 121 and resolver 123, and validator 121 and resolver 123 are connect with meta data block 122 respectively.
Validator 121, for obtaining service API, and whether service for checking credentials API meets specification and is encapsulated in service API
In peacekeeping dimension hierarchical information it is whether correct.
Service API (Application Programming Interface, application programming interface) refers to that system is pre-
The function first defined.Front end is encapsulated as corresponding service API in the inquiry request for receiving request, by inquiry request.
When validator 121 receives data inquiry request corresponding service API, whether service for checking credentials API meets specification, tool
Body, whether the inquiry instruction corresponding with data inquiry request encapsulated in service for checking credentials API meets specification, the class of inquiry request
Type includes upper volume, lower brill, slice, stripping and slicing and rotation, and whether the inquiry request encapsulated in 121 service for checking credentials API of validator meets
Whether the requirement of corresponding inquiry request, query statement are wrong.
Information can be divided into the structure of different levels by the column field in dimension table, in 121 service for checking credentials API of validator
Whether the hierarchical information of the corresponding dimension of peacekeeping in querying command is correct, i.e., the hierarchical information and dimension of the peacekeeping dimension in querying command
Whether the information in table is consistent.If inconsistent, for querying command mistake, inquiry not can be carried out.
Resolver 122, for being parsed to service API, obtaining data inquiry request after validator is by verifying.
By being parsed to service API, so that data inquiry request is obtained, including dimension table information to be checked and the fact
The type of table information and specific inquiry request.
Please continue to refer to Fig. 3, in another embodiment, CPU further includes cache module 125 and enquiry module 127.Caching
Module 125 is for caching historical query data.Enquiry module 127, for when receiving data inquiry request, query caching mould
Whether with data inquiry request corresponding data are stored in block 125.
Dimensional filter module 124 is activated when cache module 125 is not stored data corresponding with data inquiry request;It is poly-
The request data that collection module 142 is also used to obtain packet aggregation is sent to cache module and stores.
Cache module 125 is cached with historical query data, and resolver 123 is looked into carrying out parsing to service API and obtain data
After asking request, first inquire whether there are data corresponding with data inquiry request in the buffer.If so, then directly by cache module
Corresponding data return in 125.
If in not stored in cache module 125 when data corresponding with data inquiry request, dimensional filter module is activated, root
According to data inquiry request, dimension table is filtered, the filtered key-value pair data indicated with key-value pair is obtained, by key assignments logarithm
According to being sent to GPU.
Meanwhile after the request data that 142 packet aggregation of the concentrating module of GPU obtains, also request data is sent to slow
Storing module 125 is stored.
Although GPU computational efficiency with higher, start the data transmission of GPU program and GPU and CPU with higher
Cost, the data query server based on OLAP of the present embodiment, by CPU be arranged for caching historical query data
Cache module first searches whether to be stored in the cache module of CPU and data query when parsing obtains data inquiry request
Corresponding data are requested, only when cache module is not stored there are data corresponding with data inquiry request, are just executed subsequent
Inquiry operation improves the efficiency of data query so as to avoid computing repeatedly.
In another embodiment, CPU further includes write-back module 126, for receiving write back request, and according to write back request
To the request data modification that packet aggregation obtains, the hypothesis of similar " what-if " is supported to analyze.
With OLAP Xiang Gengguang deeper into field apply, the inquiry of simple analytic type has been unable to meet user's decision support
It is required that the inquiry such as Forecasting, Planning, Budgeting.Traditional OLAP to data manipulation based on reading, data into
Row periodically updates.But user-driven plans class inquiry, needs to modify to data, such as What-if analysis.Therefore
System provides write-back module 126, and user is allowed to carry out real time modifying to aggregation data.These not practical updates of modification data are arrived
Basic data is concentrated, and is stored in write-back module, the query result of user by legitimate reading and user write-back group
Conjunction provides.
In one embodiment, the data query system of OLAP a kind of is provided, as shown in figure 4, including expression layer 20, storage
Layer 30 and the above-mentioned data query server 10 based on OLAP.
The data inquiry request that user inputs is encapsulated as by expression layer 20 for providing data inquiry request input entrance
Corresponding service API, and service API is sent to CPU, and for showing the request data inquired.
In order to make system that there is stronger ease for use, operability, using EXCEL as the client of OLAP in the present embodiment
End, EXCEL is easy to operate with its as a kind of widely applied Form Handle tool, powerful, obtains the blueness of user
It looks at.But EXCEL has many inadequate natural endowments, and the advantage of EXCEL is that data are shown, but most of user use EXCEL as
Data storage, when data volume is larger, when file is more, just will appear " the EXCEL HELL " often said.But if being EXCEL rear
Platform adds a data management server, then can overcome these problems, further promotes the performance and ease for use of EXCEL.Together
When, EXCEL have very strong programmability, pass through VBA or other developing plug tools, it is easy to extension EXCEL function,
Make EXCEL as a front-end presentation tool of the data query system of OLAP, while the data that can use EXCEL again are shown
Function.In other embodiments, also the mode of Web is supported to access.
Accumulation layer 30, for storing the related data of OLAP, related data includes dimension table and true table.Meta data block,
For loading dimension table corresponding with dimension table information to be checked from accumulation layer;Cube data module, for storing and true table
The corresponding cube data of information.
In one embodiment, data query server further includes ELT module, for obtaining number in real time from creation data
According to, and it is converted into system data storage format.
ETL (Extract-Transform-Load) is filled, is updated the data pick-up of data warehouse, conversion, loading
Process.In real-time OLAP, to keep OLAP data consistent with creation data as far as possible, so ETL tool is particularly important.
Real-time ETL either can generate very big pressure to Production database or OLAP system.Need creation data
Library in real time sends over new data, and OLAP handles new data in real time.For this purpose, opening up one piece of memory sky in the memory of CPU
Between, it is specifically used to store new increment data, when executing user query, GPU and CPU are handled simultaneously, but operate different data,
Most latter two result merges, and generates final result.With the increase of incremental data, when data are more than a certain threshold value, OLAP will increase
Refresh Data is measured into the basic data for being stored in GPU.
The data query system based on OLAP of the present embodiment, additionally provides api function and facilitates client program calls.
API is mainly the encapsulation operated to common OLAP, specifically includes that volume, lower brill, stripping and slicing, slice and rotation and to What-if vacation
If the support of analysis.
The function of upper volume (Roll-up) includes: Void**Rollup (string*aggfunction, string*
Measures, string*groupbys, string*orderbys, string*tables, string*where), wherein
Aggfunction is that upper volume operates corresponding aggregate function, such as sum, is supported multiple;Measures is that aggregate function is corresponding
Estimate field;Groupbys is grouping field;Orderbys is sort field;The table that Tables is related generally to;Wheres is main
The WHERE condition being related to include (=,>,<,>=,≤, between, in) pass through AND connection between condition.
Lower brill (Drill-down), does not store aggregation data in OLAP system, obtains aggregation data by calculating in real time,
Drill down operator is a upper volume operation.The function of lower brill includes:
Void**drilldown(string*aggfunction,string*measures,string*drilldown)。
Wherein, Aggfunction is lower brill function;Measures, which is that lower brill is corresponding, estimates field;Drilldown is lower brill Dimensional level
Information, i.e. lower bore carry out in that value of that layer of that dimension.
Stripping and slicing (Dicing) operation is exactly to obtain a part of data, is equivalent to where condition.The function of stripping and slicing includes
Void**Dicing (string*measures, string*dimensions1, string*dimensions2), wherein
Measures is to need what is obtained to estimate field;Dimensions1 is dimension attribute combination-lower bound of stripping and slicing;Dimensions2 is
Dimension attribute combination-upper bound of stripping and slicing.
Slice (Slicing) operation is the special case of dicing operation, and stripping and slicing is the range filter carried out in multiple dimensions, slice
It is the equivalent filtering in a dimension.The function of slice includes: Void**Dicing (string*measures, string*
dimensions);Wherein, measures is to need what is obtained to estimate field;Dimensions is the dimension attribute combination of slice.
The above-mentioned data query system based on OLAP carries out dimension table number at the end CPU when receiving data inquiry request
According to filtering, true table data filtering and packet aggregation are carried out at the end GPU.Have the characteristics that high-performance is efficient due to GPU, leads to
The computing resource using GPU is crossed, accelerates the inquiry of OLAP using GPU, effectively improves the treatment effeciency of OLAP.
In one embodiment, a kind of data query method based on OLAP, data of this method based on OLAP are also provided
The realization of query service device, as shown in figure 5, method includes the following steps:
S502:CPU obtains data inquiry request, and data inquiry request includes dimension table information to be checked and true table letter
Breath.
S504:CPU loads dimension table corresponding with dimension table information to be checked.
S506:CPU is filtered dimension table according to data inquiry request, obtains the filtered key indicated with key-value pair
Value is sent to GPU to data, and by key-value pair data.
S508:GPU is obtained according to key-value pair data to cube data filtering corresponding with true table information is stored in GPU
Filtered data.
Specifically, judging whether corresponding each element with dimension respective column in corresponding true table meets data query
Corresponding position is set to 1 if meeting, is otherwise 0, to obtain filtered data by the requirement of inquiry request in request.
S510:GPU is grouped aggregation to filtered data and obtains request data.
Packet aggregation is specially assemble filtered data according to the order of data inquiry request, and data are looked into
The type for asking request include drill through, upper volume, slice, stripping and slicing and rotation.Concentrating module according to the type of corresponding inquiry request into
The corresponding packet aggregation of row is to obtain request data.
The above-mentioned data query method based on OLAP carries out dimension table number at the end CPU when receiving data inquiry request
According to filtering, true table data filtering and packet aggregation are carried out at the end GPU.Have the characteristics that high-performance is efficient due to GPU, leads to
The computing resource using GPU is crossed, accelerates the inquiry of OLAP using GPU, effectively improves the treatment effeciency of OLAP.
As shown in fig. 6, step S502 the following steps are included:
S5021:CPU obtains service API.
Service API (Application Programming Interface, application programming interface) refers to that system is pre-
The function first defined.Front end is encapsulated as corresponding service API in the inquiry request for receiving request, by inquiry request.
Whether S5022:CPU service for checking credentials API meets specification and is encapsulated in the hierarchical information for servicing the dimension of the peacekeeping in API
It is whether correct.If being verified, S5023 is thened follow the steps.
When receiving the corresponding service API of data inquiry request, whether service for checking credentials API meets specification, specifically, verifying
Whether the inquiry instruction corresponding with data inquiry request encapsulated in service API meets specification, and the type of inquiry request includes upper
Volume, it is lower bore, slice, stripping and slicing and rotation, whether the inquiry request encapsulated in service for checking credentials API meet wanting for corresponding inquiry request
It asks, whether query statement is wrong.
Information can be divided into the structure of different levels, the querying command in service for checking credentials API by the column field in dimension table
In the corresponding dimension of peacekeeping hierarchical information it is whether correct, i.e., the letter in the hierarchical information and dimension table of peacekeeping in querying command dimension
It whether consistent ceases.If inconsistent, for querying command mistake, inquiry not can be carried out.
S5023:CPU parses service API, obtains data inquiry request.
By being parsed to service API, so that data inquiry request is obtained, including dimension table information to be checked and the fact
Table information and the inquiry requests such as drill through.
Please continue to refer to Fig. 6, after step S5023, further includes:
Whether with data inquiry request corresponding data are stored in S5024:CPU query caching.If so, thening follow the steps
S5025, if it is not, thening follow the steps S504.
S5025:CPU returns to data corresponding with data inquiry request from caching.
Historical query data are cached in caching, after carrying out parsing to service API and obtaining data inquiry request, first slow
Deposit whether middle inquiry there are data corresponding with data inquiry request.If so, then directly data corresponding in cache module 125 are returned
It returns.
If in not stored in cache module when data corresponding with data inquiry request, CPU is asked according to the data query
It asks, the dimension table is filtered, obtain the filtered key-value pair data indicated with key-value pair, and by the key-value pair data
It is sent to GPU.
Meanwhile after step S506, further includes: the request data that S512:GPU obtains packet aggregation is sent to CPU
And it stores in the buffer.
Although GPU computational efficiency with higher, start the data transmission of GPU program and GPU and CPU with higher
Cost, the data query server based on OLAP of the present embodiment, by CPU be arranged for caching historical query data
Cache module first searches whether to be stored in the cache module of CPU and data query when parsing obtains data inquiry request
Corresponding data are requested, only when cache module is not stored there are data corresponding with data inquiry request, are just executed subsequent
Inquiry operation improves the efficiency of data query so as to avoid computing repeatedly.
Specifically, the type of inquiry inquiry request includes upper volume, lower brill, slice, stripping and slicing and rotation.Wherein, lower brill
Drill-down is a kind of special circumstances of upper volume operation, and drill down operator is converted to roll-up and operated by OLAP system.
Upper volume (Roll-up), is mainly exactly aggregation operator, the Group by operation in similar sql like language.
The inquiry request command of one embodiment is as follows:
select sum(lo_revenue),d_year,p_brand1
from lineorder,date,part,supplier
Where lo_orderdate=d_datekey
And lo_partkey=p_partkey
And lo_suppkey=s_suppkey
and p_brand1between'MFGR#2221'and'MFGR#2228'
And s_region='ASIA'
group by d_year,p_brand1
order by d_year,p_brand1。
Specifically, according to inquiry request, inquiry the following steps are included:
S1:CPU is obtained in part table according to p_brand1between'MFGR#2221'and'MFGR#2228' condition
The set { 1114,1587,3631,4378 ... } of p_partkey.Operational process is as shown in Figure 7.
S2: according to s_region='ASIA' condition, obtained in supplier table the value of s_suppkey set 11,
12,15,23,26,27 ... }, effect as shown in Figure 8
S3: by two set and (datakey, d_year), (partkey, p_brand1) two Key-Value tables from
CPU is sent to GPU.It needs to carry out specification to non-integer type-word section in this step.For example p_brand1 is one here
Character data needs to carry out coded treatment to it.Data conversion process and key-value pair form such as Fig. 9 of transmission show
S4:GPU starts kernel program, completes data filtering, filter algorithm is approximately, to the true table being stored on GPU
Lo_revenue, d_year and p_brand1 are filtered.It is to each element in p_partkey and s_suppkey two column
It is no to meet the requirements, if meeting home position 1, otherwise set 0.Then by prefix and scanning, filtered data are obtained.Process
As shown in Figure 10
S5: filtered d_year and p_brand1 two column are pressed into certain linearization process.Then it sorts, prefix scanning,
Reduction and etc., finally obtain result.Process is as shown in figure 11
S6: by the inverse linearisation of linearisation ordering structure, and being returned to the end CPU from the end GPU for result, finally aobvious in front end
Show, as shown in figure 12.
By the above-mentioned means, avoiding the star-like Join operation of complicated and time consumption, while GPU is utilized to carry out Aggregation computation
Ability.
Stripping and slicing dicing and slice slicing is specially to obtain a part of data.It is equivalent to processing where condition.
The querying command of the stripping and slicing of one embodiment are as follows:
Select d_datekey,lo_revenue
from lineorder,supplier
Where lo_suppkey=s_suppkey
And s_region='ASIA';
Specifically, according to inquiry request, inquiry the following steps are included:
S1: the set of corresponding s_suppkey is obtained according to condition s_region='ASIA'.
S2: per thread judges whether in set the s_suppkey value arranged, after being considered as sequential search or sequence
Binary search.
S3: obtained set is sent to GPU.
S4: a bit vector is generated, 1 is set in the vector corresponding position for the condition that meets, otherwise sets 0.
S5: to bit vector prefix sum, final structure size n is determined.
S6: restarting GPU, and distribute the space 2n.
S7: by d_datekey, lo_revenue respective flag position is 1 to be put into new space.
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention
Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.
Claims (10)
1. a kind of data query server based on OLAP, including CPU and GPU;
The CPU includes request module, meta data block and dimensional filter module;The GPU includes cube filtering module, gathers
Collect module and cube data module;
The request module, for obtaining data inquiry request, the data inquiry request includes dimension table letter to be checked
Breath and true table information;
The meta data block, for loading dimension table corresponding with the dimension table information to be checked, the dimension table includes the fact
The associated detailed information of specified attribute in table;
The dimensional filter module is filtered the dimension table for according to the data inquiry request, obtain it is filtered with
The key-value pair data that key-value pair indicates, and the key-value pair data is sent to GPU;
The cube data module, for storing cube data corresponding with the fact table information;
The cube filtering module, for obtaining filtered number to the cube data filtering according to the key-value pair data
According to;
The concentrating module obtains request data for being grouped aggregation to the filtered data.
2. the data query server according to claim 1 based on OLAP, which is characterized in that the dimension table information includes
The hierarchical information of peacekeeping dimension;The request module includes validator and resolver;The validator and the resolver point
It is not connect with the meta data block;
The validator for obtaining service API, and verifies whether the service API meets specification and be encapsulated in the clothes
Whether the hierarchical information for the peacekeeping dimension being engaged in API is correct;
The resolver, for being parsed to the service API, obtaining the data after the validator is verified
Inquiry request.
3. the data query server according to claim 2 based on OLAP, which is characterized in that the data inquiry request
Type include upper volume, it is lower bore, slice, stripping and slicing and rotation.
4. the data query server according to claim 2 based on OLAP, which is characterized in that the CPU further includes delaying
Storing module and enquiry module,
The cache module, for caching historical query data;
Whether the enquiry module is stored with and institute for when receiving data inquiry request, inquiring in the cache module
State the corresponding data of data inquiry request;
The dimensional filter module is activated when the cache module is not stored data corresponding with the data inquiry request;
The request data that the concentrating module is also used to obtain packet aggregation is sent to the cache module and stores.
5. the data query server according to claim 1 based on OLAP, which is characterized in that the CPU further includes back
Writing module repairs the request data that packet aggregation obtains for receiving write back request, and according to the write back request
Change.
6. a kind of data query system based on OLAP, including expression layer, accumulation layer and as described in any one of claim 1 to 5
The data query server based on OLAP;
The data inquiry request that user inputs is encapsulated as corresponding to by the expression layer for providing data inquiry request input entrance
Service API, and the service API is sent to CPU, and for showing the request data inquired;
The accumulation layer is used to store the related data of OLAP, and the related data includes dimension table and true table;The metadata
Module, for loading dimension table corresponding with the dimension table information to be checked from the accumulation layer;The cube data module,
For storing cube data corresponding with the fact table information.
7. a kind of data query method based on OLAP, comprising:
CPU obtains data inquiry request, and the data inquiry request includes dimension table information and true table information to be checked;
CPU loads dimension table corresponding with the dimension table information to be checked, and the dimension table includes the phase of specified attribute in true table
Close details;
CPU is filtered the dimension table according to the data inquiry request, obtains the filtered key assignments indicated with key-value pair
GPU is sent to data, and by the key-value pair data;
GPU is filtered according to the key-value pair data to cube data filtering corresponding with true table information is stored in GPU
Data afterwards;
GPU is grouped aggregation to the filtered data and obtains request data.
8. the data query method according to claim 7 based on OLAP, which is characterized in that the CPU obtains data and looks into
Asking the step of requesting includes:
CPU obtains service API;
CPU verifies whether the service API meets specification and the hierarchical information for the peacekeeping dimension being encapsulated in the service API is
It is no correct;
When being verified, CPU parses the service API, obtains the data inquiry request.
9. the data query method according to claim 8 based on OLAP, which is characterized in that the data inquiry request
Type includes upper volume, lower brill, slice, stripping and slicing and rotation.
10. the data query method according to claim 8 based on OLAP, which is characterized in that ought be verified described
When, after the step of parsing to the service API, obtain data inquiry request, further includes: in CPU query caching whether
It is stored with data corresponding with the data inquiry request;
If so, CPU returns to data corresponding with the data inquiry request from caching;
If it is not, then executing CPU according to the data inquiry request, the dimension table is filtered, is obtained filtered with key assignments
To the key-value pair data of expression, and the step of key-value pair data is sent to GPU;
After the GPU is grouped the step of aggregation obtains request data to the filtered data, further includes:
The request data that GPU obtains packet aggregation is sent to CPU and is stored in the caching.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610543412.9A CN106202408B (en) | 2016-07-11 | 2016-07-11 | Data query server based on OLAP, system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610543412.9A CN106202408B (en) | 2016-07-11 | 2016-07-11 | Data query server based on OLAP, system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106202408A CN106202408A (en) | 2016-12-07 |
CN106202408B true CN106202408B (en) | 2019-10-18 |
Family
ID=57476968
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610543412.9A Expired - Fee Related CN106202408B (en) | 2016-07-11 | 2016-07-11 | Data query server based on OLAP, system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106202408B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109165248B (en) * | 2018-01-29 | 2019-09-03 | 北京数聚鑫云信息技术有限公司 | A kind of management system and management method based on API |
CN110442653B (en) * | 2019-07-03 | 2023-09-29 | 平安科技(深圳)有限公司 | Method, device, server and storage medium for incrementally constructing CUBE model |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103309958A (en) * | 2013-05-28 | 2013-09-18 | 中国人民大学 | OLAP star connection query optimizing method under CPU and GPU mixing framework |
CN104246717A (en) * | 2012-05-08 | 2014-12-24 | 文雅科一番株式会社 | Data processing system, server, client, and program for managing data |
CN104866608A (en) * | 2015-06-05 | 2015-08-26 | 中国人民大学 | Query optimization method based on join index in data warehouse |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8965866B2 (en) * | 2009-12-17 | 2015-02-24 | Business Objects Software Limited | Optimizing data transfer time on graphics processor units |
US10353923B2 (en) * | 2014-04-24 | 2019-07-16 | Ebay Inc. | Hadoop OLAP engine |
-
2016
- 2016-07-11 CN CN201610543412.9A patent/CN106202408B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104246717A (en) * | 2012-05-08 | 2014-12-24 | 文雅科一番株式会社 | Data processing system, server, client, and program for managing data |
CN103309958A (en) * | 2013-05-28 | 2013-09-18 | 中国人民大学 | OLAP star connection query optimizing method under CPU and GPU mixing framework |
CN104866608A (en) * | 2015-06-05 | 2015-08-26 | 中国人民大学 | Query optimization method based on join index in data warehouse |
Non-Patent Citations (1)
Title |
---|
GPU-Based Aggregation of On-Line Analytical Processing;Guilan Wang et al.;《Communications and Information Processing》;20121231;第234-245页 * |
Also Published As
Publication number | Publication date |
---|---|
CN106202408A (en) | 2016-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102627690B1 (en) | Dimensional context propagation techniques for optimizing SKB query plans | |
US20230084389A1 (en) | System and method for providing bottom-up aggregation in a multidimensional database environment | |
US10664497B2 (en) | Hybrid database table stored as both row and column store | |
US20190272265A1 (en) | Hybrid Database Table Stored As Both Row and Column Store | |
US10565200B2 (en) | Conversion of model views into relational models | |
CN103177062B (en) | The acceleration inquiry operation device inquired about and operated for high-speed internal memory Data Environments | |
US8537160B2 (en) | Generating distributed dataflow graphs | |
US8768927B2 (en) | Hybrid database table stored as both row and column store | |
US20140074771A1 (en) | Query optimization | |
US9348874B2 (en) | Dynamic recreation of multidimensional analytical data | |
US9146979B2 (en) | Optimization of business warehouse queries by calculation engines | |
CN103678665A (en) | Heterogeneous large data integration method and system based on data warehouses | |
US11803865B2 (en) | Graph based processing of multidimensional hierarchical data | |
US20230315727A1 (en) | Cost-based query optimization for untyped fields in database systems | |
CN106202408B (en) | Data query server based on OLAP, system and method | |
US9229968B2 (en) | Management of searches in a database system | |
CN107391528A (en) | Front end assemblies Dependency Specification searching method and equipment | |
CN105786948A (en) | OLAP system based on GPU | |
KR20180077830A (en) | Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method | |
CN105243063A (en) | Information recommendation method and device | |
Real et al. | Full Speed Ahead: 3D Spatial Database Acceleration with GPUs | |
Rajadnye | Is Datawarehouse Relevant in the Era of Big Data? | |
Jun et al. | Research on In-Memory Computing Model and Data Analysis | |
Nanda | Performance enhancement techniques of cloud database queries |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20191018 Termination date: 20210711 |