CN106484875A

CN106484875A - Data processing method based on MOLAP and device

Info

Publication number: CN106484875A
Application number: CN201610893549.7A
Authority: CN
Inventors: 李寅威
Original assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd
Current assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority date: 2016-10-13
Filing date: 2016-10-13
Publication date: 2017-03-08
Anticipated expiration: 2036-10-13
Also published as: CN106484875B

Abstract

The invention discloses a kind of data processing method based on MOLAP and device.This data processing method includes：Data cube is created according to true table and dimension table；Based on the data of record in described data cube, data precomputation may be carried out combination to the whole of dimension；Precomputation result is preserved to PostgreSQL database, to calculate result on the estimation in inquiry to determine Query Result.Available data query scheme can be optimized using said method so that non-technical personnel can also realize the inquiry based on mass data.

Description

Data processing method based on MOLAP and device

Technical field

The present embodiments relate to technical field of data processing, the more particularly, to data processing method based on MOLAP and dress Put.

Background technology

On-line analytical processing (Online Analytical Processing, OLAP) system is that data warehouse is the most main The application wanted, dedicated for supporting complicated analysis operation, stresses the decision support to decision-maker and senior management staff. OLAP can carry out the complex query processing of big data quantity according to the requirement of analysis personnel, and intuitively form will be looked into one kind Ask result and be supplied to decision-maker, so that they accurately grasp the management state of enterprise (company), understand the demand of object, formulate Correct scheme.

OLAP system according to the data memory format of its memorizer can be divided into relational OLAP (RelationalOLAP, ROLAP), multidimensional OLAP (MultidimensionalOLAP, MOLAP) and three kinds of mixed type OLAP (HybridOLAP, HOLAP) Type.Wherein, the multidimensional data used in olap analysis is physically stored as the form of Multidimensional numerical by MOLAP, formed " cube The structure of body ".

Traditional MOLAP engine is limited by software and hardware resources, be only capable of process gigabit or<The number of 10 terabyte ranks According to, and, when calculating the data of multi-dimension data cube, server configures are required higher.Meanwhile, magnanimity is directed to based on MOLAP During data real-time query, frequently with SQL (the SQL on Hadoop) side based on distributed system architecture Case, on the one hand, its time delay is up to several seconds, tens of second or even several tens minutes, on the other hand, for some columnar database, generally Can only be good for according to row and carry out quick search, and the inquiry of row rank is then only capable of using in ad hoc inquiry scene.Additionally, in inquiry When, need to be related to writing so that non-technical personnel cannot be carried out inquiring about of SQL statement.

Content of the invention

In view of this, the embodiment of the present invention provides a kind of data processing method based on MOLAP and device, existing to optimize Data query scheme is so that non-technical personnel can also realize the inquiry based on mass data.

In a first aspect, embodiments providing a kind of data processing method based on MOLAP, including：

Data cube is created according to true table and dimension table；

Based on the data of record in described data cube, data precomputation may be carried out combination to the whole of dimension；

Precomputation result is preserved to PostgreSQL database, to calculate result on the estimation to determine inquiry knot in inquiry Really.

Second aspect, the embodiment of the present invention additionally provides a kind of data processing equipment based on MOLAP, including：

Cube creation module, for creating data cube according to true table and dimension table；

Precalculation module, for based in described data cube record data, to dimension whole may combine into Line number is it is expected that calculate；

Preserving module, for preserving precomputation result to PostgreSQL database, to calculate knot on the estimation in inquiry Fruit determines Query Result.

Data processing method based on MOLAP provided in an embodiment of the present invention and device, according to true table and dimension table wound Build data cube, and data precomputation carried out to all possible dimension combination according to the data of record in data cube, And precomputation result is saved in PostgreSQL database it is achieved that when user carries out data query it is only necessary in client Pull dimension and tolerance in the page, server just can determine Query Result according to corresponding precomputation result, do not need to use SQL statement is write at family.Meanwhile, make full use of big data component characteristic and MOLAP characteristic, simplify data query process, improve Inquiry response speed.

Brief description

By reading the detailed description that non-limiting example is made made with reference to the following drawings, other of the present invention Feature, objects and advantages will become more apparent upon：

A kind of flow chart of data processing method based on MOLAP that Fig. 1 provides for the embodiment of the present invention one；

A kind of flow chart of data processing method based on MOLAP that Fig. 2 provides for the embodiment of the present invention two；

Fig. 3 is that the whole of the cubical dimension of data may groups hierarchical relationship schematic diagram when being combined in data precomputation；

A kind of flow chart of the creation method of PostgreSQL database table that Fig. 4 provides for the embodiment of the present invention two；

A kind of flow chart of data query method that Fig. 5 provides for the embodiment of the present invention two；

Fig. 6 is user interface schematic diagram during client query；

A kind of structural representation of data processing equipment based on MOLAP that Fig. 7 provides for the embodiment of the present invention three.

Specific embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that, in order to just Part related to the present invention rather than full content is illustrate only in description, accompanying drawing.

Embodiment one

A kind of flow chart of data processing method based on MOLAP that Fig. 1 provides for the embodiment of the present invention one.The present embodiment The method providing can be executed by the data processing equipment based on MOLAP, and this device can be by the side of software and/or hardware Formula realize, and integrated in the server.With reference to Fig. 1, the method that the present embodiment provides can include：

S110, data cube is created according to true table and dimension table.

Wherein, true table and dimension table are stored in the data warehouse of big data platform.Dimension table stores institute under this dimension There are the value of attribute and the ID of each record.Such as sell region dimension to include：Region ID, the affiliated province in this region, affiliated The information such as city, affiliated county, taking the value that Tianhe district of Guangzhou is under this sale region dimension as a example, then with regard to the Milky Way in dimension table Area is recorded as：01 (ID being worth), Guangdong Province, Guangzhou, Tianhe District.Factual data is store, including each dimension in true table The ID of value of degree, sales volume, consumption sum etc..The field such as selling true table includes each sales record：Sell Region ID (the corresponding ID selling region dimension table), product IDs (ID of corresponding product dimension table), selling time ID (corresponding time The ID of dimension table), sales volume, consumption sum etc., according to the corresponding each ID of a certain sales volume, just it is known that this sale number Amount is sales volume within which section time for which kind of product of which region.

Optionally, it is pre-created true table and dimension table, wherein, specific establishment rule, for example, select which dimension is created Build dimension table, and create several dimension tables, can be set according to practical situation.Under normal circumstances, the number of dimension table It is not construed as limiting, the number of true table is one.

Typically, after creating true table and dimension table, obtain from the external data base that current server can access In data, and the fact that be all directed into corresponding table and dimension table.

Specifically, data cube is referred to as multi-dimension data cube, and it can comprise at least one dimension setting.Than As a data cube can include：(value includes region dimension：Beijing, Tianjin, Shanghai and Guangdong), time dimension (value bag Include：The first quarter, for the second quarter, the third quater and the fourth quater), (value includes commodity dimension：Gym suit, kettle, sport shoess, parasols with And sunbonnet) and (the value inclusion of age of user dimension：0-18 year, 19-40 year, 41-60 year and more than 60 years old), wherein each dimension In the fact that corresponding factual data (including consumption sum and sales volume) is stored in association table, that is, this data cube includes Four dimension tables and a true table, and be associated by the ID of the dimension table intermediate value of record in dimension table and true table.Root Just can determine factual data according to the data cube determining.As Beijing area just be can determine according to above-mentioned data cube The consumption sum of first quarter gym suit, for another example, it may also be determined that the consumption sum of Efficiency in Buildings in Tianjin Area kettle, and for example, it may also be determined that The year District of Shanghai fourth quater 19-40 consumer movementss' footwear sales volume.

Optionally, at least one data cube can be created with the dimension table associating according to true table, wherein, specifically Create rule to be set according to practical situation.

The whole of dimension combination may be carried out data precomputation by S120, the data based on record in data cube.

Wherein, different dimensions can be included in data cube, to can be in any combination between different dimensions.With For the data cube of citing in step S110, comprise four dimensions altogether, then the whole of dimension combination may have 16, tool The calculation of body can beWithAs a example, corresponding dimension combination includes (region Dimension, time dimension and commodity dimension), (region dimension, commodity dimension and age of user dimension), (region dimension, time dimension With age of user dimension) and (time dimension, commodity dimension and age of user dimension).Certainly, when data cube only comprises During one dimension, all may combine combination and the empty set only comprising this dimension.

Further, after determining the whole possible combination of data cube, line number is entered to the corresponding factual data of each combination It is expected that calculating, to obtain the precomputation result of each combination.Optionally, data precomputation is carried out according to the tolerance determining.In this reality Apply in example, how tolerance calculates or requirement that result of calculation sets it is expected that calculating if being logarithm, preferably include polymeric type (polymeric type Type includes being polymerized maximum and/or polymerization minima).Assume that dimension combination includes region dimension and time dimension, polymeric type For be polymerized maximum, then to during dimension data splitting precomputation be respectively determine (region dimension, time dimension), (region tie up Degree), (time dimension) andThe polymerization maximum of corresponding factual data.Taking (region dimension) as a example, data precomputation It is the polymerization maximum obtaining Beijing in region dimension, Tianjin, Shanghai and factual data in the fact that the corresponding table of Guangdong.

Accordingly, all possible dimension combination corresponding to data cube data precomputation can be carried out.

S130, precomputation result is preserved to PostgreSQL database, look into calculate result on the estimation in inquiry and to determine Ask result.

In the present embodiment, PostgreSQL database is preferably Hbase.Hbase is distributed memory system, can using Hbase To build extensive destructuring storage cluster in the middle of common server.

Further, when user is inquired about it is only necessary to selecting, in client, the dimension combination needing and measuring, service Device just can find corresponding precomputation result in Hbase, and this precomputation result is fed back to client.

The technical scheme that the present embodiment provides, creates data cube by true table with the dimension table associating, to data In cube, the whole of dimension combination may carry out data precomputation, and precomputation result is saved in PostgreSQL database, can Calculated with the big data realized based on MOLAP and store.Meanwhile, when user is inquired about, SQL statement need not be write it is only necessary to To pull dimension and the tolerance of needs in the page of client, server just can determine according to corresponding precomputation result to be looked into Ask result, simplify data query process, improve inquiry response speed.

Embodiment two

A kind of flow chart of data processing method based on MOLAP that Fig. 2 provides for the embodiment of the present invention two.The present embodiment The data processing method providing is on the basis of the data processing method that above-described embodiment provides, and is optimized.With reference to Fig. 2, The data processing method that the present embodiment provides, specifically includes：

S210, according to the list item demand to true table and dimension table in default Data Analysis Model, create corresponding thing Real table and dimension table.

Specifically, default Data Analysis Model includes the list item demand to true table and dimension table, and wherein, list item needs Ask the dimension values that can include dimension table and dimension hierarchy, and the fact that true table data attribute (as consumption sum etc.).Separately Outward, the tolerance (as polymerization maximum etc.) during data precomputation can also be included in Data Analysis Model.In Data Analysis Model Particular content can be set according to practical situation.In general, Data Analysis Model can be regarded as number in the present embodiment According to the plan of processing method, that is, in the present embodiment, data processing method executes according to Data Analysis Model.

Exemplary, according to list item demand, create true table and the dimension table associating with true table.

S220, according to true table and dimension table list item demand, by the data in external data base be directed into true table and In dimension table.

Specifically, external data base is accessed according to list item demand, and find corresponding data from external data base and import In true table and dimension table.Wherein, external data base can include data base and the enterprise external of related each business management system Database.

When further, to true table and dimension table importing data, it is specific that data warehouse may determine that whether user sets Data form, if set specific data form, original form of data is converted into specific data form, and imports In true table and dimension table, if not setting specific data form, according to original form of data, import data to the fact In table and dimension table.

S230, according to the metadata in Data Analysis Model, create data cube using true table and dimension table.

Exemplary, Data Analysis Model includes metadata, and wherein, metadata is used for indicating the cubical attribute of data Parameter and establishment rule.Optionally, metadata can include following at least one：The creation time of data cube, data are stood The fact that the establishment position of cube, the title of data cube, data cube, the dimension of data cube and its order, number According to cubical tolerance, polymeric type configuration information corresponding with programming model configuration information corresponding with PostgreSQL database, Storage information of pre-calculation time and precomputation result etc..Wherein, configuration information corresponding with programming model is to start volume During journey model, inform the information of the memory size that programming model data cube needs to take.Corresponding with PostgreSQL database join Confidence breath is the information of data cube precomputation result and the corresponding relation of PostgreSQL database table.The storage letter of precomputation result Breath includes precomputation result be saved in multiple dimensions codings that row major key in PostgreSQL database is related to order (if any dimension A with Dimension B, wherein, dimension A includes three values, respectively 1,2 and 3, and corresponding precomputation result is being stored to data of increasing income During storehouse, to 1,2 and 3 orders being encoded, and the order to dimension A and dimension B coding).Optionally, with PostgreSQL database Data in the storage information of corresponding configuration information, pre-calculation time and precomputation result can complete to store precomputation It is updated after result.

The essential information when creating for the data cube, such as creation time, establishment just be can determine according to above-mentioned metadata Position, title, the fact, dimension and its order and tolerance etc..Optionally, the essential information of above-mentioned metadata is stored in Hbase In.That is, when needing the essential information of data cube, Hbase inquiry can be accessed.

S240, according to the metadata in Data Analysis Model, start precomputation programming model task, and read and stand with data The corresponding whole dimension tables of cube and the data of true table.

Specifically, create and start the task of corresponding precomputation programming model according to metadata.In the present embodiment it is contemplated that Calculation programming model is MapReduce, and it can realize the concurrent operation of large-scale dataset.

Optionally, by precomputation programming model read data platform in whole dimension tables corresponding with data cube and The data of true table.

S250, the dimension of whole dimension tables is carried out permutation and combination, obtain including empty set whole may combination.

S260, according to set polymeric rule, aminated polyepichlorohydrin is carried out to the combination comprising whole dimensions, obtains polymerizing value.

For example, if data cube includes 4 dimensions, represented with A, B, C and D respectively, then comprise the combination of whole dimensions For (A, B, C, D).

Optionally, polymeric rule can include being polymerized maximum and/or polymerization minima.Taking be polymerized maximum as a example, then Aminated polyepichlorohydrin is the maximum MAX (M) returning (A, B, C, D) corresponding factual data using aggregate function, and using MAX (M) as Polymerizing value.

Optionally, corresponding value the whole dimensions that comprise after dictionary encoding, and calculation code can be carried out to each dimension The polymerizing value of combination.

S270, using comprise whole dimensions combination as precomputation programming model key value input, using polymerizing value as The keyword input of precomputation programming model.

Specific it is contemplated that calculating programming model when calculating by the way of step-by-step calculation.Wherein, when calculating, will comprise All the combination of dimension is as the key value input of precomputation programming model, i.e. key input.Polymerizing value is programmed as precomputation The keyword input of model, i.e. value input.

S280, obtain new dimension combination using precomputation programming model and combine corresponding polymerization with new dimension Value.

Because by the way of step-by-step calculation, then calculated new dimension is combined as comprising the combination of whole dimensions Higher level's dimension combines, and polymerizing value is the polymerizing value of new dimension combination.

For example, Fig. 3 be the cubical dimension of data whole may groups when being combined in data precomputation hierarchical relationship show It is intended to.With reference to Fig. 3 it may be determined that current data cube comprises four dimensions value, respectively A, B, C and D.Wherein, bottom Level for calculating the dimension combination of polymerizing value at first, that is, comprises the combination of whole dimensions, according to the hierarchical relationship in Fig. 3, step by step to Upper calculating, until calculating to empty set.

S290, judge whether to complete to the data precomputation that all may combine.If so, then execute S2120, otherwise, hold Row S2100.

Wherein it is possible to by judging whether to complete the data precomputation to empty set, it is confirmed whether to complete to all may tieing up The data precomputation of degree combination.

S2100, using new dimension combination as precomputation programming model key value input, new dimension combination is corresponding Polymerizing value inputs as the keyword of precomputation programming model.

Exemplary, if not completing to the data precomputation that all may combine, can be by precomputation programming model As key input, corresponding polymerizing value inputs as value for the last calculated new dimension combination, with continue into Line number is it is expected that calculate.

S2110, obtain new dimension combination using precomputation programming model and combine corresponding polymerization with new dimension Value.Return execution S290.

Further, after obtaining precomputation result, pre-calculation time can be preserved in the metadata.

The data precomputation process of a certain data cube of explanation exemplary below.

Set this data cube and comprise 4 dimensions, the value of respectively A, B, C and D, wherein dimension A is 1, dimension B It is worth for 2, the value of dimension C is (3,4), the value of dimension D is (4,6,3,8).I.e. this data cube comprises 4 dimension tables and 1 altogether Individual fact table.

Further, according to metadata in Data Analysis Model, start precomputation programmed tasks model, and determine this data The whole of cubical dimension combination may have 16.The polymeric rule setting is as polymerization maximum and polymerization minima.Right After the corresponding value of each dimension carries out dictionary encoding, data precomputation is carried out to dimension combination (A, B, C, D), obtained polymerization maximum Value MAX (M) and polymerization minimum value MIN (N), that is, complete the first order and calculate.

Table 1

Can determine the polymerization result of calculation of (A, B, C, D) dimension combination by table 1.Wherein, by (A, B, C, D) conduct The key input of MapReduce, using MAX (M) and MIN (N) as the value input of MapReduce, then can obtain new dimension Degree combination, including (A, B, C), (A, B, D), (A, C, D) and (B, C, D), and each new dimension combines corresponding polymerizing value, that is, Complete the second level to calculate.

What table 2 was merely exemplary lists the data precomputation result of dimension combination (A, B, C).

Table 2

A	B	C	MAX(M)	MIN(N)
					1	2	3	300	50
1	2	4	500	20

Further, the new dimension combination obtaining when MapReduce being calculated in the second level, will used as key input (A, B, C), (A, B, D), (A, C, D) and (B, C, D) inputs as key, and each dimension is combined corresponding MAX (M) and MIN (N) As value input, just can obtain new dimension combination, and each new dimension combines corresponding polymerizing value, that is, complete The third level calculates.Wherein, the new dimension combination obtaining includes (A, B), (A, C), (A, D), (B, C), (B, D) and (C, D).

What table 3 was merely exemplary lists the data precomputation result of dimension combination (A, B).

Table 3

A	B	MAX(M)	MIN(N)
				1	2	500	20

Further, new dimension combination (A, B) that MapReduce obtained when the third level calculates, (A, C), (A, D), (B, C), (B, D) and (C, D) inputs as key, each dimension is combined corresponding MAX (M) and MIN (N) defeated as value Enter, just can obtain new dimension combination and each new dimension combines corresponding polymerizing value, that is, complete the fourth stage and calculate.Its In, this grade of calculated new dimension combination includes：(A), (B), (C) and (D).

What table 4 was merely exemplary lists the data precomputation result of dimension combination (A).

Table 4

A	MAX(M)	MIN(N)
			1	500	20

Further, new dimension combination (A), (B), (C) and (D) MapReduce being obtained in fourth stage calculating As key input, each dimension is combined corresponding MAX (M) and MIN (N) and inputs as value, just can obtain new dimension Combination and each new dimension combine corresponding polymerizing value, that is, complete the calculating of afterbody.Wherein, this grade calculated New dimension is combined as

Dimension combination listed by table 5Data precomputation result.After obtaining the data precomputation result of empty set, confirm Complete the precomputation to this data cube.

Table 5

S2120, create PostgreSQL database table for storing precomputation result.

During due to being stored in precomputation result in PostgreSQL database, it is that precomputation result is stored in data of increasing income in logic In the table of storehouse.Therefore, it is pre-created the PostgreSQL database table for storing precomputation result.Wherein, different pieces of information is cubical pre- The different PostgreSQL database table of result of calculation correspondence.

Wherein, with reference to Fig. 4, this step can include：

S2121, determine in PostgreSQL database for PostgreSQL database table set memory block capacity.

Wherein, memory block is memory block in logic.In PostgreSQL database, inquiry data is by inquiring about data of increasing income Storehouse table is realized.If the data volume being stored in PostgreSQL database table constantly increases, the speed of inquiry data will be slack-off.This When it is possible to subregion is carried out to PostgreSQL database table, that is, be divided into multiple memory blocks in logic.After tables of data of increasing income subregion, table Existing form remains a complete table, the memory block simply when inquiring about data, in inquiry PostgreSQL database table.

Typically, PostgreSQL database can preset the capacity of each PostgreSQL database table each memory block in subregion.Generally, This capacity keeps constant after setting as far as possible.As set the capacity of memory block as 256M, that is, each memory block at most can store The data of 256M.

The size of S2122, on the estimation calculation result, determines that precomputation result is stored in required depositing during PostgreSQL database table The quantitative value of storage area.

For example it is contemplated that the size calculating result is 3G, the capacity of memory block is 256M, then obtain 12 with 3G divided by 256M (3 × 1024 ÷ 256=12), that is, determine that the quantitative value needing memory block is 12.

Optionally, when calculating the memory block quantitative value that precomputation result needs, if integer amount value can not be obtained, right The result obtaining rounds on carrying out, and such as it is contemplated that the size calculating result is 3.1G, the capacity of each memory block is 256M, then use 3.1G obtains 12.4 divided by 256M, obtains 13 to rounding on 12.4.I.e. the quantitative value of this corresponding memory block of precomputation result is 13.

S2123, quantitative value is sent to PostgreSQL database, so that PostgreSQL database creates for storing according to quantitative value The PostgreSQL database table of precomputation result.

Further, after determining the quantitative value of memory block that precomputation result needs, this quantitative value is sent to increasing income In data base, that is, notify the size of the PostgreSQL database table of PostgreSQL database storage current precomputation result needs.

Specifically, after PostgreSQL database receives quantitative value, corresponding PostgreSQL database table, this PostgreSQL database table bag are created Memory block containing this quantitative value.

For example, determine that the memory block that current precomputation result needs is 12, then PostgreSQL database creates PostgreSQL database Table, wherein, this PostgreSQL database table actually includes 12 memory blocks.

S2130, startup storage programming model task, using precomputation result as the input storing programming model task.

Specifically, storage programming model task is started according to metadata.In the present embodiment, storage programming model is MapReduce.Wherein it is possible to using precomputation result as MapReduce input.

S2140, utilization storage programming model generate corresponding binary format file.

Further, the binary format file that the result of storage programming model output is given tacit consent to for PostgreSQL database, that is, The file of HFile form.Advantage of this is that, precomputation result can be avoided to insert the property brought in PostgreSQL database one by one Can impact.

S2150, using PostgreSQL database BulkLoad, binary format file is directed into described PostgreSQL database table In, to realize storing described precomputation result in PostgreSQL database.

Specifically, when binary format file being directed into PostgreSQL database table, using the BulkLoad of PostgreSQL database, Can realize binary format file is simultaneously directed to the corresponding memory block of PostgreSQL database table.

For example, binary format file includes the precomputation result of 1G, and corresponding PostgreSQL database table comprises 4 storages Area, then when utilizing BulkLoad, it is possible to achieve front 256M precomputation result imports the 1st memory block, following 256M precomputation knot Phenolphthalein enters the 2nd memory block, next 256M precomputation result imports the 3rd memory block and the precomputation result of last 256M is led The importing process entering the 4th memory block is carried out simultaneously.

In the present embodiment, the benefit carrying out partitioned storage to PostgreSQL database table is：On the one hand, precomputation result is being led Fashionable, multiple nodes can be had simultaneously to enter the operation of row write, accelerate the speed of data write by the principle of load balancing；Another Query capability, when inquiring about data, can be distributed to each destination node (i.e. memory block), be effectively prevented from data and incline by aspect Tiltedly, accelerate the speed of data query.

S2160, preservation precomputation result and the corresponding relation extremely described Data Analysis Model of described PostgreSQL database table In metadata.

Specifically, when preserving precomputation result, the corresponding relation of precomputation result and PostgreSQL database table is saved in In the configuration information corresponding with PostgreSQL database of metadata.When inquiring about data, just can be determined estimated according to corresponding relation Calculate result corresponding PostgreSQL database table.

Optionally, said method can also include：Interval setting time, management data cube and Data Analysis Model In metadata.

Wherein it is possible to interval setting time, the total data cube in management platform and corresponding metadata.Logarithm Can include according to cubical management：Modification, inquiry, calculating and deletion etc..

For example, it is spaced setting time, the new recorded data in external data base is imported the fact that data cube is corresponding In table and dimension table, and combine the new data modification data cube importing, corresponding precomputation result and precomputation result Storage location.And, amended data cube relevant information is preserved to corresponding metadata, with complete paired data Cubical modification and calculating.

Optionally, interlude can set according to time situation.As when business dull season, can be every fortnight Complete paired data cube and the management of metadata.When the business busy season, every a week complete paired data cube with And the management of metadata.

Above-mentioned steps are inquiry set-up procedure, and here is specific query script.The number providing with reference to Fig. 5, the present embodiment Specifically include according to querying method：

S510, when getting the inquiry request of client transmission, parsing inquiry request is simultaneously converted into PostgreSQL database and looks into Ask sentence.

Fig. 6 is user interface schematic diagram during client query.User can select to need the dimension of inquiry by user interface Degree and tolerance.As shown in fig. 6, dimension A is encoded 1, dimension C1 level 1, dimension D1 level 1 and tolerance MAX1 by user, and (polymerization is Big value) it is drawn to the row as inquiry in row major key 61, certainly, user can arbitrarily pull user's assets according to the actual requirements and divide The dimension of analysis model 1 and index.Specifically, client generates inquiry request and sends to server according to the selection of user.Can Choosing, user can also input filtercondition by the filter column 62 in Fig. 6, to filter to inquiry request.

Further, server, when getting inquiry request, this inquiry request is converted into PostgreSQL database and can recognize that Query statement.

S520, PostgreSQL database query statement is sent to PostgreSQL database, with according to PostgreSQL database query statement Transfer precomputation result, and form Query Result.

Specifically, when PostgreSQL database receives PostgreSQL database query statement, result and data of increasing income are calculated on the estimation The corresponding relation of storehouse table determines this PostgreSQL database query statement corresponding PostgreSQL database table, inquires about in PostgreSQL database table With this PostgreSQL database query statement corresponding precomputation result, and form Query Result.

Further, when inquiring about PostgreSQL database table, can be only to the head data of storage and mantissa's evidence in each memory block Inquired about, if confirming the corresponding result of PostgreSQL database query statement in the scope of data that certain memory block stores, This memory block carries out data query, to determine PostgreSQL database query statement corresponding precomputation result.

S530, Query Result is encapsulated and is back to client, so that client end response Query Result.

Specifically, client, when receiving Query Result, parses this Query Result, and the graphical interface of user in Fig. 6 Viewing area 63 show this Query Result, wherein, the display format form of acquiescence.Optionally, user can also pass through In Fig. 6, display type module 64 determines the display type of Query Result, inputs show bar number by bar digital-to-analogue block 65.

The technical scheme that the present embodiment provides, by the combination of MOLAP and big data it is achieved that the inquiry of mass data, And improve inquiry velocity.Meanwhile, in inquiry, SQL statement need not be write so that non-technical personnel user can also execute looks into Ask work, improve the experience of user.

Embodiment three

A kind of structural representation of data processing equipment based on MOLAP that Fig. 7 provides for the embodiment of the present invention three.Reference Fig. 7, the data processing equipment that the present embodiment provides specifically includes：Cube creation module 701, precalculation module 702 and preservation Module 703.

Wherein, cube creation module 701, for creating data cube according to true table and dimension table；Precomputation mould The whole of dimension, for the data based on record in data cube, combination may be carried out data precomputation by block 702；Preserve Module 703, for preserving precomputation result to PostgreSQL database, to calculate result on the estimation in inquiry to determine inquiry Result.

The technical scheme that the present embodiment provides, creates data cube by true table with the dimension table associating, to data In cube, the whole of dimension combination may carry out data precomputation, and precomputation result is saved in PostgreSQL database, can Calculated with the big data realized based on MOLAP and store.Meanwhile, when user is inquired about, SQL statement need not be write it is only necessary to To pull dimension and tolerance in the page of client, server just can determine inquiry knot according to corresponding precomputation result Really, simplify data query process, improve inquiry response speed.

On the basis of above-described embodiment, cube creation module 701 includes：Tables of data creating unit, for according to pre- If Data Analysis Model in list item demand to true table and dimension table, the fact that create corresponding table and dimension table；Data is led Enter unit, for the list item demand according to true table and dimension table, the data in external data base is directed into true table and dimension In degree table；Cube creating unit, for according to the metadata in Data Analysis Model, creating number using true table and dimension table According to cube, wherein, metadata is used for indicating the cubical property parameters of data and creates rule.

Include it is contemplated that calculating module 702 on the basis of above-described embodiment：Precomputation task start unit, for according to number According to the metadata in analysis model, start precomputation programming model task, and read whole dimensions corresponding with data cube Table and the data of true table；Assembled unit, for the dimension of whole dimension tables is carried out permutation and combination, obtains including empty set Whole may combination；Polymerized unit, for according to the polymeric rule setting, carrying out polymerization fortune to the combination comprising whole dimensions Calculate, obtain polymerizing value；Input value determining unit, for using comprise whole dimensions combination as precomputation programming model key Value input, polymerizing value is inputted as the keyword of precomputation programming model；Result signal generating unit, for being programmed using precomputation Model obtains new dimension combination and combines corresponding polymerizing value with new dimension；Cycling element, for successively by new dimension Degree combination is combined corresponding polymerizing value and is programmed mould as precomputation as the key value input of precomputation programming model, new dimension The keyword input of type, and obtain new dimension combination and corresponding poly- with new dimension combination using precomputation programming model Conjunction value, till the polymerizing value that all may combine until obtaining and all may combine.

On the basis of above-described embodiment, preserving module 703 includes：Build table unit, be used for storing precomputation for creating The PostgreSQL database table of result；Store tasks set up unit, for start storage programming model task, using precomputation result as The input of storage programming model task；File generating unit, for generating corresponding binary format using storage programming model File；Result import unit, for the BulkLoad using PostgreSQL database, binary format file is directed into data of increasing income In the table of storehouse, to realize storing precomputation result in PostgreSQL database；Relation memory element, for preserving precomputation result and opening The corresponding relation of source database table is to the metadata of Data Analysis Model.

On the basis of above-described embodiment, build table unit and include：Capacity determination subelement, for determining in PostgreSQL database The capacity of the memory block setting for PostgreSQL database table；Quantitative value determination subelement, for calculating the size of result on the estimation, really Determine the quantitative value that precomputation result is stored in required memory block during PostgreSQL database table；Create subelement, for by quantitative value Send to PostgreSQL database, so that PostgreSQL database creates the PostgreSQL database for storing precomputation result according to quantitative value Table.

On the basis of above-described embodiment, also include：Management module, for being spaced setting time, manages data cube And metadata in Data Analysis Model.

On the basis of above-described embodiment, also include：Sentence acquisition module, in the inquiry getting client transmission During request, parsing inquiry request is simultaneously converted into PostgreSQL database query statement；Result queries module, for looking into PostgreSQL database Ask sentence to send to PostgreSQL database, so that precomputation result is transferred according to PostgreSQL database query statement, and form inquiry knot Really；Result returns module, for Query Result being encapsulated and being back to client, so that client end response Query Result.

On the basis of above-described embodiment, metadata include following at least one：The creation time of data cube, data are stood The fact that the establishment position of cube, the title of data cube, data cube, the dimension of data cube and its order, number According to cubical tolerance, polymeric type configuration information corresponding with programming model configuration information corresponding with PostgreSQL database, Pre-calculation time and the storage information of result of calculation.

The base that above-mentioned any embodiment offer is provided based on the data processing equipment of MOLAP provided in an embodiment of the present invention In the data processing method of MOLAP, possess corresponding function and beneficial effect.

Note, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore although being carried out to the present invention by above example It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also Other Equivalent embodiments more can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims

1. a kind of data processing method based on MOLAP is it is characterised in that include：

Data cube is created according to true table and dimension table；

Precomputation result is preserved to PostgreSQL database, to calculate result on the estimation in inquiry to determine Query Result.

2. data processing method according to claim 1 is stood it is characterised in that creating data according to true table and dimension table Cube includes：

According to the list item demand to true table and dimension table in default Data Analysis Model, create the fact that correspond to table and dimension Table；

According to the list item demand of described fact table and dimension table, the data in external data base is directed into true table and dimension table In；

According to the metadata in described Data Analysis Model, create data cube using described fact table and described dimension table, Wherein, described metadata is used for indicating the property parameters of described data cube and creates rule.

3. data processing method according to claim 1 it is characterised in that based in described data cube record number According to, data precomputation may be carried out combination to the whole of dimension, including：

According to the metadata in Data Analysis Model, start precomputation programming model task, and read and described data cube Corresponding whole dimension table and the data of true table；

The dimension of described whole dimension tables is carried out permutation and combination, obtains the whole possible combination including empty set；

According to the polymeric rule setting, aminated polyepichlorohydrin is carried out to the combination comprising whole dimensions, obtains polymerizing value；

The combination comprising whole dimensions is inputted as the key value of precomputation programming model, using described polymerizing value as precomputation The keyword input of programming model；

Obtain new dimension combination using described precomputation programming model and combine corresponding polymerizing value with new dimension；

Successively new dimension combination is inputted as the key value of precomputation programming model, new dimension combines corresponding polymerizing value As the keyword input of precomputation programming model, and using described precomputation programming model obtain new dimension combination and with Till new dimension combines corresponding polymerizing value, the polymerizing value that all may combine until obtaining and all may combine.

4. data processing method according to claim 1 is it is characterised in that preserve precomputation result to PostgreSQL database Include：

Create the PostgreSQL database table for storing precomputation result；

Start storage programming model task, using precomputation result as the described input storing programming model task；

Generate corresponding binary format file using described storage programming model；

Using the BulkLoad of PostgreSQL database, binary format file is directed in described PostgreSQL database table, to realize Described precomputation result is stored in PostgreSQL database；

Preserve the metadata of described precomputation result and the corresponding relation extremely described Data Analysis Model of described PostgreSQL database table In.

5. data processing method according to claim 4 is it is characterised in that create for storing increasing income of precomputation result Database table includes：

Determine the capacity of the memory block setting in PostgreSQL database for PostgreSQL database table；

Calculate the size of result on the estimation, determine that described precomputation result is stored in required memory block during PostgreSQL database table Quantitative value；

Described quantitative value is sent to described PostgreSQL database, so that described PostgreSQL database creates according to described quantitative value using PostgreSQL database table in storage precomputation result.

6. data processing method according to claim 1 is it is characterised in that also include：

Interval setting time, manages metadata in described data cube and described Data Analysis Model.

7. data processing method according to claim 1 is it is characterised in that preserve precomputation result to PostgreSQL database In after, also include：

When getting the inquiry request of client transmission, parse described inquiry request and be converted into PostgreSQL database inquiry language Sentence；

Described PostgreSQL database query statement is sent to PostgreSQL database, pre- to be transferred according to PostgreSQL database query statement Result of calculation, and form Query Result；

Described Query Result is encapsulated and is back to described client, so that Query Result described in described client end response.

8. the data processing method according to any one of claim 2-6 it is characterised in that described metadata include following extremely One item missing：

The creation time of data cube, the establishment position of data cube, the title of data cube, the thing of data cube In fact, the dimension of data cube and its order, the tolerance of data cube, polymeric type configuration corresponding with programming model letter Cease the storage information of configuration information corresponding with PostgreSQL database, pre-calculation time and precomputation result.

9. a kind of data processing equipment based on MOLAP is it is characterised in that include：

The whole of dimension, for the data based on record in described data cube, may be combined into line number by precalculation module It is expected that calculating；

Preserving module, for preserving precomputation result to PostgreSQL database, so that it is true to calculate result on the estimation in inquiry Determine Query Result.

10. data processing equipment according to claim 9 is it is characterised in that described cube creation module includes：

Tables of data creating unit, for according to the list item demand to true table and dimension table in default Data Analysis Model, creating The fact that build corresponding table and dimension table；

Data import unit, for the list item demand according to described fact table and dimension table, the data in external data base is led Enter to true table and dimension table；

Cube creating unit, for according to the metadata in described Data Analysis Model, using described fact table and described dimension Degree table creates data cube, and wherein, described metadata is used for indicating the property parameters of described data cube and creates rule.