CN106528787A - Mass data multi-dimensional analysis-based query method and device - Google Patents

Mass data multi-dimensional analysis-based query method and device Download PDF

Info

Publication number
CN106528787A
CN106528787A CN201610985200.6A CN201610985200A CN106528787A CN 106528787 A CN106528787 A CN 106528787A CN 201610985200 A CN201610985200 A CN 201610985200A CN 106528787 A CN106528787 A CN 106528787A
Authority
CN
China
Prior art keywords
dimension
data
tables
subcube
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610985200.6A
Other languages
Chinese (zh)
Other versions
CN106528787B (en
Inventor
翟东波
宋少峰
任永强
江志鹏
周盛
董亚卫
潘柏宇
王冀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING GAODE YUNTU TECHNOLOGY Co.,Ltd.
Original Assignee
1Verge Internet Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 1Verge Internet Technology Beijing Co Ltd filed Critical 1Verge Internet Technology Beijing Co Ltd
Priority to CN201610985200.6A priority Critical patent/CN106528787B/en
Publication of CN106528787A publication Critical patent/CN106528787A/en
Application granted granted Critical
Publication of CN106528787B publication Critical patent/CN106528787B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Abstract

The invention discloses a mass data multi-dimensional analysis-based query method and device. The method comprises the following steps of: receiving a query request, sent by a user, with dimensionality information to be queried; querying data corresponding to the dimensionality information in a pre-established subcube table according to the dimensionality information; when the data corresponding to the dimensionality information is queried, returning the data to the user; and when the data corresponding to the dimensionality information is not queried, querying the data corresponding to the dimensionality information in a pre-established cube table, returning the data to the user and taking dimensionality names included in the dimensionality information as a dimensionality combination to be sampled, wherein the subcube table is synthesized by a part of rows in the cube table. Through the method, line number in the subcube table is smaller than line number in the cube table; the user carries out query in the pre-established subcube table, so that the query efficiency can be effectively improved; and moreover, the subcube table lists a dimensionality combination of a part of dimensionalities and does not need to list all the dimensionality combinations, so that the calculation amount is effectively reduced.

Description

A kind of querying method and device based on mass data multidimensional analysis
Technical field
The application is related to field of computer technology, more particularly to a kind of querying method based on mass data multidimensional analysis and Device.
Background technology
With the continuous development of computer technology, in order to the data that multi-angle is provided to corporate decision maker are supported, increasingly Many enterprises begin to use on-line analytical processing (On-line Analytical Processing, OLAP) with this from different dimensional Spend to inquire about the corresponding data of each dimension, and the data to inquiring are analyzed process to formulate corresponding decision-making.
At present, the multidimensional data for collecting can be deposited according to fixed storage format by enterprise customer by on-line analytical processing Store up in data warehouse, subsequently, can be required according to the dimension of enterprise customer, be carried out by on-line analytical processing fast and flexible The complex query of big data quantity, and with it is a kind of it is directly perceived and understandable in the form of Query Result is supplied to into enterprise customer.
In the prior art, on-line analytical processing is mainly based upon the form of relevant database to carry out multidimensional data Storage and inquiry, therefore, the multidimensional data stored in database mainly adopts the shape of cube tables (cube tables are for bivariate table) What formula was stored, also, in cube tables, each dimension is arranged as one, such as, it is assumed that only dimension A and dimension B, table 1:
Dimension A Dimension B True value
A1 B1 1
A1 B2 1
A2 B1 1
A2 B2 1
Table 1
In addition, on-line analytical processing there is also based on the form of multidimensional data organization to carry out the storage of multidimensional data and look into Ask, therefore, during storage multidimensional data, needs are combined database to each dimension, and the data after combination are made Stored with the form of Key-Value, such as, it is assumed that the value of each dimension such as table 2:
Dimension A Dimension B
A1 B1
A2 B2
Table 2
Database is combined (that is, AB, A, B) to each dimension, using the storage mode of Key-Value, depositing in K-V Storage such as table 3:
A1,B2
A1,B1
A2,B1
A2,B2
A1
A2
B1
B2
Table 3
But, when on-line analytical processing is carrying out the storage of multidimensional data with inquiry based on the form of relevant database When, if dimensional comparison is more, than larger, data storage data in cube tables have large number of rows data to data volume, and inquire about During, need according to dimension Query Information, the data in cube tables are filtered line by line, then the data after filtration are gathered Close, can so cause to inquire about is less efficient, at the same time, when on-line analytical processing be form based on multidimensional data organization come The storage for carrying out multidimensional data goes out all of dimension with advance exhaustion during inquiry, is needed, and according to all of dimension, storage is related Dimension combination, can so bring huge amount of calculation.
The content of the invention
The embodiment of the present application provides a kind of querying method and device of mass data multidimensional analysis, to solve prior art In when on-line analytical processing be based on the form of relevant database to carry out the storage of multidimensional data with inquiry when, search efficiency It is relatively low, and when on-line analytical processing be based on the form of multidimensional data organization to carry out the storage of multidimensional data with inquiry when, in advance First exhaustion goes out all of dimension and the huge problem of caused amount of calculation.
The embodiment of the present application provides a kind of querying method of mass data multidimensional analysis, including:
The inquiry request for carrying dimensional information to be checked that receive user sends, wherein, the dimensional information includes: Dimension name and dimension values;
According to the dimensional information, the corresponding data of the dimensional information are inquired about in the subcube tables for pre-building;
When the corresponding data of the dimensional information are inquired, then user is returned data to;When not inquiring the dimension The corresponding data of information, then in the cube tables for pre-building inquire about the corresponding data of the dimensional information, return data to User, and the dimension name that the dimensional information is included is acquired as overall, wherein, the subcube tables are by cube Part row synthesis in table.
Preferably, the Query Information for carrying dimension name to be checked of user is obtained in advance, according to the dimension name, is looked into The corresponding data of dimension name are ask, according to the dimension name and the corresponding data of dimension name, subcube tables is set up.
Preferably, when the data in the cube tables for pre-building occur to update, methods described also includes:Pre-building Cube tables in obtain the data for occurring to update, the data to getting carry out dimension-reduction treatment, by the data after dimension-reduction treatment more Newly to the subcube tables for pre-building.
Preferably, for arbitrary data for getting, it is determined that comprising the dimension name corresponding at least one data Subcube tables, and subcube tables are determined for arbitrary, it is right with the subcube tables institute that the dimension corresponding to the data is dropped to The dimension answered is consistent, and dimension values identical data corresponding with the data are searched in subcube tables;When finding out and the number According to corresponding dimension values identical data, then data are merged, it is identical with the corresponding dimension values of the data when not finding out Data, then be directly appended to subcube tables.
Preferably, methods described also includes:In special time, in special time, will be gathered comprising dimension name Dimension combination in identical be classified as one group, count each group in gathered comprising dimension name dimension combination number of times, in institute In the case that the number of times of the combination of the dimension comprising dimension name of collection exceedes default threshold value, newly-built subcube tables, and advance The dimension is inquired about in the cube tables of foundation and combines the included corresponding data of dimension name, the dimension is combined into included dimension The corresponding dimension values identical data of name are merged, and are added to newly-built subcube tables.
The embodiment of the present application provides a kind of inquiry unit of mass data multidimensional analysis, including:
Receiver module, for the inquiry request for carrying dimensional information to be checked that receive user sends, wherein, it is described Dimensional information includes:Dimension name and dimension values;
Enquiry module, for according to the dimensional information, inquiring about the dimensional information in the subcube tables for pre-building Corresponding data;
Data return module, for when the corresponding data of the dimensional information are inquired, then returning data to user;When The corresponding data of the dimensional information are not inquired, then the corresponding number of the dimensional information is inquired about in the cube tables for pre-building According to, user is returned data to, and the dimension name that the dimensional information is included is acquired as overall, wherein, it is described Subcube tables are synthesized by the part row in cube tables.
Preferably, described device also includes:Module is pre-build, dimension to be checked is carried for obtain in advance user The Query Information of degree name, according to the dimension name, inquires about the corresponding data of dimension name, according to the dimension name and the dimension name Corresponding data, set up subcube tables.
Preferably, described device also includes:First update module, for occurring when the data in the cube tables for pre-building During renewal, the data for occurring to update are obtained in the cube tables for pre-building, the data to getting carry out dimension-reduction treatment, will drop Data after dimension process are updated to the subcube tables for pre-building.
Preferably, first update module is specifically for for arbitrary data for getting, it is determined that include at least one The subcube tables of the dimension name corresponding to the data, and subcube tables are determined for arbitrary, by the dimension corresponding to the data Degree drops to, and in subcube table lookup with the data corresponding dimension values phase consistent with the dimension corresponding to the subcube tables Same data;When dimension values identical data corresponding with the data are found out, then data are merged, when do not find out with The corresponding dimension values identical data of the data, then be directly appended to subcube tables.
Preferably, described device also includes:Second update module, in special time, will be gathered comprising dimension name Dimension combination in identical be classified as one group, count each group in gathered comprising dimension name dimension combination number of times, in institute In the case that the number of times of the combination of the dimension comprising dimension name of collection exceedes default threshold value, newly-built subcube tables, and advance The dimension is inquired about in the cube tables of foundation and combines the included corresponding data of dimension name, the dimension is combined into included dimension The corresponding dimension values identical data of name are merged, and are added to newly-built subcube tables.
The embodiment of the present application provides a kind of querying method and device of mass data multidimensional analysis, and the method receives use first The inquiry request for carrying dimensional information to be checked that family sends, wherein, the dimensional information includes:Dimension name and dimension values, According to the dimensional information, the corresponding data of the dimensional information are inquired about in the subcube tables for pre-building, when inquiring the dimension The corresponding data of information, then return data to user, when the corresponding data of the dimensional information are not inquired, is then pre-building Cube tables in inquire about the corresponding data of the dimensional information, return data to user, and the dimension that the dimensional information is included Name is acquired as dimension combination, wherein, the subcube tables are synthesized by the part row in cube tables.By above-mentioned side Method, as the subcube tables are synthesized by the part row in cube tables, that is to say, that the line number in subcube tables is less than Line number in cube tables, subsequently, user is first inquired about in the subcube tables for pre-building during inquiry, this Sample can effectively improve the efficiency of inquiry, also, subcube tables simply include the dimension combination of partial dimensional, and need not Exhaustion goes out all of dimension combination, so effectively reduces amount of calculation.
Description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen Schematic description and description please does not constitute the improper restriction to the application for explaining the application.In the accompanying drawings:
A kind of process schematic of the inquiry of mass data multidimensional analysis that Fig. 1 is provided for the embodiment of the present application;
A kind of structural representation of the inquiry of mass data multidimensional analysis that Fig. 2 is provided for the embodiment of the present application.
Specific embodiment
To make purpose, technical scheme and the advantage of the application clearer, below in conjunction with the application specific embodiment and Corresponding accompanying drawing is clearly and completely described to technical scheme.Obviously, described embodiment is only the application one Section Example, rather than the embodiment of whole.Based on the embodiment in the application, those of ordinary skill in the art are not doing The every other embodiment obtained under the premise of going out creative work, belongs to the scope of the application protection.
The query script of the mass data multidimensional analysis that Fig. 1 is provided for the embodiment of the present application, specifically includes following steps:
S101:The inquiry request for carrying dimensional information to be checked that receive user sends.
In actual applications, the multidimensional data for collecting can be deposited according to fixed by enterprise customer by on-line analytical processing Storage form is stored in data warehouse, subsequently, can be required according to the dimension of enterprise customer, quickly clever by on-line analytical processing Carry out the complex query of big data quantity livingly, and with it is a kind of it is directly perceived and understandable in the form of be supplied to enterprise to use Query Result Family.
And enterprise customer needed the multidimensional data storage that will be collected to data warehouse before dimension required for inquiry In.
Further, as the application is intended to by being proposed some of cube lists according to user's actual need Come, these row for singly putting forward individually are set up into a table, with this by way of reducing whole table columns, so as to reach reduction The line number of whole table, that is to say, that the columns of this table for individually setting up these row for singly putting forward is relative to original Cube tables are compared and are reduced, and the line number of whole table is compared also relative to advance cube tables and reduced, and the line number of whole table subtracts Few, then the inquiry velocity on this table will accelerate, therefore, in this application, can be in the cube tables for prestoring, root Some row are extracted accordingly according to actual demand, and according to the row that these extract, set up a table.
Here is it should be noted that in order to distinguish original cube tables and corresponding according to the actual requirements well Some row, and the table set up according to these row for extracting are extracted, in this application, will be carried accordingly according to the actual requirements Some row are taken, and the table definition set up according to these row for extracting is subcube tables, that is to say, that the subcube tables It is to be synthesized by the part row in cube tables, also, is directed to same cube table, the actual demand of user is typically comprising various Mutually different dimension combination, can combine according to each different dimension that the actual demand of user is included, respectively basis The data included in cube tables set up a subcube table, that is to say, that same cube table can be according to the actual need of user The various dimension combinations for differing for being included are asked, the subcube tables that multiple dimension combinations are differed are set up.
Further, this application provides a method for pre-building subcube tables, specific as follows:
The Query Information for carrying dimension name to be checked of user is obtained in advance, according to the dimension name, inquires about the dimension The corresponding data of name, according to the dimension name and the corresponding data of dimension name, set up subcube tables.
Here is it should be noted that the corresponding data of dimension name include:The fact that dimension values and dimension values correspondence, is worth, and And, dimension name to be checked can be the combination of only one of which dimension name, or multiple dimension names, dimension name to be checked Specifically include several dimension names then to be determined according to the actual demand of user.
In addition, according to the dimension name and the corresponding data of dimension name, during setting up subcube tables, first building The vertical one subcube table for including true value name and dimension name, each row correspond to a dimension name, dimension (that is, are wrapped Name containing dimension and dimension values) it is identical the fact value merge, and value and its corresponding dimension values are filled out for the fact that by after merging Enter in subcube tables, e.g., if true 1 corresponding dimension of value is:Province=Beijing and season=first quarter, true value 2 Also corresponding to dimension is:Province=Beijing and season=first quarter, then the two true values can merge, the fact that after merging It is worth for 3, i.e. 1+2, and true value 3 is filled into dimension:The fact that province=Beijing and season=first quarter correspondence, is worth row In, if true 2 corresponding dimension A of value (that is, province=Beijing) and dimension C (season=second quarter), then true value can not be entered Row merges.
For example, for simple and clear elaboration the application, it is assumed that cube tables such as table 4 in data warehouse:
Dimension A Dimension B Dimension C True value
A1 B1 C1 1
A1 B2 C2 1
A2 B1 C1 1
A2 B2 C2 1
Table 4
Assume that user's first needs to set up subcube tables according to the actual requirements, therefore, data warehouse obtains the carrying of user's first There is the Query Information of dimension A (that is, dimension name) to be checked, according to dimension A, inquire about the corresponding data of dimension A, i.e. table 5.
Dimension A True value
A1 1
A1 1
A2 1
A2 1
Table 5
A subcube table for including true value name and dimension A (that is, dimension name) is set up, each row correspond to one Individual dimension name, value is merged for the fact that will be dimension (that is, comprising dimension name and dimension values) identical, and the fact that after merging Value and its corresponding dimension values are filled in subcube tables, as shown in table 6:
Dimension A True value
A1 2
A2 2
Table 6
Further, after the completion of subcube tables are set up, user can pass through terminal by the inquiry of dimensional information to be checked Request is sent to data warehouse, and the corresponding data of dimension needed for inquiring about.
Here it should be noted that due to during inquiry, needing to know the data of inquiry in which row of which row, Therefore, dimensional information to be checked includes:Dimension name and dimension values.
In addition, the dimension name included in dimensional information to be checked can be one, e.g., dimension A, or multiple The combination of dimension name, e.g., dimension A and dimension B specifically include several dimension names then according to user's in dimensional information to be checked Actual demand and determine.
Use the example above, user's first inquires about the corresponding data of dimension A=A1 according to the actual requirements, therefore, user's first is by eventually The inquiry request of dimension A=A1 (that is, dimensional information) to be checked is sent to data warehouse, and the dimension pair needed for inquiring about by end The data answered.
S102:According to the dimensional information, the corresponding number of the dimensional information is inquired about in the subcube tables for pre-building According to.
Data warehouse receive user transmission the inquiry request for carrying dimensional information to be checked after, direct basis The dimension name included in dimensional information and dimension values, are inquired about in the subcube tables for pre-building, are inquired about the dimensional information Corresponding data.
Here is it should be noted that the corresponding data of the dimensional information can include:True value.
Continuation of the previous cases, data warehouse are receiving the inquiry request for carrying dimensional information to be checked of user's first transmission Afterwards, directly inquired about in the subcube tables 6 for pre-building, inquired according to dimension A=A1 included in dimensional information The corresponding data of the dimensional information are:True value=2.
S103:When the corresponding data of the dimensional information are inquired, then user is returned data to;It is described when not inquiring The corresponding data of dimensional information, then in the cube tables for pre-building inquire about the corresponding data of the dimensional information, data returned Back to user, and the dimension name that the dimensional information is included is acquired as overall.
When the corresponding data of the dimensional information are inquired in the subcube tables for pre-building, then return data to use Family.
Continue to use the example above, will inquire the corresponding data of the dimensional information is:True value=2 return to user's first.
But, due to, during the subcube for pre-building, being to determine to determine according to the actual demand of user Dimension name to be checked, and user is it is determined that during actual demand simply rule of thumb or historical data is determining, therefore, by In experience or the limitation of historical data, in actual applications, it is possible to there is user in dimension needed for inquiry not advance In the subcube tables of foundation, therefore, when the corresponding data of the dimensional information are not inquired, then can only be in the cube for pre-building The corresponding data of the dimensional information are inquired about in table, user is returned data to.
Further, in actual applications, although the dimensional information to be checked that data warehouse is sent according to user, do not have The corresponding data of the dimensional information are inquired in the subcube tables for pre-building, but can also illustrate that later user has Also may can there is the tendency for inquiring about the dimensional information to be checked, also, during active user's inquiry dimensional information, although simply Queried dimension certain dimension values under one's name, but also illustrate that the user is possible to exist afterwards and inquire about the dimension under one's name its The tendency of his dimension values, therefore, the corresponding data of the dimensional information are inquired about in the cube tables for pre-building, is returned data to While user, the dimension name that the dimensional information is included is acquired as dimension combination.
For example, it is assumed that the user's first in upper example inquires about the corresponding data of dimension B=B1 according to the actual requirements, then data warehouse The corresponding data of the dimensional information are not inquired in the subcube6 tables for pre-building, is looked in the cube tables 4 for pre-building Ask the corresponding data of the dimensional information, i.e. true value is:2, user's first is returned data to, while using dimension B as dimension group Conjunction is acquired.
Further, as, after being acquired each time, the dimension name that the dimension all to gathering is included in combining is carried out Determine whether to subcube tables be set up according to the dimension name included in the dimension combination, can waste more computer resources, because This, in this application, can be in collection certain hour, the dimension name that the dimension to being gathered is included in combining judges whether Subcube tables are set up according to the dimension name included in the combination of the dimension of collection, wherein, certain hour can be according to specific reality Need to determine.
Further, the dimension name that the dimension to gathering is included in combining is determined whether to the dimension according to the collection The dimension name included in combination is set up specific as follows during subcube tables:
In special time, identical in the combination of the dimension comprising dimension name for being gathered is classified as into one group, each group is counted The number of times of interior the gathered combination of the dimension comprising dimension name, exceedes in the number of times of the combination of the dimension comprising dimension name for being gathered In the case of default threshold value, newly-built subcube tables, and inquire about what the dimension combination was included in the cube tables for pre-building The dimension is combined the included corresponding dimension values identical data of dimension name and is merged by the corresponding data of dimension name, and It is added to newly-built subcube tables, wherein, special time is consistent with certain hour mentioned above.
For example, it is assumed that special time is one day, in one day, the dimension comprising dimension name for collecting is combined such as 7 institute of table Show:
Dimension B
Dimension B
Dimension A and dimension B
Dimension C
Dimension B
Dimension B and dimension C
Table 7
Identical in the combination of the dimension comprising dimension name for being gathered is classified as into one group, what is gathered in statistics each group includes The number of times of the dimension combination of dimension name, such as table 8:
Dimension name Number of times
Dimension B 4
Dimension A and dimension B 1
Dimension C 1
Dimension B and dimension C 1
Table 8
Default threshold value is assumed for 3 times, data warehouse determines that the number of times of the combination of the dimension comprising dimension B for being gathered surpasses Default threshold value, i.e., 3 time are crossed, newly-built subcube tables inquire about what the dimension combination was included in the cube tables 4 for pre-building The dimension is combined the included corresponding dimension values identical data of dimension B and is merged, and added by the corresponding data of dimension B Newly-built subcube tables are added to, as shown in table 9:
Dimension B True value
B1 2
B2 2
Table 9
By said method, as the subcube tables are synthesized by the part row in cube tables, that is to say, that Line number in subcube tables is less than the line number in cube tables, and subsequently, user is first being pre-build during inquiry Subcube tables in inquired about, so can effectively improve the efficiency of inquiry, also, subcube tables simply include portion The dimension combination of fractional dimension, and go out all of dimension combination without the need for exhaustion, so effectively reduce amount of calculation.
In actual applications, there is more news in the data being stored in advance in the cube tables of data warehouse, and Subcube tables are set up according to the data in cube tables, that is to say, that when the data in cube tables occur to update, then Data in subcube tables will certainly also change, therefore, in this application, the data in the cube tables for pre-building When generation updates, need to be updated the data in the subcube tables that pre-build.
This application provides the mode of the data in the subcube tables for pre-building specifically is updated, it is specific as follows:Pre- The data for occurring to update are obtained in the cube tables first set up, the data to getting carry out dimension-reduction treatment, after dimension-reduction treatment Data are updated to the subcube tables for pre-building.
In addition, the application is during the data to getting carry out dimension-reduction treatment, arbitrary number for getting can be directed to According to, it is determined that the subcube tables comprising the dimension name corresponding at least one data, and subcube tables are determined for arbitrary, Dimension corresponding to the data is dropped to it is consistent with the dimension corresponding to the subcube tables, and in subcube tables search with should The corresponding dimension values identical data of data;When dimension values identical data corresponding with the data are found out, then data are entered Row merges, and when dimension values identical data corresponding with the data are not found out, is then directly appended to subcube tables.
For example, it is assumed that the table stored in data warehouse is included:Table 4, table 6, table 9, it is assumed that user's first increased one in table 4 Row data, it is concrete as shown in table 10:
Dimension A Dimension B Dimension C True value
A1 B1 C1 1
A1 B2 C2 1
A2 B1 C1 1
A2 B2 C2 1
A1 B3 C1 1
Table 10
Data warehouse then obtains the data that generation updates in table 10, for the data for obtaining, it is determined that including at least one The subcube tables of the dimension name corresponding to the data, i.e. subcube tables 6 and subcube tables 9.
For subcube tables 6, the dimension corresponding to the data is dropped to and the dimension one corresponding to the subcube tables 6 Cause, i.e. the dimension corresponding to data after dimensionality reduction only includes dimension A, finds out corresponding with the data in subcube tables 6 Dimension values identical data, merge, concrete such as table 11:
Dimension A True value
A1 3
A2 2
Table 11
For subcube tables 9, the dimension corresponding to the data is dropped to and the dimension one corresponding to the subcube tables 9 Cause, i.e. the dimension corresponding to data after dimensionality reduction only includes dimension B, does not find out corresponding with the data in subcube tables 9 Dimension values identical data, therefore, directly add the data in subcube tables 9, it is concrete such as table 12:
Dimension B True value
B1 2
B2 2
B3 1
Table 12
The querying method of the mass data multidimensional analysis for providing for the embodiment of the present application above, based on same thinking, this Application embodiment also provides a kind of inquiry unit of mass data multidimensional analysis.
As shown in Fig. 2 a kind of inquiry unit of mass data multidimensional analysis of the embodiment of the present application offer, including:
Receiver module 201, for the inquiry request for carrying dimensional information to be checked that receive user sends, wherein, The dimensional information includes:Dimension name and dimension values;
Enquiry module 202, for according to the dimensional information, inquiring about the dimension in the subcube tables for pre-building The corresponding data of information;
Data return module 203, for when the corresponding data of the dimensional information are inquired, then returning data to use Family;When the corresponding data of the dimensional information are not inquired, then the dimensional information pair is inquired about in the cube tables for pre-building The data answered, return data to user, and the dimension name that the dimensional information is included is acquired as dimension combination, its In, the subcube tables are synthesized by the part row in cube tables.
Described device also includes:
Module 204 is pre-build, for the Query Information for carrying dimension name to be checked in advance obtaining user, according to The dimension name, inquires about the corresponding data of dimension name, according to the dimension name and the corresponding data of dimension name, sets up Subcube tables.
Described device also includes:
First update module 205, for when the data in the cube tables for pre-building occur to update, what is pre-build The data for occurring to update are obtained in cube tables, the data to getting carry out dimension-reduction treatment, and the data after dimension-reduction treatment are updated To the subcube tables for pre-building.
First update module 205 is specifically for for arbitrary data for getting, it is determined that include at least one number According to the subcube tables of corresponding dimension name, and subcube tables are determined for arbitrary, the dimension corresponding to the data is dropped To consistent with the dimension corresponding to the subcube tables, and dimension values identical corresponding with the data is searched in subcube tables Data;When dimension values identical data corresponding with the data are found out, then data are merged, when not finding out and the number According to corresponding dimension values identical data, then subcube tables are directly appended to.
Described device also includes:
Second update module 206, the dimension comprising dimension name for, in special time, being gathered are identical in combining Be classified as one group, count each group in gathered comprising dimension name dimension combination number of times, gathered comprising dimension name Dimension combination number of times exceed default threshold value in the case of, newly-built subcube tables, and looking in the cube tables for pre-building Ask the dimension and combine the included corresponding data of dimension name, the dimension is combined into the included corresponding dimension values phase of dimension name Same data are merged, and are added to newly-built subcube tables.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium Example.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be by any method Or technology is realizing information Store.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus Or any other non-transmission medium, can be used to store the information that can be accessed by a computing device.Define according to herein, calculate Machine computer-readable recording medium does not include temporary computer readable media (transitory media), the such as data-signal and carrier wave of modulation.
Also, it should be noted that term " including ", "comprising" or its any other variant are intended to nonexcludability Comprising so that a series of process, method, commodity or equipment including key elements not only includes those key elements, but also wrapping Other key elements being not expressly set out are included, or also includes intrinsic for this process, method, commodity or equipment wanting Element.In the absence of more restrictions, the key element for being limited by sentence "including a ...", it is not excluded that wanting including described The process of element, method, also there is other identical element in commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program. Therefore, the application can adopt complete hardware embodiment, complete software embodiment or with reference to the embodiment in terms of software and hardware Form.And, the application can be deposited using the computer for wherein including computer usable program code at one or more is available The shape of the computer program implemented on storage media (including but not limited to magnetic disc store, CD-ROM, optical memory etc.) Formula.
Embodiments herein is the foregoing is only, the application is not limited to.For those skilled in the art For, the application can have various modifications and variations.All any modifications made within spirit herein and principle, equivalent Replace, improve etc., within the scope of should be included in claims hereof.

Claims (10)

1. a kind of querying method of mass data multidimensional analysis, it is characterised in that include:
The inquiry request for carrying dimensional information to be checked that receive user sends, wherein, the dimensional information includes:Dimension Name and dimension values;
According to the dimensional information, the corresponding data of the dimensional information are inquired about in the subcube tables for pre-building;
When the corresponding data of the dimensional information are inquired, then user is returned data to;When not inquiring the dimensional information Corresponding data, then in the cube tables for pre-building inquire about the corresponding data of the dimensional information, return data to user, And the dimension name for including the dimensional information is acquired as dimension combination, wherein, the subcube tables are by cube tables In part row synthesis.
2. the method for claim 1, it is characterised in that pre-build subcube tables, specifically include:
The Query Information for carrying dimension name to be checked of user is obtained in advance;
According to the dimension name, the corresponding data of dimension name are inquired about;
According to the dimension name and the corresponding data of dimension name, subcube tables are set up.
3. method as claimed in claim 2, it is characterised in that when the data in the cube tables for pre-building occur to update, Methods described also includes:
The data for occurring to update are obtained in the cube tables for pre-building;
Data to getting carry out dimension-reduction treatment;
Data after dimension-reduction treatment are updated to the subcube tables for pre-building.
4. method as claimed in claim 3, it is characterised in that the data to getting carry out dimension-reduction treatment, specifically include:
For arbitrary data for getting, it is determined that the subcube tables comprising the dimension name corresponding at least one data;
And subcube tables are determined for arbitrary, the dimension corresponding to the data is dropped to and the dimension corresponding to the subcube tables Degree is consistent, and dimension values identical data corresponding with the data are searched in subcube tables;It is corresponding with the data when finding out Dimension values identical data, then data are merged, when not finding out dimension values identical data corresponding with the data, Subcube tables are directly appended to then.
5. the method for claim 1, it is characterised in that methods described also includes:
In special time, identical in the combination of the dimension comprising dimension name for being gathered is classified as into one group;
The number of times of the combination of the dimension comprising dimension name gathered in statistics each group;
In the case where the number of times of the combination of the dimension comprising dimension name for being gathered exceedes default threshold value, newly-built subcube tables, And the included corresponding data of dimension name of dimension combination are inquired about in the cube tables for pre-building, dimension combination is wrapped The corresponding dimension values identical data of dimension name for containing are merged, and are added to newly-built subcube tables.
6. a kind of inquiry unit of mass data multidimensional analysis, it is characterised in that include:
Receiver module, for the inquiry request for carrying dimensional information to be checked that receive user sends, wherein, the dimension Information includes:Dimension name and dimension values;
Enquiry module, for the dimensional information correspondence according to the dimensional information, is inquired about in the subcube tables for pre-building Data;
Data return module, for when the corresponding data of the dimensional information are inquired, then returning data to user;When not looking into The corresponding data of the dimensional information are ask, then the corresponding data of the dimensional information are inquired about in the cube tables for pre-building, User is returned data to, and the dimension name that the dimensional information is included is acquired as dimension combination, wherein, it is described Subcube tables are synthesized by the part row in cube tables.
7. device as claimed in claim 6, it is characterised in that described device also includes:
Module is pre-build, for the Query Information for carrying dimension name to be checked for obtaining user in advance, according to the dimension Degree name, inquires about the corresponding data of dimension name, according to the dimension name and the corresponding data of dimension name, sets up subcube tables.
8. device as claimed in claim 7, it is characterised in that described device also includes:
First update module, for when the data in the cube tables for pre-building occur to update, in the cube tables for pre-building Middle to obtain the data for occurring to update, the data to getting carry out dimension-reduction treatment, the data after dimension-reduction treatment are updated and arrives advance The subcube tables of foundation.
9. device as claimed in claim 8, it is characterised in that first update module is specifically for for arbitrary acquisition The data for arriving, it is determined that the subcube tables comprising the dimension name corresponding at least one data, and determine for arbitrary Subcube tables, the dimension corresponding to the data is dropped to it is consistent with the dimension corresponding to the subcube tables, and in subcube tables It is middle to search dimension values identical data corresponding with the data;When finding out dimension values identical data corresponding with the data, Then data are merged, when dimension values identical data corresponding with the data are not found out, then subcube is directly appended to Table.
10. device as claimed in claim 6, it is characterised in that described device also includes:
Second update module, for, in special time, identical in the combination of the dimension comprising dimension name for being gathered being classified as One group, the number of times of the combination of the dimension comprising dimension name gathered in each group is counted, in the dimension comprising dimension name for being gathered In the case that the number of times of combination exceedes default threshold value, newly-built subcube tables, and the dimension is inquired about in the cube tables for pre-building The dimension is combined the included corresponding dimension values identical number of dimension name by the included corresponding data of dimension name of degree combination According to merging, and it is added to newly-built subcube tables.
CN201610985200.6A 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data Active CN106528787B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610985200.6A CN106528787B (en) 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610985200.6A CN106528787B (en) 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data

Publications (2)

Publication Number Publication Date
CN106528787A true CN106528787A (en) 2017-03-22
CN106528787B CN106528787B (en) 2019-12-17

Family

ID=58350619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610985200.6A Active CN106528787B (en) 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data

Country Status (1)

Country Link
CN (1) CN106528787B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280046A (en) * 2017-11-30 2018-07-13 深圳市科列技术股份有限公司 A kind of method, battery data server and the user terminal of battery data processing
CN108363819A (en) * 2018-03-23 2018-08-03 联想(北京)有限公司 Query engine matching method, device, server group and readable storage medium storing program for executing
CN108830015A (en) * 2018-07-03 2018-11-16 北京华大九天软件有限公司 A method of utilizing unit performance trend in graphical display analytical unit library
CN108829795A (en) * 2018-06-04 2018-11-16 北京奇艺世纪科技有限公司 Data query method and device
CN108932257A (en) * 2017-05-25 2018-12-04 北京国双科技有限公司 The querying method and device of multi-dimensional data
CN110019186A (en) * 2017-09-07 2019-07-16 北京国双科技有限公司 The method and device of data storage
WO2019161778A1 (en) * 2018-02-22 2019-08-29 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for data storage and querying
CN110334122A (en) * 2019-07-11 2019-10-15 江苏曲速教育科技有限公司 The query analysis method and system of educational data
CN110837511A (en) * 2019-11-15 2020-02-25 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN112000747A (en) * 2020-07-08 2020-11-27 苏宁云计算有限公司 Data multidimensional analysis method, device and system
CN112948441A (en) * 2021-03-26 2021-06-11 浪潮通用软件有限公司 Financial data-oriented multidimensional data aggregation method and equipment
CN113393190A (en) * 2021-06-10 2021-09-14 北京京东振世信息技术有限公司 Storage information processing method and device, electronic equipment and readable medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334795A (en) * 2008-08-07 2008-12-31 金蝶软件(中国)有限公司 Data storage method and device
CN102023977A (en) * 2009-09-21 2011-04-20 陈俊 Data filtering method and data filtering system and application thereof
US20120303569A1 (en) * 2001-02-12 2012-11-29 Alexander Tuzhilin System, Process and Software Arrangement for Providing Multidimensional Recommendations/Suggestions
CN103605651A (en) * 2013-08-28 2014-02-26 杭州顺网科技股份有限公司 Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
CN105224534A (en) * 2014-05-29 2016-01-06 腾讯科技(深圳)有限公司 A kind of method and device of asking response

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120303569A1 (en) * 2001-02-12 2012-11-29 Alexander Tuzhilin System, Process and Software Arrangement for Providing Multidimensional Recommendations/Suggestions
CN101334795A (en) * 2008-08-07 2008-12-31 金蝶软件(中国)有限公司 Data storage method and device
CN102023977A (en) * 2009-09-21 2011-04-20 陈俊 Data filtering method and data filtering system and application thereof
CN103605651A (en) * 2013-08-28 2014-02-26 杭州顺网科技股份有限公司 Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
CN105224534A (en) * 2014-05-29 2016-01-06 腾讯科技(深圳)有限公司 A kind of method and device of asking response

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932257A (en) * 2017-05-25 2018-12-04 北京国双科技有限公司 The querying method and device of multi-dimensional data
CN108932257B (en) * 2017-05-25 2021-10-08 北京国双科技有限公司 Multi-dimensional data query method and device
CN110019186A (en) * 2017-09-07 2019-07-16 北京国双科技有限公司 The method and device of data storage
CN108280046A (en) * 2017-11-30 2018-07-13 深圳市科列技术股份有限公司 A kind of method, battery data server and the user terminal of battery data processing
CN110209686A (en) * 2018-02-22 2019-09-06 北京嘀嘀无限科技发展有限公司 Storage, querying method and the device of data
CN111742308A (en) * 2018-02-22 2020-10-02 北京嘀嘀无限科技发展有限公司 System and method for data storage and query
WO2019161778A1 (en) * 2018-02-22 2019-08-29 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for data storage and querying
CN108363819A (en) * 2018-03-23 2018-08-03 联想(北京)有限公司 Query engine matching method, device, server group and readable storage medium storing program for executing
CN108363819B (en) * 2018-03-23 2021-04-13 联想(北京)有限公司 Query engine matching method, device, server group and readable storage medium
CN108829795A (en) * 2018-06-04 2018-11-16 北京奇艺世纪科技有限公司 Data query method and device
CN108830015A (en) * 2018-07-03 2018-11-16 北京华大九天软件有限公司 A method of utilizing unit performance trend in graphical display analytical unit library
CN110334122A (en) * 2019-07-11 2019-10-15 江苏曲速教育科技有限公司 The query analysis method and system of educational data
CN110837511A (en) * 2019-11-15 2020-02-25 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN110837511B (en) * 2019-11-15 2022-08-23 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN112000747A (en) * 2020-07-08 2020-11-27 苏宁云计算有限公司 Data multidimensional analysis method, device and system
WO2022007592A1 (en) * 2020-07-08 2022-01-13 苏宁易购集团股份有限公司 Multidimensional data analysis method, apparatus, and system
CN112000747B (en) * 2020-07-08 2022-11-18 苏宁云计算有限公司 Data multidimensional analysis method, device and system
CN112948441A (en) * 2021-03-26 2021-06-11 浪潮通用软件有限公司 Financial data-oriented multidimensional data aggregation method and equipment
CN112948441B (en) * 2021-03-26 2023-09-29 浪潮通用软件有限公司 Multi-dimensional data collection method and equipment for financial data
CN113393190A (en) * 2021-06-10 2021-09-14 北京京东振世信息技术有限公司 Storage information processing method and device, electronic equipment and readable medium
CN113393190B (en) * 2021-06-10 2023-12-05 北京京东振世信息技术有限公司 Warehouse information processing method and device, electronic equipment and readable medium

Also Published As

Publication number Publication date
CN106528787B (en) 2019-12-17

Similar Documents

Publication Publication Date Title
CN106528787A (en) Mass data multi-dimensional analysis-based query method and device
CN104424229B (en) A kind of calculation method and system that various dimensions are split
CN104679778B (en) A kind of generation method and device of search result
CN105488231B (en) A kind of big data processing method divided based on adaptive table dimension
CN108197296B (en) Data storage method based on Elasticissearch index
CA2893912C (en) Systems and methods for optimizing data analysis
CN102054000B (en) Data querying method, device and system
CN107329983B (en) Machine data distributed storage and reading method and system
CN103559300B (en) The querying method and inquiry unit of data
CN105528367A (en) A method for storage and near-real time query of time-sensitive data based on open source big data
CN102402617A (en) Easily compressed database index storage system using fragments and sparse bitmap, and corresponding construction, scheduling and query processing methods
CN101566986A (en) Method and device for processing data in online business processing
CN106055621A (en) Log retrieval method and device
EP3217296A1 (en) Data query method and apparatus
CN103366015A (en) OLAP (on-line analytical processing) data storage and query method based on Hadoop
CN102737123B (en) A kind of multidimensional data distribution method
CN104112011B (en) The method and device that a kind of mass data is extracted
CN110990372A (en) Dimensional data processing method and device and data query method and device
CN104794146A (en) Method and device for real-time screening and ranking of commodities
CN103036921B (en) A kind of user behavior analysis system and method
CN106503196A (en) The structure and querying method of extensible storage index structure in cloud environment
CN107515899B (en) Database joint fragmentation method and device and storage medium
CN107203532A (en) Construction method, the implementation method of search and the device of directory system
CN103200269A (en) Internet information statistical method and Internet information statistical system
Chasparis et al. Experimental evaluation of selectivity estimation on big spatial data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100080 A 5 C, block A, China International Steel Plaza, 8 Haidian Avenue, Haidian District, Beijing.

Applicant after: Youku network technology (Beijing) Co., Ltd.

Address before: 100080 A 5 C, block A, China International Steel Plaza, 8 Haidian Avenue, Haidian District, Beijing.

Applicant before: 1Verge Inc.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200710

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee after: Alibaba (China) Co.,Ltd.

Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C

Patentee before: Youku network technology (Beijing) Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210303

Address after: Room 715, 7-storey, 7-storey, No. 10 Furong Street, Chaoyang District, Beijing, 100102

Patentee after: BEIJING GAODE YUNTU TECHNOLOGY Co.,Ltd.

Address before: 310052 room 508, 5th floor, building 4, No. 699 Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee before: Alibaba (China) Co.,Ltd.

TR01 Transfer of patent right