CN103995879A - Data query method, device and system based on OLAP system - Google Patents

Data query method, device and system based on OLAP system Download PDF

Info

Publication number
CN103995879A
CN103995879A CN201410228109.0A CN201410228109A CN103995879A CN 103995879 A CN103995879 A CN 103995879A CN 201410228109 A CN201410228109 A CN 201410228109A CN 103995879 A CN103995879 A CN 103995879A
Authority
CN
China
Prior art keywords
data
tables
subregion
subregions
carried out
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410228109.0A
Other languages
Chinese (zh)
Other versions
CN103995879B (en
Inventor
张宇
范旭
张兵
朱银聪
王方舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201410228109.0A priority Critical patent/CN103995879B/en
Publication of CN103995879A publication Critical patent/CN103995879A/en
Application granted granted Critical
Publication of CN103995879B publication Critical patent/CN103995879B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data query method, device and system based on an OLAP system. The data query method includes the steps that a data query request sent by a user terminal is received, a first data table corresponding to first data table identification is partitioned logically to obtain at least two first partitions, and a corresponding connection mapping relation is built for each first partition; a second data table corresponding to second data table identification is partitioned logically to obtain at least two second partitions, a first foreign key value is obtained from each second partition, and the first foreign key values are within the boundary value range of the first partitions; the connection mapping relations are queried to obtain a first Hash subtable corresponding to the first foreign key values; the first Hash subtable is scanned to obtain corresponding data, and the data are returned to the user terminal. According to the data query method, through setting up the connection mapping relations, data query time is effectively shortened, and overhead of a server is reduced.

Description

Data enquire method based on OLAP system, Apparatus and system
Technical field
The embodiment of the present invention relates to computer technology, relates in particular to a kind of data enquire method, Apparatus and system based on on-line analytical processing (On-Line Analytical Processing is called for short OLAP) system.
Background technology
OLAP is as a kind of typical application scenarios of Database Systems; be mainly used in data to carry out query manipulation; in to the query script of data, often can relate to the conjunctive query of multiple tables of data, between table and table, needing to carry out associated, common attended operation by attended operation is Hash connection.
In prior art, adopt the mode of the data query of Hash connection to be mainly: the background server of Database Systems receives the data query message that user sends, according to the first tables of data sign of carrying in this data query message and the sign of the second tables of data, the major key of less the first tables of data of line number is wherein carried out to Hash operation, set up and share Hash table, again the second tables of data is carried out to parallel scan, the external key of each inquiry thread is wherein carried out to Hash operation and obtains the cryptographic hash that each thread is corresponding, according to the cryptographic hash of each thread, this shared Hash table is carried out to parallel scan, the complete data that need to obtain user.
But, because multi-threaded parallel inquiry exists write conflict, in the situation that parallel thread is more, cause the data query time long, the large problem of expense of server.
Summary of the invention
The embodiment of the present invention provides a kind of data enquire method, Apparatus and system based on OLAP system, long to overcome in prior art multithreading query time, the problem that server expense is large, by the first tables of data and the second tables of data are carried out to subregion, and the mapping relations that connect, in carrying out the process of multithreading inquiry, each thread connects mapping relations by this, obtains and scan the first Hash sublist, thus the data of obtaining, effectively shorten the data query time, reduce server expense.
Embodiment of the present invention first aspect provides a kind of data enquire method based on OLAP system, comprising:
Receive the data query request that user terminal sends, described data query request comprises Query Information, the first tables of data sign and the second tables of data sign;
According to described data query request, the first tables of data corresponding to described the first tables of data sign carried out to logical partition processing, obtain at least two the first subregions, and for each the first subregion, set up corresponding connection mapping relations, described connection mapping relations comprise: the boundary value of described the first subregion and corresponding Hash sublist;
The second tables of data corresponding to described the second tables of data sign carried out to subregion processing, obtain at least two the second subregions, and to each the second subregion, from described the second subregion, obtain the first foreign key value, described the first foreign key value is within the scope of the boundary value of described first subregion;
Inquire about described connection mapping relations, obtain the first Hash sublist corresponding with the first foreign key value;
Scan described the first Hash sublist, obtain the data corresponding with described Query Information, and described data are returned to user terminal.
In conjunction with first aspect, in the possible embodiment of the first of first aspect, described the first tables of data corresponding to described the first tables of data sign carried out to logical partition processing, obtain at least two the first subregions, and the mapping relations that connect, comprising:
According to the proper sequence of described the first tables of data, described the first tables of data is carried out to subregion processing, obtain at least two the first subregions;
For the first subregion described in each, according to the major key of described the first subregion, carry out Hash operation, set up corresponding Hash sublist;
For the first subregion described in each, the Hash sublist corresponding according to described the first subregion, sets up corresponding connection mapping relations.
In conjunction with the possible embodiment of the first of first aspect or first aspect, in the possible embodiment of the second of first aspect, the first tables of data corresponding to described the first tables of data sign carried out logical partition processing, obtains at least two the first subregions, comprising:
If described the first tables of data is orderly, described the first tables of data is carried out to logical partition processing according to the proper sequence of described the first tables of data, obtain at least two described the first subregions;
Or,
If described the first tables of data is unordered, in described the first tables of data, increase agency in order and be listed as primary key column, and described the first tables of data is carried out to logical partition processing according to the order of described orderly agency's row, obtain at least two described the first subregions.
In conjunction with the possible embodiment of the second of first aspect, in the third possible embodiment of first aspect, described the second tables of data to described the second tables of data sign correspondence is carried out subregion processing, obtains at least two the second subregions, comprising:
If described the first tables of data is orderly, and the second tables of data is orderly, according to the proper sequence of described the second tables of data and parallel processing capability, described the second tables of data is carried out to subregion processing, obtain at least two the second subregions.
Or,
If described the first tables of data is orderly, and described the second tables of data is unordered, according to parallel processing capability, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions;
Or,
If described the first tables of data is unordered, and described the second tables of data is unordered, described the second tables of data Central Plains foreign key value replaced with and described in described the first tables of data, act on behalf of in order the new Major key that row are corresponding, and according to parallel processing capability, described the second tables of data after replacing is carried out to subregion processing, obtain at least two the second subregions.
In conjunction with first aspect and first aspect first to the third in any possible embodiment, in the 4th kind of possible embodiment of first aspect, described the first Hash sublist of described scanning, obtains the data corresponding with described Query Information, comprising:
Scan described the first Hash sublist, obtain described first tables of data corresponding with described Query Information and all data messages in the associated line in described the second tables of data as described data.
Embodiment of the present invention second aspect provides a kind of data query device based on OLAP system, comprising:
Transceiver module, the data query request sending for receiving user terminal, described data query request comprises Query Information, the first tables of data sign and the second tables of data sign;
Processing module, be used for according to described data query request, the first tables of data corresponding to described the first tables of data sign carried out to logical partition processing, obtain at least two the first subregions, and for each the first subregion, set up corresponding connection mapping relations, described connection mapping relations comprise: the boundary value of described the first subregion and corresponding Hash sublist;
Described processing module is also for carrying out subregion processing to the second tables of data corresponding to described the second tables of data sign, obtain at least two the second subregions, and to each the second subregion, from described the second subregion, obtain the first foreign key value, described the first foreign key value is within the scope of the boundary value of described first subregion;
Acquisition module, for inquiring about described connection mapping relations, obtains the first Hash sublist corresponding with the first foreign key value;
Described acquisition module also, for scanning described the first Hash sublist, obtains the data corresponding with described Query Information, and by described transceiver module, described data is returned to user terminal.
In conjunction with second aspect, in the possible embodiment of the first of second aspect, described processing module specifically for:
According to the proper sequence of described the first tables of data, described the first tables of data is carried out to subregion processing, obtain at least two the first subregions;
For the first subregion described in each, according to the major key of described the first subregion, carry out Hash operation, set up corresponding Hash sublist;
For the first subregion described in each, the Hash sublist corresponding according to described the first subregion, sets up corresponding connection mapping relations.
In conjunction with the possible embodiment of the first of second aspect or second aspect, in the possible embodiment of the second of second aspect, described processing module is used for:
If described the first tables of data is orderly, described the first tables of data is carried out to logical partition processing according to the proper sequence of described the first tables of data, obtain at least two described the first subregions;
Or,
If described the first tables of data is unordered, in described the first tables of data, increase agency in order and be listed as primary key column, and described the first tables of data is carried out to logical partition processing according to the order of described orderly agency's row, obtain at least two described the first subregions.
In conjunction with the possible embodiment of the second of second aspect, in the third possible embodiment of second aspect, described processing module also for:
If described the first tables of data is orderly, and the second tables of data is orderly, according to the proper sequence of described the second tables of data and parallel processing capability, described the second tables of data is carried out to subregion processing, obtain at least two the second subregions.
Or,
If described the first tables of data is orderly, and described the second tables of data is unordered, according to parallel processing capability, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions;
Or,
If described the first tables of data is unordered, and described the second tables of data is unordered, described the second tables of data Central Plains foreign key value replaced with and described in described the first tables of data, act on behalf of in order the new Major key that row are corresponding, and according to parallel processing capability, described the second tables of data after replacing is carried out to subregion processing, obtain at least two the second subregions.
In conjunction with second aspect, second aspect first to the third in any possible embodiment, in the 4th kind of possible embodiment of second aspect, described acquisition module specifically for:
Scan described the first Hash sublist, obtain described first tables of data corresponding with described Query Information and all data messages in the associated line in described the second tables of data as described data.
The embodiment of the present invention third aspect provides a kind of data query system based on OLAP system, comprising: the data query device based on OLAP system that user terminal and second aspect provide.
The data enquire method of the embodiment of the present invention based on OLAP system, Apparatus and system, by receiving the inquiry request of user terminal, the first tables of data is carried out to logical partition and obtain the first subregion, and the mapping relations that connect, these connection mapping relations represent the boundary value of the first subregion and the corresponding relation of Hash sublist, again the second tables of data is carried out to subregion and obtain the second subregion, in carrying out the process of multithreading inquiry, each thread connects mapping relations by this, obtain and scan the first Hash sublist, thereby the data of obtaining, again these data are returned to client terminal, solved in prior art multithreading query time long, the problem that server expense is large, effectively shorten the data query time, reduce server expense.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the process flow diagram that the present invention is based on the data enquire method embodiment mono-of OLAP system;
Fig. 2 is the process flow diagram that the present invention is based on the data enquire method embodiment bis-of OLAP system;
Fig. 3 is the structural representation that the present invention is based on the data query device embodiment of OLAP system;
Fig. 4 is the structural representation that the present invention is based on the data query system embodiment of OLAP system.
Embodiment
For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the process flow diagram that the present invention is based on the data enquire method embodiment mono-of OLAP system, this programme is mainly used in the Database Systems of OLAP type, multiple tables of data are carried out to conjunctive query and obtain in the process of data, as shown in Figure 1, the method for the present embodiment can comprise:
S101: receive the data query request that user terminal sends, described data query request comprises Query Information, the first tables of data sign and the second tables of data sign.
In the present embodiment, when user terminal need to obtain data, to Database Systems, send data query request, wherein, this first tables of data is two associated tables of data with the second tables of data, has stored associated data message.
S102: according to described data query request, the first tables of data corresponding to described the first tables of data sign carried out to logical partition processing, obtain at least two the first subregions, and for each the first subregion, set up corresponding connection mapping relations, described connection mapping relations comprise: the boundary value of described the first subregion and corresponding Hash sublist.
In the present embodiment, the first tables of data is divided into at least one first subregion, for each the first subregion, obtains a Hash sublist, and set up connection mapping relations, be used for representing the corresponding relation of the boundary value of each the first subregion and the Hash sublist of this first subregion.
These connection mapping relations can represent with the form of form, optionally, can also pass through mapping, array, and other modes such as set represent, can choose according to actual application environment, and this present invention is not limited.
S103: the second tables of data corresponding to described the second tables of data sign carried out to subregion processing, obtain at least two the second subregions, and to each the second subregion, from described the second subregion, obtain the first foreign key value, described the first foreign key value is within the scope of the boundary value of described first subregion.
In the present embodiment, obtain the second tables of data, and the second tables of data is carried out to subregion processing according to the second tables of data sign, to obtain at least one second subregion, the quantity of this second subregion can be identical with the quantity of the first subregion, also can be different.From each second subregion, choose the boundary value of a foreign key value and the first subregion and compare one by one, if this foreign key value within the scope of the boundary value of first subregion, the first foreign key value using this foreign key value as this second subregion.
In the present embodiment, the boundary value scope of the first subregion is the value between the boundary value of current the first subregion and the boundary value of upper first subregion.
In relevant database, in each tables of data, there are some attributes, if some set of properties wherein can this tables of data of unique identification, this set of properties is just a major key of this form, for example student's table comprises student number, name, sex and class, and wherein each student's student number is unique, student number is exactly major key.External key is mainly used for carrying out associated with another tables of data, for example list of results comprises student number, course number and achievement, together with student number and course number, could determine achievement, so the major key of this list of results is student number and course number, and student number in student number in list of results and student's table is corresponding, so the external key shown for student of the student number in this list of results.
S104: inquire about described connection mapping relations, obtain the first Hash sublist corresponding with the first foreign key value.
In the present embodiment, utilize this first foreign key value to inquire about this connection mapping relations, obtain the first Hash sublist, this first Hash sublist is corresponding to the Hash sublist of the boundary value of first subregion identical with this first foreign key value in a plurality of Hash sublists.
S105: scan described the first Hash sublist, obtain the data corresponding with described Query Information, and described data are returned to user terminal.
The data enquire method of the embodiment of the present invention based on OLAP system, by receiving the inquiry request of user terminal, the first tables of data is carried out to logical partition and obtain the first subregion, and the mapping relations that connect, these connection mapping relations represent the boundary value of the first subregion and the corresponding relation of Hash sublist, again the second tables of data is carried out to subregion and obtain the second subregion, in carrying out the process of multithreading inquiry, each thread connects mapping relations by this, obtain and scan the first Hash sublist, thereby the data of obtaining, again these data are returned to client terminal, solved in prior art multithreading query time long, the problem that server expense is large, effectively shorten the data query time, reduce server expense.
Fig. 2 is the process flow diagram that the present invention is based on the data enquire method embodiment bis-of OLAP system, and as shown in Figure 2, on the basis of above-described embodiment, the specific implementation of S102 comprises the following steps:
S201: according to the proper sequence of described the first tables of data, described the first tables of data is carried out to subregion processing, obtain at least two the first subregions.
In the present embodiment, in OLAP type Database Systems, there is partial ordering relation in the major key of the first tables of data conventionally, has proper sequence, therefore can the first tables of data be carried out to subregion processing according to this proper sequence, be divided into mutually disjoint at least two the first subregions.
S202: for the first subregion described in each, according to the major key of described the first subregion, carry out Hash operation, set up corresponding Hash sublist.
In the present embodiment, according to the first subregion of above-mentioned division, the Major key lock Hash operation to each the first subregion, sets up shared Hash sublist, independent mutually between Hash sublist corresponding to this each the first subregion.
S203: for the first subregion described in each, the Hash sublist corresponding according to described the first subregion, sets up corresponding connection mapping relations.
In the present embodiment, obtain the boundary value of each the first subregion, and each first subregion only has a boundary value, with this boundary value, identify this first subregion, obtain the corresponding relation of this boundary value and Hash sublist corresponding to this first subregion, mapping relations connect.
The data enquire method of the embodiment of the present invention based on OLAP system, by receiving the inquiry request of user terminal, the first tables of data is carried out to logical partition and obtain the first subregion, obtain the boundary value of Hash sublist that each the first subregion is corresponding and each the first subregion, and the mapping relations that connect, these connection mapping relations represent the boundary value of the first subregion and the corresponding relation of Hash sublist, again the second tables of data is carried out to subregion and obtain the second subregion, in carrying out the process of multithreading inquiry, each thread connects mapping relations by this, obtain and scan the first Hash sublist, thereby the data of obtaining, again these data are returned to client terminal, solved in prior art multithreading query time long, the problem that server expense is large, effectively shorten the data query time, reduce server expense.
On the basis of above-described embodiment, special,, in S102, the first tables of data corresponding to described the first tables of data sign carried out logical partition processing, obtains at least two the first subregions, comprises following two kinds of implementations:
The first implementation, if described the first tables of data is orderly, carries out logical partition processing by described the first tables of data according to the proper sequence of described the first tables of data, obtains at least two described the first subregions.
In the present embodiment, if the major key of the first tables of data has certain permanent order (partial ordering relation), the first tables of data is carried out to logical partition processing according to this permanent order.
The second implementation, if described the first tables of data is unordered, in described the first tables of data, increase agency in order and be listed as primary key column, and described the first tables of data is carried out to logical partition processing according to the order of described orderly agency's row, obtain at least two described the first subregions.
In the present embodiment, if the first tables of data does not have certain permanent order (partial ordering relation), increase agency's row in order to the first tables of data, for example, increase a Serial No. increasing progressively 1 to the line number of this first tables of data and be listed as orderly agency, then according to this newly-increased orderly agency, be listed as the first tables of data is carried out to logical partition processing.
Further, in S103, described the second tables of data to described the second tables of data sign correspondence is carried out subregion processing, obtains at least two the second subregions, specifically has following three kinds of implementations:
The first implementation, if described the first tables of data is orderly, and the second tables of data is orderly, according to the proper sequence of described the second tables of data and parallel processing capability, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions.
The second implementation, if described the first tables of data is orderly, and described the second tables of data is unordered, according to parallel processing capability, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions;
The third implementation, if described the first tables of data is unordered, and described the second tables of data is unordered, described the second tables of data Central Plains foreign key value replaced with and described in described the first tables of data, act on behalf of in order the new Major key that row are corresponding, and according to parallel processing capability, described the second tables of data after replacing is carried out to subregion processing, obtain at least two the second subregions.
Especially, for the first tables of data and the second tables of data, there is the same proper sequence (have equal partial ordering relation, in practical application, this situation is very common), for example: " The TPC Benchmark tMh " in lineitem, an order table two conventional connection data table; the major key order of order table sequentially has the same proper sequence with the record of lineitem table; therefore can carry out subregion processing to order table and lineitem table employing same edge dividing value; the logical partition that can guarantee two tables of data is correspondence one by one; can not need to connect mapping relations carrying out Hash connection procedure, can only access unique Hash sublist.
Concrete, in S105, described the first Hash sublist of described scanning, obtain the data corresponding with described Query Information, comprise: scan described the first Hash sublist, obtain described first tables of data corresponding with described Query Information and all data messages in the associated line in described the second tables of data as described data.
Especially exemplified by an example, the technical scheme in above-described embodiment is elaborated below.Concrete, the name of 50 individuals and corresponding sex and age in the first tables of data, have been stored, the attribute of this first tables of data comprises name, sex and age, major key is name, and the initial that the proper sequence of this first tables of data is name be according to 26 alphabetical tactic, arrange in alphabetical order row.In the second tables of data associated with this first tables of data, stored name, date of birth and the phone number of 100 individuals, and in the attribute of this second tables of data, name is the external key of the first tables of data.
Utilize the main process of this data enquire method based on OLAP system to be: first, according to the proper sequence of the first tables of data (name initial order), this first tables of data is carried out to subregion processing, be divided into five the first subregions, the boundary value of each the first subregion is the border personnel's of this first subregion name, each first subregion after subregion is carried out to Hash calculation, obtain the Hash sublist that each the first subregion is corresponding.The mapping relations that connect table, in this form, stored the boundary value (name of each the first subregion, for example one of them boundary value is: Zhang San) and the sign of corresponding Hash sublist, and the mapping relations of the boundary value that this connection mapping relations table has identified each the first subregion and corresponding Hash sublist.
Secondly, in the second tables of data, generally the foreign key column of the first tables of data is unordered, therefore the second tables of data is not done to Physical Extents, only carry out logical level division, this second tables of data is divided into three the second subregions and (needs three threads of thread numerical digit to be processed, or can also be divided into the number of partitions identical with the treatable maximum thread amount of system, concrete can select according to actual conditions, the application is not restricted this), from each second subregion, inquire about the foreign key value of the first all tables of data, find out a name in the scope of the boundary value of certain the first subregion as the first foreign key value, find out name: Zhang San, according to this first foreign key value inquiry, connect mapping table, obtain corresponding Hash sublist, then by the mode of prior art, each thread is independently scanned to corresponding Hash sublist, from the first tables of data and the second tables of data, obtain the data of all needs.In this scanning process, each thread is mutually not mutual.
Finally, the data of obtaining are returned to user terminal, complete the process of whole data query.
In the present embodiment, by the horizontal logical breakdown of the first tables of data (major key table), be N the first subregion (N depends on the parallel ability that platform can provide).Due to the proper sequence on major key, these first subregions can mutually disjoint, and by writing edge dividing value to distinguish each the first subregion.The first tables of data (major key table) is respectively carried out after subregion processing, and each first subregion is transferred to different threads parallel scan, uses hash function to complete and calculate in scanning process, generates the Hash sublist of each the first subregion.And to connect mapping relations structure, record the mapping relations of major key boundary value and Hash sublist.Every thread complete independently Hash calculation process separately in this process, each thread is without conflict.
The second tables of data (foreign-key table) in connection lists unordered at external key, so this second tables of data (foreign-key table) do not carry out Physical Extents, only press the horizontal logic burst of degree of parallelism, adopts multi-threaded parallel to scan.First foreign key value reading for each thread, first inquiry connects mapping relations, by mapping relations, determines the Hash sublist that this first foreign key value is corresponding.Multithreading is read-only operations to connecting the inquiry of mapping relations, conflicts with other operation nothings.
The data enquire method of the embodiment of the present invention based on OLAP system, by receiving the inquiry request of user terminal, the first tables of data is carried out to subregion and obtain the first subregion, obtain the boundary value of Hash sublist that each the first subregion is corresponding and each the first subregion, and the mapping relations that connect, these connection mapping relations represent the boundary value of the first subregion and the corresponding relation of Hash sublist, again the second tables of data is carried out to subregion and obtain the second subregion, in carrying out the process of multithreading inquiry, each thread connects mapping relations by this, obtain and scan the first Hash sublist, the Hash sublist corresponding to each the first foreign key value independently scans, thereby the data of obtaining, again these data are returned to client terminal, solved in prior art multithreading query time long, the problem that server expense is large, effectively shorten the data query time, reduce server expense.
Fig. 3 is the structural representation that the present invention is based on the data query device embodiment of OLAP system, as shown in Figure 3, the device of the present embodiment can comprise: transceiver module 31, processing module 32 and acquisition module 33, wherein, transceiver module 31, the data query request sending for receiving user terminal, described data query request comprises Query Information, the first tables of data sign and the second tables of data sign; Processing module 32, be used for according to described data query request, the first tables of data corresponding to described the first tables of data sign carried out to subregion processing, obtain at least two the first subregions, and for each the first subregion, set up corresponding connection mapping relations, described connection mapping relations comprise: the boundary value of described the first subregion and corresponding Hash sublist; Described processing module 32 is also for carrying out subregion processing to the second tables of data corresponding to described the second tables of data sign, obtain at least two the second subregions, and to each the second subregion, from described the second subregion, obtain the first foreign key value, described the first foreign key value is identical with the boundary value of described first subregion; Acquisition module 33, for inquiring about described connection mapping relations, obtains the first Hash sublist corresponding with the first foreign key value; Described acquisition module 33 also, for scanning described the first Hash sublist, obtains the data corresponding with described Query Information, and by described transceiver module 31, described data is returned to user terminal.
The data query device based on OLAP system that the present embodiment provides, can be for the technical scheme of embodiment of the method shown in execution graph 1, by transceiver module, receive the inquiry request of user terminal, processing module is carried out logical partition to the first tables of data and is obtained the first subregion, obtain the boundary value of Hash sublist that each the first subregion is corresponding and each the first subregion, and the mapping relations that connect, these connection mapping relations represent the boundary value of the first subregion and the corresponding relation of Hash sublist, again the second tables of data is carried out to subregion and obtain the second subregion, in carrying out the process of multithreading inquiry, each thread connects mapping relations by this, acquisition module obtains and scans the first Hash sublist, thereby the data of obtaining, again these data are returned to client terminal, solved in prior art multithreading query time long, the problem that server expense is large, effectively shorten the data query time, reduce server expense.
In the present invention is based on the data query device embodiment bis-of OLAP system, on the basis of above-described embodiment, described processing module 32 for:
If described the first tables of data is orderly, described the first tables of data is carried out to logical partition processing according to the proper sequence of described the first tables of data, obtain at least two described the first subregions;
Or,
If described the first tables of data is unordered, in described the first tables of data, increase agency in order and be listed as primary key column, and described the first tables of data is carried out to logical partition processing according to the order of described orderly agency's row, obtain at least two described the first subregions.
Optionally, described processing module 32 also for:
If described the first tables of data is orderly, and the second tables of data is orderly, according to the proper sequence of described the second tables of data and parallel processing capability, described the second tables of data is carried out to subregion processing, obtain at least two the second subregions;
Or,
If described the first tables of data is orderly, and described the second tables of data is unordered, according to parallel processing capability, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions;
Or,
If described the first tables of data is unordered, and described the second tables of data is unordered, described the second tables of data Central Plains foreign key value replaced with and described in described the first tables of data, act on behalf of in order the new Major key that row are corresponding, and according to parallel processing capability, described the second tables of data after replacing is carried out to subregion processing, obtain at least two the second subregions.
Optionally, if described the first tables of data is identical with the proper sequence of described the second tables of data, described processing module 32 for: according to the described proper sequence identical with described the first tables of data, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions; Wherein, the quantity of described the second subregion is identical with the quantity of described the first subregion.
Concrete, described acquisition module 33 specifically for: scan described the first Hash sublist, obtain described first tables of data corresponding with described Query Information and all data messages in the associated line in described the second tables of data as described data.
The data query device based on OLAP system that the present embodiment provides, can be for the technical scheme of manner of execution embodiment mono-to three any embodiment, and it realizes principle and technique effect is similar, repeats no more herein.
Fig. 4 is the structural representation that the present invention is based on the data query system embodiment of OLAP system.As described in Figure 4, this system comprises: the data query device 42 based on OLAP system described in the arbitrary device embodiment shown in user terminal 41 and Fig. 3.Wherein, user terminal 41 sends data query message for the data query device 42 to based on OLAP system, and the data of returning for receiving this data query device 42 based on OLAP system.Data query device 42 based on OLAP system is for the technical scheme of execution graph 1, Fig. 2 and example either method embodiment, and it realizes principle and technique effect is similar, repeats no more herein.
One of ordinary skill in the art will appreciate that: all or part of step that realizes above-mentioned each embodiment of the method can complete by the relevant hardware of programmed instruction.Aforesaid program can be stored in a computer read/write memory medium.This program, when carrying out, is carried out the step that comprises above-mentioned each embodiment of the method; And aforesaid storage medium comprises: various media that can be program code stored such as ROM, RAM, magnetic disc or CDs.
Finally it should be noted that: each embodiment, only in order to technical scheme of the present invention to be described, is not intended to limit above; Although the present invention is had been described in detail with reference to aforementioned each embodiment, those of ordinary skill in the art is to be understood that: its technical scheme that still can record aforementioned each embodiment is modified, or some or all of technical characterictic is wherein equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution depart from the scope of various embodiments of the present invention technical scheme.

Claims (11)

1. the data enquire method based on OLAP system, is characterized in that, comprising:
Receive the data query request that user terminal sends, described data query request comprises Query Information, the first tables of data sign and the second tables of data sign;
According to described data query request, the first tables of data corresponding to described the first tables of data sign carried out to logical partition processing, obtain at least two the first subregions, and for each the first subregion, set up corresponding connection mapping relations, described connection mapping relations comprise: the boundary value of described the first subregion and corresponding Hash sublist;
The second tables of data corresponding to described the second tables of data sign carried out to subregion processing, obtain at least two the second subregions, and to each the second subregion, from described the second subregion, obtain the first foreign key value, described the first foreign key value is within the scope of the boundary value of described first subregion;
Inquire about described connection mapping relations, obtain the first Hash sublist corresponding with the first foreign key value;
Scan described the first Hash sublist, obtain the data corresponding with described Query Information, and described data are returned to user terminal.
2. method according to claim 1, is characterized in that, described the first tables of data to described the first tables of data sign correspondence is carried out logical partition processing, obtains at least two the first subregions, and the mapping relations that connect, and comprising:
According to the proper sequence of described the first tables of data, described the first tables of data is carried out to subregion processing, obtain at least two the first subregions;
For the first subregion described in each, according to the major key of described the first subregion, carry out Hash operation, set up corresponding Hash sublist;
For the first subregion described in each, the Hash sublist corresponding according to described the first subregion, sets up corresponding connection mapping relations.
3. method according to claim 1 and 2, is characterized in that, the first tables of data corresponding to described the first tables of data sign carried out logical partition processing, obtains at least two the first subregions, comprising:
If described the first tables of data is orderly, described the first tables of data is carried out to logical partition processing according to the proper sequence of described the first tables of data, obtain at least two described the first subregions;
Or,
If described the first tables of data is unordered, in described the first tables of data, increase agency in order and be listed as primary key column, and described the first tables of data is carried out to logical partition processing according to the order of described orderly agency's row, obtain at least two described the first subregions.
4. method according to claim 3, is characterized in that, described the second tables of data to described the second tables of data sign correspondence is carried out subregion processing, obtains at least two the second subregions, comprising:
If described the first tables of data is orderly, and the second tables of data is orderly, according to the proper sequence of described the second tables of data and parallel processing capability, described the second tables of data is carried out to subregion processing, obtain at least two the second subregions;
Or,
If described the first tables of data is orderly, and described the second tables of data is unordered, according to parallel processing capability, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions;
Or,
If described the first tables of data is unordered, and described the second tables of data is unordered, described the second tables of data Central Plains foreign key value replaced with and described in described the first tables of data, act on behalf of in order the new Major key that row are corresponding, and according to parallel processing capability, described the second tables of data after replacing is carried out to subregion processing, obtain at least two the second subregions.
5. according to the method described in claim 1 to 4 any one, it is characterized in that, described the first Hash sublist of described scanning, obtains the data corresponding with described Query Information, comprising:
Scan described the first Hash sublist, obtain described first tables of data corresponding with described Query Information and all data messages in the associated line in described the second tables of data as described data.
6. the data query device based on OLAP system, is characterized in that, comprising:
Transceiver module, the data query request sending for receiving user terminal, described data query request comprises Query Information, the first tables of data sign and the second tables of data sign;
Processing module, be used for according to described data query request, the first tables of data corresponding to described the first tables of data sign carried out to logical partition processing, obtain at least two the first subregions, and for each the first subregion, set up corresponding connection mapping relations, described connection mapping relations comprise: the boundary value of described the first subregion and corresponding Hash sublist;
Described processing module is also for carrying out subregion processing to the second tables of data corresponding to described the second tables of data sign, obtain at least two the second subregions, and to each the second subregion, from described the second subregion, obtain the first foreign key value, described the first foreign key value is within the scope of the boundary value of described first subregion;
Acquisition module, for inquiring about described connection mapping relations, obtains the first Hash sublist corresponding with the first foreign key value;
Described acquisition module also, for scanning described the first Hash sublist, obtains the data corresponding with described Query Information, and by described transceiver module, described data is returned to user terminal.
7. device according to claim 6, is characterized in that, described processing module specifically for:
According to the proper sequence of described the first tables of data, described the first tables of data is carried out to subregion processing, obtain at least two the first subregions;
For the first subregion described in each, according to the major key of described the first subregion, carry out Hash operation, set up corresponding Hash sublist;
For the first subregion described in each, the Hash sublist corresponding according to described the first subregion, sets up corresponding connection mapping relations.
8. according to the device described in claim 6 or 7, it is characterized in that, described processing module is used for:
If described the first tables of data is orderly, described the first tables of data is carried out to logical partition processing according to the proper sequence of described the first tables of data, obtain at least two described the first subregions;
Or,
If described the first tables of data is unordered, in described the first tables of data, increase agency in order and be listed as primary key column, and described the first tables of data is carried out to logical partition processing according to the order of described orderly agency's row, obtain at least two described the first subregions.
9. device according to claim 8, is characterized in that, described processing module also for:
If described the first tables of data is orderly, and the second tables of data is orderly, according to the proper sequence of described the second tables of data and parallel processing capability, described the second tables of data is carried out to subregion processing, obtain at least two the second subregions;
Or,
If described the first tables of data is orderly, and described the second tables of data is unordered, according to parallel processing capability, described the second tables of data is carried out to subregion processing, obtains at least two the second subregions;
Or,
If described the first tables of data is unordered, and described the second tables of data is unordered, described the second tables of data Central Plains foreign key value replaced with and described in described the first tables of data, act on behalf of in order the new Major key that row are corresponding, and according to parallel processing capability, described the second tables of data after replacing is carried out to subregion processing, obtain at least two the second subregions.
10. according to the device described in claim 6 to 9 any one, it is characterized in that, described acquisition module specifically for:
Scan described the first Hash sublist, obtain described first tables of data corresponding with described Query Information and all data messages in the associated line in described the second tables of data as described data.
11. 1 kinds of data query systems based on OLAP system, is characterized in that, comprising: the data query device based on OLAP system described in user terminal and claim 6-10 any one.
CN201410228109.0A 2014-05-27 2014-05-27 Data query method, apparatus and system based on OLAP system Active CN103995879B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410228109.0A CN103995879B (en) 2014-05-27 2014-05-27 Data query method, apparatus and system based on OLAP system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410228109.0A CN103995879B (en) 2014-05-27 2014-05-27 Data query method, apparatus and system based on OLAP system

Publications (2)

Publication Number Publication Date
CN103995879A true CN103995879A (en) 2014-08-20
CN103995879B CN103995879B (en) 2017-12-15

Family

ID=51310044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410228109.0A Active CN103995879B (en) 2014-05-27 2014-05-27 Data query method, apparatus and system based on OLAP system

Country Status (1)

Country Link
CN (1) CN103995879B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016165525A1 (en) * 2015-04-16 2016-10-20 华为技术有限公司 Data query method in crossing-partition database, and crossing-partition query device
CN107085570A (en) * 2016-02-14 2017-08-22 华为技术有限公司 Data processing method, application server and router
CN107229692A (en) * 2017-05-19 2017-10-03 哈工大大数据产业有限公司 A kind of distributed multi-table connecting method and system based on streamline
CN107729500A (en) * 2017-10-20 2018-02-23 锐捷网络股份有限公司 A kind of data processing method of on-line analytical processing, device and background devices
WO2018040722A1 (en) * 2016-08-31 2018-03-08 华为技术有限公司 Table data query method and device
CN107818117A (en) * 2016-09-14 2018-03-20 阿里巴巴集团控股有限公司 A kind of method for building up of tables of data, online query method and relevant apparatus
WO2018090557A1 (en) * 2016-11-18 2018-05-24 华为技术有限公司 Method and device for querying data table
CN108427684A (en) * 2017-02-14 2018-08-21 华为技术有限公司 Data query method, apparatus and computing device
CN108874873A (en) * 2018-04-26 2018-11-23 北京空间科技信息研究所 Data query method, apparatus, storage medium and processor
CN108959330A (en) * 2017-05-26 2018-12-07 阿里巴巴集团控股有限公司 A kind of processing of database, data query method and apparatus
CN109189808A (en) * 2018-09-18 2019-01-11 腾讯科技(深圳)有限公司 Data query method and relevant device
CN109582694A (en) * 2017-09-29 2019-04-05 北京国双科技有限公司 A kind of method and Related product generating data query script
CN109885574A (en) * 2019-02-22 2019-06-14 广州荔支网络技术有限公司 A kind of data query method and device
CN110083658A (en) * 2019-03-11 2019-08-02 北京达佳互联信息技术有限公司 Method of data synchronization, device, electronic equipment and storage medium
CN110287213A (en) * 2019-07-03 2019-09-27 中通智新(武汉)技术研发有限公司 Data query method, apparatus and system based on OLAP system
CN112597248A (en) * 2020-12-26 2021-04-02 中国农业银行股份有限公司 Big data partition storage method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120221510A1 (en) * 2010-03-31 2012-08-30 International Business Machines Corporation Method and system for validating data
CN102663117A (en) * 2012-04-18 2012-09-12 中国人民大学 OLAP (On Line Analytical Processing) inquiry processing method facing database and Hadoop mixing platform
CN103235793A (en) * 2013-04-01 2013-08-07 华为技术有限公司 On-line data processing method, equipment and system
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN103324724A (en) * 2013-06-26 2013-09-25 华为技术有限公司 Method and device for processing data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120221510A1 (en) * 2010-03-31 2012-08-30 International Business Machines Corporation Method and system for validating data
CN102663117A (en) * 2012-04-18 2012-09-12 中国人民大学 OLAP (On Line Analytical Processing) inquiry processing method facing database and Hadoop mixing platform
CN103235793A (en) * 2013-04-01 2013-08-07 华为技术有限公司 On-line data processing method, equipment and system
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN103324724A (en) * 2013-06-26 2013-09-25 华为技术有限公司 Method and device for processing data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱阅岸 等: ""一种基于三元组存储的列式OLAP查询执行引擎"", 《软件学报》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016165525A1 (en) * 2015-04-16 2016-10-20 华为技术有限公司 Data query method in crossing-partition database, and crossing-partition query device
CN106156168A (en) * 2015-04-16 2016-11-23 华为技术有限公司 The method of data is being inquired about and across subregion inquiry unit in partitioned data base
CN106156168B (en) * 2015-04-16 2019-10-22 华为技术有限公司 Across the method and across subregion inquiry unit for inquiring data in partitioned data base
CN107085570A (en) * 2016-02-14 2017-08-22 华为技术有限公司 Data processing method, application server and router
CN107784044B (en) * 2016-08-31 2020-02-14 华为技术有限公司 Table data query method and device
WO2018040722A1 (en) * 2016-08-31 2018-03-08 华为技术有限公司 Table data query method and device
CN107818117A (en) * 2016-09-14 2018-03-20 阿里巴巴集团控股有限公司 A kind of method for building up of tables of data, online query method and relevant apparatus
CN107818117B (en) * 2016-09-14 2022-02-15 阿里巴巴集团控股有限公司 Data table establishing method, online query method and related device
WO2018090557A1 (en) * 2016-11-18 2018-05-24 华为技术有限公司 Method and device for querying data table
CN108073641A (en) * 2016-11-18 2018-05-25 华为技术有限公司 The method and apparatus for inquiring about tables of data
CN108073641B (en) * 2016-11-18 2020-06-16 华为技术有限公司 Method and device for querying data table
CN108427684A (en) * 2017-02-14 2018-08-21 华为技术有限公司 Data query method, apparatus and computing device
CN108427684B (en) * 2017-02-14 2020-12-25 华为技术有限公司 Data query method and device and computing equipment
CN107229692A (en) * 2017-05-19 2017-10-03 哈工大大数据产业有限公司 A kind of distributed multi-table connecting method and system based on streamline
CN107229692B (en) * 2017-05-19 2018-05-01 哈工大大数据产业有限公司 A kind of distributed multi-table connecting method and system based on assembly line
CN108959330A (en) * 2017-05-26 2018-12-07 阿里巴巴集团控股有限公司 A kind of processing of database, data query method and apparatus
CN109582694A (en) * 2017-09-29 2019-04-05 北京国双科技有限公司 A kind of method and Related product generating data query script
CN107729500A (en) * 2017-10-20 2018-02-23 锐捷网络股份有限公司 A kind of data processing method of on-line analytical processing, device and background devices
CN108874873A (en) * 2018-04-26 2018-11-23 北京空间科技信息研究所 Data query method, apparatus, storage medium and processor
CN108874873B (en) * 2018-04-26 2022-04-12 北京空间科技信息研究所 Data query method, device, storage medium and processor
CN109189808A (en) * 2018-09-18 2019-01-11 腾讯科技(深圳)有限公司 Data query method and relevant device
CN109885574A (en) * 2019-02-22 2019-06-14 广州荔支网络技术有限公司 A kind of data query method and device
CN110083658A (en) * 2019-03-11 2019-08-02 北京达佳互联信息技术有限公司 Method of data synchronization, device, electronic equipment and storage medium
CN110287213A (en) * 2019-07-03 2019-09-27 中通智新(武汉)技术研发有限公司 Data query method, apparatus and system based on OLAP system
CN110287213B (en) * 2019-07-03 2023-02-17 中通智新(武汉)技术研发有限公司 Data query method, device and system based on OLAP system
CN112597248A (en) * 2020-12-26 2021-04-02 中国农业银行股份有限公司 Big data partition storage method and device
CN112597248B (en) * 2020-12-26 2024-04-12 中国农业银行股份有限公司 Big data partition storage method and device

Also Published As

Publication number Publication date
CN103995879B (en) 2017-12-15

Similar Documents

Publication Publication Date Title
CN103995879A (en) Data query method, device and system based on OLAP system
US11132346B2 (en) Information processing method and apparatus
CN109034809B (en) Block chain generation method and device, block chain node and storage medium
CN106325933B (en) Batch data synchronous method and device
CN107622091B (en) Database query method and device
US20200272610A1 (en) Method, apparatus, device and medium for storing and querying data
JP7133647B2 (en) DATA PROCESSING METHOD, APPARATUS AND COMPUTER-READABLE STORAGE MEDIUM
CN107977396B (en) Method and device for updating data table of KeyValue database
CN103678556A (en) Method for processing column-oriented database and processing equipment
EP2863310A1 (en) Data processing method and apparatus, and shared storage device
EP3251033B1 (en) Hybrid data distribution in a massively parallel processing architecture
CN102129458A (en) Method and device for storing relational database
WO2019165671A1 (en) Method for rapidly importing big data, apparatus, terminal device, and storage medium
CN105868421A (en) Data management method and data management device
CN104216893A (en) Partitioned management method for multi-tenant shared data table, server and system
CN105550306A (en) Multi-copy data reading/writing method and system
CN109981569B (en) Network system access method, device, computer equipment and readable storage medium
CN113326005B (en) Read-write method and device for RAID storage system
CN102521304A (en) Hash based clustered table storage method
CN104573112A (en) Page query method and data processing node for OLTP cluster database
US11531706B2 (en) Graph search using index vertices
CN109542860B (en) Service data management method based on HDFS and terminal equipment
US20130304707A1 (en) Data Archiving Approach Leveraging Database Layer Functionality
CN116521956A (en) Graph database query method and device, electronic equipment and storage medium
US20180373746A1 (en) Table partition configuration method, apparatus and system for database system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant