CN105824957B - The query engine system and querying method of distributed memory columnar database - Google Patents
The query engine system and querying method of distributed memory columnar database Download PDFInfo
- Publication number
- CN105824957B CN105824957B CN201610193220.XA CN201610193220A CN105824957B CN 105824957 B CN105824957 B CN 105824957B CN 201610193220 A CN201610193220 A CN 201610193220A CN 105824957 B CN105824957 B CN 105824957B
- Authority
- CN
- China
- Prior art keywords
- query engine
- subtask
- state
- data
- main
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
- G06F16/24542—Plan optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/221—Column-oriented storage; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24554—Unary operations; Data partitioning operations
- G06F16/24556—Aggregation; Duplicate elimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24558—Binary matching operations
- G06F16/2456—Join operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5017—Task decomposition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Operations Research (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of query engine system of distributed memory columnar database and querying method, querying method includes: that resource management module determines the session that a main query engine is responsible between user;The sql like language that user sends is converted to inquiry plan by main query engine;Resource management module is that main query engine is distributed from query engine;Inquiry plan is divided at least two subtasks by main query engine, and is distributed for each subtask from query engine;Current subtask is executed after the completion of the forerunner subtask of current subtask all executes, current subtask is executed into the slave query engine where the intermediate data for completing to generate is transmitted to subsequent subtask, and current subtask completion status is sent to main query engine;Main query engine notifies client obtaining final result data from query engine.The query engine system and querying method of distributed memory columnar database provided by the invention, available good search efficiency.
Description
Technical field
The present invention relates to database technical fields, and in particular to a kind of query engine system of distributed memory columnar database
System and querying method.
Background technique
NewSQL is to various new expansible, high-performance data library abbreviations, and this kind of database not only has NoSQL pairs
The storage management ability of mass data also maintains traditional database and supports the characteristics such as ACID and SQL.In general, NewSQL
Be roughly divided into three classes: new architecture takes different design methods, such as Google using completely new database platform
Spanner, Clustrix, VoltDB and MemSQL;SQL query engine, the SQL storage engines of height optimization, provides
The identical programming interface of MySQL, but scalability is more preferable than built-in engine InnoDB;Transparent fragment provides the middleware of fragment
Layer, database are segmented in multiple node operations automatically.Over time, the NewSQL database of these three types by
It gradually merges, the extensive distribution towards on-line analytical processing (OLAP, Online Analytical Processing) of being born
Formula memory columnar database.
Query engine is the core of Database Systems, is responsible for the execution tune of entire Database Systems inquiry calculating task
Degree.The SQL statement of one user input, will do it SQL statement morphology syntax parsing generative grammar first in Database Systems
Tree, deforms syntax tree using database query optimizer, is finally converted to what database query engine can identify
Inquiry plan.Inquiry plan tells how query engine executes, and how from database bottom storage engines to extract data, logarithm
According to deform the result for being finally converted into user and wanting.
HIVE is a Tool for Data Warehouse based on Hadoop, and provides simple SQL query function, can be by SQL
Sentence is converted into MapReduce task and is run.For SQL statement SELECT c_custkey FROM customer
JOIN nation ON customer.C_NATIONKEY = nation.N_NATIONKEY JOIN lineitem ON
Lineitem.L_PARTKEY=customer.C_CUSTKEY, HIVE to a SQL query plan and task execution process such as
Shown in Fig. 1.What HIVE was really executed is MapReduce task, so inquiry plan can be converted into MapReduce task-set
It closes, former inquiry plan is converted to two MapReduce tasks.Wherein, JOB1 is responsible for calculating Join1, that is, lineitem
The Join operation of table and customer table;JOB2 is responsible for calculating Join2, that is, calculates Join1 result and nation table
Join operation, final output.After JOB1 executes completion, external storage system, JOB2 can be written in intermediate result data
It can just start to execute, and JOB2 can read the intermediate result that JOB1 is generated from external storage system and then carry out calculating work
Make.The shortcomings that HIVE, is it is clear that its bottom uses MapReduce computation module, for every two MapReduce calculating task
Between data sharing, the result of one of calculating task can only be output to external storage system (distributed file system
Or local file system), the latter calculating task reads data from external storage system and is calculated, and leads to a large amount of disk
I/O, so that entire query process delay is higher.
Spark-SQL is that another Tool for Data Warehouse is similar with HIVE function, but Spark-SQL bottom uses
Spark computation model rather than MapReduce computation module.For SQL statement SELECT c_custkey FROM
customer JOIN nation ON customer.C_NATIONKEY = nation.N_NATIONKEY JOIN
Lineitem ON lineitem.L_PARTKEY=customer.C_CUSTKEY, Spark-SQL are to a SQL query plan
It is as shown in Figure 2 with task execution process.Stage1 is mainly used to handle the ScanTable(lineitem in inquiry plan) and
ScanTable(customer), RDD1 and RDD2 are respectively corresponded.Since RDD is distributed elastic data set, corresponding multiple physics
Node, each physical node can execute corresponding task, so a RDD is by multiple Task(tasks executed parallel) it obtains,
Such as RDD1 is just calculated by Task1-1, Task1-2.After having read lineitem table and customer table content, Stage2
It is mainly used to handle Join1 operation and ScanTable(nation) operation, RDD3 and RDD4 is generated respectively.Finally, Stage3 is used
To complete Join2 operation.Spark-SQL is many fastly relative to HIVE on computing relay, but still remains some disadvantages.
One be Spark-SQL bottom using scala language realize, overall operation on a java virtual machine, memory
Administrative mechanism depends on Java Virtual Machine.And Java Virtual Machine memory management mechanism is a kind of general memory management mechanism,
In database query engine, the internal memory optimization customized is not done for database query engine, Spark-SQL is caused to calculate
A large amount of memory headroom is consumed in the process.
Secondly being executed during being Spark-SQL task execution according to phase sequence, as Stage2 starts before executing
The condition of mentioning is that Stage1 executes completion, and the precondition that Stage3 is executed is that Stage2 executes completion.Each Stage includes several
A Task(task that can be executed parallel), the execution delay of each Stage is determined by executing time longest Task in the Stage
It is fixed.Thus one is led to the problem of, executing after fast Task is completed can wait other in same Stage to be not carried out completion
Task, after completing to task executions all in same Stage, the Task in next Stage can just start to execute.For example,
Task1-1, Task1-2, Task2-1 and Task2-2 in Stage1, Task3-1 and Task3-2 in Stage2,
Task3-2 depends on the calculated result of Task1-1, Task1-2 and Task2-1.If Task1-1, Task1-2 and
Task2-1 task execution is completed and Task2-2 is not carried out completion, even when Task3-1 meets execution condition, is counted in Spark
Under the constraint condition for calculating frame, Task3-1 still cannot start to execute, and need to complete just to start later until Task2-2 is executed
It executes.If it is too long that Task2-2 executes the time, the computing relay of entire calculating process will affect.
Summary of the invention
To be solved by this invention is the low problem of existing database query engine computational efficiency.
The present invention is achieved through the following technical solutions:
A kind of query engine system of distributed memory columnar database, including resource management module, at least one master are looked into
Ask engine and at least one from query engine;The main query engine is used to sql like language being converted to inquiry plan, will inquire
Plan is divided at least two subtasks, and is responsible for the implementation procedure of monitoring and scheduling inquiry plan;It is described to be used from query engine
In the subtask for executing the main query engine distribution;The resource management module is used to be responsible for the management of system resource and divides
Match.
Optionally, the system resource includes CPU computing resource and memory source.
Based on the query engine system of above-mentioned distributed memory columnar database, the present invention also provides a kind of distributed memories
The querying method of columnar database, comprising: resource management module determines the session that a main query engine is responsible between user;
The sql like language that user sends is converted to inquiry plan by main query engine;Resource management module is that main query engine is distributed from looking into
Engine is ask, and establishes the communication between query engine and main query engine;Inquiry plan is divided at least by main query engine
Two subtasks, and distribute for each subtask from query engine;It is added to task queue from query engine by subtask, is working as
The forerunner subtask of preceding subtask executes current subtask after the completion of all executing, and current subtask is executed in completing to generate
Between data be transmitted to the slave query engine where subsequent subtask, and current subtask completion status is sent to main inquiry and is drawn
It holds up;After the completion of entire inquiry plan, main query engine notifies client obtaining final result data from query engine.
Inquiry plan is divided into several subtasks for having dependence by the present invention, and by subtask distribute to accordingly from
In the task queue of query engine, by successively executing the subtask in task queue from query engine, without
In Spark-SQL, although some task meets executable condition in the latter half, since Spark-SQL executes the limit of frame
System, and the shortcomings that executing calculating task cannot be started.Therefore, looking into using distributed memory columnar database provided by the invention
Inquiry method, available good search efficiency.
Optionally, subtask uses physics operator representation, and the physics operator includes extracting column data operation, connection behaviour
Work, condition filter operation, division operation, aggregate function operation, sorting operation and final result data convert is embarked on journey table
At least one of operation.
Optionally, main query engine is that each subtask is distributed from query engine according to Cost Model.Using Cost Model
It is the distribution of each subtask from query engine, can be the smallest from query engine for each subtask distribution Executing Cost, thus
Further increase search efficiency.
Optionally, it includes: according to from looking into that main query engine, which is that each subtask distributes from query engine according to Cost Model,
The metadata information for asking engine obtains the database table information stored from the IP of node where query engine and the node and column
Information;Each in principle distribution inquiry plan, which is localized, according to data extracts the execution node IP of column data operation;Using greed
The non-execution node for extracting column data operation of algorithm picks.
Optionally, the state of each subtask include etc. pending state, calculating state, distribution data mode, hold
Row finishes state and executes status of fail.
Optionally, the original state of current subtask be etc. pending state, from query engine where current subtask
After receiving the intermediate data that all forerunner subtasks, current subtask execute completion generation, the state of current subtask is changed to just
In the state of calculating;After the completion of current subtask calculates, the state of current subtask is changed to distribution data mode, and produce calculating
Raw intermediate data is sent to the slave query engine where subsequent subtask;If intermediate data is sent successfully, by current subtask
State be changed to the state of being finished;If waiting between pending state and calculating state, calculating state and distribution number
According between state or distributing data mode between the state that is finished and breaking down, the state of current subtask is changed to hold
Row status of fail;When state in current subtask changes, the main query engine of asynchronous notifications.
Optionally, from the column data that the intermediate data transmitted between query engine is by compression processing.In traditional data
In the enforcement engine of library, intermediate data is occurred by the form of table, and data storage is stored according to row, however in most of analytic type industry
Under scene of being engaged in, user's only several attributes in one relation table of relationship can be additional in calculating process by the way of row storage
It loads the unconcerned attribute data of user institute and has well solved this by the way of column storage to cause the waste of memory
One problem.
Optionally, the compression processing includes position compression processing and dictionary compression processing.Using dictionary compression handle and
The mode of position compression processing can further decrease memory overhead, improve the service efficiency of memory.
Compared with prior art, the present invention having the following advantages and benefits:
The query engine system and querying method of distributed memory columnar database provided by the invention, pass through asynchronous schedule
The execution of each subtask improves integral operation efficiency, i.e., inquiry plan is divided into several subtasks for having dependence, and
Subtask is distributed to accordingly from the task queue of query engine, by successively executing the son in task queue from query engine
Task.Further, from the column data that the data transmitted between query engine are by compression processing, the side using row storage is solved
Formula additionally loaded in calculating process user unconcerned attribute data and cause the waste problem of memory.
Detailed description of the invention
Attached drawing described herein is used to provide to further understand the embodiment of the present invention, constitutes one of the application
Point, do not constitute the restriction to the embodiment of the present invention.In the accompanying drawings:
Fig. 1 is a SQL query plan and the task execution flow diagram of HIVE;
Fig. 2 is a SQL query plan and the task execution flow diagram of Spark-SQL;
Fig. 3 is the part-structure signal of the query engine system of the distributed memory columnar database of the embodiment of the present invention
Figure;
Fig. 4 is a SQL query plan schematic diagram of the embodiment of the present invention;
Fig. 5 is the task execution flow diagram of the embodiment of the present invention;
Fig. 6 is the execution state transition diagram of the subtask of the embodiment of the present invention;
Fig. 7 is the schematic diagram that data are transmitted between the slave query engine of the embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below with reference to embodiment and attached drawing, to this
Invention is described in further detail, and exemplary embodiment of the invention and its explanation for explaining only the invention, are not made
For limitation of the invention.
Embodiment
The present embodiment provides a kind of query engine system of distributed memory columnar database, the distributed memory column
The query engine system of database include resource management module, at least one main query engine and at least one draw from inquiry
It holds up.
Specifically, sql like language is converted to inquiry plan by parsing sql like language by the main query engine, and inquiry is counted
Division be cut into behind at least two subtasks be distributed to it is described executed from query engine, and be responsible for holding for monitoring and scheduling inquiry plan
Row process and fault-tolerant processing.With it is similar in the prior art, inquiry plan is indicated with tree.It is described to be used for from query engine
The subtask of the main query engine distribution is executed, the resource management module is used to be responsible for the management and distribution of system resource.
Further, the system resource includes CPU computing resource and memory source.Fig. 3 is the distributed memory column number of the present embodiment
According to the partial structure diagram of the query engine system in library, main query engine 31 corresponding three from query engine: from query engine
32, from query engine 33 and from query engine 34.
The present embodiment also provides the distributed memory of the query engine system based on above-mentioned distributed memory columnar database
The querying method of columnar database, comprising:
Step S1, resource management module determine the session that a main query engine is responsible between user.Specifically, with
When there is query demand at family, resource management module is creating the responsible session between user of a main query engine in resource pool.
The sql like language that user sends is converted to inquiry plan by step S2, main query engine.Main query engine passes through morphology
Parsing and syntax parsing, and rule-based query optimization, are converted into inquiry plan for sql like language.With class in the prior art
Seemingly, inquiry plan is indicated with tree.
Step S3, resource management module are that main query engine is distributed from query engine, and is established and looked into from query engine and master
Ask the communication between engine.After sql like language is converted into inquiry plan, main query engine is to resource management module application meter
Calculate resource, resource management module distribution gives main query engine from query engine, and establish from query engine and main query engine it
Between network connection.
Inquiry plan is divided at least two subtasks by step S4, main query engine, and for the distribution of each subtask from
Query engine.By query engine in this present embodiment towards be distributed memory columnar database, tables of data is in distributed column
By column storage in database, and each column are cut into several fragments according to value range.For this characteristic, the present embodiment is taken out
As having gone out several physics operators, for indicating the specific subtask of some in inquiry plan.The physics operator includes extracting
Column data operation, attended operation, condition filter operation, division operation, aggregate function operation, sorting operation and by final result
Data convert is embarked on journey at least one of the operation of table.
Column data operation: i.e. GetColumn operator is extracted, is responsible for extracting the data of a certain column in column database,
GetColumn operator itself can be with additional restrictions, such as Teacher.age > 1 GetColumn(Teacher.age),
Indicate that the age for extracting Teacher table is arranged, and age value is greater than 1.
Attended operation: i.e. Join operator is responsible for executing Join operation, including Left Join, Right Join, Full
Join etc..
Condition filter operation: i.e. Filter operator is responsible for executing condition filter operation, mainly includes the logics such as AND and OR
Operation.
Division operation: i.e. GroupBy operator is responsible for executing GroupBy division operation, for meeting in SQL statement
The function of GroupBy keyword.
Aggregate function operation: i.e. AGG operator, including Max(maximizing), Avg(averages) etc. databases are common grasps
Make.
Sorting operation: i.e. Order operator, the column for sorting to needs are ranked up operation.
Final result data convert is embarked on journey the operation of table: i.e. BuildRow operator, for by column database final result number
The row table being understood that according to user is reduced into, is presented to user for final result in the form of relation table.
For example, a specific SQL statement SELECT c_custkey FROM customer JOIN nation
ON customer.C_NATIONKEY = nation.N_NATIONKEY JOIN lineitem ON lineitem. L_
PARTKEY=customer.C_CUSTKEY, the inquiry plan generated by the parsing of main query engine is as shown in figure 4, be divided into
Subtask as shown in figure 5, including six from query engine: from query engine Slave-QE1, from query engine Slave-QE2,
From query engine Slave-QE3, from query engine Slave-QE4, from query engine Slave-QE5 and from query engine
Slave-QE6。
Assuming that there are two fragments for each column, then for having a GetColumn operator on the fragment of each column,
Since the fragment of each column has codomain range, then can also generate the Join operator based on the fragment range for each fragment.
With reference to Fig. 5, Join1 node indicates the equivalent attended operation of column L_PARTKEY and C_CUSTKEY, in actual subtask,
Join1 is split into two specific physics operator, Join1-1 and Join1-2, is each responsible for codomain range in 1-100 and value
Domain range is operated in the equivalent Join of 101-150.And so on, Join2 is also split as two specifically in inquiry plan
Join operator.
Further, it is that each subtask is distributed from query engine that main query engine, which is according to Cost Model, in the present embodiment.
Specifically, main query engine is according to the IP and the section obtained from the metadata information of query engine from node where query engine
The database table information and column information of point storage.Each in principle distribution inquiry plan, which is localized, according to data extracts column data behaviour
The execution node IP of work.Such as in Fig. 5, divide from what physical node storage L_PARTKEY where query engine Slave-QE1 was arranged
Sheet data, then the GetColumn operator for the fragment data is just assigned to from physics section where query engine Slave-QE1
It is executed on point.And so on, the execution node of the GetColumn operator of each fragment is held in the node where corresponding data
Row.Greedy algorithm is used for the selection that non-GetColumn operator executes node, non-GetColumn operator executes node at it
It is chosen in the execution node of son's operator node, calculates separately the execution generation executed on every son operator node physical node
Valence selects the smallest physical node of Executing Cost to execute.Principle basis cost calculation formula: Executing Cost=network cost+calculating
Network load × transmitted data amount+node tasks load × calculates data volume between cost=node.In Fig. 5, Join1-1 operator
It executes node or from query engine Slave-QE1 or from query engine Slave-QE3, selects to draw from inquiry here
It is exactly by calculating separately Join1-1 operator from query engine Slave-QE1 that Slave-QE1, which is held up, as the foundation for executing node
It Executing Cost on node and determines calculating from the Executing Cost on query engine SlaveQE-3 node from query engine
The upper Executing Cost of Join1-1 is smaller, so final execution physical node is selected as from query engine Slave-QE1.
Step S5 is added to task queue from query engine by subtask, and the forerunner subtask in current subtask is whole
Current subtask is executed after the completion of executing, current subtask is executed into the intermediate data for completing to generate and is transmitted to subsequent subtask institute
Slave query engine, and current subtask completion status is sent to main query engine.Specifically, each subtask include etc.
Pending state, calculating state, distributing data mode, the state that is finished and execute this five kinds of states of status of fail,
And each subtask can safeguard the list of forerunner subtask and subsequent subtask list of the subtask, the execution of each subtask
State transition graph is as shown in Figure 6.
By taking Join1-1 operator shown in Fig. 4 as an example, forerunner's operator list is GetColumn(L_PARTKEY Slice
1 [1-100]), GetColumn (C_CUSTKEY Slice 1 [1-150]), Consequence operator list is Join2-1 operator.
Join1-1 operator original state be etc. it is pending, the physical node where Join1-1 operator receives its all forerunner's operator and sends
After the data come, Join1-1 operator state is changed to calculating, and after Join1-1 operator calculates completion, will change when pre-operator
To distribute data, and calculation result data is sent to by network to the slave query engine where Consequence operator.Data are sent
Success, when pre-operator task execution is completed.If wherein a certain step breaks down, i.e., etc. pending state and calculating state it
Between, calculating between state and distribution data mode or distributing data mode and break down between the state that is finished,
Operator state can be set to execute failure.Certainly, the every generation one-shot change of Join1-1 operator state, can all look into master in real time
It askes engine and reports and work as pre-operator state.The execution of each operator is independent from each other, and in each operator implementation procedure, state is once
Change, will the main query engine of asynchronous notifications, and result data is pushed to the execution physical node where Consequence operator.
In this way, whether the execution of the subtask forerunner subtask that places one's entire reliance upon completes, without as Spark or
MapReduce is the same, goes execution task stage by stage.
It in step s 5, is the column by compression processing from query engine and from the intermediate data transmitted between query engine
Data include that position compression processing and dictionary compression are handled using compression processing method, by taking data structure shown in Fig. 7 as an example, in
Between data include three vectors, i.e. dictionary vector, offset vector sum position vector.Dictionary vector arranges initial data
Sequence, then duplicate removal processing, the data of redundancy are abandoned, and save memory storage space.As for offset vector sum position vector, by
What is stored inside the two vectors is integer, uses position Compression Strategies here.In a computer, an INT type accounts for four
Byte, i.e. 32bit, denotable data area -2147483648~2147483647, for offset shown in Fig. 7 to
Amount and position vector, the maximum value of integer can be decided in vector.So in most cases, storing a number and using
Not 32bit.Assuming that the maximum value of integer is A in offset vector or position vector, then used in one number of storage
Bit number rounds up for log2A, compares conventionally employed INT type or LONG type variable to store integer, in this way
More save memory.
Step S6, after the completion of entire inquiry plan, main query engine notifies client most to terminate from query engine acquisition
Fruit data.So far, entire inquiry work is completed.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects
It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention
Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.
Claims (3)
1. a kind of querying method of distributed memory columnar database, which is characterized in that the method is arranged based on distributed memory
The query engine system of formula database, the system comprises:
The query engine system of distributed memory columnar database, including resource management module, at least one main query engine with
And at least one is from query engine;
The main query engine is used to sql like language being converted to inquiry plan, and inquiry plan is divided at least two subtasks,
And it is responsible for the implementation procedure of monitoring and scheduling inquiry plan;
It is described to be used to execute the subtask that the main query engine distributes from query engine;
The resource management module is used to be responsible for the management and distribution of system resource;
The described method includes: resource management module determines the session that a main query engine is responsible between user;
The sql like language that user sends is converted to inquiry plan by main query engine;
Resource management module is that main query engine is distributed from query engine, and is established between query engine and main query engine
Communication;
Inquiry plan is divided at least two subtasks by main query engine, and is distributed for each subtask from query engine;
It is added to task queue from query engine by subtask, is held after the completion of the forerunner subtask of current subtask all executes
Current subtask is executed the intermediate data for completing to generate and is transmitted to drawing where subsequent subtask from inquiry by the preceding subtask of the trade
It holds up, and current subtask completion status is sent to main query engine;
After the completion of entire inquiry plan, main query engine notifies client obtaining final result data from query engine;
Subtask uses physics operator representation, and the physics operator includes extracting column data operation, attended operation, condition filter behaviour
Work, division operation, aggregate function operation, at least one in the operation of sorting operation and table that final result data convert is embarked on journey
Kind;
The state of each subtask include etc. pending state, calculating state, distribute data mode, the state that is finished with
And execute status of fail;
The original state of current subtask such as is at the pending state, receives current son from query engine where current subtask and appoints
It is engaged in after the intermediate data that all forerunner subtasks execution completions generate, is changed to the state of current subtask calculating state;
After the completion of current subtask calculates, the state of current subtask is changed to distribution data mode, and the mediant generated will be calculated
According to the slave query engine being sent to where subsequent subtask;If intermediate data is sent successfully, the state of current subtask is changed to
Be finished state;If waiting between pending state and calculating state, calculating between state and distribution data mode
Or distribute data mode and break down between the state that is finished, the state of current subtask is changed to execute failure shape
State;When state in current subtask changes, the main query engine of asynchronous notifications;Main query engine is every according to Cost Model
A subtask is distributed from query engine;Main query engine distributes for each subtask from query engine according to Cost Model
According to the data obtained from the metadata information of query engine from the IP of node where query engine and node storage
Library table information and column information;
Each in principle distribution inquiry plan, which is localized, according to data extracts the execution node IP of column data operation;
The execution node of non-extraction column data operation is chosen using greedy algorithm.
2. the querying method of distributed memory columnar database according to claim 1, which is characterized in that from query engine
Between the intermediate data that transmits be column data by compression processing.
3. the querying method of distributed memory columnar database according to claim 2, which is characterized in that at the compression
Reason includes that position compression processing and dictionary compression are handled.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610193220.XA CN105824957B (en) | 2016-03-30 | 2016-03-30 | The query engine system and querying method of distributed memory columnar database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610193220.XA CN105824957B (en) | 2016-03-30 | 2016-03-30 | The query engine system and querying method of distributed memory columnar database |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105824957A CN105824957A (en) | 2016-08-03 |
CN105824957B true CN105824957B (en) | 2019-09-03 |
Family
ID=56524572
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610193220.XA Active CN105824957B (en) | 2016-03-30 | 2016-03-30 | The query engine system and querying method of distributed memory columnar database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105824957B (en) |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106326387B (en) * | 2016-08-17 | 2019-06-04 | 电子科技大学 | A kind of Distributed Storage structure and date storage method and data query method |
CN106445645B (en) * | 2016-09-06 | 2019-11-26 | 北京百度网讯科技有限公司 | Method and apparatus for executing distributed computing task |
CN107818100B (en) * | 2016-09-12 | 2019-12-20 | 杭州海康威视数字技术股份有限公司 | SQL statement execution method and device |
CN106383738B (en) * | 2016-09-30 | 2019-10-11 | 北京百度网讯科技有限公司 | Task processing method and distributed computing framework |
CN106649503A (en) * | 2016-10-11 | 2017-05-10 | 北京集奥聚合科技有限公司 | Query method and system based on sql |
CN107329814B (en) * | 2017-06-16 | 2020-05-26 | 电子科技大学 | RDMA (remote direct memory Access) -based distributed memory database query engine system |
CN107450972B (en) * | 2017-07-04 | 2020-10-16 | 创新先进技术有限公司 | Scheduling method and device and electronic equipment |
CN110020006B (en) * | 2017-07-27 | 2021-04-27 | 北京国双科技有限公司 | Query statement generation method and related equipment |
CN109547512B (en) * | 2017-09-22 | 2021-09-03 | 中国移动通信集团浙江有限公司 | NoSQL-based distributed Session management method and device |
CN110083441B (en) * | 2018-01-26 | 2021-06-04 | 中兴飞流信息科技有限公司 | Distributed computing system and distributed computing method |
CN108520011B (en) * | 2018-03-21 | 2020-12-04 | 哈工大大数据(哈尔滨)智能科技有限公司 | Method and device for determining task execution scheme |
CN110968579B (en) * | 2018-09-30 | 2023-04-11 | 阿里巴巴集团控股有限公司 | Execution plan generation and execution method, database engine and storage medium |
CN109471893B (en) * | 2018-10-24 | 2022-05-20 | 上海连尚网络科技有限公司 | Network data query method, equipment and computer readable storage medium |
CN110119275B (en) * | 2019-05-13 | 2021-04-02 | 电子科技大学 | Distributed memory column type database compiling executor architecture |
CN110263105B (en) | 2019-05-21 | 2021-09-10 | 北京百度网讯科技有限公司 | Query processing method, query processing system, server, and computer-readable medium |
CN110300332B (en) * | 2019-06-18 | 2020-05-08 | 南京科源信息技术有限公司 | Game loading method and system based on IPTV |
CN112650561B (en) * | 2019-10-11 | 2023-04-11 | 金篆信科有限责任公司 | Transaction management method, system, network device and readable storage medium |
CN110990430A (en) * | 2019-11-29 | 2020-04-10 | 广西电网有限责任公司 | Large-scale data parallel processing system |
CN110851452B (en) * | 2020-01-16 | 2020-09-04 | 医渡云(北京)技术有限公司 | Data table connection processing method and device, electronic equipment and storage medium |
CN111382156A (en) * | 2020-02-14 | 2020-07-07 | 石化盈科信息技术有限责任公司 | Data acquisition method, system, device, electronic equipment and storage medium |
CN111552689B (en) * | 2020-03-30 | 2022-05-03 | 平安医疗健康管理股份有限公司 | Method, device and equipment for calculating deduplication index of fund audit |
CN111723112B (en) * | 2020-06-11 | 2023-07-07 | 咪咕文化科技有限公司 | Data task execution method and device, electronic equipment and storage medium |
CN112000688A (en) * | 2020-08-14 | 2020-11-27 | 杭州数云信息技术有限公司 | Query method and query system based on universal query language |
CN112416926A (en) * | 2020-11-02 | 2021-02-26 | 浙商银行股份有限公司 | Design method of distributed database high-performance actuator supporting domestic CPU SIMD instruction |
CN112269835A (en) * | 2020-11-10 | 2021-01-26 | 浪潮云信息技术股份公司 | Method for asynchronously reading and processing batch data by distributed database |
CN113946600A (en) * | 2021-10-21 | 2022-01-18 | 北京人大金仓信息技术股份有限公司 | Data query method, data query device, computer equipment and medium |
CN113792079B (en) * | 2021-11-17 | 2022-02-08 | 腾讯科技(深圳)有限公司 | Data query method and device, computer equipment and storage medium |
CN113934763B (en) * | 2021-12-17 | 2022-04-12 | 北京奥星贝斯科技有限公司 | SQL query method and device for distributed database |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103123652A (en) * | 2013-03-14 | 2013-05-29 | 曙光信息产业(北京)有限公司 | Data query method and cluster database system |
CN103324765A (en) * | 2013-07-19 | 2013-09-25 | 西安电子科技大学 | Multi-core synchronization data query optimization method based on column storage |
-
2016
- 2016-03-30 CN CN201610193220.XA patent/CN105824957B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103123652A (en) * | 2013-03-14 | 2013-05-29 | 曙光信息产业(北京)有限公司 | Data query method and cluster database system |
CN103324765A (en) * | 2013-07-19 | 2013-09-25 | 西安电子科技大学 | Multi-core synchronization data query optimization method based on column storage |
Also Published As
Publication number | Publication date |
---|---|
CN105824957A (en) | 2016-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105824957B (en) | The query engine system and querying method of distributed memory columnar database | |
AU2014240211B2 (en) | Background format optimization for enhanced sql-like queries in hadoop | |
US9177025B2 (en) | Hash-join in parallel computation environments | |
KR101621137B1 (en) | Low latency query engine for apache hadoop | |
EP2831767B1 (en) | Method and system for processing data queries | |
CN107329814B (en) | RDMA (remote direct memory Access) -based distributed memory database query engine system | |
US10394807B2 (en) | Rewrite constraints for database queries | |
EP2469423B1 (en) | Aggregation in parallel computation environments with shared memory | |
CN105930417B (en) | A kind of big data ETL interactive process platform based on cloud computing | |
CN106354729A (en) | Graph data handling method, device and system | |
US10459760B2 (en) | Optimizing job execution in parallel processing with improved job scheduling using job currency hints | |
CN114756629B (en) | Multi-source heterogeneous data interaction analysis engine and method based on SQL | |
CN116756150B (en) | Mpp database large table association acceleration method | |
Sax et al. | Performance optimization for distributed intra-node-parallel streaming systems | |
Leida et al. | Distributed SPARQL query answering over RDF data streams | |
CN108319604B (en) | Optimization method for association of large and small tables in hive | |
Kotowski et al. | Parallel query processing for OLAP in grids | |
Gupta et al. | An approach for optimizing the performance for apache spark applications | |
CN116401277A (en) | Data processing method, device, system, equipment and medium | |
CN115982230A (en) | Cross-data-source query method, system, equipment and storage medium of database | |
US20140379691A1 (en) | Database query processing with reduce function configuration | |
CN107784032A (en) | Gradual output intent, the apparatus and system of a kind of data query result | |
EP2469424B1 (en) | Hash-join in parallel computation environments | |
CN113590651A (en) | Cross-cluster data processing system and method based on HQL | |
Pan et al. | Implementing and Optimizing Multiple Group by Query in a MapReduce Approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |