CN105740344A - Sql statement combination method and system independent of database - Google Patents

Sql statement combination method and system independent of database Download PDF

Info

Publication number
CN105740344A
CN105740344A CN201610048596.1A CN201610048596A CN105740344A CN 105740344 A CN105740344 A CN 105740344A CN 201610048596 A CN201610048596 A CN 201610048596A CN 105740344 A CN105740344 A CN 105740344A
Authority
CN
China
Prior art keywords
sql statement
queue
merging
statement
initialization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610048596.1A
Other languages
Chinese (zh)
Inventor
孙建洲
宋�莹
孙毓忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201610048596.1A priority Critical patent/CN105740344A/en
Publication of CN105740344A publication Critical patent/CN105740344A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides an sql statement combination method and system independent of a database, and relates to the field of database speedup technologies. The method comprises: performing database scanning on a selected database according to a database name; acquiring all table name information of the selected database; performing queue initialization according to the table name information so as to generate initialization queues whose number is a total number of tables in the selected database; acquiring sql statements; placing the sql statements into a globally open producer queue, and extracting the sql statements in order from the producer queue; performing object encapsulation processing on each extracted sql statement; storing the sql statement subjected to object encapsulation processing into the initialization queues; performing an dequeuing operation on all the sql statements in the initialization queues; and combining the dequeued sql statements according to a combination rule, so as to make each of the initialization queues correspond to one general sql statement.

Description

A kind of sql statement independent of data base merges method and system
Technical field
The present invention relates to data base's speed technology field, merge method and system particularly to a kind of sql statement independent of data base.
Background technology
Along with in every field data be on the increase renewal, the data base that Various types of data is supported is also constantly in development, meanwhile, the requirement obtaining concurrency performance for a data base is also constantly enhanced, and the concurrency performance of a data base is kept the support of data base itself and hardware to a great extent and is restricted.
For data base, number of concurrent finger-like state is the number of the non-backstage Session of Active, is namely synchronization, sends the request number of inquiry request toward data base.
For Oracle, although Oracle can maximum support 200 concurrent, but in practical situations both, it has been rarely found for concurrently reaching 50, for the rac binode of 10g, node is rx8640,16cpu, 64Gmemory, generally, it is just very heavy that number of concurrent is also less than 50 pressure, is substantially torpor to 80.
One machine, the internal memory of 8G, 64bitlinux system, mysql connection is counted to 250+ internal memory and be would be at a warning value, adds mysql and connects the reason not discharged, the then problem that there will be toomanyconnections.
For Hive, its acquiescence Map/Reducejob is that order performs, and giving tacit consent to concurrent quantity is 8, and the number acquiescence of concurrent thread when concurrently submitting to is also 8.
For the Impala that data base querying process is highly dependent on internal memory, in three physical nodes of laboratory, each node 4 core CPU, 48G internal memory, 3T hard disk, hundred million row table data when concurrently performing small amount of data inquiry its best concurrency but still less than 50.
And along with database engine constantly uses in the data query of every profession and trade, support for database concurrency also seems more and more important, and existing excavate the concurrency performance that this data base supports and increasingly can not meet, from data base itself is internal, the demand that people are growing, thus propose from outside parallelization thinking just seem very in the urgent need to.
At present, a large amount of correlational studyes that merge begin for this causing concurrency performance more weak owing to making full use of internal memory, and the merging improvement opportunity made to improve data base querying speed is suggested, this is wherein, mainly there are two big classes, one class is outside in inquiry, carries out corresponding operating before namely consigning to data base.(magnanimity SQL statement merges the method and device sorted out to patent, open (bulletin) number: 102945256A) merge, by certain device or statement, the method sorted out, obtain the variate-value of SQL statement, and variate-value constant is replaced, SQL statement after being resolved, then the HASH value of SQL statement after calculating and being resolved, sorts out merging according to HASH value by the SQL statement obtained.Namely only just can participating in when the variable of each statement is completely the same merging, the limitation of its merging is very strong, simultaneously as the space needing storage hash value and additionally occupying also due to the magnanimity statement that receives and produce certain memory headroom burden.
Article (ACombinedApproachtoPreventSQLInjectionAttacks, EvansDogbeRichardMillhamPrenithaSingh, ScienceandInformationConference2013) a kind of consolidation strategy is proposed, the merging of one parsing tree node is carried out for the SQLsql statement of all known accreditations, namely each node can a corresponding known sqlsql statement, in this, as a kind of rule, for the sql statement received later every time, existing node contrasts with tree in capital, if existing, then process.If being absent from, consider separately.
Above-mentioned several alanysis work, it is all merge at the outside statement done of data base, but this two class merges, first the merging being provided to next step coupling and do, it is that the conditional-variable for statement requires significantly high, when namely all variablees are all consistent, could calculate coupling, or be meet necessarily required SQL statement simply collect merging to all, compare the statement merging part of this patent, simply combining objects small part in this patent.
Patent (the method and system of SQL scripting object, open (bulletin) number: 103617273A) invent and decomposed the method splitting complicated statement, one sql statement is split into multiple queries object, each object represents one piece of content of this inquiry, when object is accumulated to after to a certain degree, follow-up every sql statement split after each module can directly belong to above in existing query object, and the operation that above all query objects are corresponding, all can be stored with the accordingly result after its Object Operations, therefore when it come to identical module operation, then can carry out directly reading results operation.And the attribute field part of extracting of the present invention is all to carry out each SQL statement disassembling being encapsulated in the object including multiple member variable, namely each statement all can correspond to a distinctive object, in statement restructuring procedure, carry out the union operation of globality for its modules of statement of each encapsulation, namely final obtain be one and comprise the corresponding contents of all modules in multiple sql statement.
Simultaneously for above-mentioned two class work owing to the purpose of its work is different, its final obtained result is not a complete sql statement comprising a plurality of statement all the elements, and the result meaning of its merging is entirely different with the amalgamation result meaning of this patent.
nullAnother kind of is internal in inquiry,(a kind of SQL statement based on cloud computing processes system to patent,Open (bulletin) number: 104391895A) a kind of SQL statement process system based on cloud computing is proposed,Its inventive features is that syntactic analysis disassembles SQL statement,Object inside syntax tree is split as the atomic object of minimum particle size,Then the atomic object with identical content is extracted and be merged into a public atomic object,It merges optimizer and is out merged into expression formula object by identical expression formula object extraction,It is put in public memory object pond,Realize the sharing functionality of big object,Its pooling function,Although each statement can carry out a module split,Public module can reduce execution number of times,But the n bar sql statement for getting on the whole,It performs the final still n bar sql statement of result.
nullSQL statement also can be split by its function of above-mentioned work,Then in order to be able to realize sharing functionality,Scan for mating with all objects in public memory object pond by modules,If there being consistent situation,Then for a plurality of subquery with equal modules,Its operation paid has only to carry out one query and performs,Decrease resource consumption to a certain extent,But along with being continuously increased of sql statement,The addition of especially a large amount of not repeat statements,The capacity in its public memory object pond can become increasing along with new being continuously added of sql statement module,Not only can add the expense of memory source to a certain extent,Also due to the huge of public memory object pond, the workload that new statement module searches for coupling in public memory object pond can be greatly increased simultaneously,The performance boost brought by multiple common module Exactly-onces can be reduced to a certain extent.Further, since its main purpose is to be able to enable the work of equal modules to consume resource as few as possible, therefore the disparate modules in different statements is not processed accordingly yet.For up-to-date obtained n bar sql statement, it is overall performs result still or n bar statement, when bottleneck is in for concurrency performance or overall concurrency performance be not just that good situation too big help.Part is reconstructed for statement, patent is (based on the SQL statement optimization method that constant is replaced, open (notification number): 103678621A) carry out creating binary tree according to where clause, binary tree node is Boolean expression, the interior nodes of binary tree is and node or or node, in its merging process, and meeting inorder traversal binary tree, all leaf nodes are put in chained list, constitutes and replace chained list.
Its reconstruct object carries out binary tree restructuring mainly for different Boolean expression in where clause in a single complicated statement, and the statement reconstruct in this patent to be each the statement module to all different statements merge, final purpose is that the sql statement being directed to same table by some reconstructs a sql statement according to certain rule simultaneously.
Part is split for result; the same with statement reconstruct part; being all the key protection point of this patent, result fractionation part is mainly combined the query results of statement and splits, and query results is split to encapsulation apoplexy due to endogenous wind corresponding to each initial sql statement to bind inquiry request and corresponding result.
Feeding back to original query portion for split result collection, for the apoplexy due to endogenous wind packaged by each inquiry request, the result set after fractionation can be assigned to the result set part corresponding to apoplexy due to endogenous wind good packaged by each inquiry request.
Summary of the invention
For the deficiencies in the prior art, the present invention proposes a kind of sql statement independent of data base and merges method and system.
The present invention proposes a kind of sql statement independent of data base and merges method, including:
Step 1, carries out scan database according to database-name to selected data storehouse, obtains all table name information of data base of described choosing, carries out queue initialization according to described table name information, and generating quantity is the initialization queue of the total quantity of table in described selected data storehouse;
Step 2, obtain sql statement, described sql statement is put in the Producer queue that the overall situation is open, and from described Producer queue, extract described sql statement successively, one often extracted described sql statement is carried out object encapsulation process simultaneously, and the sql statement after object encapsulation process is stored in described initialization queue;
Sql statements all in described initialization queue are carried out dequeue operation by step 3, are merged according to merging rule by the sql statement after going out team, make the corresponding total sql statement of each described initialization queue, to complete the merging of sql statement.
If in described step 2, sql statement is stored in the rule of described initialization queue is the type that the keyword partly corresponding for from that described sql statement comprises belongs to a table, then the object after being encapsulated by described sql statement is put in the queue corresponding with described table;If the keyword of the from part correspondence that described sql statement comprises belongs to the type that multiple table couples, then create a new queue according to the table name coupled, the relevant sql statement occurred after coupling for stored table.
Step C1 is also included between described step 2 and described step 3, for each described sql statement extracted from described Producer queue, the described initialization queue corresponding with described sql statement is carried out sql statement interpolation, if sql statement quantity reaches the thresholding of described initialization queue in adding procedure, then obtain described initialization queue, and trigger the union operation of described initialization queue;
Step C2, from described Producer queue, described sql statement is extracted for initial starting time with first time, whenever interval reaches the time threshold of setting, then carry out once overall merging, sql statement described in all described initialization queues is uniformly carried out a union operation.
The described merging rule of described step 3 is, merge the field that in described sql statement, select part is corresponding, when * does not occur, for the same field occurred between each described sql statement, save the field after same field, if different field occurs, add in amalgamation result, specification according to SQL92 simultaneously, centre is separated by with comma, in the event of *, then the amalgamation result for participating in the sql statement of merging before is modified, and changes * into, does not continue to participate in merging select part for the follow-up sql statement participating in merging;
Merging the field that where part is corresponding, the content that wherein where part is corresponding comprises with where for head, does not comprise where keyword itself, with groupby, the position that orderby, limit subsequent operation keyword occurs the earliest is ending, does not comprise described subsequent operation keyword.
Also including performing query steps, the sql statement after each described initialization queue being merged carries out the input operation of database query engine, obtains the query results merging statement.
nullQuery results splitting step,Each described query results splits according to splitting rule,It is split in corresponding described initialization queue on each sql statement crossed by object encapsulation,Wherein split rule for traveling through each sql statement crossed by object encapsulation in described initialization queue,First described query results is once arranged location by the content according to select part,Then whether inquiry subsequent content partial key includes limit keyword,If including limit keyword,Then first filter out corresponding bar number according to limit keyword,If not comprising limit keyword,The object then got up packaged by search where part,According to each key,Described query results is carried out first time result screening by the relation of rela and value,If where partial content is empty in the object of described sql statement encapsulation,Then skip the screening to where partial content,Carry out the subsequent content part screening process to described query results,Finally,Again through subsequent content part, described query results carried out last fractionation sequencer procedure,If orderby,Then it is ranked up operating to the described query results above screened according to corresponding keyword,If groupby,Then according to corresponding keyword, the described query results above screened is carried out division operation,The query results each operated is put in the sql statement of initial encapsulation corresponding thereto.
Also including statement result collection feedback step, put in consumer queue by ID corresponding for the query results after all fractionations, upper layer application obtains corresponding query results according to described ID to described consumer in row.
The present invention also proposes a kind of sql statement combination system independent of data base, including:
Initialize Queue module, for selected data storehouse being carried out scan database according to database-name, obtaining all table name information of data base of described choosing, carry out queue initialization according to described table name information, generating quantity is the initialization queue of the total quantity of table in described selected data storehouse;
Package module, for obtaining sql statement, described sql statement is put in the Producer queue that the overall situation is open, and from described Producer queue, extract described sql statement successively, one often extracted described sql statement is carried out object encapsulation process simultaneously, and the sql statement after object encapsulation process is stored in described initialization queue;
Merge module, for sql statements all in described initialization queue are carried out dequeue operation, the sql statement after going out team is merged according to merging rule, make the corresponding total sql statement of each described initialization queue, to complete the merging of sql statement.
If in described package module, sql statement is stored in the rule of described initialization queue is the type that the keyword partly corresponding for from that described sql statement comprises belongs to a table, then the object after being encapsulated by described sql statement is put in the queue corresponding with described table;If the keyword of the from part correspondence that described sql statement comprises belongs to the type that multiple table couples, then create a new queue according to the table name coupled, the relevant sql statement occurred after coupling for stored table.
Described package module also included between module merging judge module with described merging, for for each described sql statement extracted from described Producer queue, the described initialization queue corresponding with described sql statement is carried out sql statement interpolation, if sql statement quantity reaches the thresholding of described initialization queue in adding procedure, then obtain described initialization queue, and trigger the union operation of described initialization queue;
From described Producer queue, extract described sql statement for initial starting time with first time, whenever interval reaches the time threshold of setting, then carry out once overall merging, sql statement described in all described initialization queues is uniformly carried out a union operation.
The described merging rule of described merging module is, merge the field that in described sql statement, select part is corresponding, when * does not occur, for the same field occurred between each described sql statement, save the field after same field, if different field occurs, add in amalgamation result, specification according to SQL92 simultaneously, centre is separated by with comma, in the event of *, then the amalgamation result for participating in the sql statement of merging before is modified, and changes * into, does not continue to participate in merging select part for the follow-up sql statement participating in merging;Merging the field that where part is corresponding, the content that wherein where part is corresponding comprises with where for head, does not comprise where keyword itself, with groupby, the position that orderby, limit subsequent operation keyword occurs the earliest is ending, does not comprise described subsequent operation keyword.By above scheme it can be seen that it is an advantage of the current invention that:
Sql statement is carried out unified operation by the restriction that the present invention departs from database query engine, therefore also can play the same effect for different query engines;A plurality of sql statement is become a small amount of some statements according to certain compatible rule merging, carry out the inquiry of database query engine again, it is not for excellent enforcement engine for the inquiry phase just locating high concurrent bottleneck stage and overall concurrency performance, the competition of resource can be reduced to a certain extent, improve its concurrency performance;For the sql statement sended over from upper layer application, after the Fusion query of certain rule and fractionation, the response time of its overall cost can be significantly less than the overhead time altogether performing statement one by one and produce, therefore, on time performance, corresponding different situation also has lifting in various degree.
Accompanying drawing explanation
Fig. 1 is for merging block flow diagram;
Fig. 2 is the inventive method flow chart;
Fig. 3 is present system structure chart.
Detailed description of the invention
For the situation that its concurrency performance of database query engine being currently highly dependent on server memory is unexcellent, the technical problem to be solved is, how to depart from and make functionally to realize the high concurrent of data query in data base, improve the query performance of entirety simultaneously to a certain extent.
Idea of the invention is that: the sql statement that upper layer application is sended over, operation of integrating was carried out before it carries out data base querying, after reaching certain condition, database query engine is utilized to carry out database query operations the sql statement after merging, then for merging the Query Result obtained after statement puts into query engine, according to certain rule, these result sets are split, distribute to the sql statement of correspondence, be then returned to upper layer application.
As described in Figure 2, the present invention proposes a kind of sql statement independent of data base and merges method, including:
Step 1, carries out scan database according to database-name to selected data storehouse, obtains all table name information of data base of described choosing, carries out queue initialization according to described table name information, and generating quantity is the initialization queue of the total quantity of table in described selected data storehouse;
Step 2, obtain sql statement, described sql statement is put in the Producer queue that the overall situation is open, and from described Producer queue, extract described sql statement successively, one often extracted described sql statement is carried out object encapsulation process simultaneously, and the sql statement after object encapsulation process is stored in described initialization queue;
Sql statements all in described initialization queue are carried out dequeue operation by step 3, are merged according to merging rule by the sql statement after going out team, make the corresponding total sql statement of each described initialization queue, to complete the merging of sql statement.
If in described step 2, sql statement is stored in the rule of described initialization queue is the type that the keyword partly corresponding for from that described sql statement comprises belongs to a table, then the object after being encapsulated by described sql statement is put in the queue corresponding with described table;If the keyword of the from part correspondence that described sql statement comprises belongs to the type that multiple table couples, then create a new queue according to the table name coupled, the relevant sql statement occurred after coupling for stored table.
Step C1 is also included between described step 2 and described step 3, for each described sql statement extracted from described Producer queue, the described initialization queue corresponding with described sql statement is carried out sql statement interpolation, if sql statement quantity reaches the thresholding of described initialization queue in adding procedure, then obtain described initialization queue, and trigger the union operation of described initialization queue;
Step C2, from described Producer queue, described sql statement is extracted for initial starting time with first time, whenever interval reaches the time threshold of setting, then carry out once overall merging, sql statement described in all described initialization queues is uniformly carried out a union operation.
The described merging rule of described step 3 is, merge the field that in described sql statement, select part is corresponding, when * does not occur, for the same field occurred between each described sql statement, save the field after same field, if different field occurs, add in amalgamation result, specification according to SQL92 simultaneously, centre is separated by with comma, in the event of *, then the amalgamation result for participating in the sql statement of merging before is modified, and changes * into, does not continue to participate in merging select part for the follow-up sql statement participating in merging;
Merging the field that where part is corresponding, the content that wherein where part is corresponding comprises with where for head, does not comprise where keyword itself, with groupby, the position that orderby, limit subsequent operation keyword occurs the earliest is ending, does not comprise described subsequent operation keyword.
Also including performing query steps, the sql statement after each described initialization queue being merged carries out the input operation of database query engine, obtains the query results merging statement.
nullQuery results splitting step,Each described query results splits according to splitting rule,It is split in corresponding described initialization queue on each sql statement crossed by object encapsulation,Wherein split rule for traveling through each sql statement crossed by object encapsulation in described initialization queue,First described query results is once arranged location by the content according to select part,Then whether inquiry subsequent content partial key includes limit keyword,If including limit keyword,Then first filter out corresponding bar number according to limit keyword,If not comprising limit keyword,The object then got up packaged by search where part,According to each key,Described query results is carried out first time result screening by the relation of rela and value,If where partial content is empty in the object of described sql statement encapsulation,Then skip the screening to where partial content,Carry out the subsequent content part screening process to described query results,Finally,Again through subsequent content part, described query results carried out last fractionation sequencer procedure,If orderby,Then it is ranked up operating to the described query results above screened according to corresponding keyword,If groupby,Then according to corresponding keyword, the described query results above screened is carried out division operation,The query results each operated is put in the sql statement of initial encapsulation corresponding thereto.
Also including statement result collection feedback step, put in consumer queue by ID corresponding for the query results after all fractionations, upper layer application obtains corresponding query results according to described ID to described consumer in row.
As described in Figure 3, the present invention also proposes a kind of sql statement combination system independent of data base, including:
Initialize Queue module, for selected data storehouse being carried out scan database according to database-name, obtaining all table name information of data base of described choosing, carry out queue initialization according to described table name information, generating quantity is the initialization queue of the total quantity of table in described selected data storehouse;
Package module, for obtaining sql statement, described sql statement is put in the Producer queue that the overall situation is open, and from described Producer queue, extract described sql statement successively, one often extracted described sql statement is carried out object encapsulation process simultaneously, and the sql statement after object encapsulation process is stored in described initialization queue;
Merge module, for sql statements all in described initialization queue are carried out dequeue operation, the sql statement after going out team is merged according to merging rule, make the corresponding total sql statement of each described initialization queue, to complete the merging of sql statement.
If in described package module, sql statement is stored in the rule of described initialization queue is the type that the keyword partly corresponding for from that described sql statement comprises belongs to a table, then the object after being encapsulated by described sql statement is put in the queue corresponding with described table;If the keyword of the from part correspondence that described sql statement comprises belongs to the type that multiple table couples, then create a new queue according to the table name coupled, the relevant sql statement occurred after coupling for stored table.
Described package module also included between module merging judge module with described merging, for for each described sql statement extracted from described Producer queue, the described initialization queue corresponding with described sql statement is carried out sql statement interpolation, if sql statement quantity reaches the thresholding of described initialization queue in adding procedure, then obtain described initialization queue, and trigger the union operation of described initialization queue;
From described Producer queue, extract described sql statement for initial starting time with first time, whenever interval reaches the time threshold of setting, then carry out once overall merging, sql statement described in all described initialization queues is uniformly carried out a union operation.
The described merging rule of described merging module is, merge the field that in described sql statement, select part is corresponding, when * does not occur, for the same field occurred between each described sql statement, save the field after same field, if different field occurs, add in amalgamation result, specification according to SQL92 simultaneously, centre is separated by with comma, in the event of *, then the amalgamation result for participating in the sql statement of merging before is modified, and changes * into, does not continue to participate in merging select part for the follow-up sql statement participating in merging;Merging the field that where part is corresponding, the content that wherein where part is corresponding comprises with where for head, does not comprise where keyword itself, with groupby, the position that orderby, limit subsequent operation keyword occurs the earliest is ending, does not comprise described subsequent operation keyword.
Also include performing enquiry module, carry out the input operation of database query engine for the sql statement after each described initialization queue being merged, obtain the query results merging statement.
nullQuery results splits module,Split according to splitting rule for query results each described,It is split in corresponding described initialization queue on each sql statement crossed by object encapsulation,Wherein split rule for traveling through each sql statement crossed by object encapsulation in described initialization queue,First described query results is once arranged location by the content according to select part,Then whether inquiry subsequent content partial key includes limit keyword,If including limit keyword,Then first filter out corresponding bar number according to limit keyword,If not comprising limit keyword,The object then got up packaged by search where part,According to each key,Described query results is carried out first time result screening by the relation of rela and value,If where partial content is empty in the object of described sql statement encapsulation,Then skip the screening to where partial content,Carry out the subsequent content part screening process to described query results,Finally,Again through subsequent content part, described query results carried out last fractionation sequencer procedure,If orderby,Then it is ranked up operating to the described query results above screened according to corresponding keyword,If groupby,Then according to corresponding keyword, the described query results above screened is carried out division operation,The query results each operated is put in the sql statement of initial encapsulation corresponding thereto.
Also including statement result collection feedback module, for putting in consumer queue by ID corresponding for the query results after all fractionations, upper layer application obtains corresponding query results according to described ID to described consumer in row.
Below in conjunction with accompanying drawing 1, it is further described step of the present invention, as it is shown in figure 1, the present invention's comprises step: A, collection sql statement;B, statement merge;C, execution data base querying;D, query results split;E, statement result collection feed back.Concrete a kind of embodiment is as follows:
A, initialization queue
A1, according to database name, when initial case object, selected data storehouse is carried out once automatically scan database, it is thus achieved that all table name information.
A2, carry out queue initialization according to the table name information obtained, namely initialize the queue that quantity is this database table total quantity.
B, collection sql statement
The sql statement that B1, reception upper layer application send
B2, all sql statements received all are put in the obstruction queue of a thread-safe, put in the Producer queue in an open producer consumer pattern of the overall situation, such as LinkedBlockingQueue or ArrayBlockingQueue
B3, from Producer queue, constantly pull statement, pull a sql statement every time, then do an object encapsulation to process, being encapsulated in an object including multiple variable by a statement, the variable comprised in this object mainly comprises the following aspects: record the ID of this inquiry;The property content (i.e. field to be selected) of select keyword corresponding (keyword),
Leave in a String array;The corresponding keyword (table name namely operated) of from part, leaves in Sting;The corresponding keyword of where part, first the corresponding keyword of where part is demarcated, namely with where keyword for left boundary, do not include where keyword itself, with groupby, orderby, that keyword occurred at first in the subsequent operations such as limit is right boundary, do not include this keyword, if without above-mentioned listed subsequent operation, then without right boundary simultaneously.The corresponding keyword of where part is packaged again, it is encapsulated in an object comprising three variablees, first variable is key, i.e. each "=", ">", " that variable name of the left side such as<", " like ", " in " next-door neighbour in where condition, second variable is rela, namely the relation depositing between each conditional-variable and value it is used for, such as "=", ">", "<", " like ", " in ", 3rd variable is value, i.e. property value corresponding to each key;Subsequent operation partial content (such as groupby, orderby, limit etc.), leaves in a String;Simple condition information (i.e. whole where condition part, does not comprise where keyword and subsequent operation partial content, does simple union operation to reduce expense in order to merging process);Result set, initial value is null, can carry out statement correspondence in advance to upper layer application by accurate feedback for the later stage, leave in the List collection class that content is String entirely.
B4, each packaged statement is carried out sql statement enqueue operations in corresponding queue, rule point of wherein joining the team two classes: if the corresponding keyword of from part that this statement comprises is belonging to the type of a table, namely do not comprise join operation, then the object after being encapsulated by this statement is put in the queue corresponding to this affiliated table.If the corresponding keyword of from part that this statement comprises is belonging to the type that multiple table couples, then according to the newly created queue of the table name coupled, it is specifically designed to and deposits the relevant sql statement occurred after the two table couples.
C, judgement trigger and merge the condition performed
C1, for each sql statement obtaining out from Producer queue, the queue affiliated according to it and corresponding queue is carried out statement interpolation, if statement number has reached queue set thresholding at first in adding procedure, then obtain this queue immediately, and trigger the union operation of this queue, enter step C1.
It is initial starting time that C2, note take sql statement from Producer queue lira for the first time, whenever interval has reached a time threshold (such as 5s) set, then carry out once overall merging, be all uniformly carried out a union operation by the statement of all queues.Enter step C2.Here time threshold can be arranged according to the needs of oneself, assuming that a maximum tolerance time is to start to accept sql statement from upper layer application finally to obtain the total time of result to upper layer application, so here set time threshold should be less than this maximum tolerance time, because interval time here adds the merging time, and the summation of query time and fractionation time, it is only and starts to accept sql statement to the total time returning to upper layer application Query Result from upper layer application.
D, statement merge
D1, for each given queue, all statements in this queue are carried out a dequeue operation, by each statement according on certain compatible rule merging to the up-to-date statement above closed in dequeue process so that final result is the corresponding total merging statement of a queue.
Wherein, the rule that merges here merges according to the object after before sql statement being encapsulated.The first step, merges the field that select part is corresponding, when occurring without *, for the same field occurred between different statements, then directly saving field later, if occurring in that different field, being added immediately.Specification according to SQL92 simultaneously, centre is separated by with comma, and in the event of *, then the amalgamation result for participating in the statement of merging before is modified, and directly changes * into, does not continue to participate in merging select part for the follow-up statement participating in merging.Proceed by the merging of next part.
Second step, merges the field that where part is corresponding, and the content that wherein where part is corresponding comprises with where for head, do not comprise where keyword itself, with groupby, orderby, the position that the subsequent operation keywords such as limit occur the earliest is ending, does not comprise this subsequent operation keyword.For the content that where part is corresponding, first whether the merging statement before inquiry judging has contained this simple condition information, if including, then continue the merging of next statement, if the content that now the where part of all statements is corresponding has all merged, then carry out next step operation.If this simple condition information is not included in merging statement above, then being added by the middle or of the interpolation simple condition information operated different statements, different statement where part corresponding contents can also carry out the combination of complication further according to semanteme herein certainly.Then proceed to the union operation of next statement, if all statements have merged the content that where part is corresponding all, then carry out next step operation.
Finally, all built-up sections are spliced into one and merge total statement, owing to same queue is for identical table, therefore, table name is directly taken to the last item sql statement statement to obtain table name.And for further part content, therefore and be not involved in merging process owing to the minimizing of whole result set can't be caused,.
All SQLsql statements that the final result of above-mentioned merging is in this queue are finally merged into a SQLsql statement.
D2, the sql statement of all queues all carries out sql statement merging, final result is that all statements in each queue are merged into a total sql statement.
E, perform inquiry
E1, by each queue merge after sql statement carry out database query engine input operation, it is thus achieved that merge statement query results.
F, query results split
F1, each query results split according to certain rule, are split on the object of the packed mistake of each in respective queue
Split rule: the sql statement of each packed mistake in traversal queue.First it is combined statement overall result collection according to the content of select part and once arranges location, then first go whether inquiry subsequent content partial key includes limit keyword, if including limit keyword, then first go out corresponding bar number according to limit conditional filtering, if not comprising limit keyword, the object then got up packaged by direct its where part of removal search, according to each key, result set is carried out first time result screening by the relation of rela and value, if where partial content is empty in the object of this sql statement encapsulation, then directly skip the screening to its where partial content, carry out the subsequent content part screening process to result set.Finally, again through subsequent content part, result set is carried out last fractionation sequencer procedure, if orderby, then be ranked up operating to the result set above screened according to corresponding keyword.If groupby, then carry out division operation according to the corresponding keyword result set to above having screened.
F2, each corresponding good result set is put in sql statement packaged at first, namely make the ID of correspondence have the result set of correspondence.
G, statement result collection feed back
G1, its ID corresponding for the result set after all fractionations is put in consumer queue.Upper layer application goes to consumer to obtain corresponding sql statement result collection in row according to sql statement ID.
The present invention is by collecting the unified collection of the statement that upper layer application sends, real-time judge according to different queue length, and according to the gap length between merging with statement after statement transmission, based on certain merging rule, all sql statements of table same in same data base are carried out statement merging, then unified for the sql statement after merging putting into is carried out database query operations in database query engine, based on certain rule that splits, the result set that inquiry after merging obtains is carried out fractured operation again, and the result set after splitting is got up with initial given sql statement one_to_one corresponding.While promoting overall query performance, also the concurrency performance of statement can be improved, for under high complications, query engine concurrency performance be not very well or concurrency performance reach the situation of bottleneck and there is important practical significance, there is good market prospect and using value.

Claims (10)

1. the sql statement independent of data base merges method, it is characterised in that including:
Step 1, carries out scan database according to database-name to selected data storehouse, obtains all table name information of data base of described choosing, carries out queue initialization according to described table name information, and generating quantity is the initialization queue of the total quantity of table in described selected data storehouse;
Step 2, obtain sql statement, described sql statement is put in the Producer queue that the overall situation is open, and from described Producer queue, extract described sql statement successively, one often extracted described sql statement is carried out object encapsulation process simultaneously, and the sql statement after object encapsulation process is stored in described initialization queue;
Sql statements all in described initialization queue are carried out dequeue operation by step 3, are merged according to merging rule by the sql statement after going out team, make the corresponding total sql statement of each described initialization queue, to complete the merging of sql statement.
2. the sql statement independent of data base as claimed in claim 1 merges method, it is characterized in that, if in described step 2, sql statement is stored in the rule of described initialization queue is the type that the keyword partly corresponding for from that described sql statement comprises belongs to a table, then the object after being encapsulated by described sql statement is put in the queue corresponding with described table;If the keyword of the from part correspondence that described sql statement comprises belongs to the type that multiple table couples, then create a new queue according to the table name coupled, the relevant sql statement occurred after coupling for stored table.
3. the sql statement independent of data base as claimed in claim 1 merges method, it is characterized in that, step C1 is also included between described step 2 and described step 3, for each described sql statement extracted from described Producer queue, the described initialization queue corresponding with described sql statement is carried out sql statement interpolation, if sql statement quantity reaches the thresholding of described initialization queue in adding procedure, then obtain described initialization queue, and trigger the union operation of described initialization queue;
Step C2, from described Producer queue, described sql statement is extracted for initial starting time with first time, whenever interval reaches the time threshold of setting, then carry out once overall merging, sql statement described in all described initialization queues is uniformly carried out a union operation.
4. the sql statement independent of data base as claimed in claim 1 merges method, it is characterized in that, the described merging rule of described step 3 is, merge the field that in described sql statement, select part is corresponding, when * does not occur, for the same field occurred between each described sql statement, save the field after same field, if different field occurs, add in amalgamation result, specification according to SQL92 simultaneously, centre is separated by with comma, in the event of *, then the amalgamation result for participating in the sql statement of merging before is modified, change * into, the follow-up sql statement participating in merging is not continued to participate in merging select part;
Merging the field that where part is corresponding, the content that wherein where part is corresponding comprises with where for head, does not comprise where keyword itself, with groupby, the position that orderby, limit subsequent operation keyword occurs the earliest is ending, does not comprise described subsequent operation keyword.
5. the sql statement independent of data base as claimed in claim 1 merges method, it is characterized in that, also including performing query steps, the sql statement after each described initialization queue being merged carries out the input operation of database query engine, obtains the query results merging statement.
nullQuery results splitting step,Each described query results splits according to splitting rule,It is split in corresponding described initialization queue on each sql statement crossed by object encapsulation,Wherein split rule for traveling through each sql statement crossed by object encapsulation in described initialization queue,First described query results is once arranged location by the content according to select part,Then whether inquiry subsequent content partial key includes limit keyword,If including limit keyword,Then first filter out corresponding bar number according to limit keyword,If not comprising limit keyword,The object then got up packaged by search where part,According to each key,Described query results is carried out first time result screening by the relation of rela and value,If where partial content is empty in the object of described sql statement encapsulation,Then skip the screening to where partial content,Carry out the subsequent content part screening process to described query results,Finally,Again through subsequent content part, described query results carried out last fractionation sequencer procedure,If orderby,Then it is ranked up operating to the described query results above screened according to corresponding keyword,If groupby,Then according to corresponding keyword, the described query results above screened is carried out division operation,The query results each operated is put in the sql statement of initial encapsulation corresponding thereto.
6. the sql statement independent of data base as described in claim 1 or 4 merges method, it is characterized in that, also include statement result collection feedback step, putting in consumer queue by ID corresponding for the query results after all fractionations, upper layer application obtains corresponding query results according to described ID to described consumer in row.
7. the sql statement combination system independent of data base, it is characterised in that including:
Initialize Queue module, for selected data storehouse being carried out scan database according to database-name, obtaining all table name information of data base of described choosing, carry out queue initialization according to described table name information, generating quantity is the initialization queue of the total quantity of table in described selected data storehouse;
Package module, for obtaining sql statement, described sql statement is put in the Producer queue that the overall situation is open, and from described Producer queue, extract described sql statement successively, one often extracted described sql statement is carried out object encapsulation process simultaneously, and the sql statement after object encapsulation process is stored in described initialization queue;
Merge module, for sql statements all in described initialization queue are carried out dequeue operation, the sql statement after going out team is merged according to merging rule, make the corresponding total sql statement of each described initialization queue, to complete the merging of sql statement.
8. the sql statement combination system independent of data base as claimed in claim 7, it is characterized in that, if in described package module, sql statement is stored in the rule of described initialization queue is the type that the keyword partly corresponding for from that described sql statement comprises belongs to a table, then the object after being encapsulated by described sql statement is put in the queue corresponding with described table;If the keyword of the from part correspondence that described sql statement comprises belongs to the type that multiple table couples, then create a new queue according to the table name coupled, the relevant sql statement occurred after coupling for stored table.
9. the sql statement combination system independent of data base as claimed in claim 7, it is characterized in that, described package module also included between module merging judge module with described merging, for for each described sql statement extracted from described Producer queue, the described initialization queue corresponding with described sql statement is carried out sql statement interpolation, if sql statement quantity reaches the thresholding of described initialization queue in adding procedure, then obtain described initialization queue, and trigger the union operation of described initialization queue;
From described Producer queue, extract described sql statement for initial starting time with first time, whenever interval reaches the time threshold of setting, then carry out once overall merging, sql statement described in all described initialization queues is uniformly carried out a union operation.
10. the sql statement combination system independent of data base as claimed in claim 7, it is characterized in that, the described merging rule of described merging module is, merge the field that in described sql statement, select part is corresponding, when * does not occur, for the same field occurred between each described sql statement, save the field after same field, if different field occurs, add in amalgamation result, specification according to SQL92 simultaneously, centre is separated by with comma, in the event of *, then the amalgamation result for participating in the sql statement of merging before is modified, change * into, the follow-up sql statement participating in merging is not continued to participate in merging select part;Merging the field that where part is corresponding, the content that wherein where part is corresponding comprises with where for head, does not comprise where keyword itself, with groupby, the position that orderby, limit subsequent operation keyword occurs the earliest is ending, does not comprise described subsequent operation keyword.
CN201610048596.1A 2016-01-25 2016-01-25 Sql statement combination method and system independent of database Pending CN105740344A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610048596.1A CN105740344A (en) 2016-01-25 2016-01-25 Sql statement combination method and system independent of database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610048596.1A CN105740344A (en) 2016-01-25 2016-01-25 Sql statement combination method and system independent of database

Publications (1)

Publication Number Publication Date
CN105740344A true CN105740344A (en) 2016-07-06

Family

ID=56247552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610048596.1A Pending CN105740344A (en) 2016-01-25 2016-01-25 Sql statement combination method and system independent of database

Country Status (1)

Country Link
CN (1) CN105740344A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372177A (en) * 2016-08-30 2017-02-01 东华大学 Query expansion method supporting correlated query and fuzzy grouping of mixed data type
CN107220376A (en) * 2017-06-21 2017-09-29 北京奇艺世纪科技有限公司 A kind of data query method and apparatus
CN107506365A (en) * 2017-06-26 2017-12-22 杭州沃趣科技股份有限公司 A kind of method that calculating is merged to output row
CN107506450A (en) * 2017-08-28 2017-12-22 深圳市华傲数据技术有限公司 A kind of method and device for being used to solve the access of data high concurrent
CN107562790A (en) * 2017-07-31 2018-01-09 北京北信源软件股份有限公司 A kind of method and system for realizing data processing batch storage
CN107918621A (en) * 2016-10-10 2018-04-17 阿里巴巴集团控股有限公司 Daily record data processing method, device and operation system
CN108256080A (en) * 2018-01-19 2018-07-06 深圳市富途网络科技有限公司 A kind of method and system using python grammatical and semantics structure complexity sql sentences
CN109460395A (en) * 2018-09-11 2019-03-12 浙江众合科技股份有限公司 A kind of signal maintenance system database multilingual information storage system and method
CN110968595A (en) * 2019-11-27 2020-04-07 广东科徕尼智能科技有限公司 Single-thread sql statement execution method, equipment and storage medium
CN111190912A (en) * 2019-12-27 2020-05-22 山大地纬软件股份有限公司 Large-transaction-oriented fragment execution method and device based on row change
CN112347122A (en) * 2020-11-10 2021-02-09 西安宇视信息科技有限公司 SQL workflow processing method and device, electronic equipment and storage medium
CN112612771A (en) * 2020-11-24 2021-04-06 深圳市和讯华谷信息技术有限公司 Data writing method and system
CN113064928A (en) * 2021-04-25 2021-07-02 深圳壹账通智能科技有限公司 Report data query method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945256A (en) * 2012-10-18 2013-02-27 福建省海峡信息技术有限公司 Method and device for merging and classifying massive SQL (Structured Query Language) sentences
CN103617273A (en) * 2013-12-05 2014-03-05 用友软件股份有限公司 SOL script objectification method and system
CN103678621A (en) * 2013-12-18 2014-03-26 上海达梦数据库有限公司 SQL statement optimization method based on constant substitution
CN104391895A (en) * 2014-11-12 2015-03-04 珠海世纪鼎利通信科技股份有限公司 SQL (Structured Query Language) sentence processing system based on cloud computing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945256A (en) * 2012-10-18 2013-02-27 福建省海峡信息技术有限公司 Method and device for merging and classifying massive SQL (Structured Query Language) sentences
CN103617273A (en) * 2013-12-05 2014-03-05 用友软件股份有限公司 SOL script objectification method and system
CN103678621A (en) * 2013-12-18 2014-03-26 上海达梦数据库有限公司 SQL statement optimization method based on constant substitution
CN104391895A (en) * 2014-11-12 2015-03-04 珠海世纪鼎利通信科技股份有限公司 SQL (Structured Query Language) sentence processing system based on cloud computing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
EVANS DOGBE 等: "A combined approach to prevent SQL Injection Attacks", 《2013 SCIENCE AND INFORMATION CONFERENCE》 *
孙建洲 等: "医疗大数据下数据库合并技术的研究", 《中国科技论文在线》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372177A (en) * 2016-08-30 2017-02-01 东华大学 Query expansion method supporting correlated query and fuzzy grouping of mixed data type
CN107918621A (en) * 2016-10-10 2018-04-17 阿里巴巴集团控股有限公司 Daily record data processing method, device and operation system
CN107220376B (en) * 2017-06-21 2020-10-27 北京奇艺世纪科技有限公司 Data query method and device
CN107220376A (en) * 2017-06-21 2017-09-29 北京奇艺世纪科技有限公司 A kind of data query method and apparatus
CN107506365A (en) * 2017-06-26 2017-12-22 杭州沃趣科技股份有限公司 A kind of method that calculating is merged to output row
CN107562790A (en) * 2017-07-31 2018-01-09 北京北信源软件股份有限公司 A kind of method and system for realizing data processing batch storage
CN107506450A (en) * 2017-08-28 2017-12-22 深圳市华傲数据技术有限公司 A kind of method and device for being used to solve the access of data high concurrent
CN108256080A (en) * 2018-01-19 2018-07-06 深圳市富途网络科技有限公司 A kind of method and system using python grammatical and semantics structure complexity sql sentences
CN109460395A (en) * 2018-09-11 2019-03-12 浙江众合科技股份有限公司 A kind of signal maintenance system database multilingual information storage system and method
CN110968595A (en) * 2019-11-27 2020-04-07 广东科徕尼智能科技有限公司 Single-thread sql statement execution method, equipment and storage medium
CN111190912A (en) * 2019-12-27 2020-05-22 山大地纬软件股份有限公司 Large-transaction-oriented fragment execution method and device based on row change
CN112347122A (en) * 2020-11-10 2021-02-09 西安宇视信息科技有限公司 SQL workflow processing method and device, electronic equipment and storage medium
CN112612771A (en) * 2020-11-24 2021-04-06 深圳市和讯华谷信息技术有限公司 Data writing method and system
CN113064928A (en) * 2021-04-25 2021-07-02 深圳壹账通智能科技有限公司 Report data query method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN105740344A (en) Sql statement combination method and system independent of database
Zhang et al. EAGRE: Towards scalable I/O efficient SPARQL query evaluation on the cloud
CN106777108A (en) A kind of data query method and apparatus based on mixing storage architecture
CN103425672B (en) A kind of method for building up of database index and device
JP6964384B2 (en) Methods, programs, and systems for the automatic discovery of relationships between fields in a mixed heterogeneous data source environment.
CN102521406A (en) Distributed query method and system for complex task of querying massive structured data
Giannakouris et al. MuSQLE: Distributed SQL query execution over multiple engine environments
CN103309958A (en) OLAP star connection query optimizing method under CPU and GPU mixing framework
CN103440246A (en) Intermediate result data sequencing method and system for MapReduce
CN104462351B (en) A kind of data query model and method towards MapReduce patterns
CN106599052A (en) Data query system based on ApacheKylin, and method thereof
Wang et al. A mapreducemerge-based data cube construction method
Abraham et al. Distributed storage and querying techniques for a semantic web of scientific workflow provenance
CN104142968A (en) Solr technology based distributed searching method and system
Wen et al. Hardware-enhanced association rule mining with hashing and pipelining
Maccioni et al. Augmented access for querying and exploring a polystore
Sridhar et al. RAPID: Enabling scalable ad-hoc analytics on the semantic web
Ding et al. Commapreduce: An improvement of mapreduce with lightweight communication mechanisms
Anyanwu et al. Algebraic optimization for processing graph pattern queries in the cloud
Ravindra et al. Efficient processing of RDF graph pattern matching on MapReduce platforms
CN106021574A (en) Data storage replication method and system
Sahal et al. Big data multi-query optimisation with Apache Flink
Ptiček et al. MapReduce research on warehousing of big data
Horlova et al. Array-based data management for genomics
CN110990430A (en) Large-scale data parallel processing system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160706

RJ01 Rejection of invention patent application after publication