CN108874849A - A kind of optimization method and system of non-equivalent association subquery - Google Patents
A kind of optimization method and system of non-equivalent association subquery Download PDFInfo
- Publication number
- CN108874849A CN108874849A CN201810097136.7A CN201810097136A CN108874849A CN 108874849 A CN108874849 A CN 108874849A CN 201810097136 A CN201810097136 A CN 201810097136A CN 108874849 A CN108874849 A CN 108874849A
- Authority
- CN
- China
- Prior art keywords
- subquery
- subregion
- association
- associated column
- appearance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses the optimization methods and system of a kind of non-equivalent association subquery, which is characterized in that including:Obtain the value collection of the appearance associated column of association subquery;According to the type of operator in the association subquery and the value collection, the appearance associated column of the association subquery is established to the mapping relations of interior table associated column subregion;According to obtained partitioned set, subregion, while the inquiry aggregate function according to interior table in the association subquery are carried out to the interior table of the association subquery, obtain association subquery in the intermediate result status information of each subregion;According to the mapping relations, the appearance associated column is traversed, by polymerizeing the intermediate result status information of corresponding partition set, obtains the corresponding subquery results of each associated column in appearance.The technical effect that the present invention has includes:Subregion is carried out by internal table, and reuses the intermediate result of each subregion to obtain final subquery results collection, to promote query performance.
Description
Technical field
The present invention relates to database relation system regions, in particular to the optimization method of a kind of non-equivalent association subquery and
System.
Background technique
Subquery refers to that query statement occurs as the querying condition of another sentence, and association subquery refers to subquery
Querying condition is inquired dependent on father.Typical non-equivalent association subquery is as follows:
select X.a,X.b from X where X.c>
(select avg(Y.c)from Y where Y.d【OPERATOR】X.d)
Wherein it is associated with the operator of the associated column of subquery and outer inquiry, that is, above-mentioned【OPERATOR】Including:!=,>,>=,
<,<=, in, not in, between and etc..Due to subquery use outer inquiry as a result, so at present existing realization skill
Art includes the following three types implementation:
1,tuple-at-a-time(nested iterator):The value that a d is often obtained from outer Table X, is transmitted to son and looks into
It askes, then executes subquery and obtain the result of subquery.For having the case where repetition values in X.d, there is following two mode to avoid
It calculates:One is arranging subquery results to appearance to cache, life can be then cached when there is duplicate appearance associated column to occur
In, to avoid calculating;Another way is ranked up to appearance associated column, and the appearance column of identical value are put together,
One-off recognition, this method are also avoided that and compute repeatedly.When index is shown in the association of Ruo Neibiao, certain acceleration can be played and made
With.
2,semi join/anti join/outer join:Outer Table X and Nei Biao Y are done into cartesian product, are screened out from it
Meet the record of Correlation Criteria, then obtains the corresponding result column of subquery.
3,sort-merge join:If OPERATOR is<,<=,>,>=and the select of subquery be classified as max and take most
The aggregate function that big value/min is minimized, then can first be ranked up appearance and the associated column of Nei Biao respectively, then passes through
The method that similar merger sequence is compared obtain subquery as a result, answering for partial results may be implemented by some cache policies
With.Such as subquery is select max (Y.c) from Y where Y.d>X.d carries out descending row when X and Y table respectively arranges d
Sequence, it is assumed that when -1 row of kth of X.d is processed, Y.d processing to m row, interior query result is maxk, then X.d row k is arrived in processing
When, interior query result is that max (max (Y.c) of maxk, Y.d (m-n row)) wherein n row is in Y.d>X.d row k value is minimum
Value where line number.
And the above prior art there is a problem of it is respective:
First method performance is poor, although being avoided that the appearance column of identical value compute repeatedly subquery, for son
The calculating of inquiry, the appearance column of different value then obtain the time overhead of all subquery results there is still a need for table in all scanning
Interior table is scanned to the time overhead when finally calculated for the number of distinct values * of appearance associated column.And the present invention is due to being utilized
The intermediate result of subquery, it is only necessary to table in run-down, therefore it is greatly improved performance.
The problems in 1) second method equally exists.
The third method use condition is limited:The Correlation Criteria of appearance interior first is only an expression formula, and type is only
Can be<,<=,>,>=tetra- kinds, set operation in, not in are not supported, are not supported beween and and are not equal to!=operation and
The combination of multiple expression formulas such as Y.d>X.d and Y.d<X.d+10.Secondly the inquiry column of subquery only support max/min/
sum/count.Furthermore the technology needs internal appearance to be ranked up according to associated column, and time loss is big.And what the present invention supported
The type of subquery union includes set and the combination for supporting a variety of comparison operator expression formulas more, with more general
Property.
Inventor is when carrying out subquery association optimizing research, especially for the associated subquery optimizing research of non-equivalent
When, this defect is by the searching loop to table interior in subquery, the repetition meter of most of set of records ends in the prior art for discovery
Caused by calculation, inventor pass through to sub- enquiring and optimizing method the study found that solve this defect can by cache subquery
Calculated result intermediate state, it is stored with minimum particle size, is then multiplexed the intermediate result status merging shape of the small grain size
It is realized at the method for final subquery results.The general pass of a kind of support that the present invention is provided for non-equivalent association subquery
Connection expression formula and internal table only just needs the height being ranked up to the associated column collection of appearance without being ranked up under specific circumstances
Performance optimization method.The present invention only just needs to be ranked up the associated column collection of appearance under specific circumstances, is not necessarily to external table completely
It is ranked up, shows index support without association is directed to, and query performance expense is to scan opening for interior table+scanning appearance
Pin.Optimisation technique performance of the invention has the promotion of the order of magnitude compared with prior art query performance.
Summary of the invention
Present invention aim to address the support subquery of (overcoming) above-mentioned prior art it is operation associated not general and inquiry
The problem of degraded performance, proposes a kind of optimization method for non-equivalent association subquery, including:
Step 1 obtains the value collection for being associated with the appearance associated column of subquery;
Step 2, according to the type of operator in the association subquery and the value collection, establish the appearance of the association subquery
Mapping relations of the associated column to interior table associated column subregion;
Step 3, according to the mapping relations, subregion is carried out to the interior table of the association subquery, while looking into according to association
The inquiry aggregate function of interior table in inquiry obtains association subquery in the intermediate result status information of each subregion;
Step 4, according to the mapping relations, traverse the appearance associated column, pass through the intermediate result state letter of each subregion of polymerization
Breath obtains the corresponding subquery results of each associated column in appearance.
The non-equivalent is associated with the optimization method of subquery, and wherein the step 2 further includes:
Step 21 constructs automatic merge according to the type of operator in the association subquery and the value collection of appearance associated column
Partition tree, and by the automatic merging partition tree, the mapping relations are established, wherein each leaf section of the automatic merging partition tree
The corresponding subregion of point, gathers the subregion, as interior table associated column subregion.
The non-equivalent is associated with the optimization method of subquery, and wherein the step 2 further includes:
If the operator is, according to the value quantity k of appearance associated column, which to be divided into k+1 not equal to operation
A subregion, and the value of each appearance associated column is corresponded into the k subregion in addition to itself, as the mapping relations;
If the operator is to compare operation, according to the value quantity k of appearance associated column, which is divided into k+1
Subregion, and the value of appearance associated column is corresponded into respective partition according to the comparison operator, as the mapping relations;
If the operator is set operation, according to the greatest common divisor of the value of appearance associated column, which is divided
For n+1 subregion, and the value of each associated column is corresponded into respective partition according to set operation symbol, as the mapping relations,
Wherein n is the greatest common divisor.
The non-equivalent is associated with the optimization method of subquery,, should if the inquiry aggregate function is Avg wherein in step S3
Intermediate result status information is Sum+count;If the inquiry aggregate function is Sum/max/min/count, the intermediate result
Status information is Sum/max/min/count.
The non-equivalent is associated with the optimization method of subquery, and wherein the step 4 further includes:Circular treatment appearance should with interior table
The correlated judgment of subquery results obtains the final result of the association subquery.
The invention also provides a kind of non-equivalents to be associated with the optimization system of subquery, including:
Subregion mapping block, the value of the appearance associated column for obtaining association subquery, is combined into value collection for value collection,
And according to the type of operator in the association subquery and the value collection, the appearance associated column of the association subquery is established to interior table
The mapping relations of associated column subregion;
As a result merging module, for carrying out subregion, while foundation to the interior table of the association subquery according to the mapping relations
The inquiry aggregate function of interior table in the association subquery obtains association subquery in the intermediate result status information of each subregion, and
According to the mapping relations, the appearance associated column is traversed, by polymerizeing the intermediate result status information of each subregion, is obtained each in appearance
The corresponding subquery results of associated column.
The non-equivalent is associated with the optimization system of subquery, and wherein the subregion mapping block further includes:It is looked into according to association
The building of the value collection of the type of operator and appearance associated column is automatic in inquiry merges partition tree, and passes through the automatic merging subregion
Tree, establishes the mapping relations, and wherein the corresponding subregion of each leaf node of the automatic merging partition tree, gathers the subregion,
As interior table associated column subregion.
The non-equivalent is associated with the optimization system of subquery, and wherein the subregion mapping block further includes:
If the operator is, according to the value quantity k of appearance associated column, which to be divided into k+1 not equal to operation
A subregion, and the value of each appearance associated column is corresponded into the k subregion in addition to itself, as the mapping relations;
If the operator is to compare operation, according to the value quantity k of appearance associated column, which is divided into k+1
Subregion, and the value of appearance associated column is corresponded into respective partition according to the comparison operator, as the mapping relations;
If the operator is set operation, according to the greatest common divisor of the value of appearance associated column, which is divided
For n+1 subregion, and the value of each associated column is corresponded into respective partition according to set operation symbol, as the mapping relations,
Wherein n is the greatest common divisor.
The non-equivalent is associated with the optimization system of subquery, wherein in result merging module, if the inquiry aggregate function is
Avg, then the intermediate result status information is Sum+count;If the inquiry aggregate function is Sum/max/min/count, should
Intermediate result status information is Sum/max/min/count.
The non-equivalent is associated with the optimization system of subquery, and wherein the result merging module further includes:Circular treatment appearance with
The correlated judgment of the subquery results of interior table obtains the final result of the association subquery.
The present invention is mainly made of dynamic partition mapping and division result merging two stages.It is characterized in that not needing internally
Appearance be ranked up (only it is operation associated be an expression formula and when operator is comparison operator just need to appearance associated column into
Row sequence), being swept by the primary full table of internal table respectively and appearance just can be obtained the subquery calculating that minimum particle size can be re-used
Intermediate result state obtains final subquery results by the final multiple intermediate result states of combination.
Dynamic partition mapping phase:AMP-TREE is constructed according to the different value of the associated column of appearance and union, is obtained
To the mapping relations of subregion and the appearance column and subregion of interior table associated column.
Division result merging phase:Interior table is subjected to subregion according to associated column.All subregions are traversed, are obtained in each subregion
Between result phase information.According to the mapping relations that dynamic partition mapping phase obtains, it is corresponding most to obtain each appearance associated column
Whole subquery results.
The technical effect that the present invention has includes:Adaptable, this non-equivalent association subquery optimization method is to big absolutely
Partial association operation is supported, while the internal table of dynamic that can be adaptive carries out subregion, and reuses the centre of each subregion
As a result to obtaining final subquery results collection, and performance compares existing technical method very big promotion.
Detailed description of the invention
Fig. 1 is illustraton of model of the invention;
Fig. 2 is the corresponding subregion mapping relations schematic diagram of each incidence relation;
Fig. 3 is the organigram of AMP-TREE numerical intervals class;
Fig. 4 is the building method schematic diagram of AMP-TREE collection class.
Specific embodiment
The present invention for the existing defect to non-equivalent association subquery optimization method, devise it is a kind of new it is general,
Adaptively, high performance optimized treatment method.As shown in Figure 1, the design is efficiently solved for different subquery associations
The low problem of expression formula, the non-equivalent association subquery performance of subquery polymerization result function.
The present invention mainly consists of the following steps, including:
Step 1 obtains the value collection for being associated with the appearance associated column of subquery;
Step 2, according to the type of operator in the association subquery and the value collection, establish the appearance of the association subquery
Mapping relations of the associated column to interior table associated column subregion;
Step 3, according to the mapping relations, subregion is carried out to the interior table of the association subquery, while looking into according to association
The inquiry aggregate function of interior table in inquiry obtains association subquery in the intermediate result status information of each subregion;
Step 4, according to the mapping relations, traverse the appearance associated column, pass through the intermediate result state letter of each subregion of polymerization
Breath obtains the corresponding subquery results of each associated column in appearance.
Wherein the step 2 further includes:Step 21, type and appearance associated column according to operator in the association subquery
The building of value collection is automatic to merge partition tree, and by the automatic merging partition tree, establishes the mapping relations, wherein the automatic merging
The corresponding subregion of each leaf node of partition tree, gathers the subregion, as interior table associated column subregion;
Wherein the step 2 further includes:If the operator be not equal to operation, according to the value quantity k of appearance associated column,
The interior table is divided into k+1 subregion, and the value of each appearance associated column is corresponded into the k subregion in addition to itself, as
The mapping relations;If the operator is to compare operation, according to the value quantity k of appearance associated column, which is divided into k+1
A subregion, and the value of appearance associated column is corresponded into respective partition according to the comparison operator, as the mapping relations;If should
Operator is set operation, then according to the greatest common divisor of the value of appearance associated column, which is divided into n+1 subregion,
And the value of each associated column is corresponded into respective partition according to set operation symbol, as the mapping relations, wherein n is the maximum
Common divisor.
Wherein the step 4 further includes:The correlated judgment of the subquery results of circular treatment appearance and interior table, obtains the pass
Join the final result of subquery.
To make the foregoing features and advantages of the present invention clearer and more comprehensible, special embodiment below, and cooperate institute's accompanying drawings
It is described in detail below.
The present invention supports a variety of different non-equivalent associative expression formulas, and operator is specifically included:!=,>,>=,<,<
=, in, not in, between and etc., wherein processing of the in and not in for monodrome and set column.More seeds are supported to look into
Ask aggregate function:Max/min/sum/count/avg etc..Realize the multiplexing of subquery intermediate result to improve query performance.
Wherein Table X is appearance, and table Y is interior table, union operator.The meaning of Y.d-partition is interior
The partition information of the associated column d of table Y.Particular content of the invention realizes a model M, which is equivalent to a function, parameter
Including two:Distinct x.d, operator, it is also two y.d-partition, map (x.d, y.d- that function, which returns the result,
partition).I.e.:M (distinct x.d, operator)=(y.d-partition, map (x.d, y.d-
Partition)), all different the d value set distinct d and union for outer Table X are inputted, are exported as needle
A map mapping relations between the partitioned set of associated column d and the subregion of x.distinct d to Y.d of internal table Y.
The step of concrete operations, is as follows:
S1:The value set and union operator of x.d are obtained in the outer Table X of database;
S2:Based on S1's as a result, using topology row to the value set of x.d according to the total order of operator or partial ordering relation
The method of sequence obtains a sequence, as shown in Fig. 2, x.d1 to x.dk indicates the k distinct value of x.d, wherein the mesh to sort
Be to determine subsequent subregion range, topological sorting here is the value of all appearance associated columns of traversal, to form AMP
Tree use, only when appearance associated column be it is non-set and it is operation associated be a comparison operator, just need the associated column to appearance
Value is ranked up.
Wherein S2 specifically further includes step S21, during S2 sequence, generates an automatic merging partition tree Auto-
Merge-Partition Tree (AMP tree).Each leaf node of tree is a subregion.In the same of construction AMP-TREE
When, obtain the mapping relations in section and appearance associated column.
For!=operator, subregion number=k+1, the different value and an other values of respectively k X.d;
For>,>=,<,<=comparison operators are waited, subregion number=k+1 is ranked up according to appearance associated column x.d,
Multiple and different interval ranges is obtained, each interval range is a subregion.For between and operator and compare behaviour
Make multiple expression formulas that symbol is combined into, also type is handled, but subregion number, which obtains smallest interval according to sequence, to be calculated.The process
Use AMP-TREE tree structure.
For the in of collection class, not in operator obtains more trees of the bottom in Fig. 2, extracts institute as far as possible
There is the highest common divisor set of X.distinct d, can finally determine that subregion number is n+1 from figure, the leaf node in forest
An as subregion.
Obtain the corresponding relationship of distinct x.d and y.d-partiton.
For being not equal to!=operator, the corresponding k subregion other than X.di of X.di, i is the positive integer less than k,
X.di is i-th of associated column value d in outer Table X.
For>,>=,<,<=, with Y.d>For X.d.So X.d corresponds to all meet>The section of X.d.Such as
The value of X.d is 1,5,10.So X.d=1 is corresponded to section (1,5), (5,10), and (10, just infinite), X.d=5 corresponds to section
(5,10), (10, just infinite), X.d=10 correspond to section (10, just infinite).Between and is processed similarly.
For the in of collection class, not in operator is as can see from Figure 2 X.d1- for the mapping relations of in>
(s123),X.d2->(s2345) is for not in then opposite mapping relations X.d1->(non-s123), X.d2->(non-s2345)
S3:The associated column d of the i.e. interior table Y of d column of subquery Y table, the interior table associated column value partition set obtained according to S2
Subregion is carried out, while calculating the intermediate state information for saving the result avg (y.c) of subquery:sum+count.Different polymerizations
The corresponding status information of function is as shown in the table:
S4:The data content for traversing every row in appearance, according to the map information of obtained x.d to the y.d-partiton of S2,
The intermediate state for polymerizeing each partition calculates subquery results and makes a decision, for example subquery is associated in following embodiment
Correlated judgment just refer to x.c>This judgement.
Note:This method is for subquery condition:Wherein X.d is monodrome type to X.d in Y.d, and Y.d is aggregate type
When, since the complexity of the algorithm judges that X.d in Y.d complexity is consistent with direct X Join Y again, this optimization method
The expense of such inquiry is not will increase, but without effect of optimization yet.So this optimization method puts aside such subquery condition
The case where.
To make the foregoing features and advantages of the present invention clearer and more comprehensible, special embodiment below, and cooperate institute's accompanying drawings
It is described in detail below.
The present invention is carried out between the comparison operator and set and monodrome of minute mark amount class comprising operator respectively first
Explanation:
For the comparison operator of non-collection class>,>=,<,<=, and interior table associated column corresponds to the multiple of appearance associated column
The forming process of the combination of algebraic expression, subregion and mapping is as follows:Assuming that inquiry is
select X.a,X.b from X where X.c>
(select avg(Y.c)from Y where Y.d>X.d and Y.d<X.d+20)
S1:Assuming that obtaining X.distinct d={ 1,10,20 }
S2:D.dinstict d is looped through, all values of X.d and X.d+20 is obtained, its numerical value is mapped to
It is multiple sections by data cutting if AMP-TREE is sky on Patition-tree;If AMP-TREE is not sky, by two
Divide and search matched node, if intersecting with certain leaf node part, the new leaf section of two sons is generated to leaf node cutting
Point, detailed process is as follows shown in Fig. 3.In the step, meanwhile, also obtain each subquery appearance associated column and corresponding AMP-
The upper child node of TREE is the mapping relations of interior table associated column subregion partition.
Appearance association class and subregion mapping relations are obtained simultaneously in above process, and to node corresponding on AMP-TREE
(indicating subsequent will use) is marked:
I.e. partitioned set is (1,21), (10,21), [21,30], (20,21), [30,40] in the present embodiment.
S3:Interior table is mapped in the subregion of AMP-TREE according to the value of associated column, while being calculated each in AMP-TREE
The marked node intermediate state to be stored.The corresponding subquery results of X.distinct d are calculated simultaneously:
Wherein aggregate function is the query result of subquery, such as select avg (id) from the present embodiment
Avg in table is exactly the aggregate function of this inquiry.Corresponding status information is exactly avgsum+count in that table of front
Content.And in order to explain that the polymerization in above-mentioned steps 4 corresponds to partition set, the correspondence when X.d is 1 is divided in the present embodiment
Qu Jiwei (1,21), correspondence partition set when X.d is 10 are (10,21), [21,30), the correspondence partition set when X.d is 20
For (20,21), [21,30), [30,40).
S4:The correlated judgment of the subquery results of recycling processing appearance and interior table, obtains the final result of inquiry.
For the in of collection class, not in operator types, the data type of X.d is set, and interior table associated column type
For monodrome type.Assuming that inquiry is:
select X.a,X.b from X where X.c>
(select avg(Y.c)from Y where Y.d in X.d)
S1:Assuming that obtaining X.distinct d={ abc }, { ab }, { cd }, { def }, { cef } //a, b, c, d etc. is respectively
One element value
S2:X.dinstict d is looped through, is mapped that on Patition-tree, if AMP-TREE is sky, root
The value of node is this set;If AMP-TREE is not sky, whether there is matched element in search tree, if can be by multiple sections
Point element merges, if intersecting with certain leaf node part, generates the new leaf node of two sons to leaf node cutting,
Detailed process is as follows shown in Fig. 3.In the step, meanwhile, also obtain each subquery appearance associated column and corresponding AMP-
The mapping relations of the upper partition of TREE.
Appearance association class and subregion mapping relations are obtained simultaneously in above process, and to node corresponding on AMP-TREE
(indicating subsequent will use) is marked:
X.d | partition |
{abc} | {ab}{c} |
{ab} | {ab} |
{cd} | {c}{d} |
{def}, | {d}{ef} |
{cef} | {c}{ef} |
S3:Interior table is mapped in the subregion of AMP-TREE according to the value of associated column, while being calculated each in AMP-TREE
The marked node intermediate state to be stored.Then the corresponding subquery results of all X.distinct d are calculated.
S4:The correlated judgment of recycling processing appearance obtains the final result of inquiry.
The greatest common divisor of the value of appearance associated column is illustrated in collection class operator embodiment, in this implementation
According to the column upper table partition content its common portion totally 4 in example, respectively { ab } { c } { d } { ef }, therefore it is most
Big common divisor is 4, n=4, and includes that the collection of { ab } { c } { d } { ef } is combined into highest common divisor set.
Explanation:The size of AMP-TREE can configure, and when the value of configuration is enough big, then the overwhelming majority of appearance associated column is all
The partitioned nodes for arriving AMP-TREE can be corresponded.When Configuration Values are smaller, it can only guarantee that a part of appearance associated column is directly right
Should be on the node of AMP-TREE, remainder needs to merge by the Partition of small grain size.When Configuration Values are minimum
When, most of appearance associated column can not merge with subregion to be mapped, and needs to obtain using existing subquery calculation method
Association subquery as a result, there is no a results of intermediate calculations reusable.Therefore, when implementing, it is recommended to use the AMP- of the larger value
TREE size configuration, to ensure the high reusability of subquery intermediate result set.
The following are system embodiment corresponding with above method embodiment, this implementation system can be mutual with above embodiment
Cooperation is implemented.The above-mentioned relevant technical details mentioned in mode of applying are still effective in this implementation system, in order to reduce repetition, this
In repeat no more.Correspondingly, the relevant technical details mentioned in this implementation system are also applicable in above embodiment.
The invention also provides a kind of non-equivalents to be associated with the optimization system of subquery, including:
Subregion mapping block, the value of the appearance associated column for obtaining association subquery, is combined into value collection for value collection,
And according to the type of operator in the association subquery and the value collection, the appearance associated column of the association subquery is established to interior table
The mapping relations of associated column subregion;
As a result merging module, for carrying out subregion, while foundation to the interior table of the association subquery according to the mapping relations
The inquiry aggregate function of interior table in the association subquery obtains association subquery in the intermediate result status information of each subregion, and
According to the mapping relations, the appearance associated column is traversed, by polymerizeing the intermediate result status information of each subregion, is obtained each in appearance
The corresponding subquery results of associated column.
The non-equivalent is associated with the optimization system of subquery, and wherein the subregion mapping block further includes:It is looked into according to association
The building of the value collection of the type of operator and appearance associated column is automatic in inquiry merges partition tree, and passes through the automatic merging subregion
Tree, establishes the mapping relations, and wherein the corresponding subregion of each leaf node of the automatic merging partition tree, gathers the subregion,
As interior table associated column subregion.
The non-equivalent is associated with the optimization system of subquery, and wherein the subregion mapping block further includes:
If the operator is, according to the value quantity k of appearance associated column, which to be divided into k+1 not equal to operation
A subregion, and the value of each appearance associated column is corresponded into the k subregion in addition to itself, as the mapping relations;
If the operator is to compare operation, according to the value quantity k of appearance associated column, which is divided into k+1
Subregion, and the value of appearance associated column is corresponded into respective partition according to the comparison operator, as the mapping relations;
If the operator is set operation, according to the greatest common divisor of the value of appearance associated column, which is divided
For n+1 subregion, and the value of each associated column is corresponded into respective partition according to set operation symbol, as the mapping relations,
Wherein n is the greatest common divisor.
The non-equivalent is associated with the optimization system of subquery, wherein in result merging module, if the inquiry aggregate function is
Avg, then the intermediate result status information is Sum+count;If the inquiry aggregate function is Sum/max/min/count, should
Intermediate result status information is Sum/max/min/count.
The non-equivalent is associated with the optimization system of subquery, and wherein the result merging module further includes:Circular treatment appearance with
The correlated judgment of the subquery results of interior table obtains the final result of the association subquery.
In conclusion the present invention can internally table be according to associated column progress adaptive partition, for entire subquery, subquery
Interior table only needs run-down, does not need the support of index, only it is operation associated be an expression formula and operator is ratio
Appearance associated column need to be just ranked up when compared with operator;Subregion and the procedure construction of mapping go out AMP-TREE (automatic merging point
Qu Shu) tree.The characteristics of data structure AMP-TREE, is:A) all nodes are a sections;B) all leaves
Node interval range is non-intersecting and combines to form complete or collected works;C) interval range of father node is equal to the section model of all child nodes
The intersection enclosed, to assist the merging and multiplexing of subquery intermediate result state;The building process of AMP-TREE uses the calculation of subregion
Method, according to the different associated columns of appearance value and it is operation associated obtain, be directed to for set topological sorting and needle
Partial order and total order sequence to non-set, to form the mapping relations of appearance associated column and Nei Biao subregion;Internal table is according to interior table
Associated column is after major key carries out subregion, the intermediate result state of each subregion to be recorded, according to the associated column of appearance to interior table subregion
The mapping relations of collection are multiplexed obtained partial results, merge to obtain the final calculated result of subquery to it.
The invention may also have other embodiments, without departing from the spirit and scope of the invention, any this field
Technical staff can do some perfect and change on the basis of the present invention, therefore protection scope of the present invention is when view claim
Subject to the range that book is defined.
Claims (10)
1. a kind of optimization method of non-equivalent association subquery, which is characterized in that including:
Step 1 obtains the value collection for being associated with the appearance associated column of subquery;
Step 2, according to the type of operator in the association subquery and the value collection, establish the appearance association of the association subquery
Arrange the mapping relations of interior table associated column subregion;
Step 3, according to the interior table associated column subregion, obtain partitioned set, with to the association subquery interior table carry out subregion, together
When inquiry aggregate function according to interior table in the association subquery, the intermediate result state for obtaining association subquery in each subregion believes
Breath;
Step 4, according to the mapping relations, traverse the appearance associated column, pass through the intermediate result state letter for polymerizeing corresponding partition set
Breath obtains the corresponding subquery results of each associated column in appearance.
2. the optimization method of non-equivalent as described in claim 1 association subquery, which is characterized in that the step 2 further includes:
Step 21 constructs automatic merging subregion according to the type of operator in the association subquery and the value collection of appearance associated column
Tree, and by the automatic merging partition tree, the mapping relations are established, wherein each leaf node pair of the automatic merging partition tree
A subregion is answered, the subregion is gathered, as interior table associated column subregion.
3. the optimization method of non-equivalent as described in claim 1 association subquery, which is characterized in that the step 2 further includes:
If the operator is, according to the value quantity k of appearance associated column, which to be divided into k+1 points not equal to operation
Area, and the value of each appearance associated column is corresponded into the k subregion in addition to itself, as the mapping relations;
If the operator is to compare operation, according to the value quantity k of appearance associated column, which is divided into k+1 subregion,
And the value of appearance associated column is corresponded into respective partition according to the comparison operator, as the mapping relations;
If the interior table is divided into n+1 according to the greatest common divisor of the value of appearance associated column for set operation by the operator
A subregion, and the value of each associated column is corresponded into respective partition according to set operation symbol, as the mapping relations, wherein n
For the greatest common divisor.
4. the optimization method of non-equivalent association subquery as described in claim 1, which is characterized in that in step S3, if this is looked into
Inquiry aggregate function is Avg, then the intermediate result status information is Sum+count;If the inquiry aggregate function is Sum/max/
Min/count, then the intermediate result status information is Sum/max/min/count.
5. the optimization method of non-equivalent as described in claim 1 association subquery, which is characterized in that the step 4 further includes:It follows
Ring handles the correlated judgment of the subquery results of appearance and interior table, obtains the final result of the association subquery.
6. a kind of optimization system of non-equivalent association subquery, which is characterized in that including:
Subregion mapping block, the value of the appearance associated column for obtaining association subquery, is combined into value collection, and root for value collection
According to the type and the value collection of operator in the association subquery, the appearance associated column for establishing the association subquery is associated with to interior table
The mapping relations of column subregion;
As a result merging module, for carrying out subregion to the interior table of the association subquery, while according to the pass according to the mapping relations
Join the inquiry aggregate function of interior table in subquery, obtains association subquery in the intermediate result status information of each subregion, and according to
The mapping relations traverse the appearance associated column, by polymerizeing the intermediate result status information of each subregion, obtain respectively being associated in appearance
Arrange corresponding subquery results.
7. the optimization system of non-equivalent association subquery as claimed in claim 6, which is characterized in that the subregion mapping block is also
Including:The automatic merging partition tree of value collection building of type and appearance associated column according to operator in the association subquery, and
It by the automatic merging partition tree, determines all subregions, establishes the mapping relations, wherein each of the automatic merging partition tree
Leaf node corresponds to a subregion, gathers the subregion, as interior table associated column subregion.
8. the optimization system of non-equivalent association subquery as claimed in claim 6, which is characterized in that the subregion mapping block is also
Including:
If the operator is, according to the value quantity k of appearance associated column, which to be divided into k+1 points not equal to operation
Area, and the value of each appearance associated column is corresponded into the k subregion in addition to itself, as the mapping relations;
If the operator is to compare operation, according to the value quantity k of appearance associated column, which is divided into k+1 subregion,
And the value of appearance associated column is corresponded into respective partition according to the comparison operator, as the mapping relations;
If the interior table is divided into n+1 according to the greatest common divisor of the value of appearance associated column for set operation by the operator
A subregion, and the value of each associated column is corresponded into respective partition according to set operation symbol, as the mapping relations, wherein n
For the greatest common divisor.
9. the optimization system of non-equivalent association subquery as claimed in claim 6, which is characterized in that in result merging module,
If the inquiry aggregate function is Avg, which is Sum+count;If the inquiry aggregate function is Sum/
Max/min/count, then the intermediate result status information is Sum/max/min/count.
10. the optimization system of non-equivalent association subquery as claimed in claim 6, which is characterized in that the result merging module
Further include:The correlated judgment of the subquery results of circular treatment appearance and interior table, obtains the final result of the association subquery.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810097136.7A CN108874849B (en) | 2018-01-31 | 2018-01-31 | Optimization method and system for non-equivalent associated sub-query |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810097136.7A CN108874849B (en) | 2018-01-31 | 2018-01-31 | Optimization method and system for non-equivalent associated sub-query |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108874849A true CN108874849A (en) | 2018-11-23 |
CN108874849B CN108874849B (en) | 2020-12-25 |
Family
ID=64325986
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810097136.7A Active CN108874849B (en) | 2018-01-31 | 2018-01-31 | Optimization method and system for non-equivalent associated sub-query |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108874849B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220237191A1 (en) * | 2021-01-25 | 2022-07-28 | Salesforce.Com, Inc. | System and method for supporting very large data sets in databases |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080040334A1 (en) * | 2006-08-09 | 2008-02-14 | Gad Haber | Operation of Relational Database Optimizers by Inserting Redundant Sub-Queries in Complex Queries |
CN103294821A (en) * | 2013-06-17 | 2013-09-11 | 北京工业大学 | XML data query result visiting method based on multi-level subquery result branch trees |
CN104123374A (en) * | 2014-07-28 | 2014-10-29 | 北京京东尚科信息技术有限公司 | Method and device for aggregate query in distributed databases |
CN105975617A (en) * | 2016-05-20 | 2016-09-28 | 北京京东尚科信息技术有限公司 | Multi-partition-table inquiring and processing method and device |
CN107169033A (en) * | 2017-04-17 | 2017-09-15 | 东北大学 | Relation data enquiring and optimizing method with parallel framework is changed based on data pattern |
-
2018
- 2018-01-31 CN CN201810097136.7A patent/CN108874849B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080040334A1 (en) * | 2006-08-09 | 2008-02-14 | Gad Haber | Operation of Relational Database Optimizers by Inserting Redundant Sub-Queries in Complex Queries |
CN103294821A (en) * | 2013-06-17 | 2013-09-11 | 北京工业大学 | XML data query result visiting method based on multi-level subquery result branch trees |
CN104123374A (en) * | 2014-07-28 | 2014-10-29 | 北京京东尚科信息技术有限公司 | Method and device for aggregate query in distributed databases |
CN105975617A (en) * | 2016-05-20 | 2016-09-28 | 北京京东尚科信息技术有限公司 | Multi-partition-table inquiring and processing method and device |
CN107169033A (en) * | 2017-04-17 | 2017-09-15 | 东北大学 | Relation data enquiring and optimizing method with parallel framework is changed based on data pattern |
Non-Patent Citations (1)
Title |
---|
毛思雨等: "面向分布式数据库的相关子查询优化策略", 《华东师范大学学报(自然科学版)》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220237191A1 (en) * | 2021-01-25 | 2022-07-28 | Salesforce.Com, Inc. | System and method for supporting very large data sets in databases |
Also Published As
Publication number | Publication date |
---|---|
CN108874849B (en) | 2020-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9292570B2 (en) | System and method for optimizing pattern query searches on a graph database | |
Gupta et al. | Top-k interesting subgraph discovery in information networks | |
CN104750496B (en) | A kind of model changes disturbance degree automatic check method | |
CN104408159B (en) | A kind of data correlation, loading, querying method and device | |
US20160071016A1 (en) | Scope In Decision Trees | |
CN104392010B (en) | A kind of querying method of subgraph match | |
CN104462260B (en) | A kind of community search method in social networks based on k- cores | |
CN102945249B (en) | A kind of policing rule matching inquiry tree generation method, matching process and device | |
EP2746964A2 (en) | Automatic tuning of database queries | |
CN106250519A (en) | Data query method and apparatus for parallel database | |
CN106021386B (en) | Non-equivalent connection method towards magnanimity distributed data | |
CN106209989A (en) | Spatial data concurrent computational system based on spark platform and method thereof | |
CN108681577A (en) | A kind of novel library structure data index method | |
Yan et al. | Top-k aggregation queries over large networks | |
CN110032676B (en) | SPARQL query optimization method and system based on predicate association | |
CN103377236B (en) | A kind of Connection inquiring method and system for distributed data base | |
CN108874849A (en) | A kind of optimization method and system of non-equivalent association subquery | |
RU2004131664A (en) | METHOD AND DEVICE FOR HANDLING A REQUEST FOR RELATIVE DATABASES | |
CN109254962A (en) | A kind of optimiged index method and device based on T- tree | |
US20190347302A1 (en) | Device, system, and method for determining content relevance through ranked indexes | |
CN110162716A (en) | A kind of influence power community search method and system based on community's retrieval | |
CN102214216A (en) | Aggregation summarization method for keyword search result of hierarchical relation data | |
CN107679107A (en) | A kind of grid equipment accessibility querying method and system based on chart database | |
Pang et al. | Incremental maintenance of shortest distance and transitive closure in first-order logic and SQL | |
CN103902715B (en) | IP range lookup method and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
CB03 | Change of inventor or designer information |
Inventor after: He Wenting Inventor after: Cheng Xueqi Inventor after: Zheng Tianqi Inventor after: Zhang Zhibin Inventor after: Guo Jiafeng Inventor after: Zhao Peng Inventor before: He Wenting Inventor before: Zheng Tianqi Inventor before: Zhang Zhibin Inventor before: Cheng Xueqi |
|
CB03 | Change of inventor or designer information | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |