CN106484868B - Data reordering method and data collator based on LIMIT semanteme - Google Patents
Data reordering method and data collator based on LIMIT semanteme Download PDFInfo
- Publication number
- CN106484868B CN106484868B CN201610888986.XA CN201610888986A CN106484868B CN 106484868 B CN106484868 B CN 106484868B CN 201610888986 A CN201610888986 A CN 201610888986A CN 106484868 B CN106484868 B CN 106484868B
- Authority
- CN
- China
- Prior art keywords
- data
- reordering buffer
- limit
- reordering
- tuple
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
- G06F16/24539—Query rewriting; Transformation using cached or materialised query results
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention proposes a kind of data reordering method and device based on LIMIT semanteme, which includes: the first reordering buffer of distribution and the second reordering buffer;Target data is read into several times in the first reordering buffer, when reading target data every time, to the data sorting in the first reordering buffer, judge whether it is for the first time to the data sorting in the first reordering buffer, if for the first time to the data sorting in the first reordering buffer, then the preceding N tuple for meeting LIMIT semantic constraint in the first reordering buffer is stored in the second reordering buffer, if non-for the first time to the data sorting in the first reordering buffer, then by the preceding N tuple in the first reordering buffer and the coalescence in the second reordering buffer, preceding N tuple after merger is stored in the second reordering buffer;According to the data in the second reordering buffer, it is determined for compliance with data defined by the LIMIT semanteme.According to the technical solution of the present invention, the efficiency of sorting operation is improved.
Description
Technical field
The present invention relates to database technical fields, in particular to a kind of data reordering method based on LIMIT semanteme
With a kind of data sorting device based on LIMIT semanteme.
Background technique
In data base management system, query statement is frequently necessary to be ranked up data, and a common issuer
Formula is that the clause of LIMIT keyword is had in query statement, i.e., the row_ in top n tuple is found out from orderly data
Num tuple.In the very big situation of data volume, it is usually necessary to use the technologies of external sort to data progress for sorting operation
Sequence processing;Then conflation algorithm is utilized, the merging of data is carried out;Later, on the upper layer of iterator, then the tuple of return is counted
Number realizes the semanteme of LIMIT.
But in the data sorting scheme of the prior art, often a large amount of data are all ranked up, it can account in this way
With a large amount of memory, a large amount of system for computer resource is consumed.Moreover, because limited memory is to use external sort
When technology is ranked up, a large amount of I/O operation can be generated, reduces the efficiency of data base management system.In addition, on iterator
Layer rather than realize that LIMIT is semantic in sorting operation so that iterator also needs to carry out counting operation to the tuple of return, increase
The workload of calculating is not optimized so as to cause most time-consuming sequencer procedure.
Therefore, how the data sorting that LIMIT clause is had in query statement is optimized, to save computer system
Resource and improve sorting operation speed and efficiency become technical problem urgently to be resolved.
Summary of the invention
The present invention is directed to solve at least one of the technical problems existing in the prior art or related technologies.
For this purpose, an object of the present invention is to provide a kind of data reordering methods based on LIMIT semanteme.
It is another object of the present invention to propose a kind of data sorting device based on LIMIT semanteme.
To realize that at least one above-mentioned purpose proposes one kind and be based on according to the embodiment of the first aspect of the invention
The data reordering method of LIMIT semanteme, comprising: when shifting sorting operation under LIMIT is semantic, for target data distribution first
Reordering buffer and the second reordering buffer;The target data is read into several times in first reordering buffer, directly
Until the target data has been read, wherein when the target data being read into first reordering buffer every time,
It is performed both by following steps: to the data sorting in first reordering buffer, judging whether slow to first sequence for the first time
The data sorting in area is deposited, if it is determined that for for the first time to the data sorting in first reordering buffer, then by the first row
The preceding N tuple for meeting the LIMIT semantic constraint in sequence buffer area is stored in second reordering buffer, if it is determined that
To be non-for the first time to the data sorting in first reordering buffer, then by the preceding N tuple in first reordering buffer
With the coalescence in second reordering buffer, the preceding N tuple after merger is stored in second reordering buffer
In;And according to the data in second reordering buffer, it is determined for compliance with data defined by the LIMIT semanteme.
In the above-mentioned technical solutions, it is preferable that described before the step of shifting sorting operation under the semanteme by LIMIT
Data reordering method further include: determine tuple number N defined by the LIMIT semanteme;If N is less than or equal to preset threshold,
Then execute the step of shifting sorting operation under the semanteme by LIMIT.
In any of the above-described technical solution, it is preferable that if not the target data is read into described the by last time
In one reordering buffer, then the target data is read full into first reordering buffer.
In any of the above-described technical solution, it is preferable that the space of second reordering buffer is greater than or equal to for depositing
Store up the space of 2N tuple data.
In any of the above-described technical solution, it is preferable that the number of first reordering buffer is one or more, in institute
State the first reordering buffer number be it is multiple in the case where, by the target data be read into parallel it is multiple it is described first sequence
In buffer area.
Embodiment according to the second aspect of the invention proposes a kind of data sorting device based on LIMIT semanteme, packet
Include: allocation unit, for when shifting sorting operation under LIMIT is semantic, for target data distribute the first reordering buffer and
Second reordering buffer;Processing unit, for the target data to be read into several times in first reordering buffer, directly
Until the target data has been read, wherein when the target data being read into first reordering buffer every time,
The processing unit judges whether it is slow to first sequence for the first time to the data sorting in first reordering buffer
The data sorting in area is deposited, if it is determined that for for the first time to the data sorting in first reordering buffer, then by the first row
The preceding N tuple for meeting the LIMIT semantic constraint in sequence buffer area is stored in second reordering buffer, if it is determined that
To be non-for the first time to the data sorting in first reordering buffer, then by the preceding N tuple in first reordering buffer
With the coalescence in second reordering buffer, then the preceding N tuple after merger be stored in second order buffer
Qu Zhong;And first determination unit, for being determined for compliance with the LIMIT language according to the data in second reordering buffer
Data defined by justice.
In the above-mentioned technical solutions, it is preferable that the data sorting device further include: the second determination unit, for determining
Tuple number N defined by the LIMIT semanteme;Wherein, when N is less than or equal to preset threshold, the allocation unit is by institute
It states and shifts sorting operation under LIMIT semanteme.
In any of the above-described technical solution, it is preferable that the processing unit is specifically used for, if not last time will be described
Target data is read into first reordering buffer, then first reordering buffer is completely arrived in target data reading
In.
In any of the above-described technical solution, it is preferable that the space of second reordering buffer is greater than or equal to for depositing
Store up the space of 2N tuple data.
In any of the above-described technical solution, it is preferable that the number of first reordering buffer is one or more, in institute
State the first reordering buffer number be it is multiple in the case where, the target data is read into multiple by the processing unit parallel
In first reordering buffer.
The data reordering method and data collator based on LIMIT semanteme through the invention, can be in data depositary management
The data sorting with LIMIT clause is optimized in reason system, to improve the speed and efficiency of sorting operation.
Detailed description of the invention
The process that Fig. 1 shows the data reordering method according to an embodiment of the invention based on LIMIT semanteme is shown
It is intended to;
Fig. 2 shows the processes of the data reordering method based on LIMIT semanteme according to another embodiment of the invention
Schematic diagram;And
The structure that Fig. 3 shows the data sorting device according to an embodiment of the invention based on LIMIT semanteme is shown
It is intended to.
Specific embodiment
It is with reference to the accompanying drawing and specific real in order to be more clearly understood that the above objects, features and advantages of the present invention
Applying mode, the present invention is further described in detail.It should be noted that in the absence of conflict, the implementation of the application
Feature in example and embodiment can be combined with each other.
In the following description, numerous specific details are set forth in order to facilitate a full understanding of the present invention, still, the present invention may be used also
To be implemented using other than the one described here other modes, therefore, protection scope of the present invention is not by described below
Specific embodiment limitation.
The process that Fig. 1 shows the data reordering method according to an embodiment of the invention based on LIMIT semanteme is shown
It is intended to.
As shown in Figure 1, the data reordering method according to an embodiment of the invention based on LIMIT semanteme, comprising:
Step 102, when shifting sorting operation under LIMIT is semantic, for target data distribute the first reordering buffer and
Second reordering buffer.
Step 104, the target data is read into several times in first reordering buffer, until the number of targets
According to until having read, wherein when being every time read into the target data in first reordering buffer, be performed both by following step
It is rapid: to the data sorting in first reordering buffer, to judge whether it is for the first time to the number in first reordering buffer
According to sequence, if it is determined that for for the first time to the data sorting in first reordering buffer, then it will be in first reordering buffer
The preceding N tuple for meeting the LIMIT semantic constraint be stored in second reordering buffer, if it is determined that be non-right for the first time
Data sorting in first reordering buffer, then by the preceding N tuple and described second in first reordering buffer
Preceding N tuple after merger is stored in second reordering buffer by the coalescence in reordering buffer.
For example, be determined as it is non-for the first time to the data sorting in the first reordering buffer when, in the first reordering buffer
Top n tuple are as follows: tuple A, tuple B and tuple D (i.e. N=3) have tuple B, tuple C and tuple E in the second reordering buffer,
After the tuple in the top n tuple and the second reordering buffer in the first reordering buffer is so carried out merger are as follows: tuple A,
Tuple B, tuple B, tuple C, tuple D and tuple E take preceding 3 tuples (i.e. tuple A, tuple B, tuple B) to be stored in the second sequence
In buffer area.
For another example be determined as it is non-for the first time to the data sorting in the first reordering buffer when, in the first reordering buffer
Top n tuple are as follows: tuple D, tuple B and tuple A (i.e. N=3) have tuple E, tuple B and tuple in the second reordering buffer
C, then after the tuple in the top n tuple and the second reordering buffer in the first reordering buffer is carried out merger are as follows: tuple
E, tuple D, tuple C, tuple B, tuple B and tuple A take preceding 3 tuples (i.e. tuple E, tuple D, tuple C) to be stored in second row
In sequence buffer area.
Step 106, it according to the data in second reordering buffer, is determined for compliance with defined by the LIMIT semanteme
Data.
In the technical scheme, by configuring two kinds of reordering buffers (the first reordering buffer and second rows in memory
Sequence buffer area), when carrying out merger operation to the tuple after sequence, participation merger can be reduced using two kinds of reordering buffers
The tuple number of operation, to save IO (In Out) operation to tuple and avoid occupying CPU (Central
Processing Unit, central processing unit) excessive resource.And this programme does not need to execute the multichannel merger based on external memory
The step of algorithm can also greatly reduce the I/O operation for reading file and written document, simplify data sorting, to effectively improve
The speed and efficiency of data sorting.In addition, after all having handled target data, by the tuple in the second reordering buffer
The processing that iterator carries out next step is issued, the iterator avoided in the related technology unites to the tuple number of return
Meter converts internal sort operation for external sorting operation to realize.
In the above-mentioned technical solutions, it is preferable that described before the step of shifting sorting operation under the semanteme by LIMIT
Data reordering method further include: determine tuple number N defined by the LIMIT semanteme;If N is less than or equal to preset threshold,
Then execute the step of shifting sorting operation under the semanteme by LIMIT.
In the technical scheme, the tuple number N as defined by LIMIT semanteme is excessive, will affect the effect of data sorting
Fruit, therefore, the scheme more than N is just executed less than or equal to preset threshold, thus reliability when ensure that data sorting.?
In one preferred embodiment, preset threshold be 100 (this value is parameter, can according to the actual situation flexible setting).
In one embodiment, the format of LIMIT clause is " LIMIT { offset, row_num } ", wherein offset table
Show offset, row_num indicates the tuple number from offset backward, can calculate according to offset and row_num
Tuple number N=offset+row_num defined by LIMIT semanteme.Such as offset=3, row_num=7, then N=10.
In another embodiment, the format of LIMIT clause is " LIMIT { row_num } ", wherein defaulting offset=0, equally can
Tuple number N=offset+row_num defined by LIMIT semanteme, such as row_num=5 are calculated, then N=5.
In any of the above-described technical solution, it is preferable that if not the target data is read into described the by last time
In one reordering buffer, then the target data is read full into first reordering buffer.
In the technical scheme, when reading target data due to last time, the target data that last time is read may not
First reordering buffer can be read completely, and when not being that target data is read into the first reordering buffer by last time, it reads
Entering target data to the first reordering buffer is full state, the first reordering buffer can be made full use of, to improve number
According to the efficiency of sequence.
In any of the above-described technical solution, it is preferable that the space of second reordering buffer is greater than or equal to for depositing
Store up the space of 2N tuple data.
In the technical scheme, the space of the second reordering buffer is greater than or equal to the data for storing 2N tuple
Space, so that having enough spaces in the second reordering buffer to store data, to effectively improve data sorting
Speed and efficiency.
In any of the above-described technical solution, it is preferable that the number of first reordering buffer is one or more, in institute
State the first reordering buffer number be it is multiple in the case where, the target data is read into multiple by the processing unit parallel
In first reordering buffer.
In the technical scheme, by being read into target data parallel in multiple first reordering buffers, further
Improve the speed and efficiency of data sorting.
It certainly, can also be by number of targets in addition to that can be read into target data parallel in multiple first reordering buffers
According to being serially read into multiple first reordering buffers.
In addition, the number of the first reordering buffer and the second reordering buffer can be identical, it can not also be identical.If first
Reordering buffer is identical with the number of the second reordering buffer, and the number of the first reordering buffer and the second reordering buffer is
Multiple, then the first reordering buffer and the second reordering buffer correspond, i.e., by multiple first reordering buffers and multiple the
Two reordering buffers are divided into multiple groups, and each group has first reordering buffer and second reordering buffer, often
The data in the first reordering buffer and the second reordering buffer in one group be performed both by sorting operation in above-mentioned steps 104 and
Merger operation, after being disposed to target data, executes the data in the second all reordering buffers again and repeatedly returns
And operate, until only retaining the data of second reordering buffer, by the top n tuple in last current merger operation
As meet LIMIT semanteme defined by data.
In the technical scheme, pass through multiple first reordering buffers and multiple second reordering buffers while processing target
Data further increase the speed and efficiency of data sorting.
In any of the above-described technical solution, it is preferable that using quicksort algorithm (quick sorting algorithm) to described first
Data sorting in reordering buffer.
Fig. 2 shows the processes of the data reordering method based on LIMIT semanteme according to another embodiment of the invention
Schematic diagram.
As shown in Fig. 2, the data reordering method based on LIMIT semanteme according to another embodiment of the invention, comprising:
Step 202, optimizer push away under LIMIT is semantic.
Step 204, judge whether that the clause of BY containing ORDER enters step 206 when the judgment result is yes, tied in judgement
When fruit is no, 234 are entered step, i.e., data are ranked up using scheme in the prior art.
Step 206, judge whether clause containing LIMIT, when the judgment result is yes, enter step 208, be in judging result
When no, 234 are entered step, i.e., data are ranked up using scheme in the prior art.
Step 208, whether the format for judging LIMIT clause is LIMIT { offset, row_num } or LIMIT { row_
Num }, when the judgment result is yes, 210 are entered step, when the judgment result is No, enter step 234, that is, uses the prior art
In scheme data are ranked up.Wherein, offset indicates offset, and row_num indicates the tuple from offset backward
Number.
For example, if the format of LIMIT clause is LIMIT { 3,4 }, i.e. offset=3, row_num=4.In addition, if
The format of LIMIT clause is LIMIT { row_num }, then default offset amount offset=0, such as LIMIT { 4 }, i.e. offset=
0, row_num=4.
Step 210, judge whether tuple number N defined by LIMIT semanteme is less than or equal to Kmax (i.e. preset threshold),
If N is less than or equal to Kmax, 212 are entered step, if N is greater than Kmax, enters step 234, i.e., using in the prior art
Scheme is ranked up data.Wherein, described above such as Fig. 1 embodiment, it can be calculated according to offset and row_num
Tuple number N (N=offset+row_num) defined by LIMIT semanteme, details are not described herein.
Step 212, two values of offset and row_num are pushed away under to sorting operation.
Step 214, iterator is ranked up operation optimization.
Step 216, (i.e. the first reordering buffer) distribution sort buffer area B1.
Step 218, (i.e. the second reordering buffer) distribution sort buffer area B2.
Step 220, target data is read into B1, until reordering buffer B1 is full or last part target data
All read in.
Step 222, when target data being read into the B1 of reordering buffer every time, to the data in the B1 of reordering buffer into
Row sequence.Preferably, quicksort algorithm can be executed to the data in the B1 of reordering buffer to be ranked up.
Step 224, judge whether it is the data sorting executed for the first time in the B1 of reordering buffer, determining to be to execute for the first time
When, 226 are entered step, when determining is not to execute for the first time, enters step 228.
Step 226, the preceding N tuple in the B1 of reordering buffer is put into reordering buffer B2.
Step 228, the preceding N tuple in the B1 of reordering buffer directly executes merger behaviour with the tuple in the B2 of reordering buffer
Make, after merger, to reordering buffer B2, other are abandoned N item before retaining.
Step 230, judge whether target complete data are completed, when the judgment result is yes, enter step 232, judging
When being as a result no, 220 are entered step.
Step 232, the data in the B2 of reordering buffer are sent to iterator.In the embodiment above, reordering buffer B1
Number with reordering buffer B2 is one, and therefore, the data in the B2 of reordering buffer are to meet LIMIT semanteme to be limited
Data, the data in the B2 of reordering buffer are sent to iterator.
Certainly the number of reordering buffer B1 can also be multiple, and target data can be thus read into parallel to multiple rows
In sequence buffer area B1.And in the case that the number of reordering buffer B1 is multiple, the number of reordering buffer B2 can be one
It is a or multiple.If the number of reordering buffer B2 be it is multiple, after target data has been read, finally multiple reordering buffer B2
In have data (respectively retain N tuple), the data in multiple reordering buffer B2 are finally subjected to multiple merger operation, and will
Preceding N tuple in last time merger operation, which is used as, meets data defined by LIMIT semanteme.
Step 234, original process, i.e. execution prior art.
In above technical scheme, there are two types of reordering buffer B1 and B2 for distribution, are carrying out merger to the tuple after sequence
When operation, the tuple number for participating in merger operation can be reduced using two kinds of reordering buffers, to save to tuple
I/O operation and the resource for avoiding occupancy CPU excessive.And this programme does not need to execute the multichannel conflation algorithm based on external memory, and it can also
With the step of greatly reducing the I/O operation for reading file and written document, simplifying data sorting, to effectively improve data
The speed and efficiency of sequence.In addition, the tuple in the second reordering buffer is issued repeatedly after all having handled target data
It being handled in next step for device progress, the iterator avoided in the related technology counts the tuple number of return, from
And it realizes and converts internal sort operation for external sorting operation.
The structure that Fig. 3 shows the data sorting device according to an embodiment of the invention based on LIMIT semanteme is shown
It is intended to.
As shown in figure 3, the data sorting device 300 according to an embodiment of the invention based on LIMIT semanteme, packet
It includes: allocation unit 302, processing unit 304 and the first determination unit 306.
Allocation unit 302, for being sorted for target data distribution first when shifting sorting operation under LIMIT is semantic
Buffer area and the second reordering buffer;Processing unit 304, for the target data to be read into first sequence several times
In buffer area, until the target data has been read, wherein the target data is read into first sequence every time
When in buffer area, the processing unit 304 judges whether it is right for the first time to the data sorting in first reordering buffer
Data sorting in first reordering buffer, if it is determined that for for the first time to the data sorting in first reordering buffer,
The preceding N tuple for meeting the LIMIT semantic constraint in first reordering buffer is then stored in second sequence
In buffer area, if it is determined that be non-for the first time to the data sorting in first reordering buffer, then by first order buffer
The coalescence in preceding N tuple and second reordering buffer in area, then the preceding N tuple after merger is stored in institute
It states in the second reordering buffer;And first determination unit 306, for according to the data in second reordering buffer, really
Surely meet data defined by the LIMIT semanteme.
In the technical scheme, by configuring two kinds of reordering buffers in memory, return to the tuple after sequence
And when operating, the tuple number for participating in merger operation can be reduced using two kinds of reordering buffers, to save to tuple
IO (In Out) operation and avoid occupying CPU (Central Processing Unit, central processing unit) excessive resource.
And this programme does not need to execute the multichannel conflation algorithm based on external memory, can also greatly reduce the IO for reading file and written document
The step of operating, simplifying data sorting, to effectively improve the speed and efficiency of data sorting.In addition, to target
After data have all been handled, the tuple in the second reordering buffer is issued into iterator progress and is handled in next step, is avoided
Iterator in the related technology counts the tuple number of return, converts internal sort for external sorting operation to realize
Operation.
In the above-mentioned technical solutions, it is preferable that data sorting device 300 further include: the second determination unit 308, for true
Tuple number N defined by the fixed LIMIT semanteme;Wherein, when N is less than or equal to preset threshold, allocation unit 302 is by institute
It states and shifts sorting operation under LIMIT semanteme.
If tuple number N defined by LIMIT semanteme is excessive, the effect of data sorting will affect, therefore, in the technology
In scheme, N be less than or equal to preset threshold just execute more than scheme, thus reliability when ensure that data sorting.?
In one preferred embodiment, preset threshold 100.
In any of the above-described technical solution, it is preferable that if not the target data is read into described the by last time
In one reordering buffer, processing unit 304 reads the target data full into first reordering buffer.
In the technical scheme, when reading target data due to last time, the target data that last time is read may not
First reordering buffer can be read completely, and when not being that target data is read into the first reordering buffer by last time, the
One reordering buffer is full state, the first reordering buffer can be made full use of, to improve the efficiency of data sorting.
In any of the above-described technical solution, it is preferable that the space of second reordering buffer is greater than or equal to for depositing
Store up the space of 2N tuple data.
In the technical scheme, the space of the second reordering buffer is greater than or equal to the data for storing 2N tuple
Space, so that having enough spaces in the second reordering buffer to store data, to effectively improve data sorting
Speed and efficiency.
In any of the above-described technical solution, it is preferable that the number of first reordering buffer is one or more, in institute
State the first reordering buffer number be it is multiple in the case where, the target data is read into multiple by processing unit 304 parallel
In first reordering buffer.
In the technical scheme, by being read into target data parallel in multiple first reordering buffers, further
Improve the speed and efficiency of data sorting.
Certainly, target data can be read into multiple first reordering buffers by processing unit 304 parallel, can also be incited somebody to action
Target data is serially read into multiple first reordering buffers.
In addition, the number of the first reordering buffer and the second reordering buffer can be identical, it can not also be identical.If first
Reordering buffer is identical with the number of the second reordering buffer, and the number of the first reordering buffer and the second reordering buffer is
Multiple, then the first reordering buffer and the second reordering buffer correspond, i.e., by multiple first reordering buffers and multiple the
Two reordering buffers are divided into multiple groups, and each group has first reordering buffer and second reordering buffer, place
Reason unit 304 is performed both by above-mentioned sequence to the data in the first reordering buffer and the second reordering buffer in each group and grasps
Make and merger operates, after being disposed to target data, processing unit 304 is to the number in the second all reordering buffers
It is operated according to multiple merger is executed again, until only retaining the data of second reordering buffer, and will finally current merger
Top n tuple in operation, which is used as, meets data defined by LIMIT semanteme.
The technical scheme of the present invention has been explained in detail above with reference to the attached drawings, the number based on LIMIT semanteme through the invention
According to sort method and data collator, the data sorting with LIMIT clause can be carried out in data base management system
Optimization, to improve the speed and efficiency of sorting operation.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field
For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair
Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.
Claims (10)
1. a kind of data reordering method based on LIMIT semanteme characterized by comprising
When shifting sorting operation under LIMIT is semantic, the first reordering buffer and the second order buffer are distributed for target data
Area;
The target data is read into several times in first reordering buffer, is until the target data has been read
Only, wherein when being every time read into the target data in first reordering buffer, be performed both by following steps: to described
Data sorting in first reordering buffer judges whether it is for the first time to the data sorting in first reordering buffer, if
Be determined as the first time to the data sorting in first reordering buffer, then it is meeting in first reordering buffer is described
The preceding N tuple of LIMIT semantic constraint is stored in second reordering buffer, if it is determined that being non-for the first time to the first row
Data sorting in sequence buffer area, then by first reordering buffer preceding N tuple and second reordering buffer
In coalescence, the preceding N tuple after merger is stored in second reordering buffer;And
According to the data in second reordering buffer, it is determined for compliance with data defined by the LIMIT semanteme.
2. the data reordering method according to claim 1 based on LIMIT semanteme, which is characterized in that described by LIMIT language
Before the step of shifting sorting operation under justice, the data reordering method further include:
Determine tuple number N defined by the LIMIT semanteme;
If N is less than or equal to preset threshold, the step of shifting sorting operation under the semanteme by LIMIT is executed.
3. the data reordering method according to claim 1 based on LIMIT semanteme, which is characterized in that
If not the target data is read into first reordering buffer by last time, then the target data is read
Completely into first reordering buffer.
4. the data reordering method according to claim 1 based on LIMIT semanteme, which is characterized in that
The space of second reordering buffer is greater than or equal to the space for storing 2N tuple data.
5. the data reordering method according to any one of claim 1 to 4 based on LIMIT semanteme, which is characterized in that
The number of first reordering buffer is one or more, and the number in first reordering buffer is multiple feelings
Under condition, the target data is read into parallel in multiple first reordering buffers.
6. a kind of data sorting device based on LIMIT semanteme characterized by comprising
Allocation unit, for when shifting sorting operation under LIMIT is semantic, for target data distribute the first reordering buffer and
Second reordering buffer;
Processing unit, for the target data to be read into several times in first reordering buffer, until the target
Until reading data is complete, wherein when being every time read into the target data in first reordering buffer, the processing is single
Member judges whether it is for the first time to the number in first reordering buffer data sorting in first reordering buffer
According to sequence, if it is determined that for for the first time to the data sorting in first reordering buffer, then it will be in first reordering buffer
The preceding N tuple for meeting the LIMIT semantic constraint be stored in second reordering buffer, if it is determined that be non-right for the first time
Data sorting in first reordering buffer, then by the preceding N tuple and described second in first reordering buffer
Coalescence in reordering buffer, then the preceding N tuple after merger is stored in second reordering buffer;And
First determination unit, for being determined for compliance with according to the data in second reordering buffer, the LIMIT is semantic to be limited
Fixed data.
7. the data sorting device according to claim 6 based on LIMIT semanteme, which is characterized in that the data sorting
Device further include:
Second determination unit, for determining tuple number N defined by the LIMIT semanteme;
Wherein, when N is less than or equal to preset threshold, the allocation unit will shift sorting operation under the LIMIT semanteme.
8. the data sorting device according to claim 6 based on LIMIT semanteme, which is characterized in that
If the processing unit is not that the target data is read into first reordering buffer by last time, described
Processing unit reads the target data full into first reordering buffer.
9. the data sorting device according to claim 6 based on LIMIT semanteme, which is characterized in that
The space of second reordering buffer is greater than or equal to the space for storing 2N tuple data.
10. the data sorting device according to any one of claims 6 to 9 based on LIMIT semanteme, which is characterized in that
The number of first reordering buffer is one or more, and the number in first reordering buffer is multiple feelings
Under condition, the target data is read into multiple first reordering buffers by the processing unit parallel.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610888986.XA CN106484868B (en) | 2016-10-11 | 2016-10-11 | Data reordering method and data collator based on LIMIT semanteme |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610888986.XA CN106484868B (en) | 2016-10-11 | 2016-10-11 | Data reordering method and data collator based on LIMIT semanteme |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106484868A CN106484868A (en) | 2017-03-08 |
CN106484868B true CN106484868B (en) | 2019-07-09 |
Family
ID=58270561
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610888986.XA Active CN106484868B (en) | 2016-10-11 | 2016-10-11 | Data reordering method and data collator based on LIMIT semanteme |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106484868B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110297858B (en) * | 2019-05-27 | 2021-11-09 | 苏宁云计算有限公司 | Optimization method and device for execution plan, computer equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581162A (en) * | 2004-03-03 | 2005-02-16 | 北京大学 | Quick-sorting in page method based on quick sorting computation |
CN103605750A (en) * | 2013-11-22 | 2014-02-26 | 厦门雅迅网络股份有限公司 | Rapid distributed data paging method |
CN104598485A (en) * | 2013-11-01 | 2015-05-06 | 国际商业机器公司 | Method and device for processing database table |
CN105224697A (en) * | 2015-11-16 | 2016-01-06 | 北京京东尚科信息技术有限公司 | Sort method with filtercondition and the device for performing described method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160188643A1 (en) * | 2014-12-31 | 2016-06-30 | Futurewei Technologies, Inc. | Method and apparatus for scalable sorting of a data set |
-
2016
- 2016-10-11 CN CN201610888986.XA patent/CN106484868B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581162A (en) * | 2004-03-03 | 2005-02-16 | 北京大学 | Quick-sorting in page method based on quick sorting computation |
CN104598485A (en) * | 2013-11-01 | 2015-05-06 | 国际商业机器公司 | Method and device for processing database table |
CN103605750A (en) * | 2013-11-22 | 2014-02-26 | 厦门雅迅网络股份有限公司 | Rapid distributed data paging method |
CN105224697A (en) * | 2015-11-16 | 2016-01-06 | 北京京东尚科信息技术有限公司 | Sort method with filtercondition and the device for performing described method |
Non-Patent Citations (1)
Title |
---|
基于多线程归并排序算法设计;孙琳琳等;《吉林大学学报(信息科学版)》;20150115;第33卷(第1期);第105-110页 |
Also Published As
Publication number | Publication date |
---|---|
CN106484868A (en) | 2017-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10585889B2 (en) | Optimizing skewed joins in big data | |
CN107239335B (en) | Job scheduling system and method for distributed system | |
US8620932B2 (en) | Parallel sorting apparatus, method, and program | |
CN102354289B (en) | Concurrent transaction scheduling method and related device | |
CN104111936B (en) | Data query method and system | |
CA3177212A1 (en) | Resource allocating method, device, computer equipment, and storage medium | |
CN111460023A (en) | Service data processing method, device, equipment and storage medium based on elastic search | |
CN102521347B (en) | Pattern matching intermediate result management method based on priority | |
Mitzenmacher | Analyzing distributed join-idle-queue: A fluid limit approach | |
CN102831120A (en) | Data processing method and system | |
CN112100233B (en) | Flight time linking method and system based on tabu search algorithm | |
US9189489B1 (en) | Inverse distribution function operations in a parallel relational database | |
CN103440246A (en) | Intermediate result data sequencing method and system for MapReduce | |
US20170308578A1 (en) | A method for efficient one-to-one join | |
CN104871153A (en) | System and method for flexible distributed massively parallel processing (mpp) database | |
CN109828790A (en) | A kind of data processing method and system based on Shen prestige isomery many-core processor | |
CN109885642A (en) | Classification storage method and device towards full-text search | |
CN103116641B (en) | Obtain method and the collator of the statistics of sequence | |
CN112527836A (en) | Big data query method based on T-BOX platform | |
CN105550180B (en) | The method, apparatus and system of data processing | |
CN106484868B (en) | Data reordering method and data collator based on LIMIT semanteme | |
US8667008B2 (en) | Search request control apparatus and search request control method | |
CN107172193A (en) | A kind of load-balancing method and its device based on cluster | |
CN107169138B (en) | Data distribution method for distributed memory database query engine | |
CN113568931A (en) | Route analysis system and method for data access request |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220418 Address after: Room 403, 4th floor, building 23, East District, yard 10, Xibeiwang East Road, Haidian District, Beijing 100089 Patentee after: BEIJING VSETTAN DATA TECHNOLOGY CO.,LTD. Address before: 100192 West Zone, 10 / F, block a, No. 8 Xueqing Road (Science and technology wealth center), Haidian District, Beijing Patentee before: VSETTAN INFORMATION INDUSTRY DEVELOPMENT CO.,LTD. Patentee before: Beijing Huasheng Xintai Data Technology Co., Ltd |
|
TR01 | Transfer of patent right |