CN106202384A - A kind of indexing means supporting time series data aggregate function - Google Patents
A kind of indexing means supporting time series data aggregate function Download PDFInfo
- Publication number
- CN106202384A CN106202384A CN201610536956.2A CN201610536956A CN106202384A CN 106202384 A CN106202384 A CN 106202384A CN 201610536956 A CN201610536956 A CN 201610536956A CN 106202384 A CN106202384 A CN 106202384A
- Authority
- CN
- China
- Prior art keywords
- node
- sequence number
- numbering
- forest
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of indexing means supporting time series data aggregate function, it would be preferable to support the most extemporaneous inquiry of simple aggregation operation.Its basic thought is summary table and Kd-Trees (Segment Tree) to be combined, and sets up the line segment forest model being made up of many Kd-Trees, thus avoid the full table scan of summary table to operate on summary table.Meanwhile, by bottom-up mode dynamic construction line segment forest, the shortcoming that traditional Kd-Trees is not supported to increase has been avoided.Additionally, search algorithm directly positions index data by calculating, it is to avoid the recursive traversal of line segment forest is operated, decreases disk I/O number of times.Test result indicate that the calculating inquiry mode of the summary table+line segment forest used herein effectively reduces the number of times of disk I/O, has been obviously improved query performance.
Description
Technical field
The present invention relates to a kind of big data system automatic Model Selection and the side of parameter configuration in big market demand development process
Method, belongs to computer data base management technical field.
Background technology
Along with the development of sensor technology and popularizing of the Internet, the collection of data and the spread speed of information have reached sky
Front level.The aggregation information such as the extreme value of data, average are become particularly significant, the most quick and precisely obtains these polymerization letters
Breath is research emphasis herein.
Meet this kind of inquiry, in the range of data base is necessary for supporting at any time, mass data is carried out quickly
Converging operationJu Hecaozuo.
Traditional Relational DataBase mainly uses the mode of summary table or Materialized View to reach to accelerate the purpose of aggregate query.
Wherein, Materialized View is that the querying command relating to table connection is carried out pretreatment, and result is saved in view table, Yong Hufa
During raw inquiry, data base directly inquires about from view table and returns result.Summary table is then while write data, calculates also
Preserve corresponding summary info, thus when there is inquiry, directly inquire about from summary table and return result.
The essence of both modes is all to precalculate and preserve conventional aggregation information, reduces query context, improves real
Border inquiry velocity.Its drawback is the increase in the expansion rate of data base;Along with increasing of data, it may appear that the problem of performance degradation.
And in NoSQL data base, some data bases have employed the mode of MapReduce to process these converging operationJu Hecaozuos:
Aggregate query detects the table data related in real time from data base every time, is submitted in Map program process.In the Map stage,
Program filters goes out to meet the data of condition and submits to Reduce program.Reduce program collects and calculates Query Result.Separately
Some data bases such as MongoDB, then propose the concept of polymerization pipeline (Aggregation Pipeline).It is to combine
The thought of MapReduce and the product of the thought of linux system pipeline.Its principle is, converging operationJu Hecaozuo acts directly on data literary composition
On part, by the primary operation of class system, directly filter the data in aggregate file.
The mode of MapReduce and polymerization pipeline is all the representative calculated in real time.Although not increasing the expansion of data base
Rate, but query script creates substantial amounts of disk and computing cost, poor efficiency is time-consuming, it is impossible to meet the demand of extemporaneous inquiry.
The thought of Materialized View is then applied in NoSQL by Plamen Nikolov et al.: precalculate counting, summation
Etc. common statistical information, and being saved in view table, follow-up incremental updates, to reach to accelerate the purpose of inquiry response.
This mode promotes clearly compared to the speed carrying out MapReduce calculating on NoSQL data base, but also
There is its drawback.The Forming Mechanism of Materialized View itself determines its inquiry not supporting any range and operates.It addition, along with data
The rising of amount, the disk expense of inquiry operation also can increase.
Summary of the invention
Based on the problems referred to above, this paper presents a kind of Indexing Mechanism supporting NoSQL data base's converging operationJu Hecaozuo.It is thought substantially
Want to combine summary table and Kd-Trees (Segment Tree), summary table is set up the line being made up of many Kd-Trees
Section forest model, thus avoid the full table scan of summary table to operate.Meanwhile, gloomy by bottom-up mode dynamic construction line segment
Woods, has avoided the shortcoming that traditional Kd-Trees is not supported to increase.Additionally, search algorithm directly positions index data by calculating, keep away
Exempt from the operation of the recursive traversal to line segment forest, decrease disk I/O number of times.Achieve on Cassandra data base herein
The index engine stated, and design 2 groups of contrast experiments: directly inquiries based on data and directly inquiry based on summary table.Experiment
Result shows, the calculating inquiry mode of this summary table+line segment forest, effectively reduces the number of times of disk I/O, is obviously improved
Query performance.
A kind of indexing means supporting time series data aggregate function, it is characterised in that include two steps:
Step one, the data model of definition time series data and query demand
Definition 1: data item: data item D (data point) is that (s, t, v), wherein s is sensor to a tlv triple
ID, t are timestamps, and wherein, s and t constitutes globally unique mark, and v is the value of sensor, the consecutive hours of same sensor
Between data item constitute time series data, on this basis, define inquiry problem to be solved: on time series data, look into
Ask time window t1~t2(t1And t2For any time) in the value of time series data, variance statistic information;
Definition 2: summary info: in time series data, the statistical information of the individual continuous print data item in time of k and time thereof
Window constitutes 1 summary info (data Digest);
Definition 3: leaf node: the summary info directly produced by data item constitutes leaf plus specific label information
Node (leaf node);
Definition 4: intermediate node: collected by 2 leafy nodes or 2 intermediate nodes and constitute plus specific label information
Intermediate node (parent node);In order in the recursive operation avoiding tree, it is achieved the quick-searching of summary forest, tie at leaf
Point and intermediate node on the addition of necessity label information: sequence number and numbering;
Definition 5: sequence number: when initially setting up index, according to generation order, corresponding 1 sequence number of each leafy node, sequence number by
1 starts to be incremented by, and intermediate node does not has sequence number (serial);
Definition 6: numbering: according to the order of line segment forest postorder traversal, corresponding 1 numbering of each node, number (code)
It is incremented by by 1;
Definition 7: summary forest: summary forest (Synopsis Forest) be the summary tree produced by node constitute gloomy
Woods.
Step 2, the structure of summary forest and inquiry
(1) summary forest builds
Summary forest safeguards stack architecture rootStack, is used for improving combined efficiency;Safeguard a queue simultaneously, use
With temporary to be brushed enter disk nodal information,
A. when i-th leaf node arrives:
A) if i is odd number:
A) the most directly adding this leafy node, this leafy node is from becoming one tree, now, and the sequence number that this leafy node is corresponding
For i, numbered 2i-ones (i), wherein, ones (i) function be i binary representation in 1 number;
B) this leaf node is added into rootStack and queue;
B) if i is even number:
A) while adding this leafy node, generating and triggered the new tree generated by this leafy node, now, this leaf is tied
Serial number i that point is corresponding, the numbering of numbered (i-1) leafy node adds 1 i.e. 2 (i-1)-ones (i-1)+1;
B) this leaf node is added into queue;
C) the numbered 2i-ones (i) of root node of the new tree produced due to this leaf node, remaining newly-generated middle junction
The numbering of point is followed successively by 2 (i-1)-ones (i-1)+2 to 2i-ones (i)-1;
D) this leaf node is put into rootStack;
E) ejecting the first two node of rootStack, the two node has identical height and is root node, merges
Both form new tree, and the root node numbering of this tree constantly rises to 2i-ones (i) from 2 (i-1)-ones (i-1)+2;
F) root node that 1-a-ii-5 generates is put into queue;
G) root node that 1-a-ii-5 generates is put into rootStack, repeat 1-a-ii-5, until newly-generated root node
Numbering reaches 2i-ones (i);
B. the node brush kept in queue is entered disk.
(2) summary forest inquiry
1) query demand is first defined: query time window ta~tbThe summary info of corresponding data item.
2) inquiry specifically comprises the following steps that
A. normalized temporal window, it is assumed that tis<ta<tie、tjs<tb<tje, then the time window of inquiry can be divided into 3
Time window: ta<tie, t(i+1)s~t(j-1)eAnd tjs<tb;
B. for time window ta<tieAnd tjs<tb, need from data base, directly read taTo tieAnd tjsTo tbData
, and from data item, directly calculate the summary info of window during this period of time;
C. for time window t(i+1)S~t(j-1)e, from line segment forest, find out minimal number of line segment so that these lines
Section is referred to as the division of time window t (i+1) s~t (j-1) e.Assume to need altogether s line segment, from data base, read this successively
The summary node that s line segment is corresponding, obtains s summary info;This step specific implementation process is as follows:
A) from data base, 2 corresponding summary bags are read out according to initial time t (i+1) s and t (j-1) e, from summary
Bag respectively obtains sequence number i and the j of correspondence;
B) lower bound sequence number is obtained: if i is even number, summary bag corresponding for t (i+1) s is added to pending queue, now
Lower bound serial number (i+1).Otherwise, lower bound serial number i;
C) upper bound sequence number is obtained: if j is odd number, summary bag corresponding for t (j-1) e is added to pending queue, now
Upper bound serial number (j-1), otherwise, upper bound serial number j;
D) calculated the numbering of correspondence by upper bound sequence number, and cover the volume of the superiors' node of this sequence number correspondence node
Number;
E) sequence number of the lobus sinister child node that the superiors' node covers is calculated by the numbering of numbering and the superiors' node;
If f) the most left sequence number is more than lower bound sequence number, then the numbering of the superiors' node is added queue to be checked, and upper
Boundary's sequence number is set to the sequence number of lobus sinister child node and subtracts 1, forwards step d to;
If g) the most left sequence number is less than lower bound sequence number, then the numbering of the superiors' node subtracts 1, forwards step e to;
If h) the most left sequence number is equal to lower bound sequence number, then the numbering of the superiors' node is added in queue to be checked, then
Exit circulation;
I) find corresponding summary bag finally according to band query request, and add these summary bags to pending queue.
Time window t can be calculated by (s+2) the individual summary info in step B and Ca~tbSummary info.
The present invention proposes a kind of efficient index method supporting time series data converging operationJu Hecaozuo, and its advantage is:
1. can support the most extemporaneous inquiry that simple aggregation operates.In query script, this Indexing Mechanism it can be avoided that
Substantial amounts of disk expense, solves the problem that Materialized View and summary table increase, along with data volume, the hydraulic performance decline caused;
2. summary table and Kd-Trees (Segment Tree) are combined, summary table is set up by many Kd-Trees structures
The line segment forest model become, thus avoid the full table scan of summary table to operate;
3., by bottom-up mode dynamic construction line segment forest, avoided traditional Kd-Trees and do not supported that increase lacks
Point.Additionally, search algorithm directly positions index data by calculating, it is to avoid the recursive traversal of line segment forest is operated, reduces
Disk I/O number of times;
The most this Indexing Mechanism is unrelated with underlying database, by the query engine based on JAVA from realization, and can be light
Pine is transplanted in the platform of arbitrary data storehouse.
Accompanying drawing explanation
Below in conjunction with the accompanying drawings, by the citing of indefiniteness, the preferred embodiment of the present invention is described further,
In accompanying drawing:
Fig. 1 is the summary info schematic diagram corresponding to one group of data item.
Fig. 2 is summary forest and the time window of the inventive method definition.
Fig. 3 be the present invention relates to interpolation serial number odd number (on) and even number (under) node.
Fig. 4 is the interpolation leaf node algorithm false code in the present invention.
Fig. 5 is the query script algorithm false code in the present invention.
Detailed description of the invention
The present invention is described in further detail below in conjunction with the accompanying drawings.
1. the indexing means supporting time series data aggregate function, it is characterised in that include two steps:
Step one, the data model of definition time series data and query demand
Definition 1: data item: data item D (data point) is that (s, t, v), wherein s is sensor to a tlv triple
ID, t are timestamps, and wherein, s and t constitutes globally unique mark, and v is the value of sensor, the consecutive hours of same sensor
Between data item constitute time series data, on this basis, define inquiry problem to be solved: on time series data, look into
Ask time window t1~t2(t1And t2For any time) in the value of time series data, variance statistic information;
Definition 2: summary info: in time series data, the statistical information of the individual continuous print data item in time of k and time thereof
Window constitutes 1 summary info (data Digest);
Definition 3: leaf node: the summary info directly produced by data item constitutes leaf plus specific label information
Node (leaf node);
Definition 4: intermediate node: collected by 2 leafy nodes or 2 intermediate nodes and constitute plus specific label information
Intermediate node (parent node);In order in the recursive operation avoiding tree, it is achieved the quick-searching of summary forest, tie at leaf
Point and intermediate node on the addition of necessity label information: sequence number and numbering;
Definition 5: sequence number: when initially setting up index, according to generation order, corresponding 1 sequence number of each leafy node, sequence number by
1 starts to be incremented by, and intermediate node does not has sequence number (serial);
Definition 6: numbering: according to the order of line segment forest postorder traversal, corresponding 1 numbering of each node, number (code)
It is incremented by by 1;
Definition 7: summary forest: summary forest (Synopsis Forest) be the summary tree produced by node constitute gloomy
Woods.
Step 2, the structure of summary forest and inquiry
(1) summary forest builds
Summary forest safeguards stack architecture rootStack, is used for improving combined efficiency;Safeguard a queue simultaneously, use
With temporary to be brushed enter disk nodal information,
A. when i-th leaf node arrives:
A) if i is odd number:
A) the most directly adding this leafy node, this leafy node is from becoming one tree, now, and the sequence number that this leafy node is corresponding
For i, numbered 2i-ones (i), wherein, ones (i) function be i binary representation in 1 number;
B) this leaf node is added into rootStack and queue;
B) if i is even number:
A) while adding this leafy node, generating and triggered the new tree generated by this leafy node, now, this leaf is tied
Serial number i that point is corresponding, the numbering of numbered (i-1) leafy node adds 1 i.e. 2 (i-1)-ones (i-1)+1;
B) this leaf node is added into queue;
C) the numbered 2i-ones (i) of root node of the new tree produced due to this leaf node, remaining newly-generated middle junction
The numbering of point is followed successively by 2 (i-1)-ones (i-1)+2 to 2i-ones (i)-1;
D) this leaf node is put into rootStack;
E) ejecting the first two node of rootStack, the two node has identical height and is root node, merges
Both form new tree, and the root node numbering of this tree constantly rises to 2i-ones (i) from 2 (i-1)-ones (i-1)+2;
F) root node that 1-a-ii-5 generates is put into queue;
G) root node that 1-a-ii-5 generates is put into rootStack, repeat 1-a-ii-5, until newly-generated root node
Numbering reaches 2i-ones (i);
B. the node brush kept in queue is entered disk.
(3) summary forest inquiry
1) query demand is first defined: query time window ta~tbThe summary info of corresponding data item,
2) inquiry specifically comprises the following steps that
A. normalized temporal window, it is assumed that tis<ta<tie、tjs<tb<tje, then the time window of inquiry can be divided into 3
Time window: ta<tie, t(i+1)s~t(j-1)eAnd tjs<tb;
B. for time window ta<tieAnd tjs<tb, need from data base, directly read taTo tieAnd tjsTo tbData
, and from data item, directly calculate the summary info of window during this period of time;
C. for time window t(i+1)S~t(j-1)e, from line segment forest, find out minimal number of line segment so that these lines
Section is referred to as the division of time window t (i+1) s~t (j-1) e.Assume to need altogether s line segment, from data base, read this successively
The summary node that s line segment is corresponding, obtains s summary info;This step specific implementation process is as follows:
A) from data base, 2 corresponding summary bags are read out according to initial time t (i+1) s and t (j-1) e, from summary
Bag respectively obtains sequence number i and the j of correspondence;
B) lower bound sequence number is obtained: if i is even number, summary bag corresponding for t (i+1) s is added to pending queue, now
Lower bound serial number (i+1).Otherwise, lower bound serial number i;
C) upper bound sequence number is obtained: if j is odd number, summary bag corresponding for t (j-1) e is added to pending queue, now
Upper bound serial number (j-1), otherwise, upper bound serial number j;
D) calculated the numbering of correspondence by upper bound sequence number, and cover the volume of the superiors' node of this sequence number correspondence node
Number;
E) sequence number of the lobus sinister child node that the superiors' node covers is calculated by the numbering of numbering and the superiors' node;
If f) the most left sequence number is more than lower bound sequence number, then the numbering of the superiors' node is added queue to be checked, and upper
Boundary's sequence number is set to the sequence number of lobus sinister child node and subtracts 1, forwards step d to;
If g) the most left sequence number is less than lower bound sequence number, then the numbering of the superiors' node subtracts 1, forwards step e to;
If h) the most left sequence number is equal to lower bound sequence number, then the numbering of the superiors' node is added in queue to be checked, then
Exit circulation;
I) find corresponding summary bag finally according to band query request, and add these summary bags to pending queue.
Time window t can be calculated by (s+2) the individual summary info in step B and Ca~tbSummary info.
Claims (1)
1. the indexing means supporting time series data aggregate function, it is characterised in that include two steps:
Step one, the data model of definition time series data and query demand
Definition 1: data item: data item D (data point) is that (s, t, v), wherein s is sensor ID to a tlv triple, t
Being timestamp, wherein, s and t constitutes globally unique mark, and v is the value of sensor, the continuous time of same sensor
Data item constitutes time series data, on this basis, defines inquiry problem to be solved: on time series data, during inquiry
Between window t1~t2(t1And t2For any time) in the value of time series data, variance statistic information;
Definition 2: summary info: in time series data, the statistical information of the individual continuous print data item in time of k and time window thereof
Constitute 1 summary info (data Digest);
Definition 3: leaf node: the summary info directly produced by data item constitutes leafy node plus specific label information
(leaf node);
Definition 4: intermediate node: collected by 2 leafy nodes or 2 intermediate nodes and constitute centre plus specific label information
Node (parent node);In order in the recursive operation avoiding tree, it is achieved the quick-searching of summary forest, at leafy node and
With the addition of on intermediate node necessity label information: sequence number and numbering;
Definition 5: sequence number: when initially setting up index, according to generation order, corresponding 1 sequence number of each leafy node, sequence number is opened by 1
Beginning to be incremented by, intermediate node does not has sequence number (serial);
Definition 6: numbering: according to the order of line segment forest postorder traversal, corresponding 1 numbering of each node, numbering (code) is opened by 1
Begin to be incremented by;
Definition 7: summary forest: summary forest (Synopsis Forest) is the forest that the summary tree produced by node is constituted.
Step 2, the structure of summary forest and inquiry
(1) summary forest builds
Summary forest safeguards a stack architecture (rootStack), is used for improving combined efficiency;Safeguard a queue simultaneously
(queue), be configured to temporarily store to be brushed enter disk nodal information.
A. when i-th leaf node arrives:
A) if i is odd number:
A) the most directly adding this leafy node, this leafy node is from becoming one tree, now, and serial number i that this leafy node is corresponding,
Numbered 2i-ones (i), wherein, ones (i) function be i binary representation in 1 number;
B) this leaf node is added into rootStack and queue;
B) if i is even number:
A) while adding this leafy node, generate and triggered the new tree generated, now, this leafy node pair by this leafy node
Serial number i answered, the numbering of numbered (i-1) leafy node adds 1 i.e. 2 (i-1)-ones (i-1)+1;
B) this leaf node is added into queue;
C) the numbered 2i-ones (i) of root node of the new tree produced due to this leaf node, remaining newly-generated intermediate node
Numbering is followed successively by 2 (i-1)-ones (i-1)+2 to 2i-ones (i)-1;
D) this leaf node is put into rootStack;
E) ejecting the first two node of rootStack, the two node has identical height and is root node, merges both
Forming new tree, the root node numbering of this tree constantly rises to 2i-ones (i) from 2 (i-1)-ones (i-1)+2;
F) root node that 1-a-ii-5 generates is put into queue;
G) root node that 1-a-ii-5 generates is put into rootStack, repeat 1-a-ii-5, until newly-generated root node numbering
Reach 2i-ones (i);
B. the node brush kept in queue is entered disk;
(2) summary forest inquiry
1) query demand is first defined: query time window ta~tbThe summary info of corresponding data item.
2) inquiry specifically comprises the following steps that
A. normalized temporal window, it is assumed that tis<ta<tie、tjs<tb<tje, then the time window of inquiry can be divided into 3 times
Window: ta<tie, t(i+1)s~t(j-1)eAnd tjs<tb;
B. for time window ta<tieAnd tjs<tb, need from data base, directly read taTo tieAnd tjsTo tbData item,
And from data item, directly calculate the summary info of window during this period of time;
C. for time window t(i+1)S~t(j-1)e, from line segment forest, find out minimal number of line segment so that these line segments claim
Division for time window t (i+1) s~t (j-1) e.Assume to need altogether s line segment, from data base, read this s successively
The summary node that line segment is corresponding, obtains s summary info;This step specific implementation process is as follows:
A) from data base, read out 2 corresponding summary bags according to initial time t (i+1) s and t (j-1) e, divide from summary bag
Do not obtain sequence number i and the j of correspondence;
B) lower bound sequence number is obtained: if i is even number, summary bag corresponding for t (i+1) s is added to pending queue, now lower bound
Serial number (i+1).Otherwise, lower bound serial number i;
C) upper bound sequence number is obtained: if j is odd number, summary bag corresponding for t (j-1) e is added to pending queue, the now upper bound
Serial number (j-1), otherwise, upper bound serial number j;
D) calculated the numbering of correspondence by upper bound sequence number, and cover the numbering of the superiors' node of this sequence number correspondence node;
E) sequence number of the lobus sinister child node that the superiors' node covers is calculated by the numbering of numbering and the superiors' node;
If f) the most left sequence number is more than lower bound sequence number, then the numbering of the superiors' node is added queue to be checked, and upper bound sequence
The sequence number number being set to lobus sinister child node subtracts 1, forwards step d to;
If g) the most left sequence number is less than lower bound sequence number, then the numbering of the superiors' node subtracts 1, forwards step e to;
If h) the most left sequence number is equal to lower bound sequence number, then the numbering of the superiors' node is added in queue to be checked, be then log out
Circulation;
I) find corresponding summary bag finally according to band query request, and add these summary bags to pending queue.
Time window t can be calculated by (s+2) the individual summary info in step B and Ca~tbSummary info.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610536956.2A CN106202384A (en) | 2016-07-08 | 2016-07-08 | A kind of indexing means supporting time series data aggregate function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610536956.2A CN106202384A (en) | 2016-07-08 | 2016-07-08 | A kind of indexing means supporting time series data aggregate function |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106202384A true CN106202384A (en) | 2016-12-07 |
Family
ID=57473329
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610536956.2A Pending CN106202384A (en) | 2016-07-08 | 2016-07-08 | A kind of indexing means supporting time series data aggregate function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106202384A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991137A (en) * | 2017-03-15 | 2017-07-28 | 浙江大学 | The method that summary forest is indexed to time series data is hashed based on Hbase |
CN108268589A (en) * | 2017-12-05 | 2018-07-10 | 北京百度网讯科技有限公司 | Aggregate query method, apparatus, computer equipment and the readable medium of time series data |
CN109241121A (en) * | 2017-06-29 | 2019-01-18 | 阿里巴巴集团控股有限公司 | The storage of time series data and querying method, device, system and electronic equipment |
CN109948007A (en) * | 2019-03-21 | 2019-06-28 | 浙江邦盛科技有限公司 | A kind of clock synchronization ordinal number maximum processing method for being increased continuously number and number of increments according to statistics |
CN110008544A (en) * | 2019-03-21 | 2019-07-12 | 浙江邦盛科技有限公司 | A kind of processing method of clock synchronization ordinal number number of increments and reduced degree according to statistics |
CN112069164A (en) * | 2019-06-10 | 2020-12-11 | 北京百度网讯科技有限公司 | Data query method and device, electronic equipment and computer readable storage medium |
CN113535712A (en) * | 2021-06-04 | 2021-10-22 | 山东大学 | Method and system for supporting large-scale time sequence data interaction based on line segment KD tree |
CN114547073A (en) * | 2022-02-10 | 2022-05-27 | 清华大学 | Aggregation query method and device for time series data and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101859323A (en) * | 2010-05-31 | 2010-10-13 | 广西大学 | Ciphertext full-text search system |
CN105389370A (en) * | 2015-11-13 | 2016-03-09 | 浙江工业大学 | Social activity organization-faced time aggregation query method |
-
2016
- 2016-07-08 CN CN201610536956.2A patent/CN106202384A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101859323A (en) * | 2010-05-31 | 2010-10-13 | 广西大学 | Ciphertext full-text search system |
CN105389370A (en) * | 2015-11-13 | 2016-03-09 | 浙江工业大学 | Social activity organization-faced time aggregation query method |
Non-Patent Citations (1)
Title |
---|
黄向东等: "支持时序数据聚合函数的索引", 《清华大学学报(自然科学版)》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991137A (en) * | 2017-03-15 | 2017-07-28 | 浙江大学 | The method that summary forest is indexed to time series data is hashed based on Hbase |
CN106991137B (en) * | 2017-03-15 | 2019-10-18 | 浙江大学 | The method that time series data is indexed based on Hbase hash summary forest |
CN109241121A (en) * | 2017-06-29 | 2019-01-18 | 阿里巴巴集团控股有限公司 | The storage of time series data and querying method, device, system and electronic equipment |
CN108268589A (en) * | 2017-12-05 | 2018-07-10 | 北京百度网讯科技有限公司 | Aggregate query method, apparatus, computer equipment and the readable medium of time series data |
CN109948007A (en) * | 2019-03-21 | 2019-06-28 | 浙江邦盛科技有限公司 | A kind of clock synchronization ordinal number maximum processing method for being increased continuously number and number of increments according to statistics |
CN110008544A (en) * | 2019-03-21 | 2019-07-12 | 浙江邦盛科技有限公司 | A kind of processing method of clock synchronization ordinal number number of increments and reduced degree according to statistics |
CN109948007B (en) * | 2019-03-21 | 2020-07-14 | 浙江邦盛科技有限公司 | Processing method for inquiring maximum continuous increasing times and decreasing times of time sequence data statistics |
CN112069164A (en) * | 2019-06-10 | 2020-12-11 | 北京百度网讯科技有限公司 | Data query method and device, electronic equipment and computer readable storage medium |
CN112069164B (en) * | 2019-06-10 | 2023-08-01 | 北京百度网讯科技有限公司 | Data query method, device, electronic equipment and computer readable storage medium |
CN113535712A (en) * | 2021-06-04 | 2021-10-22 | 山东大学 | Method and system for supporting large-scale time sequence data interaction based on line segment KD tree |
CN113535712B (en) * | 2021-06-04 | 2023-09-29 | 山东大学 | Method and system for supporting large-scale time sequence data interaction based on line segment KD tree |
CN114547073A (en) * | 2022-02-10 | 2022-05-27 | 清华大学 | Aggregation query method and device for time series data and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106202384A (en) | A kind of indexing means supporting time series data aggregate function | |
CN105488231B (en) | A kind of big data processing method divided based on adaptive table dimension | |
CN102270232B (en) | Semantic data query system with optimized storage | |
US9400815B2 (en) | Method of two pass processing for relational queries in a database system and corresponding database system | |
US7761474B2 (en) | Indexing stored data | |
CN103823823A (en) | Denormalization strategy selection method based on frequent item set mining algorithm | |
CN106599052B (en) | Apache Kylin-based data query system and method | |
CN102722553A (en) | Distributed type reverse index organization method based on user log analysis | |
CN104504008B (en) | A kind of Data Migration algorithm based on nested SQL to HBase | |
CN107247799A (en) | Data processing method, system and its modeling method of compatible a variety of big data storages | |
CN112015741A (en) | Method and device for storing massive data in different databases and tables | |
CN103902544A (en) | Data processing method and system | |
CN104346444B (en) | A kind of the best site selection method based on the anti-spatial key inquiry of road network | |
CN104504018A (en) | Top-down real-time big data query optimization method based on bushy tree | |
CN103678550A (en) | Mass data real-time query method based on dynamic index structure | |
CN103150163A (en) | Map/Reduce mode-based parallel relating method | |
CN106844089A (en) | A kind of method and apparatus for recovering tree data storage | |
CN104731925A (en) | MapReduce-based FP-Growth load balance parallel computing method | |
CN111367951A (en) | Method and device for processing stream data | |
CN107870956A (en) | A kind of effective item set mining method, apparatus and data processing equipment | |
CN110019446A (en) | ETL data processing system and method | |
CN105359142A (en) | Hash join method, device and database management system | |
CN110597805B (en) | Memory index structure processing method | |
CN104008205A (en) | Content routing inquiry method and system | |
CN103761298B (en) | Distributed-architecture-based entity matching method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20161207 |