CN106202384A - A kind of indexing means supporting time series data aggregate function - Google Patents

A kind of indexing means supporting time series data aggregate function Download PDF

Info

Publication number
CN106202384A
CN106202384A CN201610536956.2A CN201610536956A CN106202384A CN 106202384 A CN106202384 A CN 106202384A CN 201610536956 A CN201610536956 A CN 201610536956A CN 106202384 A CN106202384 A CN 106202384A
Authority
CN
China
Prior art keywords
node
sequence number
numbering
forest
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610536956.2A
Other languages
Chinese (zh)
Inventor
王建民
黄向东
郑亮帆
康荣
龙明盛
刘英博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201610536956.2A priority Critical patent/CN106202384A/en
Publication of CN106202384A publication Critical patent/CN106202384A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of indexing means supporting time series data aggregate function, it would be preferable to support the most extemporaneous inquiry of simple aggregation operation.Its basic thought is summary table and Kd-Trees (Segment Tree) to be combined, and sets up the line segment forest model being made up of many Kd-Trees, thus avoid the full table scan of summary table to operate on summary table.Meanwhile, by bottom-up mode dynamic construction line segment forest, the shortcoming that traditional Kd-Trees is not supported to increase has been avoided.Additionally, search algorithm directly positions index data by calculating, it is to avoid the recursive traversal of line segment forest is operated, decreases disk I/O number of times.Test result indicate that the calculating inquiry mode of the summary table+line segment forest used herein effectively reduces the number of times of disk I/O, has been obviously improved query performance.

Description

A kind of indexing means supporting time series data aggregate function
Technical field
The present invention relates to a kind of big data system automatic Model Selection and the side of parameter configuration in big market demand development process Method, belongs to computer data base management technical field.
Background technology
Along with the development of sensor technology and popularizing of the Internet, the collection of data and the spread speed of information have reached sky Front level.The aggregation information such as the extreme value of data, average are become particularly significant, the most quick and precisely obtains these polymerization letters Breath is research emphasis herein.
Meet this kind of inquiry, in the range of data base is necessary for supporting at any time, mass data is carried out quickly Converging operationJu Hecaozuo.
Traditional Relational DataBase mainly uses the mode of summary table or Materialized View to reach to accelerate the purpose of aggregate query. Wherein, Materialized View is that the querying command relating to table connection is carried out pretreatment, and result is saved in view table, Yong Hufa During raw inquiry, data base directly inquires about from view table and returns result.Summary table is then while write data, calculates also Preserve corresponding summary info, thus when there is inquiry, directly inquire about from summary table and return result.
The essence of both modes is all to precalculate and preserve conventional aggregation information, reduces query context, improves real Border inquiry velocity.Its drawback is the increase in the expansion rate of data base;Along with increasing of data, it may appear that the problem of performance degradation.
And in NoSQL data base, some data bases have employed the mode of MapReduce to process these converging operationJu Hecaozuos: Aggregate query detects the table data related in real time from data base every time, is submitted in Map program process.In the Map stage, Program filters goes out to meet the data of condition and submits to Reduce program.Reduce program collects and calculates Query Result.Separately Some data bases such as MongoDB, then propose the concept of polymerization pipeline (Aggregation Pipeline).It is to combine The thought of MapReduce and the product of the thought of linux system pipeline.Its principle is, converging operationJu Hecaozuo acts directly on data literary composition On part, by the primary operation of class system, directly filter the data in aggregate file.
The mode of MapReduce and polymerization pipeline is all the representative calculated in real time.Although not increasing the expansion of data base Rate, but query script creates substantial amounts of disk and computing cost, poor efficiency is time-consuming, it is impossible to meet the demand of extemporaneous inquiry.
The thought of Materialized View is then applied in NoSQL by Plamen Nikolov et al.: precalculate counting, summation Etc. common statistical information, and being saved in view table, follow-up incremental updates, to reach to accelerate the purpose of inquiry response.
This mode promotes clearly compared to the speed carrying out MapReduce calculating on NoSQL data base, but also There is its drawback.The Forming Mechanism of Materialized View itself determines its inquiry not supporting any range and operates.It addition, along with data The rising of amount, the disk expense of inquiry operation also can increase.
Summary of the invention
Based on the problems referred to above, this paper presents a kind of Indexing Mechanism supporting NoSQL data base's converging operationJu Hecaozuo.It is thought substantially Want to combine summary table and Kd-Trees (Segment Tree), summary table is set up the line being made up of many Kd-Trees Section forest model, thus avoid the full table scan of summary table to operate.Meanwhile, gloomy by bottom-up mode dynamic construction line segment Woods, has avoided the shortcoming that traditional Kd-Trees is not supported to increase.Additionally, search algorithm directly positions index data by calculating, keep away Exempt from the operation of the recursive traversal to line segment forest, decrease disk I/O number of times.Achieve on Cassandra data base herein The index engine stated, and design 2 groups of contrast experiments: directly inquiries based on data and directly inquiry based on summary table.Experiment Result shows, the calculating inquiry mode of this summary table+line segment forest, effectively reduces the number of times of disk I/O, is obviously improved Query performance.
A kind of indexing means supporting time series data aggregate function, it is characterised in that include two steps:
Step one, the data model of definition time series data and query demand
Definition 1: data item: data item D (data point) is that (s, t, v), wherein s is sensor to a tlv triple ID, t are timestamps, and wherein, s and t constitutes globally unique mark, and v is the value of sensor, the consecutive hours of same sensor Between data item constitute time series data, on this basis, define inquiry problem to be solved: on time series data, look into Ask time window t1~t2(t1And t2For any time) in the value of time series data, variance statistic information;
Definition 2: summary info: in time series data, the statistical information of the individual continuous print data item in time of k and time thereof Window constitutes 1 summary info (data Digest);
Definition 3: leaf node: the summary info directly produced by data item constitutes leaf plus specific label information Node (leaf node);
Definition 4: intermediate node: collected by 2 leafy nodes or 2 intermediate nodes and constitute plus specific label information Intermediate node (parent node);In order in the recursive operation avoiding tree, it is achieved the quick-searching of summary forest, tie at leaf Point and intermediate node on the addition of necessity label information: sequence number and numbering;
Definition 5: sequence number: when initially setting up index, according to generation order, corresponding 1 sequence number of each leafy node, sequence number by 1 starts to be incremented by, and intermediate node does not has sequence number (serial);
Definition 6: numbering: according to the order of line segment forest postorder traversal, corresponding 1 numbering of each node, number (code) It is incremented by by 1;
Definition 7: summary forest: summary forest (Synopsis Forest) be the summary tree produced by node constitute gloomy Woods.
Step 2, the structure of summary forest and inquiry
(1) summary forest builds
Summary forest safeguards stack architecture rootStack, is used for improving combined efficiency;Safeguard a queue simultaneously, use With temporary to be brushed enter disk nodal information,
A. when i-th leaf node arrives:
A) if i is odd number:
A) the most directly adding this leafy node, this leafy node is from becoming one tree, now, and the sequence number that this leafy node is corresponding For i, numbered 2i-ones (i), wherein, ones (i) function be i binary representation in 1 number;
B) this leaf node is added into rootStack and queue;
B) if i is even number:
A) while adding this leafy node, generating and triggered the new tree generated by this leafy node, now, this leaf is tied Serial number i that point is corresponding, the numbering of numbered (i-1) leafy node adds 1 i.e. 2 (i-1)-ones (i-1)+1;
B) this leaf node is added into queue;
C) the numbered 2i-ones (i) of root node of the new tree produced due to this leaf node, remaining newly-generated middle junction The numbering of point is followed successively by 2 (i-1)-ones (i-1)+2 to 2i-ones (i)-1;
D) this leaf node is put into rootStack;
E) ejecting the first two node of rootStack, the two node has identical height and is root node, merges Both form new tree, and the root node numbering of this tree constantly rises to 2i-ones (i) from 2 (i-1)-ones (i-1)+2;
F) root node that 1-a-ii-5 generates is put into queue;
G) root node that 1-a-ii-5 generates is put into rootStack, repeat 1-a-ii-5, until newly-generated root node Numbering reaches 2i-ones (i);
B. the node brush kept in queue is entered disk.
(2) summary forest inquiry
1) query demand is first defined: query time window ta~tbThe summary info of corresponding data item.
2) inquiry specifically comprises the following steps that
A. normalized temporal window, it is assumed that tis<ta<tie、tjs<tb<tje, then the time window of inquiry can be divided into 3 Time window: ta<tie, t(i+1)s~t(j-1)eAnd tjs<tb
B. for time window ta<tieAnd tjs<tb, need from data base, directly read taTo tieAnd tjsTo tbData , and from data item, directly calculate the summary info of window during this period of time;
C. for time window t(i+1)S~t(j-1)e, from line segment forest, find out minimal number of line segment so that these lines Section is referred to as the division of time window t (i+1) s~t (j-1) e.Assume to need altogether s line segment, from data base, read this successively The summary node that s line segment is corresponding, obtains s summary info;This step specific implementation process is as follows:
A) from data base, 2 corresponding summary bags are read out according to initial time t (i+1) s and t (j-1) e, from summary Bag respectively obtains sequence number i and the j of correspondence;
B) lower bound sequence number is obtained: if i is even number, summary bag corresponding for t (i+1) s is added to pending queue, now Lower bound serial number (i+1).Otherwise, lower bound serial number i;
C) upper bound sequence number is obtained: if j is odd number, summary bag corresponding for t (j-1) e is added to pending queue, now Upper bound serial number (j-1), otherwise, upper bound serial number j;
D) calculated the numbering of correspondence by upper bound sequence number, and cover the volume of the superiors' node of this sequence number correspondence node Number;
E) sequence number of the lobus sinister child node that the superiors' node covers is calculated by the numbering of numbering and the superiors' node;
If f) the most left sequence number is more than lower bound sequence number, then the numbering of the superiors' node is added queue to be checked, and upper Boundary's sequence number is set to the sequence number of lobus sinister child node and subtracts 1, forwards step d to;
If g) the most left sequence number is less than lower bound sequence number, then the numbering of the superiors' node subtracts 1, forwards step e to;
If h) the most left sequence number is equal to lower bound sequence number, then the numbering of the superiors' node is added in queue to be checked, then Exit circulation;
I) find corresponding summary bag finally according to band query request, and add these summary bags to pending queue.
Time window t can be calculated by (s+2) the individual summary info in step B and Ca~tbSummary info.
The present invention proposes a kind of efficient index method supporting time series data converging operationJu Hecaozuo, and its advantage is:
1. can support the most extemporaneous inquiry that simple aggregation operates.In query script, this Indexing Mechanism it can be avoided that Substantial amounts of disk expense, solves the problem that Materialized View and summary table increase, along with data volume, the hydraulic performance decline caused;
2. summary table and Kd-Trees (Segment Tree) are combined, summary table is set up by many Kd-Trees structures The line segment forest model become, thus avoid the full table scan of summary table to operate;
3., by bottom-up mode dynamic construction line segment forest, avoided traditional Kd-Trees and do not supported that increase lacks Point.Additionally, search algorithm directly positions index data by calculating, it is to avoid the recursive traversal of line segment forest is operated, reduces Disk I/O number of times;
The most this Indexing Mechanism is unrelated with underlying database, by the query engine based on JAVA from realization, and can be light Pine is transplanted in the platform of arbitrary data storehouse.
Accompanying drawing explanation
Below in conjunction with the accompanying drawings, by the citing of indefiniteness, the preferred embodiment of the present invention is described further, In accompanying drawing:
Fig. 1 is the summary info schematic diagram corresponding to one group of data item.
Fig. 2 is summary forest and the time window of the inventive method definition.
Fig. 3 be the present invention relates to interpolation serial number odd number (on) and even number (under) node.
Fig. 4 is the interpolation leaf node algorithm false code in the present invention.
Fig. 5 is the query script algorithm false code in the present invention.
Detailed description of the invention
The present invention is described in further detail below in conjunction with the accompanying drawings.
1. the indexing means supporting time series data aggregate function, it is characterised in that include two steps:
Step one, the data model of definition time series data and query demand
Definition 1: data item: data item D (data point) is that (s, t, v), wherein s is sensor to a tlv triple ID, t are timestamps, and wherein, s and t constitutes globally unique mark, and v is the value of sensor, the consecutive hours of same sensor Between data item constitute time series data, on this basis, define inquiry problem to be solved: on time series data, look into Ask time window t1~t2(t1And t2For any time) in the value of time series data, variance statistic information;
Definition 2: summary info: in time series data, the statistical information of the individual continuous print data item in time of k and time thereof Window constitutes 1 summary info (data Digest);
Definition 3: leaf node: the summary info directly produced by data item constitutes leaf plus specific label information Node (leaf node);
Definition 4: intermediate node: collected by 2 leafy nodes or 2 intermediate nodes and constitute plus specific label information Intermediate node (parent node);In order in the recursive operation avoiding tree, it is achieved the quick-searching of summary forest, tie at leaf Point and intermediate node on the addition of necessity label information: sequence number and numbering;
Definition 5: sequence number: when initially setting up index, according to generation order, corresponding 1 sequence number of each leafy node, sequence number by 1 starts to be incremented by, and intermediate node does not has sequence number (serial);
Definition 6: numbering: according to the order of line segment forest postorder traversal, corresponding 1 numbering of each node, number (code) It is incremented by by 1;
Definition 7: summary forest: summary forest (Synopsis Forest) be the summary tree produced by node constitute gloomy Woods.
Step 2, the structure of summary forest and inquiry
(1) summary forest builds
Summary forest safeguards stack architecture rootStack, is used for improving combined efficiency;Safeguard a queue simultaneously, use With temporary to be brushed enter disk nodal information,
A. when i-th leaf node arrives:
A) if i is odd number:
A) the most directly adding this leafy node, this leafy node is from becoming one tree, now, and the sequence number that this leafy node is corresponding For i, numbered 2i-ones (i), wherein, ones (i) function be i binary representation in 1 number;
B) this leaf node is added into rootStack and queue;
B) if i is even number:
A) while adding this leafy node, generating and triggered the new tree generated by this leafy node, now, this leaf is tied Serial number i that point is corresponding, the numbering of numbered (i-1) leafy node adds 1 i.e. 2 (i-1)-ones (i-1)+1;
B) this leaf node is added into queue;
C) the numbered 2i-ones (i) of root node of the new tree produced due to this leaf node, remaining newly-generated middle junction The numbering of point is followed successively by 2 (i-1)-ones (i-1)+2 to 2i-ones (i)-1;
D) this leaf node is put into rootStack;
E) ejecting the first two node of rootStack, the two node has identical height and is root node, merges Both form new tree, and the root node numbering of this tree constantly rises to 2i-ones (i) from 2 (i-1)-ones (i-1)+2;
F) root node that 1-a-ii-5 generates is put into queue;
G) root node that 1-a-ii-5 generates is put into rootStack, repeat 1-a-ii-5, until newly-generated root node Numbering reaches 2i-ones (i);
B. the node brush kept in queue is entered disk.
(3) summary forest inquiry
1) query demand is first defined: query time window ta~tbThe summary info of corresponding data item,
2) inquiry specifically comprises the following steps that
A. normalized temporal window, it is assumed that tis<ta<tie、tjs<tb<tje, then the time window of inquiry can be divided into 3 Time window: ta<tie, t(i+1)s~t(j-1)eAnd tjs<tb
B. for time window ta<tieAnd tjs<tb, need from data base, directly read taTo tieAnd tjsTo tbData , and from data item, directly calculate the summary info of window during this period of time;
C. for time window t(i+1)S~t(j-1)e, from line segment forest, find out minimal number of line segment so that these lines Section is referred to as the division of time window t (i+1) s~t (j-1) e.Assume to need altogether s line segment, from data base, read this successively The summary node that s line segment is corresponding, obtains s summary info;This step specific implementation process is as follows:
A) from data base, 2 corresponding summary bags are read out according to initial time t (i+1) s and t (j-1) e, from summary Bag respectively obtains sequence number i and the j of correspondence;
B) lower bound sequence number is obtained: if i is even number, summary bag corresponding for t (i+1) s is added to pending queue, now Lower bound serial number (i+1).Otherwise, lower bound serial number i;
C) upper bound sequence number is obtained: if j is odd number, summary bag corresponding for t (j-1) e is added to pending queue, now Upper bound serial number (j-1), otherwise, upper bound serial number j;
D) calculated the numbering of correspondence by upper bound sequence number, and cover the volume of the superiors' node of this sequence number correspondence node Number;
E) sequence number of the lobus sinister child node that the superiors' node covers is calculated by the numbering of numbering and the superiors' node;
If f) the most left sequence number is more than lower bound sequence number, then the numbering of the superiors' node is added queue to be checked, and upper Boundary's sequence number is set to the sequence number of lobus sinister child node and subtracts 1, forwards step d to;
If g) the most left sequence number is less than lower bound sequence number, then the numbering of the superiors' node subtracts 1, forwards step e to;
If h) the most left sequence number is equal to lower bound sequence number, then the numbering of the superiors' node is added in queue to be checked, then Exit circulation;
I) find corresponding summary bag finally according to band query request, and add these summary bags to pending queue.
Time window t can be calculated by (s+2) the individual summary info in step B and Ca~tbSummary info.

Claims (1)

1. the indexing means supporting time series data aggregate function, it is characterised in that include two steps:
Step one, the data model of definition time series data and query demand
Definition 1: data item: data item D (data point) is that (s, t, v), wherein s is sensor ID to a tlv triple, t Being timestamp, wherein, s and t constitutes globally unique mark, and v is the value of sensor, the continuous time of same sensor Data item constitutes time series data, on this basis, defines inquiry problem to be solved: on time series data, during inquiry Between window t1~t2(t1And t2For any time) in the value of time series data, variance statistic information;
Definition 2: summary info: in time series data, the statistical information of the individual continuous print data item in time of k and time window thereof Constitute 1 summary info (data Digest);
Definition 3: leaf node: the summary info directly produced by data item constitutes leafy node plus specific label information (leaf node);
Definition 4: intermediate node: collected by 2 leafy nodes or 2 intermediate nodes and constitute centre plus specific label information Node (parent node);In order in the recursive operation avoiding tree, it is achieved the quick-searching of summary forest, at leafy node and With the addition of on intermediate node necessity label information: sequence number and numbering;
Definition 5: sequence number: when initially setting up index, according to generation order, corresponding 1 sequence number of each leafy node, sequence number is opened by 1 Beginning to be incremented by, intermediate node does not has sequence number (serial);
Definition 6: numbering: according to the order of line segment forest postorder traversal, corresponding 1 numbering of each node, numbering (code) is opened by 1 Begin to be incremented by;
Definition 7: summary forest: summary forest (Synopsis Forest) is the forest that the summary tree produced by node is constituted.
Step 2, the structure of summary forest and inquiry
(1) summary forest builds
Summary forest safeguards a stack architecture (rootStack), is used for improving combined efficiency;Safeguard a queue simultaneously (queue), be configured to temporarily store to be brushed enter disk nodal information.
A. when i-th leaf node arrives:
A) if i is odd number:
A) the most directly adding this leafy node, this leafy node is from becoming one tree, now, and serial number i that this leafy node is corresponding, Numbered 2i-ones (i), wherein, ones (i) function be i binary representation in 1 number;
B) this leaf node is added into rootStack and queue;
B) if i is even number:
A) while adding this leafy node, generate and triggered the new tree generated, now, this leafy node pair by this leafy node Serial number i answered, the numbering of numbered (i-1) leafy node adds 1 i.e. 2 (i-1)-ones (i-1)+1;
B) this leaf node is added into queue;
C) the numbered 2i-ones (i) of root node of the new tree produced due to this leaf node, remaining newly-generated intermediate node Numbering is followed successively by 2 (i-1)-ones (i-1)+2 to 2i-ones (i)-1;
D) this leaf node is put into rootStack;
E) ejecting the first two node of rootStack, the two node has identical height and is root node, merges both Forming new tree, the root node numbering of this tree constantly rises to 2i-ones (i) from 2 (i-1)-ones (i-1)+2;
F) root node that 1-a-ii-5 generates is put into queue;
G) root node that 1-a-ii-5 generates is put into rootStack, repeat 1-a-ii-5, until newly-generated root node numbering Reach 2i-ones (i);
B. the node brush kept in queue is entered disk;
(2) summary forest inquiry
1) query demand is first defined: query time window ta~tbThe summary info of corresponding data item.
2) inquiry specifically comprises the following steps that
A. normalized temporal window, it is assumed that tis<ta<tie、tjs<tb<tje, then the time window of inquiry can be divided into 3 times Window: ta<tie, t(i+1)s~t(j-1)eAnd tjs<tb
B. for time window ta<tieAnd tjs<tb, need from data base, directly read taTo tieAnd tjsTo tbData item, And from data item, directly calculate the summary info of window during this period of time;
C. for time window t(i+1)S~t(j-1)e, from line segment forest, find out minimal number of line segment so that these line segments claim Division for time window t (i+1) s~t (j-1) e.Assume to need altogether s line segment, from data base, read this s successively The summary node that line segment is corresponding, obtains s summary info;This step specific implementation process is as follows:
A) from data base, read out 2 corresponding summary bags according to initial time t (i+1) s and t (j-1) e, divide from summary bag Do not obtain sequence number i and the j of correspondence;
B) lower bound sequence number is obtained: if i is even number, summary bag corresponding for t (i+1) s is added to pending queue, now lower bound Serial number (i+1).Otherwise, lower bound serial number i;
C) upper bound sequence number is obtained: if j is odd number, summary bag corresponding for t (j-1) e is added to pending queue, the now upper bound Serial number (j-1), otherwise, upper bound serial number j;
D) calculated the numbering of correspondence by upper bound sequence number, and cover the numbering of the superiors' node of this sequence number correspondence node;
E) sequence number of the lobus sinister child node that the superiors' node covers is calculated by the numbering of numbering and the superiors' node;
If f) the most left sequence number is more than lower bound sequence number, then the numbering of the superiors' node is added queue to be checked, and upper bound sequence The sequence number number being set to lobus sinister child node subtracts 1, forwards step d to;
If g) the most left sequence number is less than lower bound sequence number, then the numbering of the superiors' node subtracts 1, forwards step e to;
If h) the most left sequence number is equal to lower bound sequence number, then the numbering of the superiors' node is added in queue to be checked, be then log out Circulation;
I) find corresponding summary bag finally according to band query request, and add these summary bags to pending queue.
Time window t can be calculated by (s+2) the individual summary info in step B and Ca~tbSummary info.
CN201610536956.2A 2016-07-08 2016-07-08 A kind of indexing means supporting time series data aggregate function Pending CN106202384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610536956.2A CN106202384A (en) 2016-07-08 2016-07-08 A kind of indexing means supporting time series data aggregate function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610536956.2A CN106202384A (en) 2016-07-08 2016-07-08 A kind of indexing means supporting time series data aggregate function

Publications (1)

Publication Number Publication Date
CN106202384A true CN106202384A (en) 2016-12-07

Family

ID=57473329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610536956.2A Pending CN106202384A (en) 2016-07-08 2016-07-08 A kind of indexing means supporting time series data aggregate function

Country Status (1)

Country Link
CN (1) CN106202384A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991137A (en) * 2017-03-15 2017-07-28 浙江大学 The method that summary forest is indexed to time series data is hashed based on Hbase
CN108268589A (en) * 2017-12-05 2018-07-10 北京百度网讯科技有限公司 Aggregate query method, apparatus, computer equipment and the readable medium of time series data
CN109241121A (en) * 2017-06-29 2019-01-18 阿里巴巴集团控股有限公司 The storage of time series data and querying method, device, system and electronic equipment
CN109948007A (en) * 2019-03-21 2019-06-28 浙江邦盛科技有限公司 A kind of clock synchronization ordinal number maximum processing method for being increased continuously number and number of increments according to statistics
CN110008544A (en) * 2019-03-21 2019-07-12 浙江邦盛科技有限公司 A kind of processing method of clock synchronization ordinal number number of increments and reduced degree according to statistics
CN112069164A (en) * 2019-06-10 2020-12-11 北京百度网讯科技有限公司 Data query method and device, electronic equipment and computer readable storage medium
CN113535712A (en) * 2021-06-04 2021-10-22 山东大学 Method and system for supporting large-scale time sequence data interaction based on line segment KD tree
CN114547073A (en) * 2022-02-10 2022-05-27 清华大学 Aggregation query method and device for time series data and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101859323A (en) * 2010-05-31 2010-10-13 广西大学 Ciphertext full-text search system
CN105389370A (en) * 2015-11-13 2016-03-09 浙江工业大学 Social activity organization-faced time aggregation query method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101859323A (en) * 2010-05-31 2010-10-13 广西大学 Ciphertext full-text search system
CN105389370A (en) * 2015-11-13 2016-03-09 浙江工业大学 Social activity organization-faced time aggregation query method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄向东等: "支持时序数据聚合函数的索引", 《清华大学学报(自然科学版)》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991137A (en) * 2017-03-15 2017-07-28 浙江大学 The method that summary forest is indexed to time series data is hashed based on Hbase
CN106991137B (en) * 2017-03-15 2019-10-18 浙江大学 The method that time series data is indexed based on Hbase hash summary forest
CN109241121A (en) * 2017-06-29 2019-01-18 阿里巴巴集团控股有限公司 The storage of time series data and querying method, device, system and electronic equipment
CN108268589A (en) * 2017-12-05 2018-07-10 北京百度网讯科技有限公司 Aggregate query method, apparatus, computer equipment and the readable medium of time series data
CN109948007A (en) * 2019-03-21 2019-06-28 浙江邦盛科技有限公司 A kind of clock synchronization ordinal number maximum processing method for being increased continuously number and number of increments according to statistics
CN110008544A (en) * 2019-03-21 2019-07-12 浙江邦盛科技有限公司 A kind of processing method of clock synchronization ordinal number number of increments and reduced degree according to statistics
CN109948007B (en) * 2019-03-21 2020-07-14 浙江邦盛科技有限公司 Processing method for inquiring maximum continuous increasing times and decreasing times of time sequence data statistics
CN112069164A (en) * 2019-06-10 2020-12-11 北京百度网讯科技有限公司 Data query method and device, electronic equipment and computer readable storage medium
CN112069164B (en) * 2019-06-10 2023-08-01 北京百度网讯科技有限公司 Data query method, device, electronic equipment and computer readable storage medium
CN113535712A (en) * 2021-06-04 2021-10-22 山东大学 Method and system for supporting large-scale time sequence data interaction based on line segment KD tree
CN113535712B (en) * 2021-06-04 2023-09-29 山东大学 Method and system for supporting large-scale time sequence data interaction based on line segment KD tree
CN114547073A (en) * 2022-02-10 2022-05-27 清华大学 Aggregation query method and device for time series data and storage medium

Similar Documents

Publication Publication Date Title
CN106202384A (en) A kind of indexing means supporting time series data aggregate function
CN105488231B (en) A kind of big data processing method divided based on adaptive table dimension
CN102270232B (en) Semantic data query system with optimized storage
US9400815B2 (en) Method of two pass processing for relational queries in a database system and corresponding database system
US7761474B2 (en) Indexing stored data
CN103823823A (en) Denormalization strategy selection method based on frequent item set mining algorithm
CN106599052B (en) Apache Kylin-based data query system and method
CN102722553A (en) Distributed type reverse index organization method based on user log analysis
CN104504008B (en) A kind of Data Migration algorithm based on nested SQL to HBase
CN107247799A (en) Data processing method, system and its modeling method of compatible a variety of big data storages
CN112015741A (en) Method and device for storing massive data in different databases and tables
CN103902544A (en) Data processing method and system
CN104346444B (en) A kind of the best site selection method based on the anti-spatial key inquiry of road network
CN104504018A (en) Top-down real-time big data query optimization method based on bushy tree
CN103678550A (en) Mass data real-time query method based on dynamic index structure
CN103150163A (en) Map/Reduce mode-based parallel relating method
CN106844089A (en) A kind of method and apparatus for recovering tree data storage
CN104731925A (en) MapReduce-based FP-Growth load balance parallel computing method
CN111367951A (en) Method and device for processing stream data
CN107870956A (en) A kind of effective item set mining method, apparatus and data processing equipment
CN110019446A (en) ETL data processing system and method
CN105359142A (en) Hash join method, device and database management system
CN110597805B (en) Memory index structure processing method
CN104008205A (en) Content routing inquiry method and system
CN103761298B (en) Distributed-architecture-based entity matching method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20161207