CN106547888A - A kind of building method and system of time series databases - Google Patents
A kind of building method and system of time series databases Download PDFInfo
- Publication number
- CN106547888A CN106547888A CN201610960702.3A CN201610960702A CN106547888A CN 106547888 A CN106547888 A CN 106547888A CN 201610960702 A CN201610960702 A CN 201610960702A CN 106547888 A CN106547888 A CN 106547888A
- Authority
- CN
- China
- Prior art keywords
- tsdb
- data
- hbase
- time series
- uid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of building method of time series databases, which supports that storage monitoring data is detailed, is easy to monitor track problems, data space with hbase clusters extension and linear expansion, without worrying storage problem.The method includes:(1) hbase clusters prepare:Designed with monitoring data feature, time series databases are realized based on hbase;(2) hbase tables are created:Create table tsdb uid, tsdb;(3) monitoring data is accessed;(4) dictionary data is stored in into tsdb uid, and sets up dictionary mapping;(5) data are stored in into tsdb in the form of mapping value integer;(6) user's inquiry hbase, obtains monitoring data.The construction system of also a kind of time series databases.
Description
Technical field
The present invention relates to the technical field that big data is processed, more particularly to a kind of building method of time series databases,
And the construction system of time series databases.
Background technology
Time series databases, are typically mainly used in the storage and inquiry of data in monitoring system, such as in some grand dukes
Various monitoring datas, including the network equipment, operation can be collected from various application services and cluster server node by department, O&M team
System, Application Status etc., and need to store these monitoring datas and inquired about with tracking problem, due to data volume it is huge
Greatly, various serious challenges can be run in storage and inquiry, in order to solve storage and query performance, existing time series data
Storehouse typically all adopts RRD (Round Robin Database ring-type data bases), and this time series databases can be to storage
Data are periodically sampled (sampling policy includes that maximum, minima and meansigma methodss are sampled), and the number that these are sampled
Get up as monitoring history data store according to this, that is, RRD can't store the detailed historical data collected, it is so main
If the amount of data storage is reduced in position, while in response inquiry, due to data volume be not it is a lot, also will not be too slow.
The following is the detailed description to RDD structures:
RDD carrys out data storage using the space of fixed size, and has a pointer to point to the position of newest data.Can be with
A circle is regarded as in the space of the data base for data storage, there are many scales above.The position that these scales are located is with regard to generation
Where table is used for data storage.So-called pointer, it is believed that be to point to the straight line of these scales from the center of circle.Pointer can be with
The read-write for data is automatically moved.It should be noted that this circle does not have beginning and end, so pointer can be moved always, and
Without the problem for worrying just advance after reaching home.Over time, when all of space has all been filled with data, just again
Start anew storage.The size of so whole memory space is exactly a fixed numerical value.
Relatively, Chinese patent application (application number:CN201610287546.9) provide for a kind of based on time serieses
The supervising data storage method of data base InfluxDB.
Existing time series databases typically all use RRD (Round Robin Database ring-type data bases),
Into the data in RRD, can be sampled first, real storage is carried out to these sampled datas then, so storing in RRD
Historical data actually sample after data, and fict data are running into some real detailed datas of needs
Application scenarios, or need to history wrong data modify backtracking in the case of, RRD is not well positioned to meet demand.
The content of the invention
To overcome the defect of prior art, the technical problem to be solved in the present invention to there is provided a kind of time series databases
Building method, which is supported that storage monitoring data is detailed, is easy to monitor track problems, and data space is with hbase clusters
Extend and linear expansion, without worrying storage problem.
The technical scheme is that:The building method of this time series databases, the method are comprised the following steps:
(1) hbase clusters prepare:Designed with monitoring data feature, time series databases are realized based on hbase;
(2) hbase tables are created:Create table tsdb-uid, tsdb;
(3) monitoring data is accessed;
(4) dictionary data is stored in into tsdb-uid, and sets up dictionary mapping;
(5) data are stored in into tsdb in the form of mapping value integer;
(6) user's inquiry hbase, obtains monitoring data.
The present invention is designed with monitoring data feature, and time series databases are realized based on hbase, so supporting storage
Monitoring data is detailed, is easy to monitor track problems;And the characteristics of thus make use of hbase, storage performance and calculate performance
Can also extend with the extension of hbase clusters, thus data space with hbase clusters extension and linear expansion,
Without worrying storage problem.
A kind of construction system of time series databases is additionally provided, the system includes:
Hbase cluster preparation modules, its configuration are designed come with monitoring data feature, time series data
Realized based on hbase in storehouse;
Hbase table creation modules, which configures to create table tsdb-uid, tsdb;
Monitoring data AM access module, which configures to access monitoring data;
Dictionary data is stored in tsdb-uid by mapping block, its configuration, and sets up dictionary mapping;
Data are stored in tsdb in the form of mapping value integer by memory module, its configuration;
Enquiry module, which is configured to user inquiry hbase, obtains monitoring data.
Description of the drawings
The flow chart that Fig. 1 show the building method of time series databases of the invention.
Fig. 2 show the schematic diagram of table tsdb-uid of the invention.
Fig. 3 show the schematic diagram of table tsdb of the invention.
Specific embodiment
As shown in figure 1, the building method of this time series databases, the method is comprised the following steps:
(1) hbase clusters prepare:Designed with monitoring data feature, time series databases are realized based on hbase;
(2) hbase tables are created:Create table tsdb-uid, tsdb;
(3) monitoring data is accessed;
(4) dictionary data is stored in into tsdb-uid, and sets up dictionary mapping;
(5) data are stored in into tsdb in the form of mapping value integer;
(6) user's inquiry hbase, obtains monitoring data.
The present invention is designed with monitoring data feature, and time series databases are realized based on hbase, so supporting storage
Monitoring data is detailed, is easy to monitor track problems;And the characteristics of thus make use of hbase, storage performance and calculate performance
Can also extend with the extension of hbase clusters, thus data space with hbase clusters extension and linear expansion,
Without worrying storage problem.
In addition, as shown in table 1, in the step (1), field includes:The metric of description monitor control index title, description prison
Tagk, tagv of the attribute tags of control data, describes the value of monitor control index value, describes the timestamp of monitoring period.
Field | Description |
metric | Monitor control index title |
tags | The attribute tags of monitoring data |
value | Monitor control index value |
timestamp | Monitoring period |
Table 1
In addition, in the step (2), tsdb-uid tables are closed as dictionary table, the mapping for preserving metric, tagk, tagv
Metric, tagk, tagv are mapped to numeral, in actual storage data, do not preserve metric by system, and tagk's, tagv is concrete
Value, and simply preserve the numeral being mapped to.In order to more vivid, here with a monitored item data instance:
metric:proc.loadavg.1m
timestamp:1234567890
value:0.42
tags:Host=web42, pool=static
As shown in Fig. 2 tsdb-uid table structures are as follows:
The table mainly preserves some metric as dictionary table, some mapping relations of tagk, tagv, by metric,
Tagk, tagv are mapped to numeral:metric—>3 byte integers, tagk->3 byte integers, tagv->3 byte integers, in reality
During the data storage of border, metric, the occurrence of tagk, tagv are not preserved, and simply preserves the numeral being mapped to, which reduced
Data storage amount, this point embody in the following table.
In addition, in the step (2), tsdb tables include rowkey and column;Rowkey includes field:metric|
Timestamp | value | host=web42 | pool=static, storage when be using tsdb-uid in corresponding 3
Byte integer;Column is configured to the data of a hour, is stored in inside a line.
As shown in figure 3, tsdb table structures are as follows:
1), rowkey designs
For the ease of inquiry, rowkey design packet contains field:
Metric | timestamp | value | host=web42 | pool=static, but when storage be not
Storage is specific to be worth, but using corresponding 3 byte integer in first character allusion quotation table tsdb-uid, mapping relations such as here
For:proc.loadavg.1m—>052、host—>001、web42—>028、pool—>047、static—>001
2) design of column
Later stage further save space for convenience.Here by the data of a hour when design, it is stored in
Inside a line.So timestamp 1234567890 above, understands first mould hour once, draws 1234566000, then
The remainder for arriving be 1890, expression be it be this hour the inside the 1890th second;Then using 1890 as column
Name, and 0.42 is column value.
It will appreciated by the skilled person that all or part of step in realizing above-described embodiment method can be
Instruct related hardware to complete by program, described program can be stored in a computer read/write memory medium,
Upon execution, including each step of above-described embodiment method, and described storage medium can be the program:ROM/RAM, magnetic
Dish, CD, storage card etc..Therefore, corresponding with the method for the present invention, the present invention also includes a kind of time series data simultaneously
The construction system in storehouse, the system are generally represented in the form of the functional module corresponding with each step of method.Using the method
System includes:
Hbase cluster preparation modules, its configuration are designed come with monitoring data feature, time series data
Realized based on hbase in storehouse;
Hbase table creation modules, which configures to create table tsdb-uid, tsdb;
Monitoring data AM access module, which configures to access monitoring data;
Dictionary data is stored in tsdb-uid by mapping block, its configuration, and sets up dictionary mapping;
Data are stored in tsdb in the form of mapping value integer by memory module, its configuration;
Enquiry module, which is configured to user inquiry hbase, obtains monitoring data.
In addition, in the hbase clusters preparation module, field includes:The metric of description monitor control index title, description prison
Tagk, tagv of the attribute tags of control data, describes the value of monitor control index value, describes the timestamp of monitoring period.
In addition, in the hbase tables creation module, tsdb-uid tables preserve metric, tagk, tagv as dictionary table
Mapping relations, metric, tagk, tagv are mapped to into numeral, in actual storage data, the number being mapped to simply is preserved
Word.
In addition, in the hbase tables creation module, tsdb tables include rowkey and column;Rowkey includes field:
Metric | timestamp | value | host=web42 | pool=static, are using in tsdb-uid when storage
Corresponding 3 byte integer;Column is configured to the data of a hour, is stored in inside a line.
Beneficial effects of the present invention are as follows:
1. support that storage monitoring data is detailed, be easy to monitor track problems;
2. data space with hbase clusters extension and linear expansion, without worry storage problem;
3. data query is extended with the extension of hbase clusters, can make full use of the characteristic of hbase.
The above, is only presently preferred embodiments of the present invention, not makees any pro forma restriction to the present invention, it is every according to
According to any simple modification, equivalent variations and modification that the technical spirit of the present invention is made to above example, still belong to the present invention
The protection domain of technical scheme.
Claims (8)
1. a kind of building method of time series databases, it is characterised in that:The method is comprised the following steps:
(1) hbase clusters prepare:Designed with monitoring data feature, time series databases are realized based on hbase;
(2) hbase tables are created:Create table tsdb-uid, tsdb;
(3) monitoring data is accessed;
(4) dictionary data is stored in into tsdb-uid, and sets up dictionary mapping;
(5) data are stored in into tsdb in the form of mapping value integer;
(6) user's inquiry hbase, obtains monitoring data.
2. the building method of time series databases according to claim 1, it is characterised in that:In the step (1), word
Section includes:The metric of description monitor control index title, describes tagk, tagv of the attribute tags of monitoring data, and description monitoring refers to
The value of scale value, describes the timestamp of monitoring period.
3. the building method of time series databases according to claim 2, it is characterised in that:In the step (2),
Tsdb-uid tables preserve the mapping relations of metric, tagk, tagv, metric, tagk, tagv are mapped to as dictionary table
Numeral, in actual storage data, does not preserve metric, the occurrence of tagk, tagv, and simply preserves the numeral being mapped to.
4. the building method of time series databases according to claim 3, it is characterised in that:In the step (2),
Tsdb tables include rowkey and column;Rowkey includes field:Metric | timestamp | value | host=web42 |
Pool=static, storage when be using tsdb-uid in corresponding 3 byte integer;Column is configured to little by one
When data, be stored in inside a line.
5. the construction system of a kind of time series databases, it is characterised in that:The system includes:
Hbase cluster preparation modules, its configuration come with monitoring data feature designing, time series databases based on hbase come
Realize;
Hbase table creation modules, which configures to create table tsdb-uid, tsdb;
Monitoring data AM access module, which configures to access monitoring data;
Dictionary data is stored in tsdb-uid by mapping block, its configuration, and sets up dictionary mapping;
Data are stored in tsdb in the form of mapping value integer by memory module, its configuration;
Enquiry module, which is configured to user inquiry hbase, obtains monitoring data.
6. the construction system of time series databases according to claim 5, it is characterised in that:The hbase clusters are accurate
In standby module, field includes:The metric of description monitor control index title, describes tagk, tagv of the attribute tags of monitoring data,
The value of description monitor control index value, describes the timestamp of monitoring period.
7. the construction system of time series databases according to claim 6, it is characterised in that:The hbase tables are created
In module, tsdb-uid tables preserve the mapping relations of metric, tagk, tagv, by metric, tagk, tagv as dictionary table
Numeral is mapped to, in actual storage data, the numeral being mapped to simply is preserved.
8. the construction system of time series databases according to claim 7, it is characterised in that:The hbase tables are created
In module, tsdb tables include rowkey and column;Rowkey includes field:Metric | timestamp | value | host=
Web42 | pool=static, storage when be using tsdb-uid in corresponding 3 byte integer;Column be configured to by
The data of one hour, are stored in inside a line.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610960702.3A CN106547888A (en) | 2016-11-04 | 2016-11-04 | A kind of building method and system of time series databases |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610960702.3A CN106547888A (en) | 2016-11-04 | 2016-11-04 | A kind of building method and system of time series databases |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106547888A true CN106547888A (en) | 2017-03-29 |
Family
ID=58393981
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610960702.3A Pending CN106547888A (en) | 2016-11-04 | 2016-11-04 | A kind of building method and system of time series databases |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106547888A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111563019A (en) * | 2020-04-29 | 2020-08-21 | 厦门市美亚柏科信息股份有限公司 | Service assembly monitoring method, system and computer storage medium |
CN111813782A (en) * | 2020-07-14 | 2020-10-23 | 杭州海康威视数字技术股份有限公司 | Time sequence data storage method and device |
CN112084226A (en) * | 2019-06-13 | 2020-12-15 | 北京京东尚科信息技术有限公司 | Data processing method, system, device and computer readable storage medium |
CN112506735A (en) * | 2020-11-26 | 2021-03-16 | 中移(杭州)信息技术有限公司 | Service quality monitoring method, system, server and storage medium |
CN112579390A (en) * | 2020-12-04 | 2021-03-30 | 麒麟软件有限公司 | Monitoring data storage method and system based on real-time memory TSDB alarm |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105430030A (en) * | 2014-09-16 | 2016-03-23 | 钛马信息网络技术有限公司 | OSG-based parallel extendable application server |
CN105490864A (en) * | 2014-09-16 | 2016-04-13 | 钛马信息网络技术有限公司 | Business module monitoring method based on OSGI |
CN105930491A (en) * | 2016-04-28 | 2016-09-07 | 安徽四创电子股份有限公司 | Monitoring data storage method based on time sequence database InfluxDB |
-
2016
- 2016-11-04 CN CN201610960702.3A patent/CN106547888A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105430030A (en) * | 2014-09-16 | 2016-03-23 | 钛马信息网络技术有限公司 | OSG-based parallel extendable application server |
CN105490864A (en) * | 2014-09-16 | 2016-04-13 | 钛马信息网络技术有限公司 | Business module monitoring method based on OSGI |
CN105930491A (en) * | 2016-04-28 | 2016-09-07 | 安徽四创电子股份有限公司 | Monitoring data storage method based on time sequence database InfluxDB |
Non-Patent Citations (1)
Title |
---|
SSTUTU: "[介绍解说] 整体认识OpenTSDB:OpenTSDB基于HBase编写的分布式、可扩展的时间序列数据库", 《ABOUTYUN.COM》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112084226A (en) * | 2019-06-13 | 2020-12-15 | 北京京东尚科信息技术有限公司 | Data processing method, system, device and computer readable storage medium |
CN111563019A (en) * | 2020-04-29 | 2020-08-21 | 厦门市美亚柏科信息股份有限公司 | Service assembly monitoring method, system and computer storage medium |
CN111813782A (en) * | 2020-07-14 | 2020-10-23 | 杭州海康威视数字技术股份有限公司 | Time sequence data storage method and device |
CN112506735A (en) * | 2020-11-26 | 2021-03-16 | 中移(杭州)信息技术有限公司 | Service quality monitoring method, system, server and storage medium |
CN112506735B (en) * | 2020-11-26 | 2022-12-13 | 中移(杭州)信息技术有限公司 | Service quality monitoring method, system, server and storage medium |
CN112579390A (en) * | 2020-12-04 | 2021-03-30 | 麒麟软件有限公司 | Monitoring data storage method and system based on real-time memory TSDB alarm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106547888A (en) | A kind of building method and system of time series databases | |
Poorthuis et al. | Making big data small: strategies to expand urban and geographical research using social media | |
CN102902730B (en) | Based on data reading method and the device of data buffer storage | |
CN107038207A (en) | A kind of data query method, data processing method and device | |
CN106202548B (en) | Date storage method, lookup method and device | |
CN107423422B (en) | Spatial data distributed storage and search method and system based on grid | |
CN100468402C (en) | Sort data storage and split catalog inquiry method based on catalog tree | |
Zhang et al. | Hbasespatial: A scalable spatial data storage based on hbase | |
CN110275920A (en) | Data query method, apparatus, electronic equipment and computer readable storage medium | |
CN102231168B (en) | Method for quickly retrieving resume from resume database | |
CN110109910A (en) | Data processing method and system, electronic equipment and computer readable storage medium | |
CN103902653A (en) | Method and device for creating data warehouse table blood relationship graph | |
US20220019739A1 (en) | Item Recall Method and System, Electronic Device and Readable Storage Medium | |
CN104021198A (en) | Relational database information retrieval method and device based on ontology semantic index | |
US10762068B2 (en) | Virtual columns to expose row specific details for query execution in column store databases | |
CN103778133A (en) | Database object changing method and device | |
JP7153420B2 (en) | Using B-Trees to Store Graph Information in a Database | |
CN103455335A (en) | Multilevel classification Web implementation method | |
CN108038018A (en) | Expansible daily record data storage method and device | |
CN107908794A (en) | A kind of method of data mining, system, equipment and computer-readable recording medium | |
CN109933803A (en) | A kind of Chinese idiom information displaying method shows device, electronic equipment and storage medium | |
JP2022137281A (en) | Data query method, device, electronic device, storage medium, and program | |
CN116339643B (en) | Formatting method, formatting device, formatting equipment and formatting medium for disk array | |
CN104391947A (en) | Real-time processing method and system of mass GIS (geographic information system) data | |
JP2014130492A (en) | Generation method for index and computer system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170329 |