CN106547888A

CN106547888A - A kind of building method and system of time series databases

Info

Publication number: CN106547888A
Application number: CN201610960702.3A
Authority: CN
Inventors: 何良均; 张翼; 温宗臣; 冯森林; 李冰; 张书凡; 范卫卫; 赵志华
Original assignee: BEIJING GEO POLYMERIZATION TECHNOLOGY Co Ltd
Current assignee: BEIJING GEO POLYMERIZATION TECHNOLOGY Co Ltd
Priority date: 2016-11-04
Filing date: 2016-11-04
Publication date: 2017-03-29

Abstract

A kind of building method of time series databases, which supports that storage monitoring data is detailed, is easy to monitor track problems, data space with hbase clusters extension and linear expansion, without worrying storage problem.The method includes：(1) hbase clusters prepare：Designed with monitoring data feature, time series databases are realized based on hbase；(2) hbase tables are created：Create table tsdb uid, tsdb；(3) monitoring data is accessed；(4) dictionary data is stored in into tsdb uid, and sets up dictionary mapping；(5) data are stored in into tsdb in the form of mapping value integer；(6) user's inquiry hbase, obtains monitoring data.The construction system of also a kind of time series databases.

Description

A kind of building method and system of time series databases

Technical field

The present invention relates to the technical field that big data is processed, more particularly to a kind of building method of time series databases, And the construction system of time series databases.

Background technology

Time series databases, are typically mainly used in the storage and inquiry of data in monitoring system, such as in some grand dukes Various monitoring datas, including the network equipment, operation can be collected from various application services and cluster server node by department, O＆M team System, Application Status etc., and need to store these monitoring datas and inquired about with tracking problem, due to data volume it is huge Greatly, various serious challenges can be run in storage and inquiry, in order to solve storage and query performance, existing time series data Storehouse typically all adopts RRD (Round Robin Database ring-type data bases), and this time series databases can be to storage Data are periodically sampled (sampling policy includes that maximum, minima and meansigma methodss are sampled), and the number that these are sampled Get up as monitoring history data store according to this, that is, RRD can't store the detailed historical data collected, it is so main If the amount of data storage is reduced in position, while in response inquiry, due to data volume be not it is a lot, also will not be too slow.

The following is the detailed description to RDD structures：

RDD carrys out data storage using the space of fixed size, and has a pointer to point to the position of newest data.Can be with A circle is regarded as in the space of the data base for data storage, there are many scales above.The position that these scales are located is with regard to generation Where table is used for data storage.So-called pointer, it is believed that be to point to the straight line of these scales from the center of circle.Pointer can be with The read-write for data is automatically moved.It should be noted that this circle does not have beginning and end, so pointer can be moved always, and Without the problem for worrying just advance after reaching home.Over time, when all of space has all been filled with data, just again Start anew storage.The size of so whole memory space is exactly a fixed numerical value.

Relatively, Chinese patent application (application number：CN201610287546.9) provide for a kind of based on time serieses The supervising data storage method of data base InfluxDB.

Existing time series databases typically all use RRD (Round Robin Database ring-type data bases), Into the data in RRD, can be sampled first, real storage is carried out to these sampled datas then, so storing in RRD Historical data actually sample after data, and fict data are running into some real detailed datas of needs Application scenarios, or need to history wrong data modify backtracking in the case of, RRD is not well positioned to meet demand.

The content of the invention

To overcome the defect of prior art, the technical problem to be solved in the present invention to there is provided a kind of time series databases Building method, which is supported that storage monitoring data is detailed, is easy to monitor track problems, and data space is with hbase clusters Extend and linear expansion, without worrying storage problem.

The technical scheme is that：The building method of this time series databases, the method are comprised the following steps：

(1) hbase clusters prepare：Designed with monitoring data feature, time series databases are realized based on hbase；

(2) hbase tables are created：Create table tsdb-uid, tsdb；

(3) monitoring data is accessed；

(4) dictionary data is stored in into tsdb-uid, and sets up dictionary mapping；

(5) data are stored in into tsdb in the form of mapping value integer；

(6) user's inquiry hbase, obtains monitoring data.

The present invention is designed with monitoring data feature, and time series databases are realized based on hbase, so supporting storage Monitoring data is detailed, is easy to monitor track problems；And the characteristics of thus make use of hbase, storage performance and calculate performance Can also extend with the extension of hbase clusters, thus data space with hbase clusters extension and linear expansion, Without worrying storage problem.

A kind of construction system of time series databases is additionally provided, the system includes：

Hbase cluster preparation modules, its configuration are designed come with monitoring data feature, time series data

Realized based on hbase in storehouse；

Hbase table creation modules, which configures to create table tsdb-uid, tsdb；

Monitoring data AM access module, which configures to access monitoring data；

Dictionary data is stored in tsdb-uid by mapping block, its configuration, and sets up dictionary mapping；

Data are stored in tsdb in the form of mapping value integer by memory module, its configuration；

Enquiry module, which is configured to user inquiry hbase, obtains monitoring data.

Description of the drawings

The flow chart that Fig. 1 show the building method of time series databases of the invention.

Fig. 2 show the schematic diagram of table tsdb-uid of the invention.

Fig. 3 show the schematic diagram of table tsdb of the invention.

Specific embodiment

As shown in figure 1, the building method of this time series databases, the method is comprised the following steps：

(2) hbase tables are created：Create table tsdb-uid, tsdb；

(3) monitoring data is accessed；

(5) data are stored in into tsdb in the form of mapping value integer；

(6) user's inquiry hbase, obtains monitoring data.

In addition, as shown in table 1, in the step (1), field includes：The metric of description monitor control index title, description prison Tagk, tagv of the attribute tags of control data, describes the value of monitor control index value, describes the timestamp of monitoring period.

Field	Description
		metric	Monitor control index title
tags	The attribute tags of monitoring data
		value	Monitor control index value
timestamp	Monitoring period

Table 1

In addition, in the step (2), tsdb-uid tables are closed as dictionary table, the mapping for preserving metric, tagk, tagv Metric, tagk, tagv are mapped to numeral, in actual storage data, do not preserve metric by system, and tagk's, tagv is concrete Value, and simply preserve the numeral being mapped to.In order to more vivid, here with a monitored item data instance：

metric：proc.loadavg.1m

timestamp：1234567890

value：0.42

tags：Host=web42, pool=static

As shown in Fig. 2 tsdb-uid table structures are as follows：

The table mainly preserves some metric as dictionary table, some mapping relations of tagk, tagv, by metric, Tagk, tagv are mapped to numeral：metric—>3 byte integers, tagk->3 byte integers, tagv->3 byte integers, in reality During the data storage of border, metric, the occurrence of tagk, tagv are not preserved, and simply preserves the numeral being mapped to, which reduced Data storage amount, this point embody in the following table.

In addition, in the step (2), tsdb tables include rowkey and column；Rowkey includes field：metric| Timestamp | value | host=web42 | pool=static, storage when be using tsdb-uid in corresponding 3 Byte integer；Column is configured to the data of a hour, is stored in inside a line.

As shown in figure 3, tsdb table structures are as follows：

1), rowkey designs

For the ease of inquiry, rowkey design packet contains field：

Metric | timestamp | value | host=web42 | pool=static, but when storage be not Storage is specific to be worth, but using corresponding 3 byte integer in first character allusion quotation table tsdb-uid, mapping relations such as here For：proc.loadavg.1m—>052、host—>001、web42—>028、pool—>047、static—>001

2) design of column

Later stage further save space for convenience.Here by the data of a hour when design, it is stored in Inside a line.So timestamp 1234567890 above, understands first mould hour once, draws 1234566000, then The remainder for arriving be 1890, expression be it be this hour the inside the 1890th second；Then using 1890 as column Name, and 0.42 is column value.

It will appreciated by the skilled person that all or part of step in realizing above-described embodiment method can be Instruct related hardware to complete by program, described program can be stored in a computer read/write memory medium, Upon execution, including each step of above-described embodiment method, and described storage medium can be the program：ROM/RAM, magnetic Dish, CD, storage card etc..Therefore, corresponding with the method for the present invention, the present invention also includes a kind of time series data simultaneously The construction system in storehouse, the system are generally represented in the form of the functional module corresponding with each step of method.Using the method System includes：

Realized based on hbase in storehouse；

Monitoring data AM access module, which configures to access monitoring data；

In addition, in the hbase clusters preparation module, field includes：The metric of description monitor control index title, description prison Tagk, tagv of the attribute tags of control data, describes the value of monitor control index value, describes the timestamp of monitoring period.

In addition, in the hbase tables creation module, tsdb-uid tables preserve metric, tagk, tagv as dictionary table Mapping relations, metric, tagk, tagv are mapped to into numeral, in actual storage data, the number being mapped to simply is preserved Word.

In addition, in the hbase tables creation module, tsdb tables include rowkey and column；Rowkey includes field： Metric | timestamp | value | host=web42 | pool=static, are using in tsdb-uid when storage Corresponding 3 byte integer；Column is configured to the data of a hour, is stored in inside a line.

Beneficial effects of the present invention are as follows：

1. support that storage monitoring data is detailed, be easy to monitor track problems；

2. data space with hbase clusters extension and linear expansion, without worry storage problem；

3. data query is extended with the extension of hbase clusters, can make full use of the characteristic of hbase.

The above, is only presently preferred embodiments of the present invention, not makees any pro forma restriction to the present invention, it is every according to According to any simple modification, equivalent variations and modification that the technical spirit of the present invention is made to above example, still belong to the present invention The protection domain of technical scheme.

Claims

1. a kind of building method of time series databases, it is characterised in that：The method is comprised the following steps：

(2) hbase tables are created：Create table tsdb-uid, tsdb；

(3) monitoring data is accessed；

(5) data are stored in into tsdb in the form of mapping value integer；

(6) user's inquiry hbase, obtains monitoring data.

2. the building method of time series databases according to claim 1, it is characterised in that：In the step (1), word Section includes：The metric of description monitor control index title, describes tagk, tagv of the attribute tags of monitoring data, and description monitoring refers to The value of scale value, describes the timestamp of monitoring period.

3. the building method of time series databases according to claim 2, it is characterised in that：In the step (2), Tsdb-uid tables preserve the mapping relations of metric, tagk, tagv, metric, tagk, tagv are mapped to as dictionary table Numeral, in actual storage data, does not preserve metric, the occurrence of tagk, tagv, and simply preserves the numeral being mapped to.

4. the building method of time series databases according to claim 3, it is characterised in that：In the step (2), Tsdb tables include rowkey and column；Rowkey includes field：Metric | timestamp | value | host=web42 | Pool=static, storage when be using tsdb-uid in corresponding 3 byte integer；Column is configured to little by one When data, be stored in inside a line.

5. the construction system of a kind of time series databases, it is characterised in that：The system includes：

Hbase cluster preparation modules, its configuration come with monitoring data feature designing, time series databases based on hbase come Realize；

Monitoring data AM access module, which configures to access monitoring data；

6. the construction system of time series databases according to claim 5, it is characterised in that：The hbase clusters are accurate In standby module, field includes：The metric of description monitor control index title, describes tagk, tagv of the attribute tags of monitoring data, The value of description monitor control index value, describes the timestamp of monitoring period.

7. the construction system of time series databases according to claim 6, it is characterised in that：The hbase tables are created In module, tsdb-uid tables preserve the mapping relations of metric, tagk, tagv, by metric, tagk, tagv as dictionary table Numeral is mapped to, in actual storage data, the numeral being mapped to simply is preserved.

8. the construction system of time series databases according to claim 7, it is characterised in that：The hbase tables are created In module, tsdb tables include rowkey and column；Rowkey includes field：Metric | timestamp | value | host= Web42 | pool=static, storage when be using tsdb-uid in corresponding 3 byte integer；Column be configured to by The data of one hour, are stored in inside a line.