CN105005617B

CN105005617B - A kind of storage method and device of time series data

Info

Publication number: CN105005617B
Application number: CN201510429895.5A
Authority: CN
Inventors: 李江颖; 吴培荣; 程衍明; 陈刚; 杨超
Original assignee: NAVIMENTUM INFORMATION SYSTEM CO Ltd
Current assignee: NAVIMENTUM INFORMATION SYSTEM CO Ltd
Priority date: 2015-07-21
Filing date: 2015-07-21
Publication date: 2018-10-12
Anticipated expiration: 2035-07-21
Also published as: CN105005617A

Abstract

The invention discloses a kind of storage method of time series data and devices.Wherein, this method includes：It will be in data to be stored storage to Hbase databases；Data volume in Hbase databases is counted, statistical result is obtained；Merge in measurement period in n-th and the data in Hbase databases are merged, the file after being merged；The N+1 data storage capacity s merged in measurement period is obtained according to statistical result；Judge whether data storage capacity s merges the preset data amount of storage of measurement period more than the N+1；If it is not, being split to the file after merging.The technical issues of present invention solves the storage capacity requirement that cannot be satisfied data in the prior art and can not ensure data storage stability.

Description

A kind of storage method and device of time series data

Technical field

The present invention relates to field of computer technology more particularly to the storage methods and device of a kind of time series data.

Background technology

In industrial information process, real time historical database always is the direction of industrial circle primary study, involved And industry include oil, electric power, metallurgy, chemical industry etc..Real time historical database is equal in directions such as data monitoring, management and storages Status with core, creation data is described using measuring point as basic unit, and a measuring point represents an actual number According to the temperature etc. of voltage, some test point on source, such as a certain power transmission line.It is growing in commercial Application in order to meet Transactional demands, every field all carried out a large amount of system research to real time historical database.

With the rapid development of Internet of Things, various smart machines are widely used in industrial circle, more and more numbers According in collected real time historical database.Since production control process is to run without interruption for 24 hours, with the time Passage, the total amount of historical data constantly accumulates, and the historical data total amount of generation can reach TB even PB ranks.Current reality When historical data base be deployed on server of good performance, but the historical data of magnanimity still can give current mainstream clothes Business device hardware strap carrys out pressure, or even is difficult to be solved the problems, such as by way of hardware expanding.

Industrial data have typical time series data feature, by smart machine upload data be typically band sometimes Between label data flow, time tag is the important screening conditions of the inquiry of historical data.Current real time historical database is adopted With special storage mode, to improve the processing capacity to time related sequence data, but the scalability of its storage is still So there are bottlenecks.

The distributed file system of Hbase (Hadoop Database, distributed memory system) bottom as it is a kind of it is low at This distributed storage solution, is widely used in recent years.But Hbase in face of mass data when being written Having certain delay, mass data, which enters Hbase, can ceaselessly trigger big file Split operation in Hbase, and file Segmentation is usually happened at after file Merge operation, and the I/O loads in cluster are extremely high at this time.Meanwhile file division meeting Cause the temporary offline of server, data filing request that can be blocked, system response time can be extremely unstable, the filing of data Speed is unable to get guarantee.

In conclusion in face of the mass data of industrial circle, currently without a kind of storage that can both meet current data Capacity requirement, and can ensure the method for data storage stability.

Invention content

The application provides a kind of storage method and device of time series data, solves and cannot be satisfied number in the prior art According to storage capacity requirement and the technical issues of can not ensure data storage stability.

An embodiment of the present invention provides a kind of storage methods of time series data, including：

It will be in data to be stored storage to Hbase databases；

Data volume in the Hbase databases is counted, statistical result is obtained；

Merge in measurement period in n-th and the data in the Hbase databases are merged, the text after being merged Part；

The N+1 data storage capacity s merged in measurement period is obtained according to the statistical result；

Judge whether the data storage capacity s merges the preset data amount of storage of measurement period more than the N+1；

If it is not, being then split to the file after the merging.

Further, it in the storage to Hbase databases by data to be stored, specifically includes：

The data to be stored is stored in buffer area；

Judge whether the data capacity in the buffer area reaches preset data capacity threshold；

And/or

Judge whether the storage period of the buffer area reaches default storage Ct value；

If the data capacity in the buffer area reaches the storage of the preset data capacity threshold and/or the buffer area Period reaches the default storage Ct value, will be in data to be stored storage to the Hbase databases.

Further, it in the storage to the Hbase databases by the data to be stored, specifically includes：

The mark of the data to be stored and storage time are stabbed to the line unit as the Hbase databases；

Using the content of the data to be stored as the filing data of the Hbase databases；

Using the line unit and the filing data as the data item of Hbase databases storage to the Hbase numbers According in library.

Further, the data volume in the Hbase databases counts, and specifically includes：

Judge the storage moment T of the data to be stored₀Whether earlier than the preset earliest statistics moment；

If so, the data to be stored is abandoned；

Judge the storage moment T of the data to be stored₀Whether between preset earliest statistics moment and preset Between the statistics moment the latest；

If so, the data volume in the former Hbase databases is added the data to be stored, statistics is completed；

Judge whether the storage moment T0 of the data to be stored is later than and described preset counts the moment the latest；

If so, using the storage moment T0 of the data to be stored as the new statistics moment the latest, and by the original Hbase Data volume in database adds the data to be stored, completes statistics.

Further, the data in the Hbase databases merge, and specifically include：

Judge that the Hbase databases merge whether the data writing in measurement period reaches default in the n-th Data writing threshold value；

If so, the data item in the Hbase databases, which is pressed ranks key ascending order, is written an independent file；

Judge whether the number of file under corresponding server in the Hbase databases is more than that default file merges threshold value；

If so, merging the file in the corresponding server.

The embodiment of the present invention additionally provides a kind of storage device of time series data, including：

Data memory module, for storing data to be stored into Hbase databases；

Data statistics module obtains statistical result for being counted to the data volume in the Hbase databases；

Data combiners block closes the data in the Hbase databases for merging in measurement period in n-th And the file after being merged；

Data processing module obtains the N+1 data merged in measurement period storage for described according to statistical result Measure s；

Judgment module, for judge the data storage capacity s whether more than the N+1 merging measurement period preset data Amount of storage；

Divide module, if the judging result for the judgment module is no, the file after the merging is split.

Further, the data memory module, specifically includes：

Data storage execution unit, for the data to be stored to be stored in buffer area；

Whether the first judging unit, the data capacity for judging in the buffer area reach preset data capacity threshold；

First data storage subunit operable will be described to be stored if the judging result for first judging unit is yes In data storage to the Hbase databases；

Second judgment unit, for judging whether the storage period of the buffer area reaches default storage Ct value；

Second data storage subunit operable will be described to be stored if the judging result for the second judgment unit is yes In data storage to the Hbase databases.

Further, first data storage subunit operable, if specifically for the judging result of first judging unit It is yes, the mark of the data to be stored and storage time is stabbed into the line unit as the Hbase databases, it will be described to be stored Filing data of the content of data as the Hbase databases；Using the line unit and the filing data as the Hbase In the data item storage to the Hbase databases of database；

Second data storage subunit operable, if the judging result specifically for the second judgment unit is yes, by institute The line unit of the mark and storage time stamp of data to be stored as the Hbase databases is stated, it will be in the data to be stored Hold the filing data as the Hbase databases；Using the line unit and the filing data as the Hbase databases In data item storage to the Hbase databases.

Further, the data statistics module, specifically includes：

Third judging unit, the storage moment T for judging the data to be stored₀Whether earlier than preset earliest statistics Moment；

Data discarding unit loses the data to be stored if the judging result for the third judging unit is yes It abandons；

4th judging unit, the storage moment T for judging the data to be stored₀Whether between described preset earliest It counts between moment and preset statistics moment the latest；

First objects of statistics, if the judging result for the 4th judging unit is yes, by the former Hbase databases In data volume add the data to be stored, complete statistics, obtain the statistical result；

5th judging unit, the storage moment T for judging the data to be stored₀Whether be later than it is described it is preset the latest Count the moment；

Second objects of statistics, if the judging result for the 5th judging unit is yes, by the data to be stored Store moment T₀As the new statistics moment the latest, and by the data volume in the former Hbase databases plus described to be stored Data complete statistics, obtain the statistical result.

Further, the data combiners block, specifically includes：

6th judging unit is write for judging that the Hbase databases merge the data in measurement period in the n-th Enter whether amount reaches preset data writing threshold value；

File generating unit will be in the Hbase databases if the judging result for the 6th judging unit is yes Data item press ranks key ascending order be written an independent file；

7th judging unit, for judge file under corresponding server in the Hbase databases number whether be more than Default file merges threshold value；

Merge execution unit and merges the corresponding server if the judging result for the 7th judging unit is yes In file, obtain the file after the merging.

The one or more technical solutions provided in the embodiment of the present invention, have at least the following technical effects or advantages：

1, data to be stored is stored into Hbase databases, and the data volume in Hbase databases is counted, Obtain statistical result；The data storage capacity s in next merging measurement period is obtained according to statistical result；Judge data storage capacity Whether s is beyond the preset amount of storage for merging data in measurement period；If it is not, being split to the file after merging.Due to Hbase is a support mass data storage, and has the distributed data base of expanding storage depth and data redundancy ability, Therefore the embodiment of the present invention disclosure satisfy that the memory requirement of historical time sequence data.Further, since Hbase in the prior art What merging and segmentation to internal data file were taken is the static policies based on preset parameter, thus cannot be adapted to sea well Measure the persistence requirement of time series data.The application improves Hbase internal data texts by the scheme based on dynamic statistics Part merges the efficiency with segmentation, not only reduces the delay of data access, and also assure the stability of data access, very well The persistence that ground has been adapted to magnanimity time series data wants summed data visiting demand.

2, first data buffer storage to be stored is got up, then the data for a period of time being cached disposably is stored in In Hbase, to reduce the number of magnetic disc i/o number and network communication during data persistence, and then greatly improve The persistence efficiency of data and the data throughout of Hbase.

3, the mark of data to be stored and storage time are stabbed to the line unit as Hbase databases；By data to be stored Filing data of the content as Hbase databases；Line unit and filing data are arrived as the data item storage of Hbase databases In Hbase databases.The embodiment of the present invention is proposed the scheme of time series data storage to Hbase databases, it is ensured that Hbase databases can adapt to the memory requirement of historical data.Due to the embodiment of the present invention to the line units of Hbase databases according to The mark of data to be stored carries out classification and ordination and simultaneously implements distributed storage, thus can be by when on the one hand ensure that data storage It is reasonably distributed under distributed environment on each node of Hbase databases, without triggering the read-write in Hbase databases Hot spot；On the other hand also ensure that the data of same observation station will not be spread out in physical store and the height of inquiry is caused to open Pin and poor efficiency.In addition, continuous data arrangement allows the inquiry of historical data easily using the scanning of Hbase databases Required network transmission volume is inquired in mechanism, reduction every time.

Description of the drawings

Fig. 1 is the flow chart of the storage method for the time series data that the embodiment of the present invention one provides；

Fig. 2 is that the storage method of a time series data provided through the embodiment of the present invention carries out industrial historical data The flow chart of storage；

Fig. 3 is that data are stored in Hbase's by the storage method of a time series data provided through the embodiment of the present invention Principle schematic；

Fig. 4 is the storage method of a time series data provided through the embodiment of the present invention to each server in cluster The flow chart that upper each writing for merging historical data in measurement period is counted；

Fig. 5 is that the storage method of a time series data provided through the embodiment of the present invention carries out certainly Hbase files Adapt to the flow chart of segmentation；

Fig. 6 is the module map of the storage device of time series data provided by Embodiment 2 of the present invention.

Specific implementation mode

The embodiment of the present invention solves in the prior art by providing a kind of storage method and device of time series data The technical issues of cannot be satisfied the storage capacity requirement of data and can not ensureing data storage stability.

Technical solution in the embodiment of the present invention is in order to solve the above technical problems, general thought is as follows：

Data to be stored is stored into Hbase databases, and the data volume in Hbase databases is counted, is obtained Obtain statistical result；The data storage capacity s in next merging measurement period is obtained according to statistical result；Judge data storage capacity s Whether beyond the preset amount of storage for merging data in measurement period；If it is not, being split to the file after merging.Due to Hbase is a support mass data storage, and has the distributed data base of expanding storage depth and data redundancy ability, Therefore the embodiment of the present invention disclosure satisfy that the memory requirement of historical time sequence data.Further, since Hbase in the prior art What merging and segmentation to internal data file were taken is the static policies based on preset parameter, thus cannot be adapted to sea well Measure the persistence requirement of time series data.The embodiment of the present invention is improved by the scheme based on dynamic statistics inside Hbase Data file merges the efficiency with segmentation, not only reduces the delay of data access, and also assure the stabilization of data access Property, the persistence for being adapted to magnanimity time series data well wants summed data visiting demand.

Above-mentioned technical proposal in order to better understand, in conjunction with appended figures and specific embodiments to upper Technical solution is stated to be described in detail.

Embodiment one

Referring to Fig. 1, the storage method of time series data provided in an embodiment of the present invention, including：

Step S110：It will be in data to be stored storage to Hbase databases；

This step is illustrated, step S110 is specifically included：

Data to be stored is stored in buffer area；

Judge whether the data capacity in buffer area reaches preset data capacity threshold；

And/or

Judge whether the storage period of buffer area reaches default storage Ct value；

If the data capacity in buffer area reaches preset data capacity threshold and/or the storage period of buffer area reaches default Ct value is stored, it will be in data to be stored storage to Hbase databases；

Otherwise, without any processing.

It is illustrated to storing data to be stored to the specific steps in Hbase databases in step S110：

The mark of data to be stored and storage time are stabbed to the line unit as Hbase databases；

Using the content of data to be stored as the filing data of Hbase databases；

Using line unit and filing data as in the storage to Hbase databases of the data item of Hbase databases.

In the present embodiment, the reality when length of each data item is depending on historical data archiving in Hbase databases Size.

Step S120：Data volume in Hbase databases is counted, statistical result is obtained；

This step is illustrated, step S120 is specifically included：

Judge the storage moment T of data to be stored₀Whether earlier than the preset earliest statistics moment；

If so, illustrating that the deposit of data to be stored is illegal, data to be stored is abandoned；

Judge the storage moment T of data to be stored₀Whether preset count between the preset earliest statistics moment and the latest Between moment；

If so, the data volume in former Hbase databases is added data to be stored, statistics is completed；

Judge the storage moment T of data to be stored₀Whether it is later than and preset counts the moment the latest；

If so, by the storage moment T of data to be stored₀As the new statistics moment the latest, and will be in former Hbase databases Data volume add data to be stored, complete statistics.

Step S130：Merge in measurement period in n-th and the data in Hbase databases are merged, is merged File afterwards；

This step is illustrated, step S130 is specifically included：

Judge whether the data writing that Hbase databases merge in n-th in measurement period reaches preset data write-in Measure threshold value；

If so, the data item in Hbase databases, which is pressed ranks key ascending order, is written an independent file；

If it is not, then without any processing.

Judge whether the number of file under corresponding server in Hbase databases is more than that default file merges threshold value；

If so, merging the file in corresponding server；

If it is not, then without any processing.

Step S140：The N+1 data storage capacity s merged in measurement period is obtained according to statistical result；

Step S150：Judge whether data storage capacity s merges the storage of the preset data of measurement period more than the N+1 Amount；

If so, without any processing；

If it is not, being then split to the file after merging.

Referring to Fig. 2, the specific steps that the method that provides through the embodiment of the present invention stores industrial historical data are such as Under：

Step 301：Initialization.Create buffer area and Hbase client pools；Detailed process includes：Utilize the API of Hbase (Application Program interface, application programming interfaces) function opens Hbase tables of data.If return value is sky, Illustrate historical data table informal in Hbase, then presses unified naming rule HisData and create a table H.Otherwise, it says Existing formal historical data table in bright Hbase.Column family where whether having historical data in table H is judged again；If it is not, Column family F is created by unified naming rule；If so, all measuring point arrangement information are then read from database, for each survey The consistent buffer area of point two block sizes of distribution, the data value number that each measuring point may store in buffer area, which is fixed and opened, determines When device thread；The ponds the HTable Pool of two fixed sizes is created according to configuration_FAnd Pool_AManage the write-in of measuring point, this be for Ensure that the safety of data under multi-thread environment, all Hbase clients connection ZooKeeper simultaneously keep session.Meanwhile matching It sets all by Pool_FThe write request of client is directly delivered to corresponding server in pond, and configuration is all to pass through Pool_APond Interior each object opens included spatial cache, and data use and automatically write mode.Wherein, the number of object depends in pond The performance of current server and the needs of practical application scene.

Step 302：It receives data and is inserted into request, by data deposit measuring point N_IDCorresponding read-write state is the caching of write state Then area updates measuring point N_IDBuffer data size information DataNum in corresponding buffer area.

Step 303：Judge measuring point N_IDIt is full whether corresponding buffer area is write, that is, judges whether is data capacity in buffer area Reach preset data capacity threshold；If so, thening follow the steps 304；If not, thening follow the steps 305.

Step 304：Acquisition request HTable pond Pool_FInterior currently available idle object C_F.If it is right to return to a sky As, then it represents that currently without idle object, continue to attempt to obtain idle object C after waiting for 1 millisecond_F.Otherwise the ponds HTable are obtained Pool_FInterior currently available idle object C_F, referring to Fig. 3, the mark for the data being inserted into according to request and request moment generate row Key RowKey, and filing data ColumnData is generated using the data of buffer area as byte arrays, by RowKey and The data item Item of ColumnData composition filings Hbase.Utilize the available idle object C_FThe write-in interface of offer will Item is delivered to corresponding server in Hbase and is handled immediately；

Step 305：Judge measuring point N_IDBuffering area write time whether be more than buffer area set by timer filing Time；If it does, thening follow the steps 306；If be not above, terminate this flow.

Step 306：Acquisition request HTable pond Pool_AInterior currently available idle object C_A.If it is right to return to a sky As, then it represents that currently without idle object, continue to attempt to obtain idle object C after waiting for 1 millisecond_A.Otherwise the ponds HTable are obtained Pool_AInterior currently available idle object C_A.The mark for the data being inserted into according to request and request moment generate line unit RowKey, and filing data ColumnData is generated using the data of buffer area as byte arrays, by RowKey and The data item Item of ColumnData composition filings Hbase.Utilize the available idle object C_AThe interface of offer founds Item Corresponding server in Hbase is delivered to be handled.

It should be noted that after industrial historical data enters Hbase, the embodiment of the present invention can be constantly in statistical cluster Respectively merge the writing of historical data in measurement period on each server.When carrying out file division, next merging is judged Whether the writing of historical data is within the data value of systemic presupposition in measurement period, if so, file division is carried out again, it is no Any action is not done then.

Separately below to respectively merging the writing and text of historical data in measurement period in statistical cluster on each server The specific steps of part segmentation illustrate.

Referring to Fig. 4, the writing to respectively merging historical data in measurement period in cluster on each server counts It is as follows：

Step 401：Initialization；After each process normally starts in Hbase clusters, created in each Region Server Data statistics queue Queue is built, the basic unit of queue is encapsulated as Writing Record objects, each objects of statistics Writing Record are responsible for one filing amount for merging historical data in statistic period T of statistics, each objects of statistics in queue The time range that Writing Record are responsible for statistics is continual incremented by successively.Initialize the objects of statistics in queue Writing Record, even its built-in variable records is equal to 0.The length of data statistics queue Queue be LEN, LEN from Queue is created to destruction and remains unchanged.Wherein, in the present embodiment, it is 20 seconds to merge statistic period T.

Step 402：After Reigon Server receive a data insertion request, the write-ahead log switch in Hbase is judged Whether the value of variable is True；If so, illustrating that write-ahead log has turned on, daily record is written into data item Item, otherwise directly will Data item Item is inserted into the corresponding position of memory according to line unit, keeps the data item in memory integrally orderly.If data be inserted at Work(thens follow the steps 403.

Step 403：It obtains current data and is inserted into time T₀, synchronous to obtain data statistics queue Queue cover time ranges Initial time T_SWith deadline T_E.Wherein, initial time T_SFor preset earliest statistics moment, deadline T_EIt is default The statistics moment the latest.If T₀Less than T_S, then illustrate that the insertion time of current historical data is illegal, by current historical data It abandons, any processing is not made to data queue, directly terminate this flow；If T₀More than T_E, then illustrate current data statistics team Object there is no to be responsible for T in row Queue₀The statistics of data writing in the cycle T of place executes step 404；If T₀Be more than or Person is equal to T_SAnd T₀Less than or equal to T_E, then illustrate that the insertion time of current historical data is in data statistics queue Queue Cover time within the scope of, directly execute step 405.

Step 404：Obtain the head subject WR of data statistics queue Queue_HEAD, and queue is removed it, it is synchronous to obtain The tail object WR of data statistics queue Queue_TAIL, obtain queue and cover deadline QT_E.New objects of statistics WR ' is created, And it sets WR ' and is responsible for statistics QT_ETo QT_EThe writing of historical data within the scope of+T time after the completion adds the data being newly inserted into Enter the tail portion of data statistics queue Queue, and repeats step 403.

Step 405：It is inserted into time T according to current data₀It is synchronous to obtain corresponding statistics pair in data statistics queue Queue As WR₀, to WR₀Internal variable counts records and increases current data size, terminates this flow.

Referring to Fig. 5, in embodiments of the present invention, adaptivenon-uniform sampling is carried out to Hbase files and is as follows：

Step 501：Data item Item is inserted into the designated position in Hbase memories MemStore according to line unit, judges to work as Whether the data writing that preceding MemStore merges at one in measurement period reaches preset data writing threshold value；If not yet Have, then terminate this flow, all data item in MemStore, which are otherwise pressed ranks key ascending order, is written an independent file StoreFile, and execute step 502.

Step 502：Judge whether the number of file under current Region Server is more than that default file merges threshold value Compact Threshold and current Region Server are in operating status；If it is not, terminating this flow, otherwise hold Row step 503.

Step 503：By the file in current Region Server according at least Min of sequential selection from small to large_CAnd extremely More Max_CA file StoreFile, and it is merged into file File one big_C, continue to execute step 504.Wherein, Min_CAnd Max_C It is configured according to practical application scene demand.In the present embodiment, Min_CIt is 64, Max_CIt is 256.

Step 504：N number of objects of statistics Writing Record that the time is nearest are selected out of data statistics queue Queue, N is less than queue length LEN.According to a sliding average algorithm, it is calculated in data statistics queue Queue from deadline T_E Play next writing s for merging historical data in statistic period T.

Step 505：Judge s whether be more than preset unit period in data writing upper limit value Writing Upper Limit judges whether s is more than the preset amount of storage for merging data in measurement period；If it does, not making any action then And terminate this flow, it is no to then follow the steps 506.Wherein, in the present embodiment, Writing Upper Limit are 64M.

Step 506：To the file File after merging_CFile Split operation is carried out, this flow is terminated.

Embodiment two

Referring to Fig. 6, the storage device of time series data provided in an embodiment of the present invention, including：

Data memory module 100, for storing data to be stored into Hbase databases；

Data memory module 100 is illustrated, data memory module 100 specifically includes：

Data storage execution unit, for data to be stored to be stored in buffer area；

First judging unit, for judging whether the data capacity in buffer area reaches preset data capacity threshold；

First data storage subunit operable stores data to be stored if the judging result for the first judging unit is yes Into Hbase databases；

Specifically, in the present embodiment, the first data storage subunit operable, if specifically for the judgement knot of the first judging unit Fruit is yes, the mark of data to be stored and storage time is stabbed the line unit as Hbase databases, by the content of data to be stored Filing data as Hbase databases；Using line unit and filing data as the data item of Hbase databases storage to Hbase In database；

Second judgment unit, for judging whether the storage period of buffer area reaches default storage Ct value；

Second data storage subunit operable stores data to be stored if the judging result for second judgment unit is yes Into Hbase databases；

Specifically, in the present embodiment, the second data storage subunit operable, if specifically for the judgement knot of second judgment unit Fruit is yes, the mark of data to be stored and storage time is stabbed the line unit as Hbase databases, by the content of data to be stored Filing data as Hbase databases；Using line unit and filing data as the data item of Hbase databases storage to Hbase In database；

Data statistics module 200 obtains statistical result for being counted to the data volume in Hbase databases；

Data statistics module 200 is illustrated, in the present embodiment, data statistics module 200 specifically includes：

Third judging unit, the storage moment T for judging data to be stored₀When whether earlier than preset earliest statistics It carves；

Data discarding unit illustrates that the deposit of data to be stored is non-if the judging result for third judging unit is yes Method abandons data to be stored；

4th judging unit, the storage moment T for judging data to be stored₀Whether between the preset earliest statistics moment Between the preset statistics moment the latest；

First objects of statistics, if the judging result for the 4th judging unit is yes, by the data in former Hbase databases Amount adds data to be stored, completes statistics, obtains statistical result；

5th judging unit, the storage moment T for judging data to be stored₀Whether it is later than preset when counting the latest It carves；

Second objects of statistics, if the judging result for the 5th judging unit is yes, by the storage moment of data to be stored T₀Data to be stored is added as the new statistics moment the latest, and by the data volume in former Hbase databases, statistics is completed, obtains Obtain statistical result.

Data combiners block 300 closes the data in Hbase databases for merging in measurement period in n-th And the file after being merged；

Data combiners block 300 is illustrated, in the present embodiment, data combiners block 300 specifically includes：

6th judging unit, for whether judging data writing of the Hbase databases in n-th merging measurement period Reach preset data writing threshold value；

File generating unit, if the judging result for the 6th judging unit is yes, by the data item in Hbase databases It presses ranks key ascending order and an independent file is written；

7th judging unit, for judging whether the number of file under corresponding server in Hbase databases is more than default Piece file mergence threshold value；

Merge execution unit, if the judging result for the 7th judging unit is yes, merge the file in corresponding server, File after being merged.

Data processing module 400, for obtaining the N+1 data storage capacity merged in measurement period according to statistical result s；

Judgment module 500, for judge data storage capacity s whether more than the N+1 merging measurement period preset data Amount of storage；

Divide module 600, if the judging result for judgment module 500 is no, the file after merging is split.

【Technique effect】

1, data to be stored is stored into Hbase databases, and the data volume in Hbase databases is counted, Obtain statistical result；The data storage capacity s in next merging measurement period is obtained according to statistical result；Judge data storage capacity Whether s is beyond the preset amount of storage for merging data in measurement period；If it is not, being split to the file after merging.Due to Hbase is a support mass data storage, and has the distributed data base of expanding storage depth and data redundancy ability, Therefore the embodiment of the present invention disclosure satisfy that the memory requirement of historical time sequence data.Further, since Hbase in the prior art What merging and segmentation to internal data file were taken is the static policies based on preset parameter, thus cannot be adapted to sea well Measure the persistence requirement of time series data.The embodiment of the present invention is improved by the scheme based on dynamic statistics inside Hbase Data file merges the efficiency with segmentation, not only reduces the delay of data access, and also assure the stabilization of data access Property, the persistence for being adapted to magnanimity time series data well wants summed data visiting demand.

The embodiment of the present invention by the good Data Structure Design in Hbase, improve time series data storage and The efficiency of inquiry.In addition, the embodiment of the present invention also utilizes the statistical result of periodic data storage capacity, to the number of future statistics It is predicted according to amount of storage, and decision is carried out to the merging of Hbase data files and segmentation using prediction result, by this strategy The generation that potential Piece file mergence or segmentation storm inside Hbase can not only be greatly reduced, effectively improves reading and writing data Performance.And also assure the writing speed of the service clock availability and mass data of system.

Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.

Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art God and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims

1. a kind of storage method of time series data, which is characterized in that including：

It will be in data to be stored storage to Hbase databases；

Merge in measurement period in n-th and the data in the Hbase databases are merged, the file after being merged；

If it is not, being then split to the file after the merging；

Wherein, the data volume in the Hbase databases counts, and specifically includes：

If so, the data to be stored is abandoned；

Judge the storage moment T of the data to be stored₀Whether preset unite between the preset earliest statistics moment and the latest Between timing is carved；

Judge the storage moment T of the data to be stored₀Whether it is later than and described preset counts the moment the latest；

If so, by the storage moment T of the data to be stored₀As the new statistics moment the latest, and by the former Hbase data Data volume in library adds the data to be stored, completes statistics.

2. the method as described in claim 1, which is characterized in that in the storage to Hbase databases by data to be stored, tool Body includes：

The data to be stored is stored in buffer area；

And/or

If the data capacity in the buffer area reaches the storage period of the preset data capacity threshold and/or the buffer area Reach the default storage Ct value, it will be in data to be stored storage to the Hbase databases.

3. method as claimed in claim 2, which is characterized in that described by data to be stored storage to the Hbase numbers According in library, specifically include：

Using the line unit and the filing data as the data item of Hbase databases storage to the Hbase databases In.

4. method as claimed in claim 3, which is characterized in that the data in the Hbase databases merge, It specifically includes：

Judge that the Hbase databases merge whether the data writing in measurement period reaches preset data in the n-th Writing threshold value；

If so, merging the file in the corresponding server.

5. a kind of storage device of time series data, which is characterized in that including：

Data memory module, for storing data to be stored into Hbase databases；

Data combiners block merges the data in the Hbase databases for merging in measurement period in n-th, File after being merged；

Data processing module, for obtaining the N+1 data storage capacity s merged in measurement period according to the statistical result；

Judgment module, for judging whether the preset data of the measurement period of the merging more than the N+1 is deposited by the data storage capacity s Reserves；

Divide module, if the judging result for the judgment module is no, the file after the merging is split；

The data statistics module, specifically includes：

Third judging unit, the storage moment T for judging the data to be stored₀Whether earlier than the preset earliest statistics moment；

Data discarding unit abandons the data to be stored if the judging result for the third judging unit is yes；

4th judging unit, the storage moment T for judging the data to be stored₀Whether between the preset earliest statistics Between moment and preset statistics moment the latest；

First objects of statistics will be in the former Hbase databases if the judging result for the 4th judging unit is yes Data volume adds the data to be stored, completes statistics, obtains the statistical result；

5th judging unit, the storage moment T for judging the data to be stored₀Whether described preset the latest count is later than Moment；

Second objects of statistics, if the judging result for the 5th judging unit is yes, by the storage of the data to be stored Moment T₀The data to be stored is added as the new statistics moment the latest, and by the data volume in the former Hbase databases, Statistics is completed, the statistical result is obtained.

6. device as claimed in claim 5, which is characterized in that the data memory module specifically includes：

First data storage subunit operable, if the judging result for first judging unit is yes, by the data to be stored It stores in the Hbase databases；

Second data storage subunit operable, if the judging result for the second judgment unit is yes, by the data to be stored It stores in the Hbase databases.

7. device as claimed in claim 6, which is characterized in that

First data storage subunit operable waits for if the judging result specifically for first judging unit is yes by described The line unit of the mark and storage time stamp of data as the Hbase databases is stored, the content of the data to be stored is made For the filing data of the Hbase databases；Using the line unit and the filing data as the data of the Hbase databases In item storage to the Hbase databases；

Second data storage subunit operable waits for if the judging result specifically for the second judgment unit is yes by described The line unit of the mark and storage time stamp of data as the Hbase databases is stored, the content of the data to be stored is made For the filing data of the Hbase databases；Using the line unit and the filing data as the data of the Hbase databases In item storage to the Hbase databases.

8. device as claimed in claim 7, which is characterized in that the data combiners block specifically includes：

6th judging unit, for judging that the Hbase databases merge the data writing in measurement period in the n-th Whether preset data writing threshold value is reached；

File generating unit, if the judging result for the 6th judging unit is yes, by the number in the Hbase databases Ranks key ascending order is pressed according to item, and an independent file is written；

7th judging unit, for judging whether the number of file under corresponding server in the Hbase databases is more than default Piece file mergence threshold value；

Merge execution unit if the judging result for the 7th judging unit is yes to merge in the corresponding server File obtains the file after the merging.