CN105005617A - Storage method and device of time sequence data - Google Patents

Storage method and device of time sequence data Download PDF

Info

Publication number
CN105005617A
CN105005617A CN201510429895.5A CN201510429895A CN105005617A CN 105005617 A CN105005617 A CN 105005617A CN 201510429895 A CN201510429895 A CN 201510429895A CN 105005617 A CN105005617 A CN 105005617A
Authority
CN
China
Prior art keywords
data
stored
hbase database
statistics
moment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510429895.5A
Other languages
Chinese (zh)
Other versions
CN105005617B (en
Inventor
李江颖
吴培荣
程衍明
陈刚
杨超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NAVIMENTUM INFORMATION SYSTEM CO Ltd
Original Assignee
NAVIMENTUM INFORMATION SYSTEM CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NAVIMENTUM INFORMATION SYSTEM CO Ltd filed Critical NAVIMENTUM INFORMATION SYSTEM CO Ltd
Priority to CN201510429895.5A priority Critical patent/CN105005617B/en
Publication of CN105005617A publication Critical patent/CN105005617A/en
Application granted granted Critical
Publication of CN105005617B publication Critical patent/CN105005617B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Abstract

The invention discloses a storage method and device of time sequence data. The method comprises the following steps: storing data to be stored into an Hbase database; measuring the data size in the Hbase database to obtain a statistical result; merging the data in the Hbavse database in the Nth merging statistical period to obtain a merged file; obtaining a data storage size s in the (N+1)th merging statistical period according to the statistical result; judging whether the data storage size s exceeds a preset data storage size in the (N+1)th merging statistical period or not; and if the data storage size s does not exceed the preset data storage size in the (N+1)th merging statistical period, segmenting the merged file. According to the storage method and device of time sequence data, the technical problems that the storage capacity of data cannot be satisfied and the data storage stability cannot be ensured in the prior art are solved.

Description

A kind of storage means of time series data and device
Technical field
The present invention relates to field of computer technology, particularly relate to a kind of storage means and device of time series data.
Background technology
In industrial information process, real time historical database always is the direction of industrial circle primary study, and involved industry comprises oil, electric power, metallurgy, chemical industry etc.Real time historical database all has the status of core in directions such as data monitoring, management and storages, it describes production data using measuring point as elementary cell, measuring point represents an actual data source, such as, voltage on a certain bar power transmission line, the temperature etc. of some check points.In order to meet transactional demands growing in commercial Application, every field has all carried out a large amount of systematic studyes to real time historical database.
Along with the develop rapidly of Internet of Things, various smart machine is widely used in industrial circle, in the collected real time historical database of increasing data.Because production control process runs without interruption for 24 hours, therefore As time goes on, the total amount of historical data is constantly accumulated, and the historical data total amount of generation can reach TB even PB rank.Current real time historical database is deployed on server of good performance, but the historical data of magnanimity brings pressure still can to the server hardware of current main flow, is even difficult to be dealt with problems by the mode of hardware expanding.
Industrial data has the feature of typical time series data, and the data uploaded by smart machine are normally with the data stream of time tag, and time tag is the important screening conditions of the inquiry of historical data.Current real time historical database all have employed special storage mode, and to improve the processing power to time related sequence data, but its extensibility stored still exists bottleneck.
The distributed file system of Hbase (Hadoop Database, distributed memory system) bottom, as a kind of distributed storage solution of low cost, is widely used in recent years.But, Hbase has certain time delay when in the face of mass data write, mass data enters Hbase ceaselessly can trigger large files cutting operation in Hbase, and after the segmentation of file usually occurs in file Merge operation, the I/O load now in cluster is extremely high.Meanwhile, file division can cause temporarily rolling off the production line of server, and data filing request can get clogged, and system response time can be extremely unstable, and the filing speed of data cannot be protected.
In sum, in the face of the mass data of industrial circle, there is no a kind of storage capacity requirement that both can meet current data at present, the method for data storage stability can be ensured again.
Summary of the invention
The application provides a kind of storage means and device of time series data, solves the storage capacity requirement that cannot meet data in prior art and the technical matters that cannot ensure data storage stability.
Embodiments provide a kind of storage means of time series data, comprising:
Data to be stored are stored in Hbase database;
Data volume in described Hbase database is added up, obtains statistics;
In N number of merging measurement period, the data in described Hbase database are merged, obtain the file after merging;
N+1 the memory data output s merged in measurement period is obtained according to described statistics;
Judge that whether described memory data output s merges the preset data memory space of measurement period more than N+1;
If not, then the file after described merging is split.
Further, described data to be stored to be stored in Hbase database, specifically to comprise:
By described data to be stored stored in buffer area;
Judge whether the data capacity in described buffer area reaches preset data capacity threshold;
And/or,
Judge whether the memory cycle of described buffer area reaches default memory cycle threshold value;
If the memory cycle that the data capacity in described buffer area reaches described preset data capacity threshold and/or described buffer area reaches described default memory cycle threshold value, described data to be stored are stored in described Hbase database.
Further, described described data to be stored to be stored in described Hbase database, specifically to comprise:
The mark of described data to be stored and storage time are stabbed the line unit as described Hbase database;
Using the filing data of the content of described data to be stored as described Hbase database;
Using described line unit and described filing data as the store data items of described Hbase database in described Hbase database.
Further, described data volume in described Hbase database to be added up, specifically comprises:
Judge the storage moment T of described data to be stored 0whether early than the moment of statistics the earliest of presetting;
If so, described data to be stored are abandoned;
Judge the storage moment T of described data to be stored 0whether default add up between moment and the default moment of statistics the latest the earliest between described;
If so, the data volume in former described Hbase database is added described data to be stored, complete statistics;
Judge whether the storage moment T0 of described data to be stored is later than the described default moment of statistics the latest;
If so, using the storage moment T0 of described data to be stored as the new moment of statistics the latest, and the data volume in former described Hbase database is added described data to be stored, completes statistics.
Further, described data in described Hbase database to be merged, specifically comprise:
Judge whether the data writing of described Hbase database in described N number of merging measurement period reaches preset data writing threshold value;
If so, the data item in described Hbase database is pressed ranks key ascending order and write an independently file;
Judge whether the number of file under corresponding server in described Hbase database exceedes default file and merge threshold value;
If so, the file in described corresponding server is merged.
The embodiment of the present invention additionally provides a kind of memory storage of time series data, comprising:
Data memory module, for being stored in Hbase database by data to be stored;
Data statistics module, for adding up the data volume in described Hbase database, obtains statistics;
Data combiners block, for merging the data in described Hbase database in N number of merging measurement period, obtains the file after merging;
Data processing module, obtains N+1 the memory data output s merged in measurement period for described according to statistics;
Judge module, for judging that whether described memory data output s merges the memory space of the preset data of measurement period more than N+1;
Segmentation module, if be no for the judged result of described judge module, splits the file after described merging.
Further, described data memory module, specifically comprises:
Data store performance element, for by described data to be stored stored in buffer area;
First judging unit, for judging whether the data capacity in described buffer area reaches preset data capacity threshold;
Described data to be stored, if be yes for the judged result of described first judging unit, are stored in described Hbase database by the first data storage subunit operable;
Second judging unit, for judging whether the memory cycle of described buffer area reaches default memory cycle threshold value;
Described data to be stored, if be yes for the judged result of described second judging unit, are stored in described Hbase database by the second data storage subunit operable.
Further, described first data storage subunit operable, if the judged result specifically for described first judging unit is yes, the mark of described data to be stored and storage time are stabbed the line unit as described Hbase database, using the filing data of the content of described data to be stored as described Hbase database; Using described line unit and described filing data as the store data items of described Hbase database in described Hbase database;
Described second data storage subunit operable, if the judged result specifically for described second judging unit is yes, the mark of described data to be stored and storage time are stabbed the line unit as described Hbase database, using the filing data of the content of described data to be stored as described Hbase database; Using described line unit and described filing data as the store data items of described Hbase database in described Hbase database.
Further, described data statistics module, specifically comprises:
3rd judging unit, for judging the storage moment T of described data to be stored 0whether early than the moment of statistics the earliest of presetting;
Described data to be stored, if be yes for the judged result of described 3rd judging unit, are abandoned by data discarding unit;
4th judging unit, for judging the storage moment T of described data to be stored 0whether default add up between moment and the default moment of statistics the latest the earliest between described;
First objects of statistics, if be yes for the judged result of described 4th judging unit, add described data to be stored by the data volume in former described Hbase database, completes statistics, obtain described statistics;
5th judging unit, for judging the storage moment T of described data to be stored 0whether be later than the described default moment of statistics the latest;
Second objects of statistics, if be yes for the judged result of described 5th judging unit, by the storage moment T of described data to be stored 0as the new moment of statistics the latest, and the data volume in former described Hbase database is added described data to be stored, complete statistics, obtain described statistics.
Further, described data combiners block, specifically comprises:
6th judging unit, for judging whether the data writing of described Hbase database in described N number of merging measurement period reaches preset data writing threshold value;
File generating unit, if be yes for the judged result of described 6th judging unit, press ranks key ascending order by the data item in described Hbase database and writes an independently file;
7th judging unit, for judging whether the number of file under corresponding server in described Hbase database exceedes default file and merge threshold value;
Merge performance element, if be yes for the judged result of described 7th judging unit, merge the file in described corresponding server, obtain the file after described merging.
The one or more technical schemes provided in the embodiment of the present invention, at least have following technique effect or advantage:
1, data to be stored are stored in Hbase database, and the data volume in Hbase database is added up, obtain statistics; The next memory data output s merged in measurement period is obtained according to statistics; Judge whether memory data output s exceeds the memory space of data in default merging measurement period; If not, be combined after file split.Support mass data storage because Hbase is one, and possess the distributed data base of expanding storage depth and data redundancy ability, therefore the embodiment of the present invention can meet the memory requirement of historical time sequence data.In addition, what take the merging of inner data file and segmentation due to Hbase of the prior art is static policies based on preset parameter, thus can not the persistence requirement of adaptive magnanimity time series data well.The application is by improve the efficiency of Hbase internal data Piece file mergence and segmentation based on the scheme of dynamic statistics, not only reduce the delay of data access, but also ensure that the stability of data access, the persistence of adaptive well magnanimity time series data requires and data access requirements.
2, first data buffer storage to be stored is got up, again by disposable for the data of a period of time be cached stored in Hbase, thus the number of times of the magnetic disc i/o number of times decreased in data persistence process and network service, and then significantly improve the persistence efficiency of data and the data throughout of Hbase.
3, the mark of data to be stored and storage time are stabbed the line unit as Hbase database; Using the filing data of the content of data to be stored as Hbase database; Using line unit and filing data as the store data items of Hbase database in Hbase database.The embodiment of the present invention proposes scheme time series data being stored into Hbase database, guarantees that Hbase database can adapt to the memory requirement of historical data.Because the embodiment of the present invention is carried out classification and ordination to the mark of the line unit of Hbase database according to data to be stored and implements distributed storage, reasonably can be distributed on each node of Hbase database under distributed environment when thus on the one hand ensure that data store, and the read-write focus in Hbase database can not be triggered; Also ensure that the data of same observation station can not be spread out when physical store and cause high expense and the poor efficiency of inquiry on the other hand.In addition, continuous print data ordering makes the inquiry of historical data can use the scan mechanism of Hbase database easily, reduces the transmission volume required for each inquiry.
Accompanying drawing explanation
The process flow diagram of the storage means of the time series data that Fig. 1 provides for the embodiment of the present invention one;
The process flow diagram that the storage means that Fig. 2 is the time series data provided by the embodiment of the present invention one stores industrial historical data;
Fig. 3 is for the storage means of time series data provided by the embodiment of the present invention one is by the principle schematic of data stored in Hbase;
The process flow diagram that the storage means that Fig. 4 is the time series data provided by the embodiment of the present invention one is added up writing merging historical data in measurement period each on server each in cluster;
Fig. 5 to carry out the process flow diagram of adaptivenon-uniform sampling for the storage means of time series data that provided by the embodiment of the present invention one to Hbase file;
The module map of the memory storage of the time series data that Fig. 6 provides for the embodiment of the present invention two.
Embodiment
The embodiment of the present invention, by providing a kind of storage means and device of time series data, solves the storage capacity requirement that cannot meet data in prior art and the technical matters that cannot ensure data storage stability.
Technical scheme in the embodiment of the present invention is for solving the problems of the technologies described above, and general thought is as follows:
Data to be stored are stored in Hbase database, and the data volume in Hbase database is added up, obtain statistics; The next memory data output s merged in measurement period is obtained according to statistics; Judge whether memory data output s exceeds the memory space of data in default merging measurement period; If not, be combined after file split.Support mass data storage because Hbase is one, and possess the distributed data base of expanding storage depth and data redundancy ability, therefore the embodiment of the present invention can meet the memory requirement of historical time sequence data.In addition, what take the merging of inner data file and segmentation due to Hbase of the prior art is static policies based on preset parameter, thus can not the persistence requirement of adaptive magnanimity time series data well.The embodiment of the present invention is by improve the efficiency of Hbase internal data Piece file mergence and segmentation based on the scheme of dynamic statistics, not only reduce the delay of data access, but also ensure that the stability of data access, the persistence of adaptive well magnanimity time series data requires and data access requirements.
In order to understand technique scheme better, below in conjunction with Figure of description and concrete embodiment, technique scheme is described in detail.
Embodiment one
See Fig. 1, the storage means of the time series data that the embodiment of the present invention provides, comprising:
Step S110: data to be stored are stored in Hbase database;
Be described this step, step S110 specifically comprises:
By data to be stored stored in buffer area;
Judge whether the data capacity in buffer area reaches preset data capacity threshold;
And/or,
Judge whether the memory cycle of buffer area reaches default memory cycle threshold value;
If the memory cycle that the data capacity in buffer area reaches preset data capacity threshold and/or buffer area reaches default memory cycle threshold value, data to be stored are stored in Hbase database;
Otherwise, be left intact.
To the concrete steps that data to be stored are stored in Hbase database are described in step S110:
The mark of data to be stored and storage time are stabbed the line unit as Hbase database;
Using the filing data of the content of data to be stored as Hbase database;
Using line unit and filing data as the store data items of Hbase database in Hbase database.
In the present embodiment, the actual size when length of each data item depends on historical data archiving in Hbase database.
Step S120: add up the data volume in Hbase database, obtains statistics;
Be described this step, step S120 specifically comprises:
Judge the storage moment T of data to be stored 0whether early than the moment of statistics the earliest of presetting;
If so, illustrate data to be stored stored in illegally, data to be stored are abandoned;
Judge the storage moment T of data to be stored 0whether add up between moment and the default moment of statistics the latest the earliest between what preset;
If so, the data volume in former Hbase database is added data to be stored, complete statistics;
Judge the storage moment T of data to be stored 0whether be later than the default moment of statistics the latest;
If so, by the storage moment T of data to be stored 0as the new moment of statistics the latest, and the data volume in former Hbase database is added data to be stored, complete statistics.
Step S130: merge the data in Hbase database in N number of merging measurement period, obtains the file after merging;
Be described this step, step S130 specifically comprises:
Judge whether the data writing of Hbase database in N number of merging measurement period reaches preset data writing threshold value;
If so, the data item in Hbase database is pressed ranks key ascending order and write an independently file;
If not, be then left intact.
Judge whether the number of file under corresponding server in Hbase database exceedes default file and merge threshold value;
If so, the file in corresponding server is merged;
If not, be then left intact.
Step S140: obtain N+1 the memory data output s merged in measurement period according to statistics;
Step S150: judge that whether memory data output s merges the memory space of the preset data of measurement period more than N+1;
If so, be then left intact;
If not, then the file after being combined is split.
See Fig. 2, the concrete steps that the method provided by the embodiment of the present invention is stored industrial historical data are as follows:
Step 301: initialization.Create buffer area and Hbase client pool; Detailed process comprises: utilize the API of Hbase (Application Program interface, application programming interfaces) function to open Hbase tables of data.If rreturn value is empty, illustrate in Hbase there is no formal historical data table, then create a table H by unified naming rule HisData.Otherwise, illustrate in Hbase there is formal historical data table.Judge in table H, whether there is historical data column race again; If no, then create row race F by unified naming rule; If had, then from database, read all measuring point arrangement information, for each measuring point distributes two block sizes consistent buffer area, the fixing also opening timing device thread of the data value number that each measuring point can store in buffer area; According to the HTable pond Pool of configuration establishment two fixed sizes fand Pool amanage the write of measuring point, this is the security in order to ensure data under multi-thread environment, and all Hbase client's side link ZooKeeper also keep session.Meanwhile, configure all through Pool fin pond, the write request of client is directly delivered to the server of correspondence, configures all through Pool aeach object in pond opens the spatial cache carried, the automatic writing mode of data acquisition.Wherein, in pond, the number of object depends on the performance of current server and the needs of practical application scene.
Step 302: receive data and insert request, by data stored in measuring point N iDcorresponding read-write state is the buffer area writing state, then upgrades measuring point N iDbuffer data size information DataNum in corresponding buffer area.
Step 303: judge measuring point N iDwhether corresponding buffer area is write full, namely judges whether the data capacity in buffer area reaches preset data capacity threshold; If so, then step 304 is performed; If not, then step 305 is performed.
Step 304: acquisition request HTable pond Pool finterior current available idle object C f.If return an empty object, then representing current does not have idle object, waits for that 1 millisecond is continued to attempt obtaining idle object C afterwards f.Otherwise obtain HTable pond Pool finterior current available idle object C f, see Fig. 3, according to mark and the request moment generation line unit RowKey of the data of request insertion, and the data of buffer area are generated filing data ColumnData as byte arrays, RowKey and ColumnData composition are filed the data item Item of Hbase.The idle object C utilizing this available fitem delivers and processes to server corresponding in Hbase by write interface immediately that provide;
Step 305: judge measuring point N iDwrite time of buffer zone whether exceed timer time of filing set by buffer area; If exceeded, then perform step 306; If do not exceeded, then process ends.
Step 306: acquisition request HTable pond Pool ainterior current available idle object C a.If return an empty object, then representing current does not have idle object, waits for that 1 millisecond is continued to attempt obtaining idle object C afterwards a.Otherwise obtain HTable pond Pool ainterior current available idle object C a.According to mark and the request moment generation line unit RowKey of the data of request insertion, and the data of buffer area are generated filing data ColumnData as byte arrays, RowKey and ColumnData composition is filed the data item Item of Hbase.The idle object C utilizing this available aitem delivers and processes to server corresponding in Hbase by interface immediately that provide.
It should be noted that, after industrial historical data enters Hbase, the invention process regular meeting is each writing merging historical data in measurement period on each server in statistical cluster constantly.When carrying out file division, judging the next writing merging historical data in measurement period whether within the data value of systemic presupposition, if so, then carrying out file division, otherwise do not do any action.
Respectively the writing of historical data and the concrete steps of file division in merging measurement period each on server each in statistical cluster are described below.
See Fig. 4, the concrete steps of adding up writing merging historical data in measurement period each on server each in cluster are as follows:
Step 401: initialization; After in Hbase cluster, each process normally starts, data statistics queue Queue is created in each Region Server, the elementary cell of queue is encapsulated as Writing Record object, each objects of statistics Writing Record is responsible for the filing amount that statistics one merges historical data in statistic period T, and the time range that each objects of statistics Writing Record in queue is responsible for adding up is continual to be increased progressively successively.Objects of statistics WritingRecord in initialize queue, even its built-in variable records equals 0.The length of data statistics queue Queue is that LEN, LEN are created to destruction from queue and remain unchanged.Wherein, in the present embodiment, merging statistic period T is 20 seconds.
Step 402:Reigon Server judges whether the value of the write-ahead log switching variable in Hbase is True after receiving a data insertion request; If so, then illustrate that write-ahead log is opened, writes daily record by data item Item, otherwise directly data item Item is inserted the correspondence position of internal memory according to line unit, make the data item in internal memory in order overall.If data are inserted successfully, then perform step 403.
Step 403: obtain current data and insert time T 0, the initial time T of synchronous acquisition number queue Queue cover time scope according to statistics swith T closing time e.Wherein, initial time T sfor preset the moment of statistics the earliest, closing time T efor the moment of statistics the latest of presetting.If T 0be less than T s, then illustrate that the insertion time of current historical data is illegal, current historical data abandoned, any process is not done to data queue, direct process ends; If T 0be greater than T e, then illustrate that there is no object in current data statistics queue Queue is responsible for T 0data write quantitative statistics in the cycle T of place, performs step 404; If T 0be greater than or equal to T sand T 0be less than or equal to T e, then illustrate that the insertion time of current historical data is within the scope of the cover time of data statistics queue Queue, directly perform step 405.
Step 404: the head subject WR of acquisition number queue Queue according to statistics hEAD, and removed queue, the afterbody object WR of synchronous acquisition number queue Queue according to statistics tAIL, obtain queue and contain QT closing time e.Create new objects of statistics WR ', and set WR ' responsible statistics QT eto QT ethe data newly inserted are added the afterbody of data statistics queue Queue by the writing of the historical data within the scope of+T time after completing, and repeated execution of steps 403.
Step 405: insert time T according to current data 0synchronous acquisition number is the interior corresponding objects of statistics WR of queue Queue according to statistics 0, to WR 0internal variable counting records increases current data size, process ends.
See Fig. 5, in embodiments of the present invention, the concrete steps of adaptivenon-uniform sampling are carried out to Hbase file as follows:
Step 501: data item Item is inserted into the assigned address in Hbase internal memory MemStore according to line unit, judges whether current MemStore reaches preset data writing threshold value in the data writing that merges in measurement period; If no, then process ends, otherwise data item all in MemStore are pressed ranks key ascending order and write an independently file StoreFile, and perform step 502.
Step 502: to judge under current Region Server whether the number of file exceedes default file and merge threshold value Compact Threshold and current Region Server is in running status; If no, then process ends, otherwise perform step 503.
Step 503: by the file in current Region Server according at least Min of select progressively from small to large cand Max at the most cindividual file StoreFile, and be merged into a large files File c, continue to perform step 504.Wherein, Min cand Max carrange according to practical application scene demand.In the present embodiment, Min cbe 64, Max cbe 256.
Step 504: N number of objects of statistics Writing Record that select time is nearest in data statistics queue Queue, N is less than queue length LEN.According to a running mean algorithm, calculate in data statistics queue Queue from cut-off date T eplay the next writing s merging historical data in statistic period T.
Step 505: judge whether s exceedes the higher limit Writing Upper Limit of data writing in default unit period, namely judge whether s exceedes the memory space of data in default merging measurement period; If exceeded, then do not do any action and process ends, otherwise perform step 506.Wherein, in the present embodiment, Writing Upper Limit is 64M.
Step 506: the file File after being combined ccarry out file Split operation, process ends.
Embodiment two
See Fig. 6, the memory storage of the time series data that the embodiment of the present invention provides, comprising:
Data memory module 100, for being stored in Hbase database by data to be stored;
Be described data memory module 100, data memory module 100, specifically comprises:
Data store performance element, for by data to be stored stored in buffer area;
First judging unit, for judging whether the data capacity in buffer area reaches preset data capacity threshold;
Data to be stored, if be yes for the judged result of the first judging unit, are stored in Hbase database by the first data storage subunit operable;
Particularly, in the present embodiment, the first data storage subunit operable, if be yes specifically for the judged result of the first judging unit, the mark of data to be stored and storage time are stabbed the line unit as Hbase database, using the filing data of the content of data to be stored as Hbase database; Using line unit and filing data as the store data items of Hbase database in Hbase database;
Second judging unit, for judging whether the memory cycle of buffer area reaches default memory cycle threshold value;
Data to be stored, if be yes for the judged result of the second judging unit, are stored in Hbase database by the second data storage subunit operable;
Particularly, in the present embodiment, the second data storage subunit operable, if be yes specifically for the judged result of the second judging unit, the mark of data to be stored and storage time are stabbed the line unit as Hbase database, using the filing data of the content of data to be stored as Hbase database; Using line unit and filing data as the store data items of Hbase database in Hbase database;
In the present embodiment, the actual size when length of each data item depends on historical data archiving in Hbase database.
Data statistics module 200, for adding up the data volume in Hbase database, obtains statistics;
Be described data statistics module 200, in the present embodiment, data statistics module 200, specifically comprises:
3rd judging unit, for judging the storage moment T of data to be stored 0whether early than the moment of statistics the earliest of presetting;
Data discarding unit, if be yes for the judged result of the 3rd judging unit, illustrate data to be stored stored in illegally, data to be stored are abandoned;
4th judging unit, for judging the storage moment T of data to be stored 0whether add up between moment and the default moment of statistics the latest the earliest between what preset;
First objects of statistics, if be yes for the judged result of the 4th judging unit, add data to be stored by the data volume in former Hbase database, completes statistics, obtains statistics;
5th judging unit, for judging the storage moment T of data to be stored 0whether be later than the default moment of statistics the latest;
Second objects of statistics, if be yes for the judged result of the 5th judging unit, by the storage moment T of data to be stored 0as the new moment of statistics the latest, and the data volume in former Hbase database is added data to be stored, complete statistics, obtain statistics.
Data combiners block 300, for merging the data in Hbase database in N number of merging measurement period, obtains the file after merging;
Be described data combiners block 300, in the present embodiment, data combiners block 300, specifically comprises:
6th judging unit, for judging whether the data writing of Hbase database in N number of merging measurement period reaches preset data writing threshold value;
File generating unit, if be yes for the judged result of the 6th judging unit, press ranks key ascending order and writes an independently file by the data item in Hbase database;
7th judging unit, for judging whether the number of file under corresponding server in Hbase database exceedes default file and merge threshold value;
Merge performance element, if be yes for the judged result of the 7th judging unit, merge the file in corresponding server, obtain the file after merging.
Data processing module 400, for obtaining N+1 the memory data output s merged in measurement period according to statistics;
Judge module 500, for judging that whether memory data output s merges the memory space of the preset data of measurement period more than N+1;
Segmentation module 600, if be no for the judged result of judge module 500, the file after being combined is split.
[technique effect]
1, data to be stored are stored in Hbase database, and the data volume in Hbase database is added up, obtain statistics; The next memory data output s merged in measurement period is obtained according to statistics; Judge whether memory data output s exceeds the memory space of data in default merging measurement period; If not, be combined after file split.Support mass data storage because Hbase is one, and possess the distributed data base of expanding storage depth and data redundancy ability, therefore the embodiment of the present invention can meet the memory requirement of historical time sequence data.In addition, what take the merging of inner data file and segmentation due to Hbase of the prior art is static policies based on preset parameter, thus can not the persistence requirement of adaptive magnanimity time series data well.The embodiment of the present invention is by improve the efficiency of Hbase internal data Piece file mergence and segmentation based on the scheme of dynamic statistics, not only reduce the delay of data access, but also ensure that the stability of data access, the persistence of adaptive well magnanimity time series data requires and data access requirements.
2, first data buffer storage to be stored is got up, again by disposable for the data of a period of time be cached stored in Hbase, thus the number of times of the magnetic disc i/o number of times decreased in data persistence process and network service, and then significantly improve the persistence efficiency of data and the data throughout of Hbase.
3, the mark of data to be stored and storage time are stabbed the line unit as Hbase database; Using the filing data of the content of data to be stored as Hbase database; Using line unit and filing data as the store data items of Hbase database in Hbase database.The embodiment of the present invention proposes scheme time series data being stored into Hbase database, guarantees that Hbase database can adapt to the memory requirement of historical data.Because the embodiment of the present invention is carried out classification and ordination to the mark of the line unit of Hbase database according to data to be stored and implements distributed storage, reasonably can be distributed on each node of Hbase database under distributed environment when thus on the one hand ensure that data store, and the read-write focus in Hbase database can not be triggered; Also ensure that the data of same observation station can not be spread out when physical store and cause high expense and the poor efficiency of inquiry on the other hand.In addition, continuous print data ordering makes the inquiry of historical data can use the scan mechanism of Hbase database easily, reduces the transmission volume required for each inquiry.
The embodiment of the present invention, by Data Structure Design good in Hbase, improves the efficiency that time series data stores and inquires about.In addition, the embodiment of the present invention also utilizes the statistics of periodic memory data output, to future, the memory data output of statistics is predicted, and utilize to predict the outcome decision-making is carried out to the merging of Hbase data file and segmentation, not only can greatly reduce the generation of the inner potential Piece file mergence of Hbase or segmentation storm by this strategy, effectively improve the performance of reading and writing data.But also ensure that the service continuation availability of system and the writing speed of mass data.
Although describe the preferred embodiments of the present invention, those skilled in the art once obtain the basic creative concept of cicada, then can make other change and amendment to these embodiments.So claims are intended to be interpreted as comprising preferred embodiment and falling into all changes and the amendment of the scope of the invention.
Obviously, those skilled in the art can carry out various change and modification to the present invention and not depart from the spirit and scope of the present invention.Like this, if these amendments of the present invention and modification belong within the scope of the claims in the present invention and equivalent technologies thereof, then the present invention is also intended to comprise these change and modification.

Claims (10)

1. a storage means for time series data, is characterized in that, comprising:
Data to be stored are stored in Hbase database;
Data volume in described Hbase database is added up, obtains statistics;
In N number of merging measurement period, the data in described Hbase database are merged, obtain the file after merging;
N+1 the memory data output s merged in measurement period is obtained according to described statistics;
Judge that whether described memory data output s merges the preset data memory space of measurement period more than N+1;
If not, then the file after described merging is split.
2. the method for claim 1, is characterized in that, describedly data to be stored is stored in Hbase database, specifically comprises:
By described data to be stored stored in buffer area;
Judge whether the data capacity in described buffer area reaches preset data capacity threshold;
And/or,
Judge whether the memory cycle of described buffer area reaches default memory cycle threshold value;
If the memory cycle that the data capacity in described buffer area reaches described preset data capacity threshold and/or described buffer area reaches described default memory cycle threshold value, described data to be stored are stored in described Hbase database.
3. method as claimed in claim 2, is characterized in that, describedly described data to be stored is stored in described Hbase database, specifically comprises:
The mark of described data to be stored and storage time are stabbed the line unit as described Hbase database;
Using the filing data of the content of described data to be stored as described Hbase database;
Using described line unit and described filing data as the store data items of described Hbase database in described Hbase database.
4. the method for claim 1, is characterized in that, describedly adds up the data volume in described Hbase database, specifically comprises:
Judge the storage moment T of described data to be stored 0whether early than the moment of statistics the earliest of presetting;
If so, described data to be stored are abandoned;
Judge the storage moment T of described data to be stored 0whether default add up between moment and the default moment of statistics the latest the earliest between described;
If so, the data volume in former described Hbase database is added described data to be stored, complete statistics;
Judge the storage moment T of described data to be stored 0whether be later than the described default moment of statistics the latest;
If so, by the storage moment T of described data to be stored 0as the new moment of statistics the latest, and the data volume in former described Hbase database is added described data to be stored, complete statistics.
5. method as claimed in claim 3, is characterized in that, describedly merges the data in described Hbase database, specifically comprises:
Judge whether the data writing of described Hbase database in described N number of merging measurement period reaches preset data writing threshold value;
If so, the data item in described Hbase database is pressed ranks key ascending order and write an independently file;
Judge whether the number of file under corresponding server in described Hbase database exceedes default file and merge threshold value;
If so, the file in described corresponding server is merged.
6. a memory storage for time series data, is characterized in that, comprising:
Data memory module, for being stored in Hbase database by data to be stored;
Data statistics module, for adding up the data volume in described Hbase database, obtains statistics;
Data combiners block, for merging the data in described Hbase database in N number of merging measurement period, obtains the file after merging;
Data processing module, obtains N+1 the memory data output s merged in measurement period for described according to statistics;
Judge module, for judging that whether described memory data output s merges the memory space of the preset data of measurement period more than N+1;
Segmentation module, if be no for the judged result of described judge module, splits the file after described merging.
7. device as claimed in claim 6, it is characterized in that, described data memory module, specifically comprises:
Data store performance element, for by described data to be stored stored in buffer area;
First judging unit, for judging whether the data capacity in described buffer area reaches preset data capacity threshold;
Described data to be stored, if be yes for the judged result of described first judging unit, are stored in described Hbase database by the first data storage subunit operable;
Second judging unit, for judging whether the memory cycle of described buffer area reaches default memory cycle threshold value;
Described data to be stored, if be yes for the judged result of described second judging unit, are stored in described Hbase database by the second data storage subunit operable.
8. device as claimed in claim 7, is characterized in that,
Described first data storage subunit operable, if the judged result specifically for described first judging unit is yes, the mark of described data to be stored and storage time are stabbed the line unit as described Hbase database, using the filing data of the content of described data to be stored as described Hbase database; Using described line unit and described filing data as the store data items of described Hbase database in described Hbase database;
Described second data storage subunit operable, if the judged result specifically for described second judging unit is yes, the mark of described data to be stored and storage time are stabbed the line unit as described Hbase database, using the filing data of the content of described data to be stored as described Hbase database; Using described line unit and described filing data as the store data items of described Hbase database in described Hbase database.
9. device as claimed in claim 6, it is characterized in that, described data statistics module, specifically comprises:
3rd judging unit, for judging the storage moment T of described data to be stored 0whether early than the moment of statistics the earliest of presetting;
Described data to be stored, if be yes for the judged result of described 3rd judging unit, are abandoned by data discarding unit;
4th judging unit, for judging the storage moment T of described data to be stored 0whether default add up between moment and the default moment of statistics the latest the earliest between described;
First objects of statistics, if be yes for the judged result of described 4th judging unit, add described data to be stored by the data volume in former described Hbase database, completes statistics, obtain described statistics;
5th judging unit, for judging the storage moment T of described data to be stored 0whether be later than the described default moment of statistics the latest;
Second objects of statistics, if be yes for the judged result of described 5th judging unit, by the storage moment T of described data to be stored 0as the new moment of statistics the latest, and the data volume in former described Hbase database is added described data to be stored, complete statistics, obtain described statistics.
10. device as claimed in claim 8, it is characterized in that, described data combiners block, specifically comprises:
6th judging unit, for judging whether the data writing of described Hbase database in described N number of merging measurement period reaches preset data writing threshold value;
File generating unit, if be yes for the judged result of described 6th judging unit, press ranks key ascending order by the data item in described Hbase database and writes an independently file;
7th judging unit, for judging whether the number of file under corresponding server in described Hbase database exceedes default file and merge threshold value;
Merge performance element, if be yes for the judged result of described 7th judging unit, merge the file in described corresponding server, obtain the file after described merging.
CN201510429895.5A 2015-07-21 2015-07-21 A kind of storage method and device of time series data Active CN105005617B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510429895.5A CN105005617B (en) 2015-07-21 2015-07-21 A kind of storage method and device of time series data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510429895.5A CN105005617B (en) 2015-07-21 2015-07-21 A kind of storage method and device of time series data

Publications (2)

Publication Number Publication Date
CN105005617A true CN105005617A (en) 2015-10-28
CN105005617B CN105005617B (en) 2018-10-12

Family

ID=54378293

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510429895.5A Active CN105005617B (en) 2015-07-21 2015-07-21 A kind of storage method and device of time series data

Country Status (1)

Country Link
CN (1) CN105005617B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407453A (en) * 2016-09-30 2017-02-15 郑州云海信息技术有限公司 Performance data management method and system
CN106682100A (en) * 2016-12-02 2017-05-17 浙江宇视科技有限公司 Data statistical method and system based on Hbase database
CN106843770A (en) * 2017-01-23 2017-06-13 北京思特奇信息技术股份有限公司 A kind of distributed file system small file data storage, read method and device
CN107229673A (en) * 2017-04-20 2017-10-03 努比亚技术有限公司 Method for writing data, Hbase terminals and the storage medium of Hbase databases
CN107491458A (en) * 2016-06-13 2017-12-19 阿里巴巴集团控股有限公司 A kind of method and apparatus and system of storage time sequence data
CN107491314A (en) * 2017-08-30 2017-12-19 四川长虹电器股份有限公司 Processing method is write based on Read-Write Locks algorithm is accessible to HBASE real time datas
CN108038171A (en) * 2017-12-07 2018-05-15 杭州电魂网络科技股份有限公司 Method for writing data, device and data server
CN108255533A (en) * 2016-12-28 2018-07-06 平安科技(深圳)有限公司 System configuration changes method and device
CN108563698A (en) * 2018-03-22 2018-09-21 中国银联股份有限公司 A kind of the Region merging methods and device of HBase table
CN108647243A (en) * 2018-04-13 2018-10-12 中国神华能源股份有限公司 Industrial big data storage method based on time series
CN108804347A (en) * 2017-05-05 2018-11-13 华中科技大学 A kind of cache layer, collecting system and method for industrial big data convergence
CN110019239A (en) * 2017-12-29 2019-07-16 百度在线网络技术(北京)有限公司 Storage method, device, electronic equipment and the storage medium of reported data
CN110502543A (en) * 2019-08-07 2019-11-26 京信通信系统(中国)有限公司 Device performance data storage method, device, equipment and storage medium
CN110582735A (en) * 2017-02-20 2019-12-17 株式会社Kmc Production information collection system, computer system, production information collection method, and program
CN112632347A (en) * 2021-01-14 2021-04-09 加和(北京)信息科技有限公司 Data screening control method and device and nonvolatile storage medium
CN112685008A (en) * 2020-11-30 2021-04-20 上海赫千电子科技有限公司 Service failure control method adopting service-oriented architecture based on AUTOSAR
WO2021087990A1 (en) * 2019-11-08 2021-05-14 深圳市欢太科技有限公司 Tag updating method and device, electronic apparatus, and storage medium
US11119863B2 (en) 2015-09-25 2021-09-14 Huawei Technologies Co., Ltd. Data backup method and data processing system
US11132260B2 (en) 2015-09-25 2021-09-28 Huawei Technologies Co., Ltd. Data processing method and apparatus
CN115269594A (en) * 2022-07-20 2022-11-01 清云智通(北京)科技有限公司 Industrial data processing method and system and computing equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101641674A (en) * 2006-10-05 2010-02-03 斯普兰克公司 Time series search engine
US20130103658A1 (en) * 2011-10-19 2013-04-25 Vmware, Inc. Time series data mapping into a key-value database
CN103605805A (en) * 2013-12-09 2014-02-26 冶金自动化研究设计院 Storage method of massive time series data
CN103902544A (en) * 2012-12-25 2014-07-02 中国移动通信集团公司 Data processing method and system
CN104216989A (en) * 2014-09-09 2014-12-17 广东电网公司中山供电局 Method for storing transmission line integrated data based on HBase

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101641674A (en) * 2006-10-05 2010-02-03 斯普兰克公司 Time series search engine
US20130103658A1 (en) * 2011-10-19 2013-04-25 Vmware, Inc. Time series data mapping into a key-value database
CN103902544A (en) * 2012-12-25 2014-07-02 中国移动通信集团公司 Data processing method and system
CN103605805A (en) * 2013-12-09 2014-02-26 冶金自动化研究设计院 Storage method of massive time series data
CN104216989A (en) * 2014-09-09 2014-12-17 广东电网公司中山供电局 Method for storing transmission line integrated data based on HBase

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11132260B2 (en) 2015-09-25 2021-09-28 Huawei Technologies Co., Ltd. Data processing method and apparatus
US11119863B2 (en) 2015-09-25 2021-09-14 Huawei Technologies Co., Ltd. Data backup method and data processing system
CN107491458A (en) * 2016-06-13 2017-12-19 阿里巴巴集团控股有限公司 A kind of method and apparatus and system of storage time sequence data
CN107491458B (en) * 2016-06-13 2021-08-31 阿里巴巴集团控股有限公司 Method, device and system for storing time series data
CN106407453A (en) * 2016-09-30 2017-02-15 郑州云海信息技术有限公司 Performance data management method and system
CN106682100A (en) * 2016-12-02 2017-05-17 浙江宇视科技有限公司 Data statistical method and system based on Hbase database
CN106682100B (en) * 2016-12-02 2020-10-20 浙江宇视科技有限公司 Data statistics method and system based on Hbase database
CN108255533A (en) * 2016-12-28 2018-07-06 平安科技(深圳)有限公司 System configuration changes method and device
CN106843770A (en) * 2017-01-23 2017-06-13 北京思特奇信息技术股份有限公司 A kind of distributed file system small file data storage, read method and device
CN110582735A (en) * 2017-02-20 2019-12-17 株式会社Kmc Production information collection system, computer system, production information collection method, and program
CN107229673A (en) * 2017-04-20 2017-10-03 努比亚技术有限公司 Method for writing data, Hbase terminals and the storage medium of Hbase databases
CN108804347A (en) * 2017-05-05 2018-11-13 华中科技大学 A kind of cache layer, collecting system and method for industrial big data convergence
CN107491314A (en) * 2017-08-30 2017-12-19 四川长虹电器股份有限公司 Processing method is write based on Read-Write Locks algorithm is accessible to HBASE real time datas
CN108038171B (en) * 2017-12-07 2020-07-03 杭州电魂网络科技股份有限公司 Data writing method and device and data server
CN108038171A (en) * 2017-12-07 2018-05-15 杭州电魂网络科技股份有限公司 Method for writing data, device and data server
CN110019239B (en) * 2017-12-29 2021-06-04 百度在线网络技术(北京)有限公司 Storage method and device of reported data, electronic equipment and storage medium
CN110019239A (en) * 2017-12-29 2019-07-16 百度在线网络技术(北京)有限公司 Storage method, device, electronic equipment and the storage medium of reported data
CN108563698B (en) * 2018-03-22 2021-11-23 中国银联股份有限公司 Region merging method and device for HBase table
CN108563698A (en) * 2018-03-22 2018-09-21 中国银联股份有限公司 A kind of the Region merging methods and device of HBase table
US11372822B2 (en) 2018-03-22 2022-06-28 China Unionpay Co., Ltd. Method, device, and computer apparatus for merging regions of HBase table
CN108647243A (en) * 2018-04-13 2018-10-12 中国神华能源股份有限公司 Industrial big data storage method based on time series
CN108647243B (en) * 2018-04-13 2021-11-23 中国神华能源股份有限公司 Industrial big data storage method based on time series
CN110502543A (en) * 2019-08-07 2019-11-26 京信通信系统(中国)有限公司 Device performance data storage method, device, equipment and storage medium
WO2021087990A1 (en) * 2019-11-08 2021-05-14 深圳市欢太科技有限公司 Tag updating method and device, electronic apparatus, and storage medium
CN112685008A (en) * 2020-11-30 2021-04-20 上海赫千电子科技有限公司 Service failure control method adopting service-oriented architecture based on AUTOSAR
CN112632347A (en) * 2021-01-14 2021-04-09 加和(北京)信息科技有限公司 Data screening control method and device and nonvolatile storage medium
CN112632347B (en) * 2021-01-14 2024-01-23 加和(北京)信息科技有限公司 Data screening control method and device and nonvolatile storage medium
CN115269594A (en) * 2022-07-20 2022-11-01 清云智通(北京)科技有限公司 Industrial data processing method and system and computing equipment

Also Published As

Publication number Publication date
CN105005617B (en) 2018-10-12

Similar Documents

Publication Publication Date Title
CN105005617A (en) Storage method and device of time sequence data
CN102843396B (en) Data write-in and read method and device in a kind of distributed cache system
CN102521269B (en) Index-based computer continuous data protection method
CN105183839A (en) Hadoop-based storage optimizing method for small file hierachical indexing
CN111427844B (en) Data migration system and method for file hierarchical storage
CN107220348B (en) Data collection method based on Flume and Alluxio
CN108073349B (en) Data transmission method and device
CN103885887B (en) User data storage method, read method and system
CN104123237A (en) Hierarchical storage method and system for massive small files
CN107832423B (en) File reading and writing method for distributed file system
CN102307234A (en) Resource retrieval method based on mobile terminal
CN103559229A (en) Small file management service (SFMS) system based on MapFile and use method thereof
US9558123B2 (en) Retrieval hash index
CN101707633A (en) Message-oriented middleware persistent message storing method based on file system
CN102254001A (en) Efficient data management method and system
CN104572505A (en) System and method for ensuring eventual consistency of mass data caches
CN108874688A (en) A kind of message data caching method and device
CN101751993A (en) Apparatus and method for cache control
CN107741947A (en) The storage of random number key based on HDFS file system and acquisition methods
CN105243030A (en) Data caching method
CN103279489A (en) Method and device for storing metadata
CN103514140B (en) For realizing the reconfigurable controller of configuration information multi-emitting in reconfigurable system
CN115408149A (en) Time sequence storage engine memory design and distribution method and device
CN105162622A (en) Storage method and system
CN107506146A (en) A kind of data-storage system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant