CN109492008A - A kind of network big data design methods and system based on HBase - Google Patents

A kind of network big data design methods and system based on HBase Download PDF

Info

Publication number
CN109492008A
CN109492008A CN201811343471.7A CN201811343471A CN109492008A CN 109492008 A CN109492008 A CN 109492008A CN 201811343471 A CN201811343471 A CN 201811343471A CN 109492008 A CN109492008 A CN 109492008A
Authority
CN
China
Prior art keywords
data
merger
hbase
new
column
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811343471.7A
Other languages
Chinese (zh)
Inventor
袁守正
姚磊
曹征
吴舸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Ideal Information Industry Group Co Ltd
Original Assignee
Shanghai Ideal Information Industry Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Ideal Information Industry Group Co Ltd filed Critical Shanghai Ideal Information Industry Group Co Ltd
Priority to CN201811343471.7A priority Critical patent/CN109492008A/en
Publication of CN109492008A publication Critical patent/CN109492008A/en
Pending legal-status Critical Current

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of network big data design methods and system based on HBase, described method includes following steps: step S1, creates multiple table data stores based on Hbase;The network equipment data of acquisition are carried out aggregation of data according to conflation algorithm with corresponding tables of data according to the time by step S2;Step S3 stores the data after initial data and merger into each table data store of corresponding Hbase;Step S4, the querying condition for receiving user is generated search condition, data query retrieval is carried out in the table data store based on HBase according to the search condition of generation, the present invention can support the reliable memory and efficiently inquiry of the mass data that the network equipment reports in network management system.

Description

A kind of network big data design methods and system based on HBase
Technical field
The present invention relates to database technical fields, more particularly to a kind of network big data modelling based on HBase Method and system.
Background technique
China Telecom's network management expert service is one and couples with China Telecom big net network monitoring center, provides one for client The remote monitoring and administration service for covering route and the network equipment of standing posture.However in practical O&M, client and O&M engineer It needs in face of third-party platform, autonomous platform, double layer network from service monitoring platform, seven layers of traffic monitoring, computer room house keeper, ITSM The multi-platform problem such as platform, to customer demand low-response, data between each platform, are matched at customer information for third-party platform provider Set it is relatively independent be difficult to unification, reduce the perceptibility of client.In order to solve these problems, some companies then accordingly develop net Guard system is promoted the response speed to customer demand, is increased client's with data, client and the configuration information of unified each platform Perceptibility.
However, the solution of traditional network management system is usually to store data in relevant database, as MySQL, SQL Server, ORACLE etc., distinct issues are can not to solve the high frequency insertion and inquiry of big data quantity, and scheme It is with high costs, scalability is bad.
Summary of the invention
In order to overcome the deficiencies of the above existing technologies, purpose of the present invention is to provide a kind of networks based on HBase Big data design methods and system, to support the reliable memory and height of the network equipment reports in network management system mass data Effect inquiry.
In order to achieve the above object, the present invention proposes a kind of network big data design methods based on HBase, including as follows Step:
Step S1 creates multiple table data stores based on Hbase;
The network equipment data of acquisition are carried out data according to conflation algorithm with corresponding tables of data according to the time by step S2 Merger;
Step S3 stores the data after initial data and merger into corresponding each table data store based on Hbase;
Step S4, the querying condition for receiving user are generated search condition, are based on according to the search condition of generation in this Data query retrieval is carried out in the table data store of HBase.
Preferably, the table data store includes raw data table, 5-minute data table, 15 minute data tables, hour data Table, day data table, weekly data table, moon tables of data, some or all of in annual data table.
Preferably, the structure of the raw data table includes line unit RowKey and Value column family, in which:
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition;TIME is to be accurate to hour Time character string, inversion operation is to arrange character string inverted order,
Column family Value has 0~59 column, if the specific indexes SERVICE_ITEM_ID of each column storage network element NE_ID Dry time data.
Preferably, the 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, months Structure according to table, annual data table includes line unit RowKey and Value column family, wherein
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID;SERVICE_ITEM_ID is the index mark of acquisition;TIME was opened from 1900 Begin by which 5 minutes or 15 minutes of this acquisition time or hour day or week or the moon or year character string;Inversion operation It is to arrange character string inverted order,
Column family Value includes: that MAX, MIN, AVG, SUM, COUNTS five is arranged, and wherein MAX is maximum value, and MIN is minimum value, AVG is average value, and SUM is total for merger data, and COUNTS is merger number of data.
Preferably, NE_ID is 8, when less than 8, mends " 0 " in front, and SERVICE_ITEM_ID is 4, less than 4 When, " 0 " is mended in front, TIME is 12, when string length is less than 12, mends " 0 " in front
Preferably, step S2 further comprises:
Step S200 isolates network element ID NE_ID, index mark from the initial data of the network equipment of this acquisition SERVICE_ITEM_ID;
Step S201, according to network element ID NE_ID, index mark SERVICE_ITEM_ID and the corresponding number of time retrieval According in storage table with the presence or absence of meeting the data of querying condition;
Step S202 then calculates the initial data for planning the network equipment of this acquisition as the time if it does not exist First data is inserted into the column family Value in corresponding table data store;Meet the data of querying condition if it exists, then basis is returned And algorithm merger historical data, calculate column family Value.
Preferably, in step S202, the conflation algorithm is to each column of the column family Value of each tables of data using following public Formula merger:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through retrieval The history merger number of data that condition query comes out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be inquired by search condition History minimum Value Data out;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be inquired by search condition History maximum value data out;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be looked by search condition It is total to ask the history merger data come out;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be inquired by search condition History merger data out are total, and COUNTS_old is the item number of history merger data.
Preferably, step S3 further comprises:
Step S300 generates line unit Rowkey according to line unit create-rule according to the classification of table,
Step S301, by this initial data according to the Value of the minute storage where acquisition time to raw data table In column family in the column of which corresponding minute;Other tables of data using the new data of merger in step S2 as the column of Value column family, Store the corresponding table data store based on Hbase.
Preferably, in step S4, querying condition is spliced into search condition according to following formula:
The SERVICE_ITEM_ID+STARTTIME that the NE_ID+ negated is negated
The SERVICE_ITEM_ID+ENDTIME that the NE_ID+ negated is negated
Wherein NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition, and STARTTIME is that retrieval is opened Begin the time, ENDTIME is the retrieval end time, and inversion operation is to arrange character string inverted order.
In order to achieve the above objectives, the network big data modelling system based on HBase that the present invention also provides a kind of, packet It includes:
Creating unit, for creating multiple table data stores based on Hbase;
Aggregation of data unit, the network equipment data for that will acquire are calculated with corresponding tables of data according to merger according to the time Method carries out aggregation of data;
Data storage cell, for depositing each data of data storage to corresponding Hbase after initial data and merger It stores up in table;
Query and search unit, the querying condition for receiving user are generated search condition, according to the retrieval item of generation Part carries out data query retrieval in the table data store based on HBase.
Compared with prior art, a kind of network big data design methods and system based on HBase of the present invention pass through By creating the table data store based on HBase, merger is carried out using network equipment data of the conflation algorithm to acquisition and will be returned Data after and are stored in the table data store based on HBase created, and the present invention has fully considered field of network management feature With the characteristic of HBase column storage, the efficient storage of data is realized, the demand under user's different application scene is met, and The scalability and data balancing of system are effectively taken into account, the present invention can meet network management expert's business big data storage and quickly The business demand of retrieval.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of the network big data design methods based on HBase of the present invention;
Fig. 2 is a kind of system architecture diagram of the network big data modelling system based on HBase of the present invention;
Fig. 3 is the detail structure chart of aggregation of data unit in the specific embodiment of the invention;
Fig. 4 is the flow chart of one embodiment of the network big data design methods based on HBase of the present invention.
Specific embodiment
Below by way of specific specific example and embodiments of the present invention are described with reference to the drawings, those skilled in the art can Understand further advantage and effect of the invention easily by content disclosed in the present specification.The present invention can also pass through other differences Specific example implemented or applied, details in this specification can also be based on different perspectives and applications, without departing substantially from Various modifications and change are carried out under spirit of the invention.
Fig. 1 is a kind of step flow chart of the network big data design methods based on HBase of the present invention.Such as Fig. 1 institute Show, a kind of network big data design methods based on HBase of the present invention include the following steps:
Step S1 creates multiple table data stores based on Hbase.In the specific embodiment of the invention, the data of creation Storage table includes raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, the moon Some or all of in tables of data, annual data table.
Specifically, each table data store structure of creation is as follows:
1) raw data table: including line unit RowKey and Value column family
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of raw data table ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the inverted order character string of network element ID, when less than 8, mends " 0 " in front;SERVICE_ITEM_ ID is the inverted order character string of the index mark of acquisition, when less than 4, mends " 0 " in front;TIME is the time word for being accurate to hour Symbol string mends " 0 " in front when string length is less than 12;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of raw data table has 0~59 column, and each column store network element (NE_ID) some time (such as one minute) data of specific indexes (SERVICE_ITEM_ID).
2) 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, year According to table: including line unit RowKey and Value column family.
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of above-mentioned tables of data ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the mark of network element;SERVICE_ITEM_ID is the index mark of acquisition;TIME is from 1900 Start to this acquisition time which 5 minutes (or 15 minutes, hour, day, week, the moon, year) character string, work as string length When less than 12, " 0 " is mended in front;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of above-mentioned tables of data includes but is not limited to: MAX (maximum value), MIN (minimum value), AVG (average value), SUM (merger data are total), COUNTS (merger number of data) five column.
The network equipment data of acquisition are carried out aggregation of data according to basis of time conflation algorithm by step S2.In the present invention In specific embodiment, the network equipment data of acquisition can be divided into 5-minute data, 15 minute datas, hour data, day according to the time Accordingly and annual data, but not limited to this for data, weekly data, months.
Specifically, step S2 further comprises:
Step S200 isolates NE ID (NE_ID), index ID from the initial data of the network equipment of this acquisition (SERVICE_ITEM_ID);
Step S201, according to NE ID (NE_ID), index ID (SERVICE_ITEM_ID) and time (such as if 5 points Clock data, then from 1900 to current which 5 minutes, if 15 minute datas, then from 1900 to current which 15 minutes) it retrieves in corresponding table data store with the presence or absence of the data for meeting querying condition;
Step S202 then calculates the initial data for planning the network equipment of this acquisition as the time if it does not exist First data is inserted into the column family Value in corresponding table data store, i.e. this is original by COUNTS_new=1, MIN_new= Data, this initial data of MAX_new=, this initial data of SUM_new=, this initial data of AVG_new=, for example, If the initial data of this network equipment obtained is 5-minute data, using the initial data as this first of 5 minutes Data are inserted into 5-minute data table;If there are the data for meeting querying condition in table data store, according to conflation algorithm merger Historical data calculates Value column family.Specifically, conflation algorithm uses following formula to each column of the column family Value of each tables of data Merger:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through retrieval The history merger number of data that condition query comes out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be inquired by search condition History minimum Value Data out;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be inquired by search condition History maximum value data out;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be looked by search condition It is total to ask the history merger data come out;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be inquired by search condition History merger data out are total, and COUNTS_old is the item number of history merger data;
Step S3 stores the data after initial data and merger into each table data store of corresponding Hbase.
Specifically, step S3 further comprises:
Step S300, according to the classification of table (raw data table, 5 minutes, 15 minutes, hour, day, week, the moon, annual data Table), line unit Rowkey is generated according to line unit create-rule in step S1,
Step S301, by this initial data according to the Value of the minute storage where acquisition time to raw data table In column family in the column of which corresponding minute;Other tables of data using the new data of merger in step S2 as the column of Value column family, Store corresponding 5-minute data table in Hbase, 15 minute data tables, hour data table, day data table, weekly data table, months According in table, annual data table, used for inquiring.
Step S4, the querying condition for receiving user are generated search condition, are based on according to the search condition of generation in this Data query retrieval is carried out in the table data store of HBase.Specifically, the classification for the table retrieved as needed (5 minutes, 15 points Clock, hour, day, week, the moon, annual data table), NE ID (NE_ID), index ID (SERVICE_ITEM_ID), calculate user input Retrieval time range be since 1900 by which 5 minutes or 15 minutes of retrieval time hour or day or week or The moon or year (being determined according to table classification), " 0 " is mended in front less than 12, querying condition is spliced into search condition: NE_ID (8 Negate)+SERVICE_ITEM_ID (4 negate)+STARTTIME (12), NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ENDTIME (12), qualified data are inquired in the table data store based on Hbase and return to inquiry As a result.
Fig. 2 is a kind of system architecture diagram of the network big data modelling system based on HBase of the present invention.Such as Fig. 2 institute Show, a kind of network big data modelling system based on HBase of the present invention, comprising:
Creating unit 20, for creating multiple table data stores based on Hbase.In the specific embodiment of the invention, wound The table data store built includes raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, week Tables of data, moon tables of data, some or all of in annual data table.
Specifically, each table data store structure of creation is as follows:
1) raw data table: including line unit RowKey and Value column family
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of raw data table ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the inverted order character string of network element ID, when less than 8, mends " 0 " in front;SERVICE_ITEM_ ID is the inverted order character string of the index mark of acquisition, when less than 4, mends " 0 " in front;TIME is the time word for being accurate to hour Symbol string mends " 0 " in front when string length is less than 12;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of raw data table has 0~59 column, and each column store network element (NE_ID) some time (such as one minute) data of specific indexes (SERVICE_ITEM_ID).
2) 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, year According to table: including line unit RowKey and Value column family.
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of above-mentioned tables of data ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the mark of network element;SERVICE_ITEM_ID is the index mark of acquisition;TIME is from 1900 Start to this acquisition time which 5 minutes (or 15 minutes, hour, day, week, the moon, year) character string, work as string length When less than 12, " 0 " is mended in front;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of above-mentioned tables of data includes: MAX (maximum value), MIN (minimum Value), AVG (average value), SUM (merger data total), COUNTS (merger number of data) five column.
Aggregation of data unit 21 is returned for the network equipment data of acquisition to be carried out data according to basis of time conflation algorithm And.In the specific embodiment of the invention, the network equipment data of acquisition according to the time can be divided into 5-minute data, 15 minute datas, Hour data, day data, weekly data, months are accordingly and annual data.
Specifically, as shown in figure 3, aggregation of data unit 21 further comprises:
Data parsing unit 210, for isolating NE ID (NE_ from the initial data for the network equipment that this is obtained ID), index ID (SERVICE_ITEM_ID);
Retrieval unit 211, for according to NE ID (NE_ID), index ID (SERVICE_ITEM_ID) and time (such as If 5-minute data, then from 1900 to current which 5 minutes, if 15 minute datas, then from 1900 to current Which is 15 minutes a) it retrieves in corresponding table data store with the presence or absence of the data for meeting querying condition;
Search result processing unit 212, if the search result of retrieval unit 211 is that there is no calculating is planned this and obtained The initial data of the network equipment taken is inserted into the column family in corresponding table data store as the first data of the time Value, i.e. COUNTS_new=1, this initial data of MIN_new=, this initial data of MAX_new=, SUM_new=sheet Secondary initial data, this initial data of AVG_new=, for example, if the initial data of this network equipment obtained is 5 the number of minutes According to then using the initial data as in this 5 minutes first data insertion 5-minute data table;If the inspection of retrieval unit 211 Hitch fruit is the presence of the data for meeting querying condition in table data store, then according to conflation algorithm merger historical data, calculates Value column family.Specifically, conflation algorithm uses following formula merger to each column of the column family Value of each tables of data:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through retrieval The history merger number of data that condition query comes out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be inquired by search condition History minimum Value Data out;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be inquired by search condition History maximum value data out;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be looked by search condition It is total to ask the history merger data come out;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be inquired by search condition History merger data out are total, and COUNTS_old is the item number of history merger data;
Data storage cell 22, for the data after initial data and merger to be stored to each data to corresponding Hbase In storage table.
Specifically, data storage cell 22 further comprises:
Line unit generation unit, for according to the classification of table (raw data table, 5 minutes, 15 minutes, hour, day, week, the moon, Annual data table), line unit Rowkey is generated according to line unit create-rule in creating unit 20,
Storage unit, for this initial data to be arrived raw data table according to the minute storage where acquisition time In Value column family in the column of which corresponding minute;Other tables of data using the new data of merger in aggregation of data unit 21 as The column of Value column family, storage corresponding raw data table, 5-minute data table, 15 minute data tables, hour data into Hbase Table, day data table, weekly data table, moon tables of data, in annual data table, so that inquiry uses.
Query and search unit 23, the querying condition for receiving user is generated search condition, according to the retrieval of generation Condition carries out data query retrieval in the table data store based on HBase.Specifically, query and search unit 23 is as needed The table of retrieval classification (5 minutes, 15 minutes, hour, day, week, the moon, annual data table), NE ID (NE_ID), index ID (SERVICE_ITEM_ID), calculate user input retrieval time range be since 1900 to retrieval time which A 5 minutes or 15 minutes or hour day or week or the moon or year (being determined according to table classification), mend " 0 " in front less than 12, will Querying condition is spliced into search condition: NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+STARTTIME (12 Position), NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ENDTIME (12), deposited in the data based on Hbase Qualified data are inquired in storage table and return to query result.
As shown in figure 4, one embodiment of the network big data design methods based on HBase for the present invention, is System acquisition network equipment information according to conflation algorithm merger data, and is stored to according in the database table based on Hbase, is used Family system queries can retrieve the data being stored in Hbase through the invention:
Step 1, the table data store based on Hbase is created
Create raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, Month tables of data, annual data table.Wherein, raw data table includes line unit RowKey and Value column family, in raw data table Value column family has 0~59 column, and each column store one minute of the specific indexes (SERVICE_ITEM_ID) of network element (NE_ID) Data;5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, annual data table Including line unit RowKey and Value column family, Value column family includes: MAX (maximum value), MIN (minimum value), AVG (average Value), SUM (merger data total), COUNTS (merger number of data) five column.
Step 2, the network equipment data of acquisition are subjected to aggregation of data according to conflation algorithm
1) 5-minute data merger:
Assuming that the initial data of this acquisition are as follows: 0.7, net is isolated from the initial data of the network equipment of this acquisition (such as CPU, memory, network gulp down by first ID (NE_ID) (such as port, CPU, memory etc.), index ID (SERVICE_ITEM_ID) The indexs such as the amount of spitting), according to NE ID (NE_ID), index ID (SERVICE_ITEM_ID) and time (from 1900 to current Which 5 minutes) in three querying condition retrieval 5-minute data tables with the presence or absence of the data for meeting querying condition, if do not deposited , then this storage 5-minute data are as follows: COUNTS_new=1, MIN_new=0.7, MAX_new=0.7, SUM_new= 0.7, AVG_new=0.7;
If it is present according to following algorithm merger historical data:
By inquiring obtained historical data are as follows: MIN_old=0.6, MAX_old=0.65, SUM_old=1.3, AVG_ Old=0.75, COUNTS_old=2;Then aggregation of data:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1=2+1=3
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data)=min (0.6,0.7) =0.6;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data)=max (0.65,0.7) =0.7;
SUM column new data merger formula: this initial data=1.3+0.7=2.0 of SUM_new=SUM_old+;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(Counts_old+1)= (1.3+0.7)/(2+1)=0.67;
2) in the embodiment of the present invention, 15 minutes, hour, day, week, the moon, the aggregation of data algorithm in year and process be substantially the same as 5 points Clock aggregation of data, unique different place are that retrieval time, (time in the condition of retrieval divided according to the classification of retrieval tables of data Be not since 1900 to which 15 minutes of this acquisition time, hour, day, week, the moon, year).
Step 3, the data after initial data and merger are stored into the table data store accordingly based on Hbase
According to the classification of table (raw data table, 5 minutes, 15 minutes, hour, day, week, the moon, annual data table), according to step 1 line unit create-rule generates line unit Rowkey, this initial data is stored according to the minute where acquisition time to original In the Value column family of tables of data in the column of which corresponding minute;Other tables of data using the new data of merger in step 2 as Corresponding 5-minute data table, 15 minute data tables, hour data table, number of days based on Hbase are arrived in the column of Value column family, storage According in table, weekly data table, moon tables of data, annual data table, used for inquiry.
For example, TIME in 5-minute data table be since 1900 the 4116226th 5 minutes, get network element ID It is 10102643, index is identified as 0001, then RowKey value when data storage is to HBase are as follows:
RowKey=NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+TIME (12, less than 12 Mend " 0 " in front)
=34620101+1000+000004116226
=346201011000000004116226
The 5-minute data and this Rowkey that step 2 merger is obtained are inserted into 5 minutes of Hbase as a data In tables of data.
Step 4, data retrieval:
The table retrieved as needed classification (5 minutes, 15 minutes, hour, day, week, the moon, annual data table), NE ID (NE_ID), index ID (SERVICE_ITEM_ID), calculate user input retrieval time range be since 1900 to Which 5 minutes of retrieval time, 15 minutes, hour, day, week, the moon, year (according to the determination of table classification), in front less than 12 It mends " 0 ", querying condition is spliced into search condition: NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ STARTTIME (12), NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ENDTIME (12) arrive Hbase Qualified data are inquired in database and return to query result.
For example, inquiry network element ID is 10102643, index is identified as 0001, from 12 points to 2018 of on January 1st, 2018 5-minute data between 12 points of January 2.
Starting RowKey=NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate) of search condition+ STARTTIME (the 12412656th of point distance 1900 1 day 12 January in 2018 5 minutes mends " 0 " in front less than 12)
=34620101+1000+000012412656
=346201011000000012412656
End RowKey=NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ENDTIME of search condition (the 12412944th of point distance 1900 2 days 12 January in 2018 5 minutes mends " 0 " in front less than 12)
=34620101+1000+000012412944
=346201011000000012412944
Rowkey is retrieved in the 5-minute data table of the table data store based on Hbase in starting RowKey and end Data between RowKey value return to query results.
In conclusion a kind of network big data design methods and system based on HBase of the present invention pass through wound The table data store based on HBase is built, merger is carried out to the network equipment data of acquisition using conflation algorithm and will be after merger Data are stored in the table data store based on HBase created, and the present invention has fully considered field of network management feature and HBase The characteristic of column storage, realizes the efficient storage of data, meets the demand under user's different application scene, and effectively simultaneous The scalability and data balancing of system are cared for, the present invention can meet the industry of network management expert's business big data storage and quick-searching Business demand.
The above-described embodiments merely illustrate the principles and effects of the present invention, and is not intended to limit the present invention.Any Without departing from the spirit and scope of the present invention, modifications and changes are made to the above embodiments by field technical staff.Therefore, The scope of the present invention, should be as listed in the claims.

Claims (10)

1. a kind of network big data design methods based on HBase, include the following steps:
Step S1 creates multiple table data stores based on Hbase;
The network equipment data of acquisition are carried out data according to conflation algorithm with corresponding tables of data according to the time and returned by step S2 And;
Step S3 stores the data after initial data and merger into corresponding each table data store based on Hbase;
Step S4, the querying condition for receiving user are generated search condition, are based on HBase in this according to the search condition of generation Table data store in carry out data query retrieval.
2. a kind of network big data design methods based on HBase as described in claim 1, it is characterised in that: described Table data store includes raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data Table, moon tables of data, some or all of in annual data table.
3. a kind of network big data design methods based on HBase as claimed in claim 2, which is characterized in that described The structure of raw data table includes line unit RowKey and Value column family, in which:
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition, TIME be accurate to hour when Between character string, inversion operation is to arrange character string inverted order,
Value column family has 0~59 column, the specific indexes SERVICE_ITEM_ID of each column storage network element NE_ID it is several when Between data.
4. a kind of network big data design methods based on HBase as claimed in claim 3, it is characterised in that: described 5 Minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, the structure of annual data table Including line unit RowKey and Value column family, wherein
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID;SERVICE_ITEM_ID is the index mark of acquisition;TIME be since 1900 to Which 5 minutes or 15 minutes of this acquisition time or hour day or week or the moon or year character string;Inversion operation be by The arrangement of character string inverted order,
Column family Value includes: that MAX, MIN, AVG, SUM, COUNTS five is arranged, and wherein MAX is maximum value, and MIN is minimum value, AVG For average value, SUM is total for merger data, and COUNTS is merger number of data.
5. a kind of network big data design methods based on HBase as described in claim 3 or 4, it is characterised in that: NE_ID is 8, when less than 8, mends " 0 " in front, and SERVICE_ITEM_ID is 4, when less than 4, mends " 0 " in front, TIME is 12, when string length is less than 12, mends " 0 " in front
6. a kind of network big data design methods based on HBase as claimed in claim 5, which is characterized in that step S2 further comprises:
Step S200 isolates network element ID NE_ID, index mark from the initial data of the network equipment of this acquisition SERVICE_ITEM_ID;
Step S201 retrieves corresponding data according to network element ID NE_ID, index mark SERVICE_ITEM_ID and time and deposits With the presence or absence of the data for meeting querying condition in storage table;
Step S202 then calculates first of the initial data for planning the network equipment of this acquisition as the time if it does not exist Data is inserted into the column family Value in corresponding table data store;The data for meeting querying condition if it exists, then calculate according to merger Method merger historical data calculates column family Value.
7. a kind of network big data design methods based on HBase as claimed in claim 6, which is characterized in that Yu Bu In rapid S202, the conflation algorithm uses following formula merger to each column of the column family Value of each tables of data:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through search condition The history merger number of data checked out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be checked out by search condition History minimum Value Data;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be checked out by search condition History maximum value data;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be inquired by search condition The history merger data come are total;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be checked out by search condition History merger data it is total, COUNTS_old is the item number of history merger data.
8. a kind of network big data design methods based on HBase as described in claim 1, which is characterized in that step S3 further comprises:
Step S300 generates line unit Rowkey according to line unit create-rule according to the classification of table,
Step S301, by this initial data according to the Value column family of the minute storage where acquisition time to raw data table In which corresponding minute column in;Other tables of data are using the new data of merger in step S2 as the column of Value column family, storage To the corresponding table data store based on Hbase.
9. a kind of network big data design methods based on HBase as claimed in claim 8, it is characterised in that: Yu Bu In rapid S4, querying condition is spliced into search condition according to following formula:
The SERVICE_ITEM_ID+STARTTIME that the NE_ID+ negated is negated
The SERVICE_ITEM_ID+ENDTIME that the NE_ID+ negated is negated
Wherein NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition, when STARTTIME is that retrieval starts Between, ENDTIME is the retrieval end time, and inversion operation is to arrange character string inverted order.
10. a kind of network big data modelling system based on HBase, comprising:
Creating unit, for creating multiple table data stores based on Hbase;
Aggregation of data unit, network equipment data for that will acquire according to time and corresponding tables of data according to conflation algorithm into Row aggregation of data;
Data storage cell is deposited for storing the data after initial data and merger to corresponding each data based on Hbase It stores up in table;
Query and search unit, the querying condition for receiving user are generated search condition, according to the search condition of generation in Data query retrieval is carried out in the table data store based on HBase.
CN201811343471.7A 2018-11-13 2018-11-13 A kind of network big data design methods and system based on HBase Pending CN109492008A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811343471.7A CN109492008A (en) 2018-11-13 2018-11-13 A kind of network big data design methods and system based on HBase

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811343471.7A CN109492008A (en) 2018-11-13 2018-11-13 A kind of network big data design methods and system based on HBase

Publications (1)

Publication Number Publication Date
CN109492008A true CN109492008A (en) 2019-03-19

Family

ID=65694323

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811343471.7A Pending CN109492008A (en) 2018-11-13 2018-11-13 A kind of network big data design methods and system based on HBase

Country Status (1)

Country Link
CN (1) CN109492008A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502543A (en) * 2019-08-07 2019-11-26 京信通信系统(中国)有限公司 Device performance data storage method, device, equipment and storage medium
CN111538728A (en) * 2020-04-27 2020-08-14 中国科学技术大学 Method for archiving and querying historical data of large scientific device
CN113114508A (en) * 2021-04-15 2021-07-13 上海理想信息产业(集团)有限公司 Multistage variable-frequency network monitoring data acquisition method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1747398A (en) * 2004-09-08 2006-03-15 大唐移动通信设备有限公司 Mass performance data statistical method in network element management system
CN103200046A (en) * 2013-03-28 2013-07-10 青岛海信传媒网络技术有限公司 Method and system for monitoring network cell device performance
CN104216962A (en) * 2014-08-22 2014-12-17 南京邮电大学 Mass network management data indexing design method based on HBase
CN104216963A (en) * 2014-08-22 2014-12-17 南京邮电大学 Mass network management data collection and storage method based on HBase
US20160373292A1 (en) * 2015-06-22 2016-12-22 Arista Networks, Inc. Tracking state of components within a network element
CN107273482A (en) * 2017-06-12 2017-10-20 北京市天元网络技术股份有限公司 Alarm data storage method and device based on HBase

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1747398A (en) * 2004-09-08 2006-03-15 大唐移动通信设备有限公司 Mass performance data statistical method in network element management system
CN103200046A (en) * 2013-03-28 2013-07-10 青岛海信传媒网络技术有限公司 Method and system for monitoring network cell device performance
CN104216962A (en) * 2014-08-22 2014-12-17 南京邮电大学 Mass network management data indexing design method based on HBase
CN104216963A (en) * 2014-08-22 2014-12-17 南京邮电大学 Mass network management data collection and storage method based on HBase
US20160373292A1 (en) * 2015-06-22 2016-12-22 Arista Networks, Inc. Tracking state of components within a network element
CN107273482A (en) * 2017-06-12 2017-10-20 北京市天元网络技术股份有限公司 Alarm data storage method and device based on HBase

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨建东: "云环境下网管数据查询系统设计", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502543A (en) * 2019-08-07 2019-11-26 京信通信系统(中国)有限公司 Device performance data storage method, device, equipment and storage medium
CN111538728A (en) * 2020-04-27 2020-08-14 中国科学技术大学 Method for archiving and querying historical data of large scientific device
CN113114508A (en) * 2021-04-15 2021-07-13 上海理想信息产业(集团)有限公司 Multistage variable-frequency network monitoring data acquisition method and device

Similar Documents

Publication Publication Date Title
US11347776B2 (en) Index mechanism for report generation
CN103365929B (en) The management method of a kind of data base connection and system
CN103548019B (en) Method and system for providing statistical information according to data warehouse
CN103049521B (en) Virtual table directory system and the method for many attributes multiple condition searching can be realized
CN111459985B (en) Identification information processing method and device
CN109492008A (en) A kind of network big data design methods and system based on HBase
US11734292B2 (en) Cloud inference system
CN103930888B (en) Selected based on the many grain size subpopulation polymerizations updating, storing and response constrains
CN105930446B (en) A kind of telecom client label generating method based on Hadoop distributed computing technology
CN107844879A (en) Order allocation method and device
CN103793493B (en) A kind of method and system for handling car-mounted terminal mass data
CN105608188A (en) Data processing method and data processing device
CN106972978A (en) A kind of ALM method for pushing and device
WO2012100349A1 (en) Statistics forecast for range partitioned tables
US20170351762A1 (en) Generating exemplar electronic documents using semantic context
CN109871380A (en) A kind of crowd's packet application method and system based on Redis
CN109886618A (en) A kind of method and device optimizing logistics operation
CN109299089A (en) The calculating and storage method and calculating of a kind of label data of drawing a portrait and storage system
CN115358679B (en) Medical material intelligent management method based on cloud bin
CN115774717A (en) Data searching method and device, electronic equipment and computer readable storage medium
CN107682180A (en) A kind of communication network device performance indications collecting method
CN108614818B (en) Data storage, updating and query method and device
CN106326295A (en) Method and device for storing semantic data
CN110147424A (en) A kind of Top-k interblock space keyword query method and system
RU2680743C1 (en) Method of preserving and changing reference and initial records in an information data management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190319