CN109492008A - A kind of network big data design methods and system based on HBase - Google Patents
A kind of network big data design methods and system based on HBase Download PDFInfo
- Publication number
- CN109492008A CN109492008A CN201811343471.7A CN201811343471A CN109492008A CN 109492008 A CN109492008 A CN 109492008A CN 201811343471 A CN201811343471 A CN 201811343471A CN 109492008 A CN109492008 A CN 109492008A
- Authority
- CN
- China
- Prior art keywords
- data
- merger
- hbase
- new
- column
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of network big data design methods and system based on HBase, described method includes following steps: step S1, creates multiple table data stores based on Hbase;The network equipment data of acquisition are carried out aggregation of data according to conflation algorithm with corresponding tables of data according to the time by step S2;Step S3 stores the data after initial data and merger into each table data store of corresponding Hbase;Step S4, the querying condition for receiving user is generated search condition, data query retrieval is carried out in the table data store based on HBase according to the search condition of generation, the present invention can support the reliable memory and efficiently inquiry of the mass data that the network equipment reports in network management system.
Description
Technical field
The present invention relates to database technical fields, more particularly to a kind of network big data modelling based on HBase
Method and system.
Background technique
China Telecom's network management expert service is one and couples with China Telecom big net network monitoring center, provides one for client
The remote monitoring and administration service for covering route and the network equipment of standing posture.However in practical O&M, client and O&M engineer
It needs in face of third-party platform, autonomous platform, double layer network from service monitoring platform, seven layers of traffic monitoring, computer room house keeper, ITSM
The multi-platform problem such as platform, to customer demand low-response, data between each platform, are matched at customer information for third-party platform provider
Set it is relatively independent be difficult to unification, reduce the perceptibility of client.In order to solve these problems, some companies then accordingly develop net
Guard system is promoted the response speed to customer demand, is increased client's with data, client and the configuration information of unified each platform
Perceptibility.
However, the solution of traditional network management system is usually to store data in relevant database, as MySQL,
SQL Server, ORACLE etc., distinct issues are can not to solve the high frequency insertion and inquiry of big data quantity, and scheme
It is with high costs, scalability is bad.
Summary of the invention
In order to overcome the deficiencies of the above existing technologies, purpose of the present invention is to provide a kind of networks based on HBase
Big data design methods and system, to support the reliable memory and height of the network equipment reports in network management system mass data
Effect inquiry.
In order to achieve the above object, the present invention proposes a kind of network big data design methods based on HBase, including as follows
Step:
Step S1 creates multiple table data stores based on Hbase;
The network equipment data of acquisition are carried out data according to conflation algorithm with corresponding tables of data according to the time by step S2
Merger;
Step S3 stores the data after initial data and merger into corresponding each table data store based on Hbase;
Step S4, the querying condition for receiving user are generated search condition, are based on according to the search condition of generation in this
Data query retrieval is carried out in the table data store of HBase.
Preferably, the table data store includes raw data table, 5-minute data table, 15 minute data tables, hour data
Table, day data table, weekly data table, moon tables of data, some or all of in annual data table.
Preferably, the structure of the raw data table includes line unit RowKey and Value column family, in which:
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition;TIME is to be accurate to hour
Time character string, inversion operation is to arrange character string inverted order,
Column family Value has 0~59 column, if the specific indexes SERVICE_ITEM_ID of each column storage network element NE_ID
Dry time data.
Preferably, the 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, months
Structure according to table, annual data table includes line unit RowKey and Value column family, wherein
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID;SERVICE_ITEM_ID is the index mark of acquisition;TIME was opened from 1900
Begin by which 5 minutes or 15 minutes of this acquisition time or hour day or week or the moon or year character string;Inversion operation
It is to arrange character string inverted order,
Column family Value includes: that MAX, MIN, AVG, SUM, COUNTS five is arranged, and wherein MAX is maximum value, and MIN is minimum value,
AVG is average value, and SUM is total for merger data, and COUNTS is merger number of data.
Preferably, NE_ID is 8, when less than 8, mends " 0 " in front, and SERVICE_ITEM_ID is 4, less than 4
When, " 0 " is mended in front, TIME is 12, when string length is less than 12, mends " 0 " in front
Preferably, step S2 further comprises:
Step S200 isolates network element ID NE_ID, index mark from the initial data of the network equipment of this acquisition
SERVICE_ITEM_ID;
Step S201, according to network element ID NE_ID, index mark SERVICE_ITEM_ID and the corresponding number of time retrieval
According in storage table with the presence or absence of meeting the data of querying condition;
Step S202 then calculates the initial data for planning the network equipment of this acquisition as the time if it does not exist
First data is inserted into the column family Value in corresponding table data store;Meet the data of querying condition if it exists, then basis is returned
And algorithm merger historical data, calculate column family Value.
Preferably, in step S202, the conflation algorithm is to each column of the column family Value of each tables of data using following public
Formula merger:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through retrieval
The history merger number of data that condition query comes out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be inquired by search condition
History minimum Value Data out;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be inquired by search condition
History maximum value data out;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be looked by search condition
It is total to ask the history merger data come out;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be inquired by search condition
History merger data out are total, and COUNTS_old is the item number of history merger data.
Preferably, step S3 further comprises:
Step S300 generates line unit Rowkey according to line unit create-rule according to the classification of table,
Step S301, by this initial data according to the Value of the minute storage where acquisition time to raw data table
In column family in the column of which corresponding minute;Other tables of data using the new data of merger in step S2 as the column of Value column family,
Store the corresponding table data store based on Hbase.
Preferably, in step S4, querying condition is spliced into search condition according to following formula:
The SERVICE_ITEM_ID+STARTTIME that the NE_ID+ negated is negated
The SERVICE_ITEM_ID+ENDTIME that the NE_ID+ negated is negated
Wherein NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition, and STARTTIME is that retrieval is opened
Begin the time, ENDTIME is the retrieval end time, and inversion operation is to arrange character string inverted order.
In order to achieve the above objectives, the network big data modelling system based on HBase that the present invention also provides a kind of, packet
It includes:
Creating unit, for creating multiple table data stores based on Hbase;
Aggregation of data unit, the network equipment data for that will acquire are calculated with corresponding tables of data according to merger according to the time
Method carries out aggregation of data;
Data storage cell, for depositing each data of data storage to corresponding Hbase after initial data and merger
It stores up in table;
Query and search unit, the querying condition for receiving user are generated search condition, according to the retrieval item of generation
Part carries out data query retrieval in the table data store based on HBase.
Compared with prior art, a kind of network big data design methods and system based on HBase of the present invention pass through
By creating the table data store based on HBase, merger is carried out using network equipment data of the conflation algorithm to acquisition and will be returned
Data after and are stored in the table data store based on HBase created, and the present invention has fully considered field of network management feature
With the characteristic of HBase column storage, the efficient storage of data is realized, the demand under user's different application scene is met, and
The scalability and data balancing of system are effectively taken into account, the present invention can meet network management expert's business big data storage and quickly
The business demand of retrieval.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of the network big data design methods based on HBase of the present invention;
Fig. 2 is a kind of system architecture diagram of the network big data modelling system based on HBase of the present invention;
Fig. 3 is the detail structure chart of aggregation of data unit in the specific embodiment of the invention;
Fig. 4 is the flow chart of one embodiment of the network big data design methods based on HBase of the present invention.
Specific embodiment
Below by way of specific specific example and embodiments of the present invention are described with reference to the drawings, those skilled in the art can
Understand further advantage and effect of the invention easily by content disclosed in the present specification.The present invention can also pass through other differences
Specific example implemented or applied, details in this specification can also be based on different perspectives and applications, without departing substantially from
Various modifications and change are carried out under spirit of the invention.
Fig. 1 is a kind of step flow chart of the network big data design methods based on HBase of the present invention.Such as Fig. 1 institute
Show, a kind of network big data design methods based on HBase of the present invention include the following steps:
Step S1 creates multiple table data stores based on Hbase.In the specific embodiment of the invention, the data of creation
Storage table includes raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, the moon
Some or all of in tables of data, annual data table.
Specifically, each table data store structure of creation is as follows:
1) raw data table: including line unit RowKey and Value column family
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of raw data table
ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the inverted order character string of network element ID, when less than 8, mends " 0 " in front;SERVICE_ITEM_
ID is the inverted order character string of the index mark of acquisition, when less than 4, mends " 0 " in front;TIME is the time word for being accurate to hour
Symbol string mends " 0 " in front when string length is less than 12;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of raw data table has 0~59 column, and each column store network element
(NE_ID) some time (such as one minute) data of specific indexes (SERVICE_ITEM_ID).
2) 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, year
According to table: including line unit RowKey and Value column family.
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of above-mentioned tables of data
ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the mark of network element;SERVICE_ITEM_ID is the index mark of acquisition;TIME is from 1900
Start to this acquisition time which 5 minutes (or 15 minutes, hour, day, week, the moon, year) character string, work as string length
When less than 12, " 0 " is mended in front;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of above-mentioned tables of data includes but is not limited to: MAX (maximum value),
MIN (minimum value), AVG (average value), SUM (merger data are total), COUNTS (merger number of data) five column.
The network equipment data of acquisition are carried out aggregation of data according to basis of time conflation algorithm by step S2.In the present invention
In specific embodiment, the network equipment data of acquisition can be divided into 5-minute data, 15 minute datas, hour data, day according to the time
Accordingly and annual data, but not limited to this for data, weekly data, months.
Specifically, step S2 further comprises:
Step S200 isolates NE ID (NE_ID), index ID from the initial data of the network equipment of this acquisition
(SERVICE_ITEM_ID);
Step S201, according to NE ID (NE_ID), index ID (SERVICE_ITEM_ID) and time (such as if 5 points
Clock data, then from 1900 to current which 5 minutes, if 15 minute datas, then from 1900 to current which
15 minutes) it retrieves in corresponding table data store with the presence or absence of the data for meeting querying condition;
Step S202 then calculates the initial data for planning the network equipment of this acquisition as the time if it does not exist
First data is inserted into the column family Value in corresponding table data store, i.e. this is original by COUNTS_new=1, MIN_new=
Data, this initial data of MAX_new=, this initial data of SUM_new=, this initial data of AVG_new=, for example,
If the initial data of this network equipment obtained is 5-minute data, using the initial data as this first of 5 minutes
Data are inserted into 5-minute data table;If there are the data for meeting querying condition in table data store, according to conflation algorithm merger
Historical data calculates Value column family.Specifically, conflation algorithm uses following formula to each column of the column family Value of each tables of data
Merger:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through retrieval
The history merger number of data that condition query comes out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be inquired by search condition
History minimum Value Data out;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be inquired by search condition
History maximum value data out;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be looked by search condition
It is total to ask the history merger data come out;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be inquired by search condition
History merger data out are total, and COUNTS_old is the item number of history merger data;
Step S3 stores the data after initial data and merger into each table data store of corresponding Hbase.
Specifically, step S3 further comprises:
Step S300, according to the classification of table (raw data table, 5 minutes, 15 minutes, hour, day, week, the moon, annual data
Table), line unit Rowkey is generated according to line unit create-rule in step S1,
Step S301, by this initial data according to the Value of the minute storage where acquisition time to raw data table
In column family in the column of which corresponding minute;Other tables of data using the new data of merger in step S2 as the column of Value column family,
Store corresponding 5-minute data table in Hbase, 15 minute data tables, hour data table, day data table, weekly data table, months
According in table, annual data table, used for inquiring.
Step S4, the querying condition for receiving user are generated search condition, are based on according to the search condition of generation in this
Data query retrieval is carried out in the table data store of HBase.Specifically, the classification for the table retrieved as needed (5 minutes, 15 points
Clock, hour, day, week, the moon, annual data table), NE ID (NE_ID), index ID (SERVICE_ITEM_ID), calculate user input
Retrieval time range be since 1900 by which 5 minutes or 15 minutes of retrieval time hour or day or week or
The moon or year (being determined according to table classification), " 0 " is mended in front less than 12, querying condition is spliced into search condition: NE_ID (8
Negate)+SERVICE_ITEM_ID (4 negate)+STARTTIME (12), NE_ID (8 negate)+SERVICE_ITEM_ID
(4 negate)+ENDTIME (12), qualified data are inquired in the table data store based on Hbase and return to inquiry
As a result.
Fig. 2 is a kind of system architecture diagram of the network big data modelling system based on HBase of the present invention.Such as Fig. 2 institute
Show, a kind of network big data modelling system based on HBase of the present invention, comprising:
Creating unit 20, for creating multiple table data stores based on Hbase.In the specific embodiment of the invention, wound
The table data store built includes raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, week
Tables of data, moon tables of data, some or all of in annual data table.
Specifically, each table data store structure of creation is as follows:
1) raw data table: including line unit RowKey and Value column family
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of raw data table
ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the inverted order character string of network element ID, when less than 8, mends " 0 " in front;SERVICE_ITEM_
ID is the inverted order character string of the index mark of acquisition, when less than 4, mends " 0 " in front;TIME is the time word for being accurate to hour
Symbol string mends " 0 " in front when string length is less than 12;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of raw data table has 0~59 column, and each column store network element
(NE_ID) some time (such as one minute) data of specific indexes (SERVICE_ITEM_ID).
2) 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, year
According to table: including line unit RowKey and Value column family.
In the specific embodiment of the invention, line unit RowKey=NE_ID (8 negate)+SERVICE_ of above-mentioned tables of data
ITEM_ID (4 negate)+TIME (12);
Wherein, NE_ID is the mark of network element;SERVICE_ITEM_ID is the index mark of acquisition;TIME is from 1900
Start to this acquisition time which 5 minutes (or 15 minutes, hour, day, week, the moon, year) character string, work as string length
When less than 12, " 0 " is mended in front;Inversion operation is to arrange character string inverted order.
In the specific embodiment of the invention, the column family Value of above-mentioned tables of data includes: MAX (maximum value), MIN (minimum
Value), AVG (average value), SUM (merger data total), COUNTS (merger number of data) five column.
Aggregation of data unit 21 is returned for the network equipment data of acquisition to be carried out data according to basis of time conflation algorithm
And.In the specific embodiment of the invention, the network equipment data of acquisition according to the time can be divided into 5-minute data, 15 minute datas,
Hour data, day data, weekly data, months are accordingly and annual data.
Specifically, as shown in figure 3, aggregation of data unit 21 further comprises:
Data parsing unit 210, for isolating NE ID (NE_ from the initial data for the network equipment that this is obtained
ID), index ID (SERVICE_ITEM_ID);
Retrieval unit 211, for according to NE ID (NE_ID), index ID (SERVICE_ITEM_ID) and time (such as
If 5-minute data, then from 1900 to current which 5 minutes, if 15 minute datas, then from 1900 to current
Which is 15 minutes a) it retrieves in corresponding table data store with the presence or absence of the data for meeting querying condition;
Search result processing unit 212, if the search result of retrieval unit 211 is that there is no calculating is planned this and obtained
The initial data of the network equipment taken is inserted into the column family in corresponding table data store as the first data of the time
Value, i.e. COUNTS_new=1, this initial data of MIN_new=, this initial data of MAX_new=, SUM_new=sheet
Secondary initial data, this initial data of AVG_new=, for example, if the initial data of this network equipment obtained is 5 the number of minutes
According to then using the initial data as in this 5 minutes first data insertion 5-minute data table;If the inspection of retrieval unit 211
Hitch fruit is the presence of the data for meeting querying condition in table data store, then according to conflation algorithm merger historical data, calculates
Value column family.Specifically, conflation algorithm uses following formula merger to each column of the column family Value of each tables of data:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through retrieval
The history merger number of data that condition query comes out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be inquired by search condition
History minimum Value Data out;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be inquired by search condition
History maximum value data out;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be looked by search condition
It is total to ask the history merger data come out;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be inquired by search condition
History merger data out are total, and COUNTS_old is the item number of history merger data;
Data storage cell 22, for the data after initial data and merger to be stored to each data to corresponding Hbase
In storage table.
Specifically, data storage cell 22 further comprises:
Line unit generation unit, for according to the classification of table (raw data table, 5 minutes, 15 minutes, hour, day, week, the moon,
Annual data table), line unit Rowkey is generated according to line unit create-rule in creating unit 20,
Storage unit, for this initial data to be arrived raw data table according to the minute storage where acquisition time
In Value column family in the column of which corresponding minute;Other tables of data using the new data of merger in aggregation of data unit 21 as
The column of Value column family, storage corresponding raw data table, 5-minute data table, 15 minute data tables, hour data into Hbase
Table, day data table, weekly data table, moon tables of data, in annual data table, so that inquiry uses.
Query and search unit 23, the querying condition for receiving user is generated search condition, according to the retrieval of generation
Condition carries out data query retrieval in the table data store based on HBase.Specifically, query and search unit 23 is as needed
The table of retrieval classification (5 minutes, 15 minutes, hour, day, week, the moon, annual data table), NE ID (NE_ID), index ID
(SERVICE_ITEM_ID), calculate user input retrieval time range be since 1900 to retrieval time which
A 5 minutes or 15 minutes or hour day or week or the moon or year (being determined according to table classification), mend " 0 " in front less than 12, will
Querying condition is spliced into search condition: NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+STARTTIME (12
Position), NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ENDTIME (12), deposited in the data based on Hbase
Qualified data are inquired in storage table and return to query result.
As shown in figure 4, one embodiment of the network big data design methods based on HBase for the present invention, is
System acquisition network equipment information according to conflation algorithm merger data, and is stored to according in the database table based on Hbase, is used
Family system queries can retrieve the data being stored in Hbase through the invention:
Step 1, the table data store based on Hbase is created
Create raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table,
Month tables of data, annual data table.Wherein, raw data table includes line unit RowKey and Value column family, in raw data table
Value column family has 0~59 column, and each column store one minute of the specific indexes (SERVICE_ITEM_ID) of network element (NE_ID)
Data;5-minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, annual data table
Including line unit RowKey and Value column family, Value column family includes: MAX (maximum value), MIN (minimum value), AVG (average
Value), SUM (merger data total), COUNTS (merger number of data) five column.
Step 2, the network equipment data of acquisition are subjected to aggregation of data according to conflation algorithm
1) 5-minute data merger:
Assuming that the initial data of this acquisition are as follows: 0.7, net is isolated from the initial data of the network equipment of this acquisition
(such as CPU, memory, network gulp down by first ID (NE_ID) (such as port, CPU, memory etc.), index ID (SERVICE_ITEM_ID)
The indexs such as the amount of spitting), according to NE ID (NE_ID), index ID (SERVICE_ITEM_ID) and time (from 1900 to current
Which 5 minutes) in three querying condition retrieval 5-minute data tables with the presence or absence of the data for meeting querying condition, if do not deposited
, then this storage 5-minute data are as follows: COUNTS_new=1, MIN_new=0.7, MAX_new=0.7, SUM_new=
0.7, AVG_new=0.7;
If it is present according to following algorithm merger historical data:
By inquiring obtained historical data are as follows: MIN_old=0.6, MAX_old=0.65, SUM_old=1.3, AVG_
Old=0.75, COUNTS_old=2;Then aggregation of data:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1=2+1=3
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data)=min (0.6,0.7)
=0.6;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data)=max (0.65,0.7)
=0.7;
SUM column new data merger formula: this initial data=1.3+0.7=2.0 of SUM_new=SUM_old+;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(Counts_old+1)=
(1.3+0.7)/(2+1)=0.67;
2) in the embodiment of the present invention, 15 minutes, hour, day, week, the moon, the aggregation of data algorithm in year and process be substantially the same as 5 points
Clock aggregation of data, unique different place are that retrieval time, (time in the condition of retrieval divided according to the classification of retrieval tables of data
Be not since 1900 to which 15 minutes of this acquisition time, hour, day, week, the moon, year).
Step 3, the data after initial data and merger are stored into the table data store accordingly based on Hbase
According to the classification of table (raw data table, 5 minutes, 15 minutes, hour, day, week, the moon, annual data table), according to step
1 line unit create-rule generates line unit Rowkey, this initial data is stored according to the minute where acquisition time to original
In the Value column family of tables of data in the column of which corresponding minute;Other tables of data using the new data of merger in step 2 as
Corresponding 5-minute data table, 15 minute data tables, hour data table, number of days based on Hbase are arrived in the column of Value column family, storage
According in table, weekly data table, moon tables of data, annual data table, used for inquiry.
For example, TIME in 5-minute data table be since 1900 the 4116226th 5 minutes, get network element ID
It is 10102643, index is identified as 0001, then RowKey value when data storage is to HBase are as follows:
RowKey=NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+TIME (12, less than 12
Mend " 0 " in front)
=34620101+1000+000004116226
=346201011000000004116226
The 5-minute data and this Rowkey that step 2 merger is obtained are inserted into 5 minutes of Hbase as a data
In tables of data.
Step 4, data retrieval:
The table retrieved as needed classification (5 minutes, 15 minutes, hour, day, week, the moon, annual data table), NE ID
(NE_ID), index ID (SERVICE_ITEM_ID), calculate user input retrieval time range be since 1900 to
Which 5 minutes of retrieval time, 15 minutes, hour, day, week, the moon, year (according to the determination of table classification), in front less than 12
It mends " 0 ", querying condition is spliced into search condition: NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+
STARTTIME (12), NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ENDTIME (12) arrive Hbase
Qualified data are inquired in database and return to query result.
For example, inquiry network element ID is 10102643, index is identified as 0001, from 12 points to 2018 of on January 1st, 2018
5-minute data between 12 points of January 2.
Starting RowKey=NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate) of search condition+
STARTTIME (the 12412656th of point distance 1900 1 day 12 January in 2018 5 minutes mends " 0 " in front less than 12)
=34620101+1000+000012412656
=346201011000000012412656
End RowKey=NE_ID (8 negate)+SERVICE_ITEM_ID (4 negate)+ENDTIME of search condition
(the 12412944th of point distance 1900 2 days 12 January in 2018 5 minutes mends " 0 " in front less than 12)
=34620101+1000+000012412944
=346201011000000012412944
Rowkey is retrieved in the 5-minute data table of the table data store based on Hbase in starting RowKey and end
Data between RowKey value return to query results.
In conclusion a kind of network big data design methods and system based on HBase of the present invention pass through wound
The table data store based on HBase is built, merger is carried out to the network equipment data of acquisition using conflation algorithm and will be after merger
Data are stored in the table data store based on HBase created, and the present invention has fully considered field of network management feature and HBase
The characteristic of column storage, realizes the efficient storage of data, meets the demand under user's different application scene, and effectively simultaneous
The scalability and data balancing of system are cared for, the present invention can meet the industry of network management expert's business big data storage and quick-searching
Business demand.
The above-described embodiments merely illustrate the principles and effects of the present invention, and is not intended to limit the present invention.Any
Without departing from the spirit and scope of the present invention, modifications and changes are made to the above embodiments by field technical staff.Therefore,
The scope of the present invention, should be as listed in the claims.
Claims (10)
1. a kind of network big data design methods based on HBase, include the following steps:
Step S1 creates multiple table data stores based on Hbase;
The network equipment data of acquisition are carried out data according to conflation algorithm with corresponding tables of data according to the time and returned by step S2
And;
Step S3 stores the data after initial data and merger into corresponding each table data store based on Hbase;
Step S4, the querying condition for receiving user are generated search condition, are based on HBase in this according to the search condition of generation
Table data store in carry out data query retrieval.
2. a kind of network big data design methods based on HBase as described in claim 1, it is characterised in that: described
Table data store includes raw data table, 5-minute data table, 15 minute data tables, hour data table, day data table, weekly data
Table, moon tables of data, some or all of in annual data table.
3. a kind of network big data design methods based on HBase as claimed in claim 2, which is characterized in that described
The structure of raw data table includes line unit RowKey and Value column family, in which:
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition, TIME be accurate to hour when
Between character string, inversion operation is to arrange character string inverted order,
Value column family has 0~59 column, the specific indexes SERVICE_ITEM_ID of each column storage network element NE_ID it is several when
Between data.
4. a kind of network big data design methods based on HBase as claimed in claim 3, it is characterised in that: described 5
Minute data table, 15 minute data tables, hour data table, day data table, weekly data table, moon tables of data, the structure of annual data table
Including line unit RowKey and Value column family, wherein
The SERVICE_ITEM_ID+TIME that the NE_ID+ that line unit RowKey=is negated is negated;
Wherein, NE_ID is network element ID;SERVICE_ITEM_ID is the index mark of acquisition;TIME be since 1900 to
Which 5 minutes or 15 minutes of this acquisition time or hour day or week or the moon or year character string;Inversion operation be by
The arrangement of character string inverted order,
Column family Value includes: that MAX, MIN, AVG, SUM, COUNTS five is arranged, and wherein MAX is maximum value, and MIN is minimum value, AVG
For average value, SUM is total for merger data, and COUNTS is merger number of data.
5. a kind of network big data design methods based on HBase as described in claim 3 or 4, it is characterised in that:
NE_ID is 8, when less than 8, mends " 0 " in front, and SERVICE_ITEM_ID is 4, when less than 4, mends " 0 " in front,
TIME is 12, when string length is less than 12, mends " 0 " in front
6. a kind of network big data design methods based on HBase as claimed in claim 5, which is characterized in that step
S2 further comprises:
Step S200 isolates network element ID NE_ID, index mark from the initial data of the network equipment of this acquisition
SERVICE_ITEM_ID;
Step S201 retrieves corresponding data according to network element ID NE_ID, index mark SERVICE_ITEM_ID and time and deposits
With the presence or absence of the data for meeting querying condition in storage table;
Step S202 then calculates first of the initial data for planning the network equipment of this acquisition as the time if it does not exist
Data is inserted into the column family Value in corresponding table data store;The data for meeting querying condition if it exists, then calculate according to merger
Method merger historical data calculates column family Value.
7. a kind of network big data design methods based on HBase as claimed in claim 6, which is characterized in that Yu Bu
In rapid S202, the conflation algorithm uses following formula merger to each column of the column family Value of each tables of data:
COUNTS column new data merger formula: COUNTS_new=COUNTS_old+1;
Wherein COUNTS_new is the merger number of data after this initial data of merger, and COUNTS_old is to pass through search condition
The history merger number of data checked out;
MIN column new data merger formula: MIN_new=min (MIN_old, this initial data);
Wherein MIN_new is the minimum Value Data after this initial data of merger, and MIN_old is to be checked out by search condition
History minimum Value Data;
MAX column new data merger formula: MAX_new=max (MAX_old, this initial data);
Wherein MAX_new is the maximum value data after this initial data of merger, and MAX_old is to be checked out by search condition
History maximum value data;
SUM column new data merger formula: this initial data of SUM_new=SUM_old+;
Wherein SUM_new is that the merger data after this initial data of merger are total, and SUM_old is to be inquired by search condition
The history merger data come are total;
AVG column new data merger formula: AVG_new=(this initial data of SUM_old+)/(COUNTS_old+1);
Wherein AVG_new is the average data after this initial data of merger, and SUM_old is to be checked out by search condition
History merger data it is total, COUNTS_old is the item number of history merger data.
8. a kind of network big data design methods based on HBase as described in claim 1, which is characterized in that step
S3 further comprises:
Step S300 generates line unit Rowkey according to line unit create-rule according to the classification of table,
Step S301, by this initial data according to the Value column family of the minute storage where acquisition time to raw data table
In which corresponding minute column in;Other tables of data are using the new data of merger in step S2 as the column of Value column family, storage
To the corresponding table data store based on Hbase.
9. a kind of network big data design methods based on HBase as claimed in claim 8, it is characterised in that: Yu Bu
In rapid S4, querying condition is spliced into search condition according to following formula:
The SERVICE_ITEM_ID+STARTTIME that the NE_ID+ negated is negated
The SERVICE_ITEM_ID+ENDTIME that the NE_ID+ negated is negated
Wherein NE_ID is network element ID, and SERVICE_ITEM_ID is the index mark of acquisition, when STARTTIME is that retrieval starts
Between, ENDTIME is the retrieval end time, and inversion operation is to arrange character string inverted order.
10. a kind of network big data modelling system based on HBase, comprising:
Creating unit, for creating multiple table data stores based on Hbase;
Aggregation of data unit, network equipment data for that will acquire according to time and corresponding tables of data according to conflation algorithm into
Row aggregation of data;
Data storage cell is deposited for storing the data after initial data and merger to corresponding each data based on Hbase
It stores up in table;
Query and search unit, the querying condition for receiving user are generated search condition, according to the search condition of generation in
Data query retrieval is carried out in the table data store based on HBase.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811343471.7A CN109492008A (en) | 2018-11-13 | 2018-11-13 | A kind of network big data design methods and system based on HBase |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811343471.7A CN109492008A (en) | 2018-11-13 | 2018-11-13 | A kind of network big data design methods and system based on HBase |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109492008A true CN109492008A (en) | 2019-03-19 |
Family
ID=65694323
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811343471.7A Pending CN109492008A (en) | 2018-11-13 | 2018-11-13 | A kind of network big data design methods and system based on HBase |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109492008A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502543A (en) * | 2019-08-07 | 2019-11-26 | 京信通信系统(中国)有限公司 | Device performance data storage method, device, equipment and storage medium |
CN111538728A (en) * | 2020-04-27 | 2020-08-14 | 中国科学技术大学 | Method for archiving and querying historical data of large scientific device |
CN113114508A (en) * | 2021-04-15 | 2021-07-13 | 上海理想信息产业(集团)有限公司 | Multistage variable-frequency network monitoring data acquisition method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1747398A (en) * | 2004-09-08 | 2006-03-15 | 大唐移动通信设备有限公司 | Mass performance data statistical method in network element management system |
CN103200046A (en) * | 2013-03-28 | 2013-07-10 | 青岛海信传媒网络技术有限公司 | Method and system for monitoring network cell device performance |
CN104216962A (en) * | 2014-08-22 | 2014-12-17 | 南京邮电大学 | Mass network management data indexing design method based on HBase |
CN104216963A (en) * | 2014-08-22 | 2014-12-17 | 南京邮电大学 | Mass network management data collection and storage method based on HBase |
US20160373292A1 (en) * | 2015-06-22 | 2016-12-22 | Arista Networks, Inc. | Tracking state of components within a network element |
CN107273482A (en) * | 2017-06-12 | 2017-10-20 | 北京市天元网络技术股份有限公司 | Alarm data storage method and device based on HBase |
-
2018
- 2018-11-13 CN CN201811343471.7A patent/CN109492008A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1747398A (en) * | 2004-09-08 | 2006-03-15 | 大唐移动通信设备有限公司 | Mass performance data statistical method in network element management system |
CN103200046A (en) * | 2013-03-28 | 2013-07-10 | 青岛海信传媒网络技术有限公司 | Method and system for monitoring network cell device performance |
CN104216962A (en) * | 2014-08-22 | 2014-12-17 | 南京邮电大学 | Mass network management data indexing design method based on HBase |
CN104216963A (en) * | 2014-08-22 | 2014-12-17 | 南京邮电大学 | Mass network management data collection and storage method based on HBase |
US20160373292A1 (en) * | 2015-06-22 | 2016-12-22 | Arista Networks, Inc. | Tracking state of components within a network element |
CN107273482A (en) * | 2017-06-12 | 2017-10-20 | 北京市天元网络技术股份有限公司 | Alarm data storage method and device based on HBase |
Non-Patent Citations (1)
Title |
---|
杨建东: "云环境下网管数据查询系统设计", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502543A (en) * | 2019-08-07 | 2019-11-26 | 京信通信系统(中国)有限公司 | Device performance data storage method, device, equipment and storage medium |
CN111538728A (en) * | 2020-04-27 | 2020-08-14 | 中国科学技术大学 | Method for archiving and querying historical data of large scientific device |
CN113114508A (en) * | 2021-04-15 | 2021-07-13 | 上海理想信息产业(集团)有限公司 | Multistage variable-frequency network monitoring data acquisition method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11347776B2 (en) | Index mechanism for report generation | |
CN103365929B (en) | The management method of a kind of data base connection and system | |
CN103548019B (en) | Method and system for providing statistical information according to data warehouse | |
CN103049521B (en) | Virtual table directory system and the method for many attributes multiple condition searching can be realized | |
CN111459985B (en) | Identification information processing method and device | |
CN109492008A (en) | A kind of network big data design methods and system based on HBase | |
US11734292B2 (en) | Cloud inference system | |
CN103930888B (en) | Selected based on the many grain size subpopulation polymerizations updating, storing and response constrains | |
CN105930446B (en) | A kind of telecom client label generating method based on Hadoop distributed computing technology | |
CN107844879A (en) | Order allocation method and device | |
CN103793493B (en) | A kind of method and system for handling car-mounted terminal mass data | |
CN105608188A (en) | Data processing method and data processing device | |
CN106972978A (en) | A kind of ALM method for pushing and device | |
WO2012100349A1 (en) | Statistics forecast for range partitioned tables | |
US20170351762A1 (en) | Generating exemplar electronic documents using semantic context | |
CN109871380A (en) | A kind of crowd's packet application method and system based on Redis | |
CN109886618A (en) | A kind of method and device optimizing logistics operation | |
CN109299089A (en) | The calculating and storage method and calculating of a kind of label data of drawing a portrait and storage system | |
CN115358679B (en) | Medical material intelligent management method based on cloud bin | |
CN115774717A (en) | Data searching method and device, electronic equipment and computer readable storage medium | |
CN107682180A (en) | A kind of communication network device performance indications collecting method | |
CN108614818B (en) | Data storage, updating and query method and device | |
CN106326295A (en) | Method and device for storing semantic data | |
CN110147424A (en) | A kind of Top-k interblock space keyword query method and system | |
RU2680743C1 (en) | Method of preserving and changing reference and initial records in an information data management system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190319 |