CN102209118A - Distributed mass data gathering method - Google Patents

Distributed mass data gathering method Download PDF

Info

Publication number
CN102209118A
CN102209118A CN2011101541171A CN201110154117A CN102209118A CN 102209118 A CN102209118 A CN 102209118A CN 2011101541171 A CN2011101541171 A CN 2011101541171A CN 201110154117 A CN201110154117 A CN 201110154117A CN 102209118 A CN102209118 A CN 102209118A
Authority
CN
China
Prior art keywords
data
harvester
center machine
rule file
timer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011101541171A
Other languages
Chinese (zh)
Inventor
周关力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Qinzhi Digital Technology Co Ltd
Original Assignee
Chengdu Qinzhi Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Qinzhi Digital Technology Co Ltd filed Critical Chengdu Qinzhi Digital Technology Co Ltd
Priority to CN2011101541171A priority Critical patent/CN102209118A/en
Publication of CN102209118A publication Critical patent/CN102209118A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a distributed mass data gathering method. The method comprises the followings steps: a user configures acquirer connection information on a centre machine; B, a device administrator configures multiple groups of gathering rule files according to the data structure difference of each acquirer on the centre machine; C, the user configures corresponding gathering rule files for acquirers on the centre machine; D, the centre machine starts an acquiring timer to automatically connect the acquirers according to the connection information; E, the centre machine acquires required data according to the gathering rule files configured for the acquirers and the and the acquired data are sent back to the centre machine; F, the centre machine compresses and counts the acquired data according to the gathering rule file configured for each acquire, and stores results in each storage module; and G, the centre machine starts a graded gathering timer to gather the data in the storage modules regularly in a graded manner. The acquirers and the centre machine can be produced by different manufacturers and can have different data structures.

Description

A kind of distributed mass data assemblage method
Technical field
The present invention relates to data warehouse and data mining technology.
Background technology
Along with greatly developing of informatization, each large enterprises begins to enable the network service and comes management enterprise information, along with the equipment that the development network service of business is required is also increasing; Just need monitor automatically in order to ensure the available of network service all devices; Simultaneously in order to ensure each large enterprises of viability of monitor data adopt substantially multiple special watch-dog to whole network carry out in all directions, real-time monitoring.
Along with the long-play of watch-dog, obtained huge monitor data, just need converge the efficient of checking that improves monitor data to data; While just need be rejected invalid, overlapped data to the centralized Analysis of all monitor datas along with multi-vendor multiple watch-dog enables, and monitor data is stored with the statistics structure.Conventional method is: 1) set up a data center machine manual installation and create database; 2) manual data with each supervisory control system are copied to data center's machine, keep source data data structure and initial data on data center's machine; 3) storage that the manual compiling database script is identical with data structure needs the manual or semi-automatic unique identification that is provided with in the middle of same memory device; 4) when obtaining comprehensive statistics information, need adopt different querying commands, add up again after desired data is obtained one by one according to different data structures.So need a method to solve or optimize the problems referred to above.
Summary of the invention
There is the huge retrieval difficulty of data in purpose of the present invention at existing technology lower network monitor data, monitor data is integrated problem such as difficulty and is proposed a kind of distributed mass data assemblage method under supervisory control system diversification, the data structure diversification, to improve the efficient of checking, classifying to monitor data.
To achieve these goals, the invention discloses a kind of distributed mass data assemblage method, this method may further comprise the steps:
A, user dispose the harvester link information on center machine;
The steps A link information comprises following content: 1) harvester equipment link information; 2) harvester data obtain manner and call parameters; Obtain manner comprises two kinds: direct-connected mode of database and system interface obtain manner.
Also inequality according to its corresponding parameters of selected data obtain manner; The direct-connected mode parameter of database is: data bank network address, database side slogan, type of database, database-name, database login user name, database land password; System interface obtain manner parameter is: the system service network address, system service port numbers, System Privileges user name, System Privileges password, the described framework of interface or system.
The direct-connected mode parameter of above-mentioned database, described property data base type comprises as follows: Oracle, MySql, SqlServer, Sybase, DB.
Said system interface obtain manner parameter, described framework of described feature interface or system comprise as follows: webservice, corba, socket, snmp, TL1.
Dispose the harvester link information guaranteeing correct, and center machine will once be tested connection, and after test is passed through, center machine will be stored the link information of this harvester.
The described center machine of steps A is the leading machine of this method, but can also be the center machine cluster for the separate unit center machine; Described harvester is the center machine of diverse network performance collection equipment; Center machine can be used different data structures for different vendor, different operators, different O﹠M merchant with harvester.Harvester can use different data structures for different vendor, different operators, different O﹠M merchant with harvester.
B, equipment manager are on center machine, and according to each harvester data structure difference, the many groups of configuration converge rule file;
Converging rule file mainly describes: data obtain manner method, how the data that obtain are resolved, how to be stored in center machine.
Described in the step B, rule file mainly is divided into two kinds of templates according to data obtain manner difference: template one mainly needs to be defined as follows content for the rule file of the direct-connected mode of database, file: required data structure, querying command, the data structure position of fetched data correspondence on center machine that obtains data; Template two is the rule file of system interface obtain manner, and file mainly needs to be defined as follows content: each concrete data correspondence data structure position on center machine in the parameter of the required interface method name of calling, required transmission, the form of return data and parsing template, the return data.
Can dispose a plurality of rule files according to the harvester internal data structure is one group of rule file.
Step B is that center machine equipment manager or O﹠M personnel are configured.
C, user select the corresponding rule file that converges for harvester on center machine;
Step C: the user is that the harvester that is disposed is selected its pairing rule file that converges.
Harvester can be selected a plurality of rule files in the same group of rule file, and same group of rule file can be selected by the harvester of a plurality of same data structure.Like this can be effectively be reused and be convenient to unified the modification, increase work efficiency converging rule file.
D, center machine start gather timer according to link information from being dynamically connected harvester;
Among the step D, center machine will be called automatically and gather timer connection harvester, gather timer and will carry out every day once, and the concrete time of implementation is configured by equipment manager.
E, center machine join according to harvester and converge rule file and obtain monitor data;
The step e center machine converges the data that rule file is obtained last one day according to this harvester is selected, directly inquires about desired data or call the harvester system interface obtaining this interface return data.
F, center machine are compressed obtaining data statistics and the result are stored in each memory module according to rule file that each harvester is joined;
Each memory module described in the step F: center machine will mark off a plurality of memory modules according to Data Source, harvester type, harvester end data structure.
In center machine the acquisition data are compressed statistics by the time in the step F, with statistics back data with unified data structure storage to different memory modules; Take different operations according to gathering link information selected data obtain manner during statistics:
1, when the data obtain manner is the direct-connected mode of database, center machine data and desired data, only permitted hour being that unit compresses statistics to data on time, obtain maximum hourly, mean value, minimum value, total value and maximum and minimum value place time point, statistics back data also can be stored to each memory module sky table with the result according to one day the data of data statistics of hour meter with unified data structure storage to different memory module hour meter.
2, when the data obtain manner is the system interface obtain manner, the data of being returned are at first resolved and format to center machine according to rule file, and then obtain needed data, reject after invalid, the hash hour being that unit compresses statistics to valid data by the time according to rule file, obtain maximum hourly, mean value, minimum value, total value and maximum and minimum value place time point, with statistics back data with unified data structure storage to different memory modules.
Center machine is set at same categorical data with several monitor datas, and its data structure in center machine makes things convenient for storage and uniform, unified inquiry with unanimity.
All there are a plurality of hour meters, a plurality of days tables, a plurality of weekly form, a plurality of menology, a plurality of annual report in the center machine in each memory module; Specifically based on the data type of preset value, each memory module will be data under this module, and affiliated each type is created one group of timetable promptly: hour, day, the week, month, chronology.
G, center machine start classification and converge timer, regularly data staging in the memory module are converged; Timer is converged in classification can converge, realize inquiring about preprocessing function to the data in the memory module by week, 3 ranks in month, year automatically, when the user passed through each data of center machine query statistic, center machine can be automatically according to the time range inquiry different stage table of being inquired about.
Step G: center machine will start classification automatically and converge timer promptly: converge Zhou Huiju timer, the moon timer, year converge timer; Concrete time execution date: the Zhou Huiju timer is carried out weekly once, converge that timer was carried out once in every month the moon, converge that timer is annual to be carried out once year; The concrete time of implementation is configurable, is configured by equipment manager.
The Zhou Huiju timer will serve as the data in a last week of basis statistics and the result will be stored in the middle of the weekly form of each memory module with the sky table;
Converging timer the moon will serve as the data in basis statistics last January and the result will be stored in the middle of the menology of each memory module with the sky table;
Converge timer year and will be the data of basis statistics last one year and the result is stored in the middle of the chronology of each memory module with the menology.
The outstanding advantage of above-mentioned distributed mass data assemblage method is: 1) center machine integrated data base automatically, need not the client arrange personnel to install separately, reduced personnel input, provide cost savings; 2) the center machine data of can be automatically obtaining each harvester according to rule need not manually copy, its advantage is: a, the mistake of avoiding manual operation to cause, b, full automatic collecting work have improved operating efficiency to greatest extent, c, automatically acquisition time is configurable, and data acquisition is more timely more accurate; 3) center machine is carried out the rejecting operation and the data qualification of invalid data automatically according to configuration file to institute's image data, and manual operation that need not be loaded down with trivial details exists when having avoided manual compiling to reject order and sort command and writes wrong possibility; 4) center machine adopts consolidation form to be stored to different memory modules according to data type automatically, and data structure has more stratification; 4) center machine is carried out preliminary treatment by timer to data, and the data of improve inquiry velocity to greatest extent during the data query statistics, being added up are also more accurate.
Description of drawings
The present invention will illustrate by example and with reference to the mode of accompanying drawing, wherein
Fig. 1 is that workflow is always schemed.
Fig. 2 is a data pick-up assemblage method schematic diagram.
Fig. 3 is according to regular statistical method schematic diagram.
Fig. 4 is the data pick-up method flow diagram.
Embodiment
Disclosed all features in this specification, or the step in disclosed all methods or the process except mutually exclusive feature and/or step, all can make up by any way.
Disclosed arbitrary feature in this specification (comprising any accessory claim, summary and accompanying drawing) is unless special narration all can be replaced by other equivalences or the alternative features with similar purpose.That is, unless special narration, each feature is an example in a series of equivalences or the similar characteristics.
The present invention is described further below in conjunction with accompanying drawing
As follows as Fig. 1, the inventive method basic procedure: the user is in the link information, harvester data obtain manner and the call parameters that are converging configuration harvester on the center machine; The selected rule file that converges by the equipment manager configuration is verified automatically by center machine, checking by the back to joining information store; Center machine will be called the collection timer automatically, and the timing acquiring data are to center machine; Center machine inside will call automatically converge timer to data on time between rank: the week, month, year, converge.
In the present invention, the data structure of center machine is originated as Fig. 2, and Data Source of the present invention is from each harvester, and center machine self is not carried out concrete device data acquisition operations; Center machine is at first according to the harvester link information that is disposed with converge rule file and obtain monitor data; Secondly center machine is carried out validation verification, is rejected invalid data and data compression statistics data according to converging rule file; To add up the back storage in the middle of the corresponding stored module according to Data Source, harvester type, harvester end data structure at last.
Data compression statistics and storage concrete steps are as follows:
1) with data one hour for unit gathers, the result is stored in the middle of the hour meter corresponding in the memory module;
2) according to 1) data after gathering are foundation, are that unit converges once more with the sky with data, the result is stored in the middle of the sky table corresponding in the memory module.
Center machine specifically obtains the image data method such as Fig. 4 concrete steps are as follows:
A, user dispose the harvester link information in center machine: data access mode: the direct-connected mode of database; Data bank network address: 172.16.104.2; Type of database: mysql; Database service port numbers: 3066; Database service title: oral; Database user name: root; Password is logined in the numerical control storehouse: root; Rule file is converged in selection: the version1.1 file group.
B, center machine start gathers timer, gathers timer and disposes the harvester link information according to the user, initiatively connects harvester.
C, after center machine connects the harvester success, according to the data query information that is disposed in the version1.1 file group, the data oneself of a required collection are inquired about and are obtained, and the data after will obtaining are beamed back center machine.
C, center machine are added up the gained data by the time, add up at first by the hour, calculate the concrete time point of maximum hourly, minimum value, mean value, total value and maximum, minimum value; Add up by the sky again, calculate maximum, minimum value, mean value, total value and the maximum of every day, the concrete time point of minimum value.
D, data based after will adding up carry out the subregion storage without data source, different system, different pointer type, make data have stratification, compartmentalization, rankization so that follow-up data query statistics.
E, center machine will be called automatically and converge timer, the storage data are carried out further convergence processing, carried out statistical summaries as Fig. 3 the inventive method in the data that regularly start in the once Zhou Huiju timer centring machine week on used day table weekly, the statistical value in a last week is deposited in the middle of each corresponding weekly form; Regularly started once the data that converge used day table of timer centring machine January of lasting the moon in every month and carry out further statistical summaries, the statistical value in last January is deposited in the middle of each menology of correspondence; Annual regularly start a next year and converge the data of all menologies of timer centring machine last one year and carry out further statistical summaries, the statistical value of last one year is deposited in the middle of each corresponding chronology; When the user passed through each data of center machine query statistic, center machine can be automatically according to the time range inquiry different stage table of being inquired about.
The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature or any new combination that discloses in this manual, and the arbitrary new method that discloses or step or any new combination of process.

Claims (9)

1. distributed mass data assemblage method, this method may further comprise the steps:
A, user dispose the harvester link information on center machine;
B, equipment manager are on center machine, and according to each harvester data structure difference, the many groups of configuration converge rule file;
C, user select the corresponding rule file that converges for harvester on center machine;
D, center machine start gather timer according to link information from being dynamically connected harvester;
E, center machine join according to harvester and converge rule file and obtain monitor data;
F, center machine are joined according to each harvester and are converged rule file, and obtaining data are compressed statistics and the result is stored in each memory module;
G, center machine start classification and converge timer, regularly data staging in the memory module are converged.
2. a kind of distributed mass data assemblage method according to claim 1 is characterized in that: described center machine is the leading machine of this method, but can also be the center machine cluster for the separate unit center machine; Described harvester is the center machine of diverse network performance collection equipment; Center machine can be used different data structures for different vendor, different operators, different O﹠M merchant with harvester; Harvester can use different data structures for different vendor, different operators, different O﹠M merchant with harvester.
3. method according to claim 1 and 2 is characterized in that: described steps A user disposes the harvester link information on center machine, link information comprises following content: 1) harvester equipment link information; 2) harvester data obtain manner and call parameters; Obtain manner comprises two kinds: direct-connected mode of database and system interface obtain manner.
4. a kind of distributed mass data assemblage method according to claim 3, it is characterized in that: described step B equipment manager is on center machine, according to each harvester data structure difference, the many groups of configuration converge rule file: converge rule file and mainly describe: data obtain manner method, and how the data that obtain are resolved, how to be stored in center machine; Can dispose a plurality of rule files according to harvester internal data mechanism is one group of rule file.
5. a kind of distributed mass data assemblage method according to claim 4, it is characterized in that: described step C user selects the corresponding rule file that converges for harvester on center machine: harvester can be selected a plurality of rule files in the same group of rule file, same group of rule file can be selected by the harvester of a plurality of same data structure, is used for effectively being reused and being convenient to unified the modification converging rule file.
6. a kind of distributed mass data assemblage method according to claim 5, it is characterized in that: described step D center machine start gather timer according to link information from being dynamically connected harvester: center machine will be called automatically and gather timer and connect harvester, gathering timer will carry out once every day, and the concrete time of implementation is configured by equipment manager.
7. a kind of distributed mass data assemblage method according to claim 6 is characterized in that: described step e center machine is joined according to harvester and is converged rule file and obtain monitor data; Join according to each harvester with the F center machine and to converge rule file, obtaining data are compressed statistics and the result is stored in each memory module: center machine converges the data that rule file is obtained last one day according to this harvester is selected, directly inquires about desired data or call the harvester system interface obtaining this interface return data; Center machine is compressed statistics to the acquisition data by the time, with statistics back data with unified data structure storage to different memory modules.
8. according to a kind of distributed mass data assemblage method described in the claim 7, it is characterized in that: describedly compress statistics:, obtain maximum hourly, mean value, minimum value, total value and maximum and minimum value place time point hour to be that unit compresses statistics to valid data by the time by the time.
9. according to a kind of distributed mass data assemblage method described in the claim 7, it is characterized in that: described step G center machine starts classification and converges timer, regularly data staging in the memory module is converged: timer is converged in classification can converge, realize inquiring about preprocessing function to the data in the memory module by week, 3 ranks in month, year automatically, when the user passed through each data of center machine query statistic, center machine can be automatically according to the time range inquiry different stage table of being inquired about.
CN2011101541171A 2011-06-10 2011-06-10 Distributed mass data gathering method Pending CN102209118A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011101541171A CN102209118A (en) 2011-06-10 2011-06-10 Distributed mass data gathering method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011101541171A CN102209118A (en) 2011-06-10 2011-06-10 Distributed mass data gathering method

Publications (1)

Publication Number Publication Date
CN102209118A true CN102209118A (en) 2011-10-05

Family

ID=44697777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011101541171A Pending CN102209118A (en) 2011-06-10 2011-06-10 Distributed mass data gathering method

Country Status (1)

Country Link
CN (1) CN102209118A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750182A (en) * 2012-06-12 2012-10-24 苏州微逸浪科技有限公司 Processing method of active acquisition based on custom task scheduling
CN103853713A (en) * 2012-11-28 2014-06-11 成都勤智数码科技股份有限公司 Efficient storage method of mass data
CN103853719A (en) * 2012-11-28 2014-06-11 成都勤智数码科技股份有限公司 Extensible mass data collection system
CN106254172A (en) * 2016-07-14 2016-12-21 东软集团股份有限公司 Heterogeneous applications collecting method and device
CN106407205A (en) * 2015-07-29 2017-02-15 腾讯科技(深圳)有限公司 Data aggregation method and apparatus
CN106484857A (en) * 2016-10-09 2017-03-08 珠海经济特区远宏科技有限公司大连分公司 Data collecting system and its method
CN107229628A (en) * 2016-03-23 2017-10-03 中兴通讯股份有限公司 The method and device of distributed data base pretreatment
CN108052551A (en) * 2017-11-28 2018-05-18 北京航天云路有限公司 A kind of method for a large amount of time series datas of storage realized on REDIS
CN110262407A (en) * 2019-04-27 2019-09-20 南京联澳科技有限公司 A method of realizing that electroplating machine data exchange immediately
CN113791955A (en) * 2021-09-17 2021-12-14 济南浪潮数据技术有限公司 Data aggregation device and method for monitoring system and server

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101188569A (en) * 2006-11-16 2008-05-28 饶大平 Method for constructing data quanta space in network and distributed file storage system
CN101488026A (en) * 2009-02-26 2009-07-22 福州欣创摩尔电子科技有限公司 Distributed data acquisition control platform system
CN101867613A (en) * 2010-06-08 2010-10-20 中兴通讯股份有限公司 Content delivery CDN sub system and data synchronization method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101188569A (en) * 2006-11-16 2008-05-28 饶大平 Method for constructing data quanta space in network and distributed file storage system
CN101488026A (en) * 2009-02-26 2009-07-22 福州欣创摩尔电子科技有限公司 Distributed data acquisition control platform system
CN101867613A (en) * 2010-06-08 2010-10-20 中兴通讯股份有限公司 Content delivery CDN sub system and data synchronization method

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750182A (en) * 2012-06-12 2012-10-24 苏州微逸浪科技有限公司 Processing method of active acquisition based on custom task scheduling
CN103853713B (en) * 2012-11-28 2018-04-24 勤智数码科技股份有限公司 The efficient storage method of mass data
CN103853713A (en) * 2012-11-28 2014-06-11 成都勤智数码科技股份有限公司 Efficient storage method of mass data
CN103853719A (en) * 2012-11-28 2014-06-11 成都勤智数码科技股份有限公司 Extensible mass data collection system
CN103853719B (en) * 2012-11-28 2018-05-22 勤智数码科技股份有限公司 Easily extension mass data collection system
CN106407205B (en) * 2015-07-29 2019-12-20 腾讯科技(深圳)有限公司 Data aggregation method and device
CN106407205A (en) * 2015-07-29 2017-02-15 腾讯科技(深圳)有限公司 Data aggregation method and apparatus
CN107229628A (en) * 2016-03-23 2017-10-03 中兴通讯股份有限公司 The method and device of distributed data base pretreatment
CN106254172A (en) * 2016-07-14 2016-12-21 东软集团股份有限公司 Heterogeneous applications collecting method and device
CN106484857A (en) * 2016-10-09 2017-03-08 珠海经济特区远宏科技有限公司大连分公司 Data collecting system and its method
CN108052551A (en) * 2017-11-28 2018-05-18 北京航天云路有限公司 A kind of method for a large amount of time series datas of storage realized on REDIS
CN110262407A (en) * 2019-04-27 2019-09-20 南京联澳科技有限公司 A method of realizing that electroplating machine data exchange immediately
CN113791955A (en) * 2021-09-17 2021-12-14 济南浪潮数据技术有限公司 Data aggregation device and method for monitoring system and server

Similar Documents

Publication Publication Date Title
CN102209118A (en) Distributed mass data gathering method
CN109947746B (en) Data quality control method and system based on ETL flow
CN100451989C (en) Software testing system and testing method
US9679021B2 (en) Parallel transactional-statistics collection for improving operation of a DBMS optimizer module
CN110245078A (en) A kind of method for testing pressure of software, device, storage medium and server
EP2661014A1 (en) Polling sub-system and polling method for communication network system and communication apparatus
US8930918B2 (en) System and method for SQL performance assurance services
CN102737020A (en) Method for initializing multi-tenant database, and apparatus thereof
US20070136383A1 (en) Database Tuning Method and System
CN102521700A (en) Electrical network informatization evaluation rapid test system
CN104572122A (en) Software application data generating device and method
CN102780726A (en) Log analysis method and log analysis system based on WEB platform
CN111475490B (en) Data management system and method of data directory system
CN104346574A (en) Automatic host computer security configuration vulnerability restoration method and system based on configuration specification
CN108228740A (en) Electric power full-service uniform data centre data analysis domain comparing tool
CN104915262A (en) Calibration system and method based on EXCEL data structure
EP2610768B1 (en) Data archiving and de-archiving in a business environment
US11010366B2 (en) Method and system for implementing an automated data validation tool
CN115934680A (en) One-stop big data analysis processing system
CN109857649A (en) A kind of resource testing method and system
CN104636244A (en) Server monitoring method using Java, MySQL and Shell
CN111177239B (en) Unified log processing method and system based on HDP big data cluster
CN111708677B (en) Cloud hard disk usage amount acquisition method in cloud computing environment
CN104679894A (en) Acquisition method of operation and maintenance data in ERP (Enterprise Resource Planning) system
CN114661693A (en) Data auditing realization method, storage medium, electronic equipment and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20111005