CN104714875A - Distributed automatic collecting method - Google Patents
Distributed automatic collecting method Download PDFInfo
- Publication number
- CN104714875A CN104714875A CN201510106013.1A CN201510106013A CN104714875A CN 104714875 A CN104714875 A CN 104714875A CN 201510106013 A CN201510106013 A CN 201510106013A CN 104714875 A CN104714875 A CN 104714875A
- Authority
- CN
- China
- Prior art keywords
- server
- management server
- collecting
- collection
- servers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses a distributed automatic collecting method. The method comprises the following steps that a single-unit collecting procedure is allocated to all servers; one server serves as a management server controlling other servers, and addresses of other servers are configured; addresses to be collected are placed in a management server database, and task allocation is carried out; the management server operates a collecting procedure of a collecting server through the crawler technology, and the type of data to be collected is controlled through the management server; the collecting work carried out at fixed time in each month is allocated to the management server database, and timing tasks are set; a server monitoring system is used for judging the collecting situation of the servers, abnormally collected server information is sent to an administrator, and collecting tasks are evenly allocated to other idle servers. According to the distributed automatic collecting method, the danger of the collapse of the whole system can be avoided, manpower maintenance is reduced, and the server monitoring system is used for judging the collecting situation of child nodes.
Description
Technical field
The present invention relates to microcomputer data processing field, specifically a kind of method of distributed automation collection.
Background technology
In reality, all produce a large amount of data all the time, some Water demand, some needs to store, and all these need data bulk to be processed be huge, has similarity, so need these large data analysis process, extracts the data needed.
Data acquisition needs data to have similarity, can extract corresponding data according to their rule, the data of collection also will have the value or purposes that gather and extract.Data acquisition will possess method or the scheme of collection, plannedly can gather with step, possess the condition of collection, such as equipment and technology.
In the face of the server of tens, up to a hundred, if collector will enter server be configured acquisition tasks at every turn, operation capture program, can face a lot of problem:
1) maintenance is large, and the configuration of long-range connection, repeatability, gathers.
2) waste server resource, every station server resource can not be made full use of, owing to manually can not whether complete by Timeliness coverage collection of server, so next step program can not be performed in time.
3) error rate increases, and owing to manually needing a large amount of operations, the error rate of configuration information also can increase.
4) data volume is large, and grab type, acquisition configuration are different.The data display mode of different websites is different, needs different collocation methods.
summary of the invention
Technical assignment of the present invention is to provide a kind of method of distributed automation collection.
Technical assignment of the present invention realizes in the following manner, and the step of the method is as follows:
Step 1: unit capture program is deployed on each server;
Step 2: by a wherein station server as the management server controlling other server, configure other server address;
Step 3: the address that will gather, puts into management server data storehouse, carries out task matching;
Step 4: management server, by the capture program of crawler technology operation acquisition server, gathers the data of what type, when gathers, when terminate, all controlled by management server;
Step 5: by the work of monthly set time collection, be configured in the database of management server, and timed task is set;
Step 6: by monitoring server system, judges the collection situation of server, the server info of improper collection is sent to keeper, and acquisition tasks is evenly distributed to other idle server.
In described step 3, carrying out task matching is determined according to the picking rate of each server by management server.
In described step 6, the server info of improper collection is sent to keeper by lettergram mode.
Compared to the prior art the method for a kind of distributed automation collection of the present invention, can balance the collection pressure of every station server, raise the efficiency.Danger whole system being collapsed because individual node lost efficacy can be avoided.Decrease manpower to safeguard, by the collection content that configures by management server allocating task, and by monitoring server system, judge the collection situation of child node.
Accompanying drawing explanation
Accompanying drawing 1 is a kind of FB(flow block) of method of distributed automation collection.
Embodiment
Embodiment 1:
The step of the method is as follows:
Step 1: unit capture program is deployed on each server;
Step 2: by a wherein station server as the management server controlling other server, configure other server address;
Step 3: the address that will gather, puts into management server data storehouse, by management server according to each collection of server speed, carries out task matching;
Step 4: management server, by the capture program of crawler technology operation acquisition server, gathers the data of what type, when gathers, when terminate, all controlled by management server;
Step 5: by the work of monthly set time collection, be configured in the database of management server, and timed task is set;
Step 6: by monitoring server system, judges the collection situation of server, the server info of improper collection is sent to keeper by lettergram mode, and acquisition tasks is evenly distributed to other idle server.
Embodiment 2:
Each acquisition tasks is different, and acquisition time is different, the change of child node address; Set for this platform by following steps:
1) unit capture program is deployed in child node, starts capture program.
2) configure the address of acquisition node on the management server, and test is passed through.
3) interface of monitoring server system is called.
4) back end is disposed.
5) configure acquisition tasks, test is passed through.
The unit capture program be deployed on multiple servers is carried out unified management, by send/receive the mode of message, distributing and receiving acquisition tasks.Balance the collection pressure of every station server, load is transferred to by individual node multiple, thus raises the efficiency.Danger whole system being collapsed because individual node lost efficacy can be avoided.Decrease manpower to safeguard, by the collection content that configures by management server allocating task, and by monitoring server system, judge the collection situation of child node.Adopt Observer Pattern, by the management on backstage, the scheme that server is arranged according to backstage carries out data acquisition session, and feeds back to the state of background server, realizes the management to server.
By embodiment above, described those skilled in the art can be easy to realize the present invention.But should be appreciated that the present invention is not limited to above-mentioned several embodiments.On the basis of disclosed embodiment, described those skilled in the art can the different technical characteristic of combination in any, thus realizes different technical schemes.
Claims (3)
1. a method for distributed automation collection, is characterized in that, the step of the method is as follows:
Step 1: unit capture program is deployed on each server;
Step 2: by a wherein station server as the management server controlling other server, configure other server address;
Step 3: the address that will gather, puts into management server data storehouse, carries out task matching;
Step 4: management server, by the capture program of crawler technology operation acquisition server, gathers the data of what type, when gathers, when terminate, all controlled by management server;
Step 5: by the work of monthly set time collection, be configured in the database of management server, and timed task is set;
Step 6: by monitoring server system, judges the collection situation of server, the server info of improper collection is sent to keeper, and acquisition tasks is evenly distributed to other idle server.
2., in the step 3 stated, carrying out task matching is determined according to the picking rate of each server by management server.
3. the method for a kind of distributed automation collection according to claim 1, is characterized in that, in described step 6, the server info of improper collection is sent to keeper by lettergram mode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510106013.1A CN104714875A (en) | 2015-03-11 | 2015-03-11 | Distributed automatic collecting method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510106013.1A CN104714875A (en) | 2015-03-11 | 2015-03-11 | Distributed automatic collecting method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104714875A true CN104714875A (en) | 2015-06-17 |
Family
ID=53414234
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510106013.1A Pending CN104714875A (en) | 2015-03-11 | 2015-03-11 | Distributed automatic collecting method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104714875A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107222564A (en) * | 2017-07-04 | 2017-09-29 | 贵州数据宝网络科技有限公司 | Collecting method and device |
CN109522183A (en) * | 2018-10-23 | 2019-03-26 | 东软集团股份有限公司 | Instrument operating condition monitoring method and system, collector, server and storage medium |
CN110968755A (en) * | 2018-09-29 | 2020-04-07 | 北京国双科技有限公司 | Method and device for crawling data |
CN111130900A (en) * | 2019-12-30 | 2020-05-08 | 智慧神州(北京)科技有限公司 | Data acquisition method and device based on distributed interconnection of coordination services |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007156946A (en) * | 2005-12-07 | 2007-06-21 | Nec Engineering Ltd | Trace device of distributed program |
CN101370024A (en) * | 2007-08-15 | 2009-02-18 | 北京灵图软件技术有限公司 | Distributed information collection method and system |
CN101867226A (en) * | 2010-06-07 | 2010-10-20 | 国电南瑞科技股份有限公司 | Wide-area distribution type data collection method for dispatching automation system |
CN102508709A (en) * | 2011-11-30 | 2012-06-20 | 国电南瑞科技股份有限公司 | Distributed-cache-based acquisition task scheduling method in purchase, supply and selling integrated electric energy acquiring and monitoring system |
CN103246592A (en) * | 2013-05-13 | 2013-08-14 | 北京搜狐新媒体信息技术有限公司 | Monitoring acquisition system and method |
CN103856565A (en) * | 2014-03-18 | 2014-06-11 | 浪潮集团有限公司 | E-commerce tax source management cloud collection monitoring method |
CN104036025A (en) * | 2014-06-27 | 2014-09-10 | 蓝盾信息安全技术有限公司 | Distribution-base mass log collection system |
CN104158878A (en) * | 2014-08-18 | 2014-11-19 | 浪潮(北京)电子信息产业有限公司 | Adaptive scheduling distributive monitoring data acquisition method and system |
-
2015
- 2015-03-11 CN CN201510106013.1A patent/CN104714875A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007156946A (en) * | 2005-12-07 | 2007-06-21 | Nec Engineering Ltd | Trace device of distributed program |
CN101370024A (en) * | 2007-08-15 | 2009-02-18 | 北京灵图软件技术有限公司 | Distributed information collection method and system |
CN101867226A (en) * | 2010-06-07 | 2010-10-20 | 国电南瑞科技股份有限公司 | Wide-area distribution type data collection method for dispatching automation system |
CN102508709A (en) * | 2011-11-30 | 2012-06-20 | 国电南瑞科技股份有限公司 | Distributed-cache-based acquisition task scheduling method in purchase, supply and selling integrated electric energy acquiring and monitoring system |
CN103246592A (en) * | 2013-05-13 | 2013-08-14 | 北京搜狐新媒体信息技术有限公司 | Monitoring acquisition system and method |
CN103856565A (en) * | 2014-03-18 | 2014-06-11 | 浪潮集团有限公司 | E-commerce tax source management cloud collection monitoring method |
CN104036025A (en) * | 2014-06-27 | 2014-09-10 | 蓝盾信息安全技术有限公司 | Distribution-base mass log collection system |
CN104158878A (en) * | 2014-08-18 | 2014-11-19 | 浪潮(北京)电子信息产业有限公司 | Adaptive scheduling distributive monitoring data acquisition method and system |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107222564A (en) * | 2017-07-04 | 2017-09-29 | 贵州数据宝网络科技有限公司 | Collecting method and device |
CN110968755A (en) * | 2018-09-29 | 2020-04-07 | 北京国双科技有限公司 | Method and device for crawling data |
CN109522183A (en) * | 2018-10-23 | 2019-03-26 | 东软集团股份有限公司 | Instrument operating condition monitoring method and system, collector, server and storage medium |
CN109522183B (en) * | 2018-10-23 | 2022-04-12 | 东软集团股份有限公司 | Working state monitoring method and system, collector, server and storage medium |
CN111130900A (en) * | 2019-12-30 | 2020-05-08 | 智慧神州(北京)科技有限公司 | Data acquisition method and device based on distributed interconnection of coordination services |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104915259A (en) | Task scheduling method applied to distributed acquisition system | |
CN108769121A (en) | Intelligent industrial equips the method for uploading of internet of things data acquisition system and gathered data | |
CN110311990B (en) | Configurable Internet of things data acquisition system and configuration method | |
CN103699063B (en) | The harvester of off-line data and method in a kind of Manufacturing Executive System MES | |
CN104714875A (en) | Distributed automatic collecting method | |
CN106302017B (en) | The small capaciated flow network velocity-measuring system of high concurrent and method | |
CN104298194B (en) | The data volume compression method of data is gathered and transmitted in elevator remote monitoring system | |
CN107959620B (en) | Fully mechanized mining equipment identification method, device, system, gateway and storage medium | |
CN107992392A (en) | A kind of automatic monitoring repair system and method for cloud rendering system | |
CN105376101A (en) | Method and system for enabling physical device to be connected into virtual network | |
CN104301244A (en) | Cluster communication system and method of large-scale power distribution network system | |
CN102222112A (en) | Resource management device and resource management method | |
CN105790978A (en) | Network manager communication message processing method and device, server and main control board | |
CN104283958B (en) | A kind of system task dispatching method | |
CN105187490B (en) | A kind of transfer processing method of internet of things data | |
CN108093075A (en) | A kind of implementation method of application system gray scale issue | |
CN102480369A (en) | Network management system and method for collecting performance | |
CN105743676B (en) | A kind of multi-data source synthetical collection device and method | |
CN105490879A (en) | Automatic distributed performance test system of large-scale integrated network | |
CN110837242A (en) | Hot water supply equipment running state monitoring system based on Internet of things | |
CN202385116U (en) | Distributed operation and maintenance data acquisition device | |
CN106161339A (en) | Obtain the method and device of IP access relation | |
CN106707859A (en) | Grouting site information processing system and grouting site information processing method based on raspberry Pi | |
CN103618641A (en) | Data packet detecting and monitoring system based on multiple-core network processor and capable of being deployed fast | |
CN107942888B (en) | Work control method after fault recovery of data acquisition device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150617 |
|
WD01 | Invention patent application deemed withdrawn after publication |