CN104714875A - Distributed automatic collecting method - Google Patents

Distributed automatic collecting method Download PDF

Info

Publication number
CN104714875A
CN104714875A CN201510106013.1A CN201510106013A CN104714875A CN 104714875 A CN104714875 A CN 104714875A CN 201510106013 A CN201510106013 A CN 201510106013A CN 104714875 A CN104714875 A CN 104714875A
Authority
CN
China
Prior art keywords
server
management server
collecting
collection
servers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510106013.1A
Other languages
Chinese (zh)
Inventor
孙海峰
王传超
徐宏伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Group Co Ltd
Original Assignee
Inspur Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Group Co Ltd filed Critical Inspur Group Co Ltd
Priority to CN201510106013.1A priority Critical patent/CN104714875A/en
Publication of CN104714875A publication Critical patent/CN104714875A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a distributed automatic collecting method. The method comprises the following steps that a single-unit collecting procedure is allocated to all servers; one server serves as a management server controlling other servers, and addresses of other servers are configured; addresses to be collected are placed in a management server database, and task allocation is carried out; the management server operates a collecting procedure of a collecting server through the crawler technology, and the type of data to be collected is controlled through the management server; the collecting work carried out at fixed time in each month is allocated to the management server database, and timing tasks are set; a server monitoring system is used for judging the collecting situation of the servers, abnormally collected server information is sent to an administrator, and collecting tasks are evenly allocated to other idle servers. According to the distributed automatic collecting method, the danger of the collapse of the whole system can be avoided, manpower maintenance is reduced, and the server monitoring system is used for judging the collecting situation of child nodes.

Description

A kind of method of distributed automation collection
Technical field
The present invention relates to microcomputer data processing field, specifically a kind of method of distributed automation collection.
Background technology
In reality, all produce a large amount of data all the time, some Water demand, some needs to store, and all these need data bulk to be processed be huge, has similarity, so need these large data analysis process, extracts the data needed.
Data acquisition needs data to have similarity, can extract corresponding data according to their rule, the data of collection also will have the value or purposes that gather and extract.Data acquisition will possess method or the scheme of collection, plannedly can gather with step, possess the condition of collection, such as equipment and technology.
In the face of the server of tens, up to a hundred, if collector will enter server be configured acquisition tasks at every turn, operation capture program, can face a lot of problem:
1) maintenance is large, and the configuration of long-range connection, repeatability, gathers.
2) waste server resource, every station server resource can not be made full use of, owing to manually can not whether complete by Timeliness coverage collection of server, so next step program can not be performed in time.
3) error rate increases, and owing to manually needing a large amount of operations, the error rate of configuration information also can increase.
4) data volume is large, and grab type, acquisition configuration are different.The data display mode of different websites is different, needs different collocation methods.
summary of the invention
Technical assignment of the present invention is to provide a kind of method of distributed automation collection.
Technical assignment of the present invention realizes in the following manner, and the step of the method is as follows:
Step 1: unit capture program is deployed on each server;
Step 2: by a wherein station server as the management server controlling other server, configure other server address;
Step 3: the address that will gather, puts into management server data storehouse, carries out task matching;
Step 4: management server, by the capture program of crawler technology operation acquisition server, gathers the data of what type, when gathers, when terminate, all controlled by management server;
Step 5: by the work of monthly set time collection, be configured in the database of management server, and timed task is set;
Step 6: by monitoring server system, judges the collection situation of server, the server info of improper collection is sent to keeper, and acquisition tasks is evenly distributed to other idle server.
In described step 3, carrying out task matching is determined according to the picking rate of each server by management server.
In described step 6, the server info of improper collection is sent to keeper by lettergram mode.
Compared to the prior art the method for a kind of distributed automation collection of the present invention, can balance the collection pressure of every station server, raise the efficiency.Danger whole system being collapsed because individual node lost efficacy can be avoided.Decrease manpower to safeguard, by the collection content that configures by management server allocating task, and by monitoring server system, judge the collection situation of child node.
Accompanying drawing explanation
Accompanying drawing 1 is a kind of FB(flow block) of method of distributed automation collection.
Embodiment
Embodiment 1:
The step of the method is as follows:
Step 1: unit capture program is deployed on each server;
Step 2: by a wherein station server as the management server controlling other server, configure other server address;
Step 3: the address that will gather, puts into management server data storehouse, by management server according to each collection of server speed, carries out task matching;
Step 4: management server, by the capture program of crawler technology operation acquisition server, gathers the data of what type, when gathers, when terminate, all controlled by management server;
Step 5: by the work of monthly set time collection, be configured in the database of management server, and timed task is set;
Step 6: by monitoring server system, judges the collection situation of server, the server info of improper collection is sent to keeper by lettergram mode, and acquisition tasks is evenly distributed to other idle server.
Embodiment 2:
Each acquisition tasks is different, and acquisition time is different, the change of child node address; Set for this platform by following steps:
1) unit capture program is deployed in child node, starts capture program.
2) configure the address of acquisition node on the management server, and test is passed through.
3) interface of monitoring server system is called.
4) back end is disposed.
5) configure acquisition tasks, test is passed through.
The unit capture program be deployed on multiple servers is carried out unified management, by send/receive the mode of message, distributing and receiving acquisition tasks.Balance the collection pressure of every station server, load is transferred to by individual node multiple, thus raises the efficiency.Danger whole system being collapsed because individual node lost efficacy can be avoided.Decrease manpower to safeguard, by the collection content that configures by management server allocating task, and by monitoring server system, judge the collection situation of child node.Adopt Observer Pattern, by the management on backstage, the scheme that server is arranged according to backstage carries out data acquisition session, and feeds back to the state of background server, realizes the management to server.
By embodiment above, described those skilled in the art can be easy to realize the present invention.But should be appreciated that the present invention is not limited to above-mentioned several embodiments.On the basis of disclosed embodiment, described those skilled in the art can the different technical characteristic of combination in any, thus realizes different technical schemes.

Claims (3)

1. a method for distributed automation collection, is characterized in that, the step of the method is as follows:
Step 1: unit capture program is deployed on each server;
Step 2: by a wherein station server as the management server controlling other server, configure other server address;
Step 3: the address that will gather, puts into management server data storehouse, carries out task matching;
Step 4: management server, by the capture program of crawler technology operation acquisition server, gathers the data of what type, when gathers, when terminate, all controlled by management server;
Step 5: by the work of monthly set time collection, be configured in the database of management server, and timed task is set;
Step 6: by monitoring server system, judges the collection situation of server, the server info of improper collection is sent to keeper, and acquisition tasks is evenly distributed to other idle server.
2., in the step 3 stated, carrying out task matching is determined according to the picking rate of each server by management server.
3. the method for a kind of distributed automation collection according to claim 1, is characterized in that, in described step 6, the server info of improper collection is sent to keeper by lettergram mode.
CN201510106013.1A 2015-03-11 2015-03-11 Distributed automatic collecting method Pending CN104714875A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510106013.1A CN104714875A (en) 2015-03-11 2015-03-11 Distributed automatic collecting method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510106013.1A CN104714875A (en) 2015-03-11 2015-03-11 Distributed automatic collecting method

Publications (1)

Publication Number Publication Date
CN104714875A true CN104714875A (en) 2015-06-17

Family

ID=53414234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510106013.1A Pending CN104714875A (en) 2015-03-11 2015-03-11 Distributed automatic collecting method

Country Status (1)

Country Link
CN (1) CN104714875A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107222564A (en) * 2017-07-04 2017-09-29 贵州数据宝网络科技有限公司 Collecting method and device
CN109522183A (en) * 2018-10-23 2019-03-26 东软集团股份有限公司 Instrument operating condition monitoring method and system, collector, server and storage medium
CN110968755A (en) * 2018-09-29 2020-04-07 北京国双科技有限公司 Method and device for crawling data
CN111130900A (en) * 2019-12-30 2020-05-08 智慧神州(北京)科技有限公司 Data acquisition method and device based on distributed interconnection of coordination services

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007156946A (en) * 2005-12-07 2007-06-21 Nec Engineering Ltd Trace device of distributed program
CN101370024A (en) * 2007-08-15 2009-02-18 北京灵图软件技术有限公司 Distributed information collection method and system
CN101867226A (en) * 2010-06-07 2010-10-20 国电南瑞科技股份有限公司 Wide-area distribution type data collection method for dispatching automation system
CN102508709A (en) * 2011-11-30 2012-06-20 国电南瑞科技股份有限公司 Distributed-cache-based acquisition task scheduling method in purchase, supply and selling integrated electric energy acquiring and monitoring system
CN103246592A (en) * 2013-05-13 2013-08-14 北京搜狐新媒体信息技术有限公司 Monitoring acquisition system and method
CN103856565A (en) * 2014-03-18 2014-06-11 浪潮集团有限公司 E-commerce tax source management cloud collection monitoring method
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN104158878A (en) * 2014-08-18 2014-11-19 浪潮(北京)电子信息产业有限公司 Adaptive scheduling distributive monitoring data acquisition method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007156946A (en) * 2005-12-07 2007-06-21 Nec Engineering Ltd Trace device of distributed program
CN101370024A (en) * 2007-08-15 2009-02-18 北京灵图软件技术有限公司 Distributed information collection method and system
CN101867226A (en) * 2010-06-07 2010-10-20 国电南瑞科技股份有限公司 Wide-area distribution type data collection method for dispatching automation system
CN102508709A (en) * 2011-11-30 2012-06-20 国电南瑞科技股份有限公司 Distributed-cache-based acquisition task scheduling method in purchase, supply and selling integrated electric energy acquiring and monitoring system
CN103246592A (en) * 2013-05-13 2013-08-14 北京搜狐新媒体信息技术有限公司 Monitoring acquisition system and method
CN103856565A (en) * 2014-03-18 2014-06-11 浪潮集团有限公司 E-commerce tax source management cloud collection monitoring method
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN104158878A (en) * 2014-08-18 2014-11-19 浪潮(北京)电子信息产业有限公司 Adaptive scheduling distributive monitoring data acquisition method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107222564A (en) * 2017-07-04 2017-09-29 贵州数据宝网络科技有限公司 Collecting method and device
CN110968755A (en) * 2018-09-29 2020-04-07 北京国双科技有限公司 Method and device for crawling data
CN109522183A (en) * 2018-10-23 2019-03-26 东软集团股份有限公司 Instrument operating condition monitoring method and system, collector, server and storage medium
CN109522183B (en) * 2018-10-23 2022-04-12 东软集团股份有限公司 Working state monitoring method and system, collector, server and storage medium
CN111130900A (en) * 2019-12-30 2020-05-08 智慧神州(北京)科技有限公司 Data acquisition method and device based on distributed interconnection of coordination services

Similar Documents

Publication Publication Date Title
CN104915259A (en) Task scheduling method applied to distributed acquisition system
CN108769121A (en) Intelligent industrial equips the method for uploading of internet of things data acquisition system and gathered data
CN110311990B (en) Configurable Internet of things data acquisition system and configuration method
CN103699063B (en) The harvester of off-line data and method in a kind of Manufacturing Executive System MES
CN104714875A (en) Distributed automatic collecting method
CN106302017B (en) The small capaciated flow network velocity-measuring system of high concurrent and method
CN104298194B (en) The data volume compression method of data is gathered and transmitted in elevator remote monitoring system
CN107959620B (en) Fully mechanized mining equipment identification method, device, system, gateway and storage medium
CN107992392A (en) A kind of automatic monitoring repair system and method for cloud rendering system
CN105376101A (en) Method and system for enabling physical device to be connected into virtual network
CN104301244A (en) Cluster communication system and method of large-scale power distribution network system
CN102222112A (en) Resource management device and resource management method
CN105790978A (en) Network manager communication message processing method and device, server and main control board
CN104283958B (en) A kind of system task dispatching method
CN105187490B (en) A kind of transfer processing method of internet of things data
CN108093075A (en) A kind of implementation method of application system gray scale issue
CN102480369A (en) Network management system and method for collecting performance
CN105743676B (en) A kind of multi-data source synthetical collection device and method
CN105490879A (en) Automatic distributed performance test system of large-scale integrated network
CN110837242A (en) Hot water supply equipment running state monitoring system based on Internet of things
CN202385116U (en) Distributed operation and maintenance data acquisition device
CN106161339A (en) Obtain the method and device of IP access relation
CN106707859A (en) Grouting site information processing system and grouting site information processing method based on raspberry Pi
CN103618641A (en) Data packet detecting and monitoring system based on multiple-core network processor and capable of being deployed fast
CN107942888B (en) Work control method after fault recovery of data acquisition device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150617

WD01 Invention patent application deemed withdrawn after publication