CN103701661B - A kind of method and system for realizing monitoring nodes - Google Patents

A kind of method and system for realizing monitoring nodes Download PDF

Info

Publication number
CN103701661B
CN103701661B CN201310717518.2A CN201310717518A CN103701661B CN 103701661 B CN103701661 B CN 103701661B CN 201310717518 A CN201310717518 A CN 201310717518A CN 103701661 B CN103701661 B CN 103701661B
Authority
CN
China
Prior art keywords
back end
control instruction
information
server
master server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310717518.2A
Other languages
Chinese (zh)
Other versions
CN103701661A (en
Inventor
刘璧怡
郭美思
宗栋瑞
吴楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201310717518.2A priority Critical patent/CN103701661B/en
Publication of CN103701661A publication Critical patent/CN103701661A/en
Application granted granted Critical
Publication of CN103701661B publication Critical patent/CN103701661B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

This application discloses a kind of method and system for realizing monitoring nodes, including:One master server and the corresponding independent proxy server run on each back end;Wherein, master server is connected with name node, and obtains cluster configuration information;Based on heart-beat protocol, status command and control instruction are issued to proxy server;The node status information that Receiving Agent server is uploaded, to update cluster configuration information;Proxy server, status command and control instruction information for receiving master server obtain back end status information according to status command, are uploaded to master server;State control is operated to each component of back end according to control instruction, and control instruction result is fed back into master server.The present invention realizes that proxy server receives the status command and control instruction information of master server, to obtain back end status information, issue control instruction and feedback control instruction results information, realizes the monitoring management to back end.

Description

A kind of method and system for realizing monitoring nodes
Technical field
It is espespecially a kind of to be applied to distributed system architecture the present invention relates to big data treatment technology(hadoop)Big number Monitoring nodes method and system are realized according to platform.
Background technology
Along with continuing to develop for digital living, the volume of data is increased sharply with mysterious speed, resulting big Data also become increasingly difficult to processing.Big data is, using the data processing based on cloud computing and application model, to pass through logarithm According to integration it is shared, intersect the abilities of intellectual resources that multiplexing formed and knowledge services.And big data platform is big data technology The base support of application.
Current most popular hadoop big datas platform is the distributed system basis developed by Apache foundations Architecture platform.Hadoop big datas platform has in the case where user does not know about distributed low-level details, it is possible to divided Cloth program development, sufficiently make use of the power of cluster to carry out the characteristics of high-speed computation and storage.One hadoop cluster Node scale frequently includes tens, up to a hundred or even thousands of back end, due in large scale so that fast and accurately monitoring pipe Back end becomes abnormal difficult in reason cluster.
At present, hadoop big datas platform is provided by cluster shell dos command line DOSs or browser checks the section of cluster Dotted state.If to be controlled operation to a certain node in cluster, need individually to log in the node, pass through shell Instruction is controlled operation to the node.When the node appearance exception in cluster delays machine, it is necessary to by manually recovering this Node is delayed the service before machine, then the node is added cluster, could recover cluster normal work.Deposited using the method for manual reversion Cumbersome, it is also easy to introduce new mistake while expending a large amount of manpowers so as to collection in large-scale cluster environment Group node is monitored and replied operation and is inconvenient.
The content of the invention
In order to solve the above-mentioned technical problem, can be right the invention discloses a kind of method and system for realizing monitoring nodes The status information of back end carries out effective monitoring, occur it is abnormal delay machine when, data section that can timely and effectively to the machine of delaying Point carries out recovery control.
The present invention provides a kind of method for realizing monitoring nodes, including:
One master server and the corresponding independent proxy server run on each back end;Wherein,
Master server is connected with name node, for obtaining cluster configuration information from name node;Based on heart-beat protocol, under Status command and control instruction information are sent out to proxy server;The node status information that Receiving Agent server is uploaded, to update Cluster configuration information;
Proxy server, status command and control instruction information for receiving master server, is obtained according to status command Back end status information, is uploaded to master server;State is operated to each component of back end according to control instruction Control, and control instruction result is fed back into master server.
Further, master server is additionally operable to, when back end delays machine extremely, and machine of delaying is sent according to cluster configuration information The control instruction of node recovery configuring is to proxy server;
Proxy server is additionally operable to, and back end is recovered according to cluster configuration information according to control instruction control data node The working condition of each component, and control instruction result is fed back into master server.
Further, master server is additionally operable to, the back end status information obtained according to proxy server, determines data Whether node there is abnormal machine of delaying.
Further, master server by message queue mode specifically for issuing status command and control instruction information.
Further, proxy server using message queue mode specifically for uploading back end status information and anti- Present control instruction object information.
On the other hand, the application also provides a kind of method for realizing monitoring nodes,
One master server is set on name node, corresponding independent agency's clothes are set respectively on each back end Business device;
Master server obtains cluster configuration information from name node, based on heart-beat protocol, issues status command and control refers to Information is made to proxy server;
Proxy server is according to status command acquisition of information back end status information, according to control instruction information to control Each component of node is operated state control;
Back end status information and control instruction object information are sent to main service node, cluster configuration information is carried out Update.
Further, this method also includes:
When back end delays machine extremely, the master server sends machine node recovery configuring of delaying according to cluster configuration information Control instruction information to proxy server;
Proxy server is according to control instruction control data node according to each group of cluster configuration information recovery back end The working condition of part.
Further, the back end status information that master server is obtained according to proxy server, determines that back end is The no abnormal machine of delaying of appearance.
Further, master server by message queue mode, issue status command and control instruction information.
Further, proxy server is referred to using message queue mode, upload back end status information and feedback control Make object information.
The application provides a kind of technical scheme, including:One master server and run on each back end it is corresponding solely Vertical proxy server;Wherein, master server is connected with name node, for obtaining cluster configuration information from name node;Base In heart-beat protocol, status command and control instruction information are issued to proxy server;The node shape that Receiving Agent server is uploaded State information, to update cluster configuration information;Proxy server, status command and control instruction letter for receiving master server Breath, obtains back end status information according to status command, is uploaded to master server;According to control instruction to each of back end Individual component is operated state control, and control instruction result is fed back into master server.The present invention realizes that proxy server connects The status command and control instruction information of master server are received, to obtain back end status information, control instruction is issued and feeds back Control instruction object information, realizes the monitoring management to back end.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not constitute inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 realizes the structured flowchart of the system of monitoring nodes for the present invention;
Fig. 2 realizes the flow chart of the method for monitoring nodes for the present invention.
Embodiment
In order that technical scheme is sufficiently understood, the statement of summary is carried out to heart-beat protocol.In net Reception and transmission data in network are all to use the SOCKET in WINDOWS to be realized.But if this socket has broken Open, that sends data and is just bound to when receiving data problematic.Whether judge socket to use is exactly by the heart Agreement is jumped to realize.The mechanism for being called heartbeat is had been realized in TCP in fact.If there is provided heartbeat, that TCP will be The heartbeat of the number of times set is sent in regular hour, and this information does not interfere with defined agreement.So-called " heartbeat " It is exactly that timing sends a customized structure, allows other side to know the service " online ".To ensure the validity of link.
Timing sends a customized structure(Heartbeat packet), to ensure the validity of connection, here it is heart-beat protocol Main contents.
Fig. 1 realizes the structured flowchart of the system of monitoring nodes for the present invention, as shown in figure 1, including:
One master server and the corresponding independent proxy server run on each back end;Wherein,
Master server is connected with name node, for obtaining cluster configuration information from name node;Based on heart-beat protocol, under Status command and control instruction information are sent out to proxy server;The node status information that Receiving Agent server is uploaded, to update Cluster configuration information;
Proxy server, status command and control instruction information for receiving master server, is obtained according to status command Back end status information, is uploaded to master server;State is operated to each component of back end according to control instruction Control, and control instruction result is fed back into master server.
Further, the master server is additionally operable to, and when back end delays machine extremely, is sent according to cluster configuration information The control instruction for machine node recovery configuring of delaying is to proxy server;
Proxy server is additionally operable to, and back end is recovered according to cluster configuration information according to control instruction control data node The working condition of each component, and control instruction result is fed back into master server.
Master server is additionally operable to, the back end status information obtained according to proxy server, whether determines back end There is abnormal machine of delaying.
Master server by message queue mode specifically for issuing status command and control instruction.
Proxy server is specifically for uploading back end status information using message queue mode and feedback control being instructed As a result.
Fig. 2 realizes the flow chart of the method for monitoring nodes for the present invention;As shown in Fig. 2 including:
Step 200, one master server of setting set corresponding independent respectively on name node in each back end Proxy server.
Step 201, master server obtain cluster configuration information from name node, based on heart-beat protocol, issue status command With control instruction information to proxy server.
In this step, master server by message queue mode, issue status command and control instruction.
Step 202, proxy server are believed according to status command acquisition of information back end status information according to control instruction Cease and state control is operated to each component of control node.
In this step, proxy server is referred to using message queue mode, upload back end status information and feedback control Make object information.
Step 203, back end status information and control instruction object information be sent to main service node, carry out cluster Configuration information update.
The inventive method also includes:
When back end delays machine extremely, the master server sends machine node recovery configuring of delaying according to cluster configuration information Control instruction information to proxy server.
In this step, the back end status information that master server is obtained according to proxy server determines that back end is The no abnormal machine of delaying of appearance.
It is each that the proxy server recovers back end according to control instruction control data node according to cluster configuration information The working condition of individual component.
One of ordinary skill in the art will appreciate that all or part of step in the above method can be instructed by program Related hardware is completed, and described program can be stored in computer-readable recording medium, such as read-only storage, disk or CD Deng.Alternatively, all or part of step of above-described embodiment can also use one or more integrated circuits to realize.Accordingly Each module/unit in ground, above-described embodiment can be realized in the form of hardware, it would however also be possible to employ the shape of software function module Formula is realized.The application is not restricted to the combination of the hardware and software of any particular form.
It is described above, it is only the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all this Within the spirit and principle of invention, any modification, equivalent substitution and improvements done etc. should be included in the protection model of the present invention Within enclosing.

Claims (2)

1. a kind of system for realizing monitoring nodes, it is characterised in that applied to distributed system architecture hadoop big datas Platform, including:One master server and the corresponding independent proxy server run on each back end;Wherein,
Master server is connected with name node, for obtaining cluster configuration information from name node;Based on heart-beat protocol, lower hair-like State is instructed and control instruction is to proxy server;The node status information that Receiving Agent server is uploaded, to update cluster configuration Information;The back end status information obtained according to proxy server is additionally operable to, determines whether back end abnormal machine of delaying occurs; It is additionally operable to when back end delays machine extremely, the control instruction for machine node recovery configuring of delaying is sent to generation according to cluster configuration information Manage server;
Proxy server, status command and control instruction for receiving master server obtain back end according to status command Status information, is uploaded to master server;State control is operated to each component of back end according to control instruction, and will Control instruction object information feeds back to master server;It is additionally operable to according to control instruction control data node according to cluster configuration information Recover the working condition of each component of back end, and control instruction object information is fed back into master server;
Wherein, master server by message queue mode specifically for issuing status command and control instruction;
Proxy server using message queue mode specifically for uploading back end status information and feedback control instruction results Information.
2. a kind of method for realizing monitoring nodes, it is characterised in that applied to distributed system architecture hadoop big datas Platform, including:
One master server is set on name node, corresponding independent agency service is set respectively on each back end Device;
Master server obtains cluster configuration information from name node, based on heart-beat protocol, issues status command and control instruction is arrived Proxy server;
Proxy server obtains back end status information according to status command, according to each group of control instruction to back end Part is operated state control;
Back end status information and control instruction object information are sent to master server, cluster configuration information renewal is carried out;
Wherein, the back end status information that the master server is obtained according to proxy server, determines whether back end goes out Existing abnormal machine of delaying;
When back end delays machine extremely, the master server sends the control for machine node recovery configuring of delaying according to cluster configuration information System instruction is to proxy server;
The proxy server is according to control instruction control data node according to each group of cluster configuration information recovery back end The working condition of part;
The master server issues status command and control instruction by message queue mode;
The proxy server uploads back end status information and feedback control instruction results information using message queue mode.
CN201310717518.2A 2013-12-23 2013-12-23 A kind of method and system for realizing monitoring nodes Active CN103701661B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310717518.2A CN103701661B (en) 2013-12-23 2013-12-23 A kind of method and system for realizing monitoring nodes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310717518.2A CN103701661B (en) 2013-12-23 2013-12-23 A kind of method and system for realizing monitoring nodes

Publications (2)

Publication Number Publication Date
CN103701661A CN103701661A (en) 2014-04-02
CN103701661B true CN103701661B (en) 2017-08-25

Family

ID=50363064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310717518.2A Active CN103701661B (en) 2013-12-23 2013-12-23 A kind of method and system for realizing monitoring nodes

Country Status (1)

Country Link
CN (1) CN103701661B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109324834A (en) * 2018-09-19 2019-02-12 郑州云海信息技术有限公司 A kind of system and method that distributed storage server is restarted automatically

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104135388B (en) * 2014-08-15 2017-06-06 曙光信息产业(北京)有限公司 The method for managing security of back end in a kind of distributed system
CN104866380B (en) * 2015-06-18 2018-07-06 北京搜狐新媒体信息技术有限公司 A kind for the treatment of method and apparatus of the state conversion of cluster management system
CN105007193A (en) * 2015-08-19 2015-10-28 浪潮(北京)电子信息产业有限公司 Multi-layer information processing method, system thereof and cluster management node
CN105915405A (en) * 2016-03-29 2016-08-31 深圳市中博科创信息技术有限公司 Large-scale cluster node performance monitoring system
CN105872055A (en) * 2016-03-31 2016-08-17 浪潮通用软件有限公司 Online monitoring method and system for computer systems in network distributed deployment mode
CN106126283B (en) * 2016-06-21 2019-05-14 浪潮电子信息产业股份有限公司 A kind of method, apparatus and system of product allocation
CN106557543A (en) * 2016-10-14 2017-04-05 深圳前海微众银行股份有限公司 Node switching method and system
CN106506203B (en) * 2016-10-25 2019-12-10 杭州云象网络技术有限公司 Node monitoring system applied to block chain
CN106802852A (en) * 2017-01-19 2017-06-06 郑州云海信息技术有限公司 A kind of method of Linux platform component unified monitoring
CN107819553B (en) * 2017-09-28 2020-10-30 青岛海信网络科技股份有限公司 Control instruction feedback method and device
CN108363610A (en) * 2018-02-09 2018-08-03 华为技术有限公司 A kind of control method and equipment of virtual machine monitoring plug-in unit
CN109656570B (en) * 2018-12-18 2022-03-22 江苏满运软件科技有限公司 Cluster system, operation method thereof, electronic device and storage medium
WO2020206638A1 (en) * 2019-04-10 2020-10-15 Beijing Voyager Technology Co., Ltd. Systems and methods for data storage
CN113051102B (en) * 2019-12-26 2024-03-19 中国移动通信集团云南有限公司 File backup method, device, system, storage medium and computer equipment
CN111506480B (en) * 2020-04-23 2024-03-08 上海达梦数据库有限公司 Method, device and system for detecting states of components in cluster
CN115379012B (en) * 2022-10-25 2023-03-24 航天云网数据研究院(广东)有限公司 Industrial interconnection platform message queue deployment method and device based on identification analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667034A (en) * 2009-09-21 2010-03-10 北京航空航天大学 Scalable monitoring system supporting hybrid clusters
CN102104628A (en) * 2010-12-29 2011-06-22 北京新媒传信科技有限公司 Server cluster system and management method thereof
CN102394791A (en) * 2011-10-26 2012-03-28 浪潮(北京)电子信息产业有限公司 Downtime recovery method and system
CN102761570A (en) * 2011-04-28 2012-10-31 同济大学 System and method for monitoring grid resources based on agents
KR20130073372A (en) * 2011-12-23 2013-07-03 주식회사 포스코 Heat recovery apparatus of coke oven and method of the same

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667034A (en) * 2009-09-21 2010-03-10 北京航空航天大学 Scalable monitoring system supporting hybrid clusters
CN102104628A (en) * 2010-12-29 2011-06-22 北京新媒传信科技有限公司 Server cluster system and management method thereof
CN102761570A (en) * 2011-04-28 2012-10-31 同济大学 System and method for monitoring grid resources based on agents
CN102394791A (en) * 2011-10-26 2012-03-28 浪潮(北京)电子信息产业有限公司 Downtime recovery method and system
KR20130073372A (en) * 2011-12-23 2013-07-03 주식회사 포스코 Heat recovery apparatus of coke oven and method of the same

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109324834A (en) * 2018-09-19 2019-02-12 郑州云海信息技术有限公司 A kind of system and method that distributed storage server is restarted automatically

Also Published As

Publication number Publication date
CN103701661A (en) 2014-04-02

Similar Documents

Publication Publication Date Title
CN103701661B (en) A kind of method and system for realizing monitoring nodes
US11469970B2 (en) Methods and apparatus for providing adaptive private network centralized management system data visualization processes
EP3526994B1 (en) Network management interface
US20170286517A1 (en) Systems and methods for managing distributed database deployments
EP2849064B1 (en) Method and apparatus for network virtualization
CN102801559B (en) Intelligent local area network data collecting method
CN103916264B (en) Optimize the virtual network of physical network
WO2019042571A1 (en) Asynchronous gradient averaging distributed stochastic gradient descent
EP2934036B1 (en) System and method for managing cwsn communication data based on gui interaction
CN102571420B (en) Method and system for network element data management
JP2010507298A (en) Method for logical deployment, undeployment, and monitoring of a target IP network
CN107463582A (en) The method and device of distributed deployment Hadoop clusters
CN104391697B (en) The cloud resource management system and method for application program
CN108718347A (en) A kind of domain name analytic method, system, device and storage medium
EP3890243A1 (en) Method and apparatus for network verification
CN104243230B (en) The method and apparatus of monitoring data in a kind of acquisition Linux server
CN106464584A (en) Providing router information according to a programmatic interface
CN112437072A (en) Virtual machine flow traction system, method, equipment and medium in cloud platform
CN110062054A (en) Internet of things equipment long-range control method and system
EP3172682A1 (en) Distributing and processing streams over one or more networks for on-the-fly schema evolution
CN102148702B (en) Method for managing network by utilizing network configuration protocol
Porter et al. The Design and Implementation of a RESTful IoT Service Using the MERN Stack
CN108345518A (en) A kind of data recovery system after software crash and its restoration methods
CN109561025A (en) A kind of information processing method and relevant device
CN105847039A (en) Network monitoring method and network monitoring system based on dynamic executable script

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant