CN103701661B

CN103701661B - A kind of method and system for realizing monitoring nodes

Info

Publication number: CN103701661B
Application number: CN201310717518.2A
Authority: CN
Inventors: 刘璧怡; 郭美思; 宗栋瑞; 吴楠
Original assignee: Inspur Beijing Electronic Information Industry Co Ltd
Current assignee: Inspur Beijing Electronic Information Industry Co Ltd
Priority date: 2013-12-23
Filing date: 2013-12-23
Publication date: 2017-08-25
Anticipated expiration: 2033-12-23
Also published as: CN103701661A

Abstract

This application discloses a kind of method and system for realizing monitoring nodes, including：One master server and the corresponding independent proxy server run on each back end；Wherein, master server is connected with name node, and obtains cluster configuration information；Based on heart-beat protocol, status command and control instruction are issued to proxy server；The node status information that Receiving Agent server is uploaded, to update cluster configuration information；Proxy server, status command and control instruction information for receiving master server obtain back end status information according to status command, are uploaded to master server；State control is operated to each component of back end according to control instruction, and control instruction result is fed back into master server.The present invention realizes that proxy server receives the status command and control instruction information of master server, to obtain back end status information, issue control instruction and feedback control instruction results information, realizes the monitoring management to back end.

Description

A kind of method and system for realizing monitoring nodes

Technical field

It is espespecially a kind of to be applied to distributed system architecture the present invention relates to big data treatment technology（hadoop）Big number Monitoring nodes method and system are realized according to platform.

Background technology

Along with continuing to develop for digital living, the volume of data is increased sharply with mysterious speed, resulting big Data also become increasingly difficult to processing.Big data is, using the data processing based on cloud computing and application model, to pass through logarithm According to integration it is shared, intersect the abilities of intellectual resources that multiplexing formed and knowledge services.And big data platform is big data technology The base support of application.

Current most popular hadoop big datas platform is the distributed system basis developed by Apache foundations Architecture platform.Hadoop big datas platform has in the case where user does not know about distributed low-level details, it is possible to divided Cloth program development, sufficiently make use of the power of cluster to carry out the characteristics of high-speed computation and storage.One hadoop cluster Node scale frequently includes tens, up to a hundred or even thousands of back end, due in large scale so that fast and accurately monitoring pipe Back end becomes abnormal difficult in reason cluster.

At present, hadoop big datas platform is provided by cluster shell dos command line DOSs or browser checks the section of cluster Dotted state.If to be controlled operation to a certain node in cluster, need individually to log in the node, pass through shell Instruction is controlled operation to the node.When the node appearance exception in cluster delays machine, it is necessary to by manually recovering this Node is delayed the service before machine, then the node is added cluster, could recover cluster normal work.Deposited using the method for manual reversion Cumbersome, it is also easy to introduce new mistake while expending a large amount of manpowers so as to collection in large-scale cluster environment Group node is monitored and replied operation and is inconvenient.

The content of the invention

In order to solve the above-mentioned technical problem, can be right the invention discloses a kind of method and system for realizing monitoring nodes The status information of back end carries out effective monitoring, occur it is abnormal delay machine when, data section that can timely and effectively to the machine of delaying Point carries out recovery control.

The present invention provides a kind of method for realizing monitoring nodes, including：

One master server and the corresponding independent proxy server run on each back end；Wherein,

Master server is connected with name node, for obtaining cluster configuration information from name node；Based on heart-beat protocol, under Status command and control instruction information are sent out to proxy server；The node status information that Receiving Agent server is uploaded, to update Cluster configuration information；

Proxy server, status command and control instruction information for receiving master server, is obtained according to status command Back end status information, is uploaded to master server；State is operated to each component of back end according to control instruction Control, and control instruction result is fed back into master server.

Further, master server is additionally operable to, when back end delays machine extremely, and machine of delaying is sent according to cluster configuration information The control instruction of node recovery configuring is to proxy server；

Proxy server is additionally operable to, and back end is recovered according to cluster configuration information according to control instruction control data node The working condition of each component, and control instruction result is fed back into master server.

Further, master server is additionally operable to, the back end status information obtained according to proxy server, determines data Whether node there is abnormal machine of delaying.

Further, master server by message queue mode specifically for issuing status command and control instruction information.

Further, proxy server using message queue mode specifically for uploading back end status information and anti- Present control instruction object information.

On the other hand, the application also provides a kind of method for realizing monitoring nodes,

One master server is set on name node, corresponding independent agency's clothes are set respectively on each back end Business device；

Master server obtains cluster configuration information from name node, based on heart-beat protocol, issues status command and control refers to Information is made to proxy server；

Proxy server is according to status command acquisition of information back end status information, according to control instruction information to control Each component of node is operated state control；

Back end status information and control instruction object information are sent to main service node, cluster configuration information is carried out Update.

Further, this method also includes：

When back end delays machine extremely, the master server sends machine node recovery configuring of delaying according to cluster configuration information Control instruction information to proxy server；

Proxy server is according to control instruction control data node according to each group of cluster configuration information recovery back end The working condition of part.

Further, the back end status information that master server is obtained according to proxy server, determines that back end is The no abnormal machine of delaying of appearance.

Further, master server by message queue mode, issue status command and control instruction information.

Further, proxy server is referred to using message queue mode, upload back end status information and feedback control Make object information.

The application provides a kind of technical scheme, including：One master server and run on each back end it is corresponding solely Vertical proxy server；Wherein, master server is connected with name node, for obtaining cluster configuration information from name node；Base In heart-beat protocol, status command and control instruction information are issued to proxy server；The node shape that Receiving Agent server is uploaded State information, to update cluster configuration information；Proxy server, status command and control instruction letter for receiving master server Breath, obtains back end status information according to status command, is uploaded to master server；According to control instruction to each of back end Individual component is operated state control, and control instruction result is fed back into master server.The present invention realizes that proxy server connects The status command and control instruction information of master server are received, to obtain back end status information, control instruction is issued and feeds back Control instruction object information, realizes the monitoring management to back end.

Brief description of the drawings

Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not constitute inappropriate limitation of the present invention.In the accompanying drawings：

Fig. 1 realizes the structured flowchart of the system of monitoring nodes for the present invention；

Fig. 2 realizes the flow chart of the method for monitoring nodes for the present invention.

Embodiment

In order that technical scheme is sufficiently understood, the statement of summary is carried out to heart-beat protocol.In net Reception and transmission data in network are all to use the SOCKET in WINDOWS to be realized.But if this socket has broken Open, that sends data and is just bound to when receiving data problematic.Whether judge socket to use is exactly by the heart Agreement is jumped to realize.The mechanism for being called heartbeat is had been realized in TCP in fact.If there is provided heartbeat, that TCP will be The heartbeat of the number of times set is sent in regular hour, and this information does not interfere with defined agreement.So-called " heartbeat " It is exactly that timing sends a customized structure, allows other side to know the service " online ".To ensure the validity of link.

Timing sends a customized structure（Heartbeat packet）, to ensure the validity of connection, here it is heart-beat protocol Main contents.

Fig. 1 realizes the structured flowchart of the system of monitoring nodes for the present invention, as shown in figure 1, including：

Further, the master server is additionally operable to, and when back end delays machine extremely, is sent according to cluster configuration information The control instruction for machine node recovery configuring of delaying is to proxy server；

Master server is additionally operable to, the back end status information obtained according to proxy server, whether determines back end There is abnormal machine of delaying.

Master server by message queue mode specifically for issuing status command and control instruction.

Proxy server is specifically for uploading back end status information using message queue mode and feedback control being instructed As a result.

Fig. 2 realizes the flow chart of the method for monitoring nodes for the present invention；As shown in Fig. 2 including：

Step 200, one master server of setting set corresponding independent respectively on name node in each back end Proxy server.

Step 201, master server obtain cluster configuration information from name node, based on heart-beat protocol, issue status command With control instruction information to proxy server.

In this step, master server by message queue mode, issue status command and control instruction.

Step 202, proxy server are believed according to status command acquisition of information back end status information according to control instruction Cease and state control is operated to each component of control node.

In this step, proxy server is referred to using message queue mode, upload back end status information and feedback control Make object information.

Step 203, back end status information and control instruction object information be sent to main service node, carry out cluster Configuration information update.

The inventive method also includes：

When back end delays machine extremely, the master server sends machine node recovery configuring of delaying according to cluster configuration information Control instruction information to proxy server.

In this step, the back end status information that master server is obtained according to proxy server determines that back end is The no abnormal machine of delaying of appearance.

It is each that the proxy server recovers back end according to control instruction control data node according to cluster configuration information The working condition of individual component.

One of ordinary skill in the art will appreciate that all or part of step in the above method can be instructed by program Related hardware is completed, and described program can be stored in computer-readable recording medium, such as read-only storage, disk or CD Deng.Alternatively, all or part of step of above-described embodiment can also use one or more integrated circuits to realize.Accordingly Each module/unit in ground, above-described embodiment can be realized in the form of hardware, it would however also be possible to employ the shape of software function module Formula is realized.The application is not restricted to the combination of the hardware and software of any particular form.

It is described above, it is only the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all this Within the spirit and principle of invention, any modification, equivalent substitution and improvements done etc. should be included in the protection model of the present invention Within enclosing.

Claims

1. a kind of system for realizing monitoring nodes, it is characterised in that applied to distributed system architecture hadoop big datas Platform, including：One master server and the corresponding independent proxy server run on each back end；Wherein,

Master server is connected with name node, for obtaining cluster configuration information from name node；Based on heart-beat protocol, lower hair-like State is instructed and control instruction is to proxy server；The node status information that Receiving Agent server is uploaded, to update cluster configuration Information；The back end status information obtained according to proxy server is additionally operable to, determines whether back end abnormal machine of delaying occurs； It is additionally operable to when back end delays machine extremely, the control instruction for machine node recovery configuring of delaying is sent to generation according to cluster configuration information Manage server；

Proxy server, status command and control instruction for receiving master server obtain back end according to status command Status information, is uploaded to master server；State control is operated to each component of back end according to control instruction, and will Control instruction object information feeds back to master server；It is additionally operable to according to control instruction control data node according to cluster configuration information Recover the working condition of each component of back end, and control instruction object information is fed back into master server；

Wherein, master server by message queue mode specifically for issuing status command and control instruction；

Proxy server using message queue mode specifically for uploading back end status information and feedback control instruction results Information.

2. a kind of method for realizing monitoring nodes, it is characterised in that applied to distributed system architecture hadoop big datas Platform, including：

One master server is set on name node, corresponding independent agency service is set respectively on each back end Device；

Master server obtains cluster configuration information from name node, based on heart-beat protocol, issues status command and control instruction is arrived Proxy server；

Proxy server obtains back end status information according to status command, according to each group of control instruction to back end Part is operated state control；

Back end status information and control instruction object information are sent to master server, cluster configuration information renewal is carried out；

Wherein, the back end status information that the master server is obtained according to proxy server, determines whether back end goes out Existing abnormal machine of delaying；

When back end delays machine extremely, the master server sends the control for machine node recovery configuring of delaying according to cluster configuration information System instruction is to proxy server；

The proxy server is according to control instruction control data node according to each group of cluster configuration information recovery back end The working condition of part；

The master server issues status command and control instruction by message queue mode；

The proxy server uploads back end status information and feedback control instruction results information using message queue mode.