CN103810136A - Computer cluster, management method and management system for computer cluster - Google Patents

Computer cluster, management method and management system for computer cluster Download PDF

Info

Publication number
CN103810136A
CN103810136A CN201210453826.4A CN201210453826A CN103810136A CN 103810136 A CN103810136 A CN 103810136A CN 201210453826 A CN201210453826 A CN 201210453826A CN 103810136 A CN103810136 A CN 103810136A
Authority
CN
China
Prior art keywords
information
node
database
data
proxy server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210453826.4A
Other languages
Chinese (zh)
Inventor
王明仁
于立杰
赖传霖
郭嘉真
张西亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201210453826.4A priority Critical patent/CN103810136A/en
Publication of CN103810136A publication Critical patent/CN103810136A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The invention discloses a computer cluster. The computer cluster comprises at least one node and one management system, wherein the node comprises an infranet agent and corresponds to role data, the infranet agent is used for collecting software operating data of the node, and the infranet agent can also be used for transmitting node information comprising the role data, the software operating data and an event message when the event message is generated by the node; the management system comprises an infranet agent management module and a database, the infranet agent management module is used for searching the database according to the node information and sending established solution information back to the node if the established solution information related to the node is searched; a complete solution can be generated by the infranet agent of the node according to the established solution information and due to the matching of the role data. The invention simultaneously discloses a management method and a management system for the computer cluster.

Description

Computer cluster, for management method and the system of computer cluster
Technical field
The present invention relates to a kind of calculation element, refer to especially a kind of computer cluster (computercluster), management method and system for computer cluster.
Background technology
In recent years, started by film " A Fanda ", taken up the agitation that a burst of three-dimensional (3D) shows, and nomogram farm (render farm) also then produces; Nomogram farm belongs to a kind of computer cluster, and it is mainly for carrying out a large amount of imaging work relevant to plot of three-D computer (3D computergraphics).Further, nomogram farm is the system of using the huge evaluation work that can highly closely cooperate set up out of many computing machines, and it is generally used for, and picture is painted, shadow lattice synthetic, simulate the correlation computations such as cloth; Wherein, each computing machine is called as the node (node) in computer cluster.
For computer cluster, likely because the role of each node (for example, cluster monitoring person (cluster supervisor), authorization server (license server), computing engines (computing engine) etc.) different, hardware difference that each node configures, or the operating system difference that uses of each node, and cause the installation of software of each node and the program of setting different; So in the time that wherein arbitrary node, in running, problem occurs, mediate as how minimum manpower also carries out problem to this node efficiently, being one is worth the subject under discussion of inquiring into.
A kind of existing operating system image (operating system image, be called for short OSimage) management and installation method and system, for example, the U.S. discloses No. 1 disclosed patented technology of patent of 2008/0046708A, it can be implemented in an allocating operating system system (operating system deployment system), this allocating operating system system comprises at least one destination apparatus (target device), at least one server unit, and a policy library (policy store); Wherein, this server unit comprises an operating system management server, and the policy data (policy data) in this policy library has defined associated between specific policy criteria data event (specific policy criteria data instance) and operating system image event (OSimage instance).
Wherein, a user agent program of this destination apparatus (client agent) is collected the policy criteria data (or claiming configuration data (configuration data)) of this destination apparatus, and is sent to this operating system management server; This operating system management server is according to this this policy library of policy criteria data search from this destination apparatus; (pre-existing) operating system image being pre-stored in corresponding to one of these policy criteria data if find, this destination apparatus is downloaded and be mounted to the operating system image this being pre-stored in.Wherein, these policy criteria data comprise hardware configuration data (for example, a microprocessor identifier, a knife edge type slot position (blade slotlocation), a memory size etc.), and user input data (for example a, user identifier).
But above-mentioned prior art is mainly to install for it this operating system image being pre-stored in that should policy criteria data (referring to hardware configuration data and user input data) is offered to this destination apparatus; For arbitrary node in computer cluster, the prior art only utilizes this operating system image being pre-stored in to cover this node operating system image originally, with the operating system of post-equalization (recovery) this node, being difficult to provides holistic solution (solution) for the problem occurring in this node running specially, automatically carries out problem mediate for it.
Summary of the invention
The object of the present invention is to provide a kind of computer cluster.
The object of the invention to solve the technical problems is to adopt following technical scheme to realize.
Computer cluster of the present invention, comprises: at least one node, and a management system.This node comprises a proxy server, and this node is corresponding to a default role data.The software operational data of this proxy server for collecting this node, in the time that this node produces an event information, this proxy server is also for transmitting a nodal information.This nodal information comprises this role data, this software operational data, and this event information.This management system and this node carry out communication, and comprise a proxy server administration module, and are electrically connected on a database of this proxy server administration module, and this database comprises the solution information that at least one has been set up.This proxy server administration module is for searching this database according to this nodal information of this proxy server from this node, if search this solution information set up that is relevant to this nodal information in this database, this solution information set up that is relevant to this nodal information is returned to this node, this solution information of having set up is used to indicate this node and needs the corresponding action of carrying out.
This proxy server of this node is also relevant to this solution information of having set up of this nodal information for basis, coordinate this role data, produce a total solution corresponding to this event information, this total solution is included at least one instruction that this node is carried out.
Object of the present invention and solve its technical matters and can also be further achieved by the following technical measures.
Preferably, aforesaid computer cluster, this proxy server of this node is also for collecting hardware configuration data of this node, and this proxy server is according to this solution information set up that is relevant to this nodal information, coordinate this role data and this hardware configuration data, produce this total solution corresponding to this event information.
Preferably, aforesaid computer cluster, this proxy server of this node is according to corresponding this role data of this node, and a software/hardware environment set data relevant to this node, to collect this software operational data and this hardware configuration data of this node.
Preferably, aforesaid computer cluster, this database also comprises at least one standard, and a corresponding relation of this standard and this solution information of having set up.
Preferably, aforesaid computer cluster, this proxy server administration module of this management system obtains a group polling condition by this nodal information, according to this group polling condition, this database is searched again, if this proxy server administration module searches this standard conforming to this group polling condition in this database, represent to search this solution information set up that is relevant to this nodal information, the corresponding solution information that this has been set up of this standard conforming to this group polling condition, is relevant to this solution information of having set up of this nodal information exactly.
Preferably, aforesaid computer cluster, this standard in this database comprises a role data of having set up, an event information of having set up, an and critical data group of having set up, this proxy server administration module is first to obtain this role data and this event information by this nodal information, again according to this role data and this event information both wherein at least one, take out a relevant critical data group by this software operational data, then, this proxy server administration module is with this role data, this event information, and this critical data group is as this group polling condition, this database is searched.
Preferably, aforesaid computer cluster, this management system also comprises a database update interface module, if this proxy server administration module does not search this solution information set up that is relevant to this nodal information in this database, this database update interface module is used for providing a database update interface, sets up a new solution information of having set up that is relevant to this nodal information for user.
Another object of the present invention is to provide a kind of management method for computer cluster.
The object of the invention to solve the technical problems is to adopt following technical scheme to realize.
The present invention is for the management method of computer cluster, and this computer cluster comprises at least one node, and a management system of carrying out communication with this node.This node is corresponding to a default role data, and this management system comprises a database, and this database comprises the solution information that at least one has been set up, and the method comprises the following step:
(A) utilize this node to collect a software operational data of this node;
(B) in the time that this node produces an event information, utilize this node to transmit a nodal information, this nodal information comprises this role data, this software operational data, and this event information;
(C) utilize this management system according to from this nodal information of this node, this database being searched;
(D) if search this solution information set up that is relevant to this nodal information in this database, utilize this management system that this solution information set up that is relevant to this nodal information is returned to this node, this solution information of having set up is used to indicate this node and needs the corresponding action of carrying out; And
(E) utilize this node according to this solution information set up that is relevant to this nodal information, coordinate this role data, produce a total solution corresponding to this event information, this total solution is included at least one instruction that this node is carried out.
Object of the present invention and solve its technical matters and can also be further achieved by the following technical measures.
Preferably, the aforesaid management method for computer cluster, in step (A), also collects hardware configuration data of this node; In step (E), also, according to this solution information set up that is relevant to this nodal information, coordinate this role data and this hardware configuration data, produce this total solution corresponding to this event information.
Preferably, the aforesaid management method for computer cluster, in step (A), is according to corresponding this role data of this node, and a software/hardware environment set data relevant to this node, to collect this software operational data and this hardware configuration data of this node.
Preferably, the aforesaid management method for computer cluster, this database also comprises at least one standard, an and corresponding relation of this standard and this solution information of having set up, in step (C), be to obtain a group polling condition by this nodal information, then according to this group polling condition, this database searched; In step (D), if search this standard conforming to this group polling condition in this database, represent to search this solution information set up that is relevant to this nodal information, the corresponding solution information that this has been set up of this standard conforming to this group polling condition, is relevant to this solution information of having set up of this nodal information exactly.
Preferably, the aforesaid management method for computer cluster, this standard in this database comprises a role data of having set up, an event information of having set up, and a critical data group of having set up, wherein, step (C) comprises following sub-step:
(c-1) obtain this role data and this event information by this nodal information;
(c-2) according to this role data and this event information both wherein at least one, take out a relevant critical data group by this software operational data; And
(c-3) with this role data, this event information, and this critical data group is as this group polling condition, and this database is searched.
Preferably, the aforesaid management method for computer cluster, in step (D), if do not search this standard conforming to this group polling condition in this database, utilize this management system that a database update interface is provided, set up a new solution information of having set up that is relevant to this nodal information for user.
Another object of the present invention is to provide a kind of management system for computer cluster.
The object of the invention to solve the technical problems is to adopt following technical scheme to realize.
The present invention is for the management system of computer cluster, this management system and at least one node that is used for computer cluster carries out communication, this node comprises a proxy server and corresponding to a default role data, this proxy server is collected a software operational data of this node, in the time that this node produces an event information, this proxy server transmission comprises this role data, this software operational data, and a nodal information of this event information give this management system, this management system comprises:
A database, comprises the solution information that at least one has been set up; And
A proxy server administration module, be electrically connected on this database, this proxy server administration module is for searching this database according to this nodal information of this proxy server from this node, if search this solution information set up that is relevant to this nodal information in this database, this solution information set up that is relevant to this nodal information is returned to this node, this solution information of having set up is used to indicate this node and needs the corresponding action of carrying out.
Object of the present invention and solve its technical matters and can also be further achieved by the following technical measures.
Preferably, the aforesaid management system for computer cluster, this database also comprises at least one standard, and a corresponding relation of this standard and this solution information of having set up.
Preferably, the aforesaid management system for computer cluster, this proxy server administration module obtains a group polling condition by this nodal information, according to this group polling condition, this database is searched again, if this proxy server administration module searches this standard conforming to this group polling condition in this database, represent to search this solution information set up that is relevant to this nodal information, the corresponding solution information that this has been set up of this standard conforming to this group polling condition, is relevant to this solution information of having set up of this nodal information exactly.
Preferably, the aforesaid management system for computer cluster, this standard in this database comprises a role data of having set up, an event information of having set up, and a critical data group of having set up; This proxy server administration module is first to obtain this role data and this event information by this nodal information, again according to this role data and this event information both wherein at least one, take out a relevant critical data group by this software operational data, then, this proxy server administration module is with this role data, this event information, and this critical data group is as this group polling condition, and this database is searched.
Preferably, the aforesaid management system for computer cluster, also comprise a database update interface module, if this proxy server administration module does not search this solution information set up that is relevant to this nodal information in this database, this database update interface module is used for providing a database update interface, sets up a new solution information of having set up that is relevant to this nodal information for user.
Beneficial effect of the present invention is: by this proxy server of each node, this management system of computer cluster coordinated of the present invention, can provide an overall solution for the problem occurring in this node running specially, mediate so can automatically carry out problem.
Accompanying drawing explanation
Fig. 1 is an Organization Chart of a preferred embodiment of computer cluster of the present invention;
Fig. 2 is the process flow diagram of the present invention for a preferred embodiment of the management method of computer cluster.
Embodiment
Below in conjunction with accompanying drawing and a preferred embodiment, the present invention is described in detail.
Refer to Fig. 1, a preferred embodiment of computer cluster 1 of the present invention, comprises at least one node 2, and a management system 3.The quantity of this at least one node 2 can be one or more, and each node 2 comprises a proxy server (agent) 21, and this node 2 is corresponding to a default role data; This node 2 is a computing machine (computer), and this proxy server 21 is to implement with software mode, and is installed on this node 2.This management system 3 can be carried out communication by network (intranet/internet) 4 and this node 2; This management system 3 comprises a database (database) 32 of acting on behalf of management module 31, be electrically connected on this proxy server administration module 31, a software library (softwarerepository) 33 that is electrically connected on this proxy server administration module 31, and is electrically connected on a database update interface module 34 of this proxy server administration module 31 and this database 32; This database 32 comprises at least one solution information of having set up.
For instance, this computer cluster 1 is a nomogram farm that comprises multiple nodes 2, and the role data of described node 2 correspondences comprises a nomogram supervisor (render supervisor), and multiple nomogram worker (render worker); Node 2 that should nomogram supervisor is mainly used in assigning (dispatch) work (job) node 2 to corresponding described nomogram worker.This management system 3 is mainly used in managing the software environment of described node 2, for example, relevant to the software environment of each node 2 builds, the processing such as reduction, problem reparation (repair).
Wherein, this proxy server 21 of each node 2 is for collecting software running (software behavior) data for this node 2, and hardware configuration data; In the time that this node 2 produces an event (event) message, this proxy server 21 is also for transmitting a nodal information to this management system 3.This nodal information comprises this role data, this software operational data, and this event information.
This proxy server administration module 31 of this management system 3 is for searching this database 32 according to this nodal information of this proxy server 21 from this node 2; If search this solution information set up that is relevant to this nodal information in this database 32, this solution information set up that is relevant to this nodal information returned to this node 2.This solution information of having set up is used to indicate this node 2 and needs the corresponding action (action) of carrying out, this proxy server 21 of this node 2 is according to this solution information of having set up, coordinate this role data, produce the total solution corresponding to this event information.In the time that the setting of this solution information of having set up and the hardware environment of this node 2 is relevant, this proxy server 21 is except coordinating this role data, also need further to coordinate these hardware configuration data, just can be enough to produce this total solution, this total solution comprises at least one instruction (command) that can carry out in this node 2; Otherwise, this database update interface module 34 provides a database update interface, for user (for example, managerial personnel (administrator)) manually set up the new solution information that one of this nodal information has been set up that is relevant to, and increase and be stored to this solution information of having set up newly this database 32.
Below coordinate the present invention to be used for a preferred embodiment of the management method of computer cluster, further illustrate the running between this at least one node 2 and this management system 3.Because each node 2 is similar to the running concept between this management system 3, below be only described for the running between single node 2 and this management system 3.
It is worth mentioning that, in the stage that initially builds of the software environment of this node 2, this node 2 need first be installed this proxy server 21, and in the installation process of this proxy server 21, user can be according to corresponding this role data of this node 2, the software/hardware environment set data (for example, the component software (component) of these node 2 need installations, fire wall setting data, Internet protocol (Internet Protocol is called for short IP) setting data etc.) that manually input is relevant to this node 2.When user finishes the input of this software/hardware environment set data, and correspondence (is for example carried out an input complete operation, an acknowledgement key of setting interface of clicking that this proxy server 21 provides) after, this node 2 produces an event information of a corresponding software environment Installation Events; Then, this proxy server 21 transmits a nodal information to this management system 3, and this nodal information comprises this event information; Then, this management system 3 is according to this event information of this nodal information, one solution information of having set up is returned to this node 2, and wherein, this solution information of having set up is used to indicate this node 2 need initially install (initial installation); Then, this proxy server 21 of this node 2 is according to this solution information of having set up, coordinate this software/hardware environment set data, produce a total solution, this total solution comprises a string software installation instruction sequentially, and to this string software, software installation path (path) and the software/hardware Configuration Values that instruction is relevant is installed; Finally, this node 2 is sequentially carried out this string software instruction is installed, to carry out building of software environment.
Refer to Fig. 1~2, this management method that is used for computer cluster comprises the following step:
In step 501, this proxy server 21 of this node 2, according to corresponding this role data of this node 2 and relevant software/hardware environment set data, is collected a software operational data of this node 2, and hardware configuration data; Wherein, operating state (state) data that this software operational data is the software installed in this node 2.
In step 502, this proxy server 21 has judged whether that an event information produces; If so, proceed the processing of step 503; Otherwise, get back to step 501.
In this preferred embodiment, to revise this software/hardware environment set data the correspondence of this proxy server 21 as user and carry out after this input complete operation, this node 2 can then produce an event information of a corresponding setting data modification event; Or in the time of the upper wrong generation of this node 2 running, this node 2 can then produce an event information of a corresponding error event; Or, when this proxy server 21 receives monitoring (monitor) application state (state) for this node 2 while requiring, this node 2 can then produce the event information to should monitoring software state requiring, wherein, it can be by a subscriber computer (the client PC that can carry out with this management system 3 communication that this monitoring software state requires, figure does not show) institute initiates, and by this management system 3, this monitoring software state required to send to this proxy server 21 of this node 2.
In step 503, this proxy server 21 transmits a nodal information to this management system 3; This nodal information comprises this role data, this software operational data, and this event information.
In step 504, this proxy server administration module 31 of this management system 3, according to this nodal information from this proxy server 21, is searched and is relevant to the solution information that one of this nodal information has been set up in this database 32; This solution information of having set up is used to indicate this node 2 and needs the corresponding action of carrying out.
In this preferred embodiment, this database 32 comprises at least one standard (criterion), at least one solution information of having set up, and a corresponding relation of this standard and this solution information of having set up; This standard comprises a role data of having set up, an event information of having set up, and the key of having set up (key) data group.This proxy server administration module 31 obtains a group polling (query) condition by this nodal information, then according to this group polling condition, this database 32 is searched.Further, this proxy server administration module 31 is first to obtain this role data and this event information by this nodal information; Again according to this role data and this event information both wherein at least one, take out a relevant critical data group by this software operational data; Then, with this role data, this event information, and this critical data group is as this group polling condition, and this database 32 is searched.
In step 505, this proxy server administration module 31 judges whether to search this solution information set up that is relevant to this nodal information; If so, proceed the processing of step 508; Otherwise, proceed the processing of step 506.
In this preferred embodiment, if this proxy server administration module 31 searches this standard conforming to this group polling condition in this database 32, represent to search this solution information set up that is relevant to this nodal information, further, the corresponding solution information that this has been set up of this standard conforming to this group polling condition, is relevant to this solution information of having set up of this nodal information exactly; Otherwise, represent not search this solution information set up that is relevant to this nodal information.
In step 506, this proxy server administration module 31 provides a system mistake message that is relevant to this nodal information to user.
In step 507, this database update interface module 34 provides this database update interface, manually set up and be relevant to the solution information that one of this nodal information has been set up for user, and increase and be stored to this solution information of having set up newly this database 32; Then, get back to step 504.
In step 508, these proxy server administration module 31 its these solution informations of having set up that search of passback are given this node 2, if this solution information of having set up need to use the interior stored software of this software library 33, this solution information of having set up also comprises the software storage path corresponding to the interior stored software of this software library 33.
In step 509, this proxy server 21 of this node 2 is according to this solution information of having set up, coordinate this role data, produce a total solution, wherein, in the time that the setting of this solution information of having set up and the hardware environment of this node 2 is relevant, this proxy server 21 is except coordinating this role data, also coordinate this hardware configuration data, to produce this total solution, this total solution comprises at least one instruction that can carry out in this node 2, wherein, this total solution can only include single instruction, or, also can comprise multiple instructions that a particular order is arranged of complying with.
For instance, this solution information of having set up is used to indicate this node 2 and needs to install the driver (driver) corresponding to A hardware; And this total solution is included in this node 2 installs a succession of instruction of the required execution of driver of A hardware and the software/hardware Configuration Values relevant to this consecutive instruction.In other words, due to corresponding this role data of each node 2 and these hardware configuration data of collecting different, therefore, this proxy server 21 must produce this total solution that meets this role data and these hardware configuration data customizedly.
In step 510, this node 2 is carried out the instruction of this total solution.
In step 511, this proxy server 21 of this node 2 checks whether this node 2 has completed the processing corresponding to this event information; If completed the processing corresponding to this event information, get back to step 501; Otherwise, proceed the processing of step 512.
For instance, when the corresponding a certain error event of this event information, this proxy server 21 just checks whether this error event has been got rid of or repaired; If this error event has been got rid of or repaired, get back to step 501; Otherwise, proceed the processing of step 512.
In step 512, this proxy server 21 of this node 2 judges whether the time of processing this event information has exceeded a default time restriction (time limit); If so, to step 506; Otherwise, get back to step 501.
In sum, computer cluster 1 of the present invention is by this proxy server 21 of each node 2, and this management system 3 of coordinated, can provide holistic solution for the problem occurring in these node 2 runnings specially, automatically carries out problem mediate for it.

Claims (18)

1. a computer cluster, is characterized in that: comprise:
At least one node, this node comprises a proxy server, and this node is corresponding to a default role data, the software operational data of this proxy server for collecting this node, in the time that this node produces an event information, this proxy server is also for transmitting a nodal information, and this nodal information comprises this role data, this software operational data, and this event information; And
A management system, carry out communication with this node, this management system comprises a proxy server administration module, and be electrically connected on a database of this proxy server administration module, this database comprises the solution information that at least one has been set up, this proxy server administration module is for searching this database according to this nodal information of this proxy server from this node, if search this solution information set up that is relevant to this nodal information in this database, this solution information set up that is relevant to this nodal information is returned to this node, this solution information of having set up is used to indicate this node and needs the corresponding action of carrying out,
This proxy server of this node is also relevant to this solution information of having set up of this nodal information for basis, coordinate this role data, produce a total solution corresponding to this event information, this total solution is included at least one instruction that this node is carried out.
2. computer cluster according to claim 1, it is characterized in that: this proxy server of this node is also for collecting hardware configuration data of this node, and this proxy server is according to this solution information set up that is relevant to this nodal information, coordinate this role data and this hardware configuration data, produce this total solution corresponding to this event information.
3. computer cluster according to claim 2, it is characterized in that: this proxy server of this node is according to corresponding this role data of this node, and a software/hardware environment set data relevant to this node, to collect this software operational data and this hardware configuration data of this node.
4. computer cluster according to claim 1, is characterized in that: this database also comprises at least one standard, and a corresponding relation of this standard and this solution information of having set up.
5. computer cluster according to claim 4, it is characterized in that: this proxy server administration module of this management system obtains a group polling condition by this nodal information, according to this group polling condition, this database is searched again, if this proxy server administration module searches this standard conforming to this group polling condition in this database, represent to search this solution information set up that is relevant to this nodal information, the corresponding solution information that this has been set up of this standard conforming to this group polling condition, be relevant to exactly this solution information of having set up of this nodal information.
6. computer cluster according to claim 5, it is characterized in that: this standard in this database comprises a role data of having set up, an event information of having set up, an and critical data group of having set up, this proxy server administration module is first to obtain this role data and this event information by this nodal information, again according to this role data and this event information both wherein at least one, take out a relevant critical data group by this software operational data, then, this proxy server administration module is with this role data, this event information, and this critical data group is as this group polling condition, this database is searched.
7. computer cluster according to claim 1, it is characterized in that: this management system also comprises a database update interface module, if this proxy server administration module does not search this solution information set up that is relevant to this nodal information in this database, this database update interface module is used for providing a database update interface, sets up a new solution information of having set up that is relevant to this nodal information for user.
8. the management method for computer cluster, it is characterized in that: this computer cluster comprises at least one node, an and management system of carrying out communication with this node, this node is corresponding to a default role data, this management system comprises a database, this database comprises the solution information that at least one has been set up, and the method comprises the following step:
(A) utilize this node to collect a software operational data of this node;
(B) in the time that this node produces an event information, utilize this node to transmit a nodal information, this nodal information comprises this role data, this software operational data, and this event information;
(C) utilize this management system according to from this nodal information of this node, this database being searched;
(D) if search this solution information set up that is relevant to this nodal information in this database, utilize this management system that this solution information set up that is relevant to this nodal information is returned to this node, this solution information of having set up is used to indicate this node and needs the corresponding action of carrying out; And
(E) utilize this node according to this solution information set up that is relevant to this nodal information, coordinate this role data, produce a total solution corresponding to this event information, this total solution is included at least one instruction that this node is carried out.
9. the management method for computer cluster according to claim 8, is characterized in that: in step (A), also collect hardware configuration data of this node; In step (E), also, according to this solution information set up that is relevant to this nodal information, coordinate this role data and this hardware configuration data, produce this total solution corresponding to this event information.
10. the management method for computer cluster according to claim 9, it is characterized in that: in step (A), according to corresponding this role data of this node, and a software/hardware environment set data relevant to this node, to collect this software operational data and this hardware configuration data of this node.
11. management methods for computer cluster according to claim 8, it is characterized in that: this database also comprises at least one standard, an and corresponding relation of this standard and this solution information of having set up, in step (C), be to obtain a group polling condition by this nodal information, then according to this group polling condition, this database searched; In step (D), if search this standard conforming to this group polling condition in this database, represent to search this solution information set up that is relevant to this nodal information, the corresponding solution information that this has been set up of this standard conforming to this group polling condition, is relevant to this solution information of having set up of this nodal information exactly.
12. management methods for computer cluster according to claim 11, it is characterized in that: this standard in this database comprises a role data of having set up, an event information of having set up, an and critical data group of having set up, wherein, step (C) comprises following sub-step:
(c-1) obtain this role data and this event information by this nodal information;
(c-2) according to this role data and this event information both wherein at least one, take out a relevant critical data group by this software operational data; And
(c-3) with this role data, this event information, and this critical data group is as this group polling condition, and this database is searched.
13. management methods for computer cluster according to claim 8, it is characterized in that: in step (D), if do not search this solution information set up that is relevant to this nodal information in this database, utilize this management system that a database update interface is provided, set up a new solution information of having set up that is relevant to this nodal information for user.
14. 1 kinds of management systems for computer cluster, it is characterized in that: this management system and at least one node that is used for computer cluster carries out communication, this node comprises a proxy server and corresponding to a default role data, this proxy server is collected a software operational data of this node, in the time that this node produces an event information, this proxy server transmission comprises this role data, this software operational data, and a nodal information of this event information give this management system, this management system comprises:
A database, comprises the solution information that at least one has been set up; And
A proxy server administration module, be electrically connected on this database, this proxy server administration module is for searching this database according to this nodal information of this proxy server from this node, if search this solution information set up that is relevant to this nodal information in this database, this solution information set up that is relevant to this nodal information is returned to this node, this solution information of having set up is used to indicate this node and needs the corresponding action of carrying out.
15. management systems for computer cluster according to claim 14, is characterized in that: this database also comprises at least one standard, and a corresponding relation of this standard and this solution information of having set up.
16. management systems for computer cluster according to claim 15, it is characterized in that: this proxy server administration module obtains a group polling condition by this nodal information, according to this group polling condition, this database is searched again, if this proxy server administration module searches this standard conforming to this group polling condition in this database, represent to search this solution information set up that is relevant to this nodal information, the corresponding solution information that this has been set up of this standard conforming to this group polling condition, be relevant to exactly this solution information of having set up of this nodal information.
17. management systems for computer cluster according to claim 16, is characterized in that: this standard in this database comprises a role data of having set up, an event information of having set up, and a critical data group of having set up; This proxy server administration module is first to obtain this role data and this event information by this nodal information, again according to this role data and this event information both wherein at least one, take out a relevant critical data group by this software operational data, then, this proxy server administration module is with this role data, this event information, and this critical data group is as this group polling condition, and this database is searched.
18. management systems for computer cluster according to claim 14, it is characterized in that: also comprise a database update interface module, if this proxy server administration module does not search this solution information set up that is relevant to this nodal information in this database, this database update interface module is used for providing a database update interface, sets up a new solution information of having set up that is relevant to this nodal information for user.
CN201210453826.4A 2012-11-13 2012-11-13 Computer cluster, management method and management system for computer cluster Pending CN103810136A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210453826.4A CN103810136A (en) 2012-11-13 2012-11-13 Computer cluster, management method and management system for computer cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210453826.4A CN103810136A (en) 2012-11-13 2012-11-13 Computer cluster, management method and management system for computer cluster

Publications (1)

Publication Number Publication Date
CN103810136A true CN103810136A (en) 2014-05-21

Family

ID=50706926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210453826.4A Pending CN103810136A (en) 2012-11-13 2012-11-13 Computer cluster, management method and management system for computer cluster

Country Status (1)

Country Link
CN (1) CN103810136A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575247A (en) * 2014-08-13 2017-04-19 微软技术许可有限责任公司 Fault tolerant federation of computing clusters
CN109388213A (en) * 2017-08-09 2019-02-26 广达电脑股份有限公司 Server system, computer implemented method and non-transitory computer-readable medium
US11290524B2 (en) 2014-08-13 2022-03-29 Microsoft Technology Licensing, Llc Scalable fault resilient communications within distributed clusters

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005055072A1 (en) * 2003-11-26 2005-06-16 Hewlett-Packard Development Company, L.P. System and method for management and installation of operating system images for computers
CN1851686A (en) * 2005-04-22 2006-10-25 天津曙光计算机产业有限公司 Method for self-constructing group operating system core and intelligent constructor
CN1983199A (en) * 2005-12-15 2007-06-20 联想(新加坡)私人有限公司 System and method for analyzing out-of-work of computer intellectually
CN101315618A (en) * 2008-05-30 2008-12-03 中国科学院计算技术研究所 Cluster system for utility computation and its environment management method in operation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005055072A1 (en) * 2003-11-26 2005-06-16 Hewlett-Packard Development Company, L.P. System and method for management and installation of operating system images for computers
CN1851686A (en) * 2005-04-22 2006-10-25 天津曙光计算机产业有限公司 Method for self-constructing group operating system core and intelligent constructor
CN1983199A (en) * 2005-12-15 2007-06-20 联想(新加坡)私人有限公司 System and method for analyzing out-of-work of computer intellectually
CN101315618A (en) * 2008-05-30 2008-12-03 中国科学院计算技术研究所 Cluster system for utility computation and its environment management method in operation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575247A (en) * 2014-08-13 2017-04-19 微软技术许可有限责任公司 Fault tolerant federation of computing clusters
CN106575247B (en) * 2014-08-13 2020-05-15 微软技术许可有限责任公司 Fault-tolerant federation of computing clusters
US11290524B2 (en) 2014-08-13 2022-03-29 Microsoft Technology Licensing, Llc Scalable fault resilient communications within distributed clusters
CN109388213A (en) * 2017-08-09 2019-02-26 广达电脑股份有限公司 Server system, computer implemented method and non-transitory computer-readable medium
CN109388213B (en) * 2017-08-09 2021-02-02 广达电脑股份有限公司 Server system, computer-implemented method, and non-transitory computer-readable medium

Similar Documents

Publication Publication Date Title
CN102064966B (en) A kind of collocation method, server, equipment and system
CN101390336A (en) Disaster recovery architecture
CN103339611A (en) Remote access appliance having mss functionality
CN104508628A (en) Monitoring for managed services
CN104836699A (en) Equipment state processing method and equipment state processing system
CN101785283B (en) Methods and devices for communicating diagnosis data in a real time communication network
CN106790131B (en) Parameter modification method and device and distributed platform
CN105991361A (en) Monitoring method and monitoring system for cloud servers in cloud computing platform
CN109905492B (en) Safety operation management system and method based on distributed modular data center
US9195535B2 (en) Hotspot identification
EP2696297B1 (en) System and method for generating information file based on parallel processing
CN112685175B (en) Construction method and device of service topological graph and computer readable storage medium
CN110278101B (en) Resource management method and equipment
CN104731062A (en) Intelligent network management system and method used for monitoring state and dispatching for instruments
CN110096521A (en) Log information processing method and device
CN110890987A (en) Method, device, equipment and system for automatically creating cluster
CN107171888A (en) A kind of clustering performance monitoring method based on cAdvisor
CN112506969A (en) BMC address query method, system, equipment and readable storage medium
CN103810136A (en) Computer cluster, management method and management system for computer cluster
CN104317672A (en) System file repairing method, device and system
CN117389830A (en) Cluster log acquisition method and device, computer equipment and storage medium
CN104951855A (en) Apparatus and method for improving resource management
US8117181B2 (en) System for notification of group membership changes in directory service
CN106612193A (en) Network deployment configuration method and device in virtualization technology
Mesiti et al. StreamLoader: an event-driven ETL system for the on-line processing of heterogeneous sensor data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140521