CN104333459A

CN104333459A - Method and device for fault management of cloud data center

Info

Publication number: CN104333459A
Application number: CN201410363945.XA
Authority: CN
Inventors: 陈光新; 朱波
Original assignee: Inspur Beijing Electronic Information Industry Co Ltd
Current assignee: Inspur Beijing Electronic Information Industry Co Ltd
Priority date: 2014-07-28
Filing date: 2014-07-28
Publication date: 2015-02-04

Abstract

The invention provides a method and a device for fault management of a cloud data center. The method comprises the following steps that an organization and management server receives fault information submitted by a user terminal and judges whether an organization and management fault type corresponding to the fault information can be determined; if the organization and management fault type corresponding to the fault information can be determined, the organization and management server repairs a fault according to the organization and management fault type; if the organization and management fault type corresponding to the fault information cannot be determined, the organization and management server transmits the fault information to a system management server, the system management server determines a system management fault type according to the fault information, and repairs the fault according to the system management fault type. According to the method and the device, the fault can be timely processed, so that system resources are effectively utilized.

Description

Cloud data center failure management method and device

Technical field

The present invention relates to field of cloud computer technology, particularly relate to a kind of cloud data center failure management method and device.

Background technology

Along with the continuous maturation of cloud computing technology, cloud computing progressively becomes the Hot spots for development of industry.Cloud data center operation system is unique scheme that data center completes conversion from hardware to resource pool, and a large amount of heterogeneous device is fused to the logical resource pond that standard is unified, dynamic dispatching is applied to cloud, the service of complete paired terminal.Meanwhile, cloud data center operation system also carries upper Application of Interface, intermediate function to dispatching hardware management, is the unique link of link hardware and application, is in core status, has decisive influence for the technical indicator such as application system, hardware.

Cloud sea operating system is complete cloud data center solution, all demands of cloud data center are covered with the form of external member, system comprises alternation of bed, platform management layer, resource virtualizing layer three-tier architecture, wherein alternation of bed function is realized by iPortal, be divided into administrator interfaces and user interface, different roles uses unified platform to realize resource service; Platform management layer is dispatched the functional module such as (iCloudManager), resource management (iResourceManager), statistics charging (iCharge), Self-Service (iService) by resource pool and is formed; Resource virtualizing layer is carried by traditional server virtualization software, realizes the virtual of physical resource.

In cloud sea operating system, in use may there are some faults in the resources of virtual machine of virtual machine and network, as network is unavailable, or virtual machine such as cannot to be started shooting at the problem, but due to the restrict access of resource, user cannot solve fault, thus causes system resource effectively not use.

Summary of the invention

In order to solve the problems of the technologies described above, the invention provides a kind of cloud data center failure management method and device, can process in time fault, system resource is effectively used.

In order to reach the object of the invention, the invention provides a kind of cloud data center failure management method, comprising: the fault message that organization and administration server receives user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with fault message; If can determine the organization and administration fault type corresponding with fault message, then organization and administration server is repaired fault according to organization and administration fault type; If can not determine the organization and administration fault type corresponding with fault message, then fault message is sent to system management server by organization and administration server, system management server according to fault message certainty annuity managing failures type, and is repaired fault according to described system management fault type.

Further, the method also comprises: organization and administration server pre-sets organization and administration fault type collection, and organization and administration server is modified to organization and administration fault type collection according to demand; System management server pre-sets system management fault type collection, and system management server is modified to system management fault type collection according to demand.

Further, the fault message that organization and administration server receives user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with described fault message, comprise: the fault message that organization and administration server receives user terminal is submitted to, fault message and the organization and administration fault type collection pre-set are compared, judges whether to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with fault message, then judge to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with fault message, then judge to can not determine the organization and administration fault type corresponding with fault message.

Further, the method also comprises: the state that record organization managing failures is repaired, and the state of organization and administration fault restoration comprises fault message submission, fault message shifts, fault processes, fault has solved or fault is closed; And/or the state that register system managing failures is repaired, the state of system management fault restoration comprises fault message submission, fault processes, fault has solved or fault is closed.

The invention provides a kind of cloud data center Fault Management System, comprising: user terminal, for sending fault message to organization and administration server; Organization and administration server, for receiving the fault message that user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with fault message, if the organization and administration fault type corresponding with fault message can be determined, then organization and administration server is repaired fault according to organization and administration fault type, if can not determine the organization and administration fault type corresponding with fault message, then fault message is sent to system management server by organization and administration server; System management server, for receiving the fault message from organization and administration server, according to fault message certainty annuity managing failures type, and repairs fault according to system management fault type.

Further, organization and administration server, also for: pre-set organization and administration fault type collection, according to demand organization and administration fault type collection modified; System management server, also for: pre-set system management fault type collection, according to demand system management fault type collection modified.

Further, organization and administration server, specifically for: receive the fault message that user terminal is submitted to, fault message and the organization and administration fault type collection pre-set are compared, judges whether to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with fault message, then judge to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with fault message, then judge to can not determine the organization and administration fault type corresponding with fault message.

Further, organization and administration server, also for the state that: record organization managing failures is repaired, the state of organization and administration fault restoration comprises that fault message is submitted to, fault message transfer, fault processes, fault has solved or fault is closed; System management server, also for the state that: register system managing failures is repaired, the state of system management fault restoration comprises that fault message is submitted to, fault processes, fault has solved or fault is closed.

Compared with prior art, the present invention includes: the fault message that organization and administration server receives user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with fault message; If can determine the organization and administration fault type corresponding with fault message, then organization and administration server is repaired fault according to organization and administration fault type; If can not determine the organization and administration fault type corresponding with fault message, then fault message is sent to system management server by organization and administration server, system management server according to fault message certainty annuity managing failures type, and is repaired fault according to described system management fault type.The present invention is after user finds fault, fault message is sent to organization and administration server in time by user terminal, organization and administration server is according to fault message determination fault type, or organization and administration server is transmitted to system management server determination fault type, organization and administration server or system management server are repaired fault according to fault type, can process in time fault thus, system resource is effectively used.

Accompanying drawing explanation

Fig. 1 is the schematic flow sheet of cloud data center of the present invention failure management method.

Fig. 2 is the structural representation of cloud data center of the present invention fault management device.

Embodiment

Describe the present invention below with reference to embodiment shown in the drawings.

In cloud sea operating system, the role of cloud data center is system manager, organization administrator and user respectively.The whole architecture of System Administrator Management, is divided into multiple cloud data center by unified data center resource, is managed by organization administrator.The cloud data center of branch is consigned to different user's requests by organization administrator.System manager, organization administrator and user communicate with user terminal respectively by system management server, organization and administration server.

Fig. 1 is the schematic flow sheet of cloud data center of the present invention failure management method, as shown in Figure 1, comprising:

Step 11, organization and administration server pre-sets organization and administration fault type collection, and system management server pre-sets system management fault type collection;

In this step, the virtual resource used due to organization and administration server and system management server is not identical, so organization and administration server and system management server arrange the fault type collection of oneself respectively, it is unavailable that this fault type collection can comprise network, or virtual machine such as cannot to be started shooting at the fault type.

Organization and administration server and system management server can be modified to the fault type collection arranged according to demand, such as, increase or delete certain fault type.

Step 12, the fault message that organization and administration server receives user terminal is submitted to, judges whether to determine the organization and administration fault type corresponding with this fault message, if can, enter step 13; If can not, enter step 14.

In this step, if user finds fault, fault message can be submitted to organization and administration server by user terminal, fault message and the organization and administration fault type collection pre-set are compared by organization and administration server, judge whether to determine the organization and administration fault type corresponding with this fault message.

Step 13, if can determine the organization and administration fault type corresponding with this fault message, then organization and administration server is repaired fault according to this organization and administration fault type.

In this step, if organization and administration fault type is concentrated there is the organization and administration fault type corresponding with this fault message, then judge to determine the organization and administration fault type corresponding with this fault message.

Organization and administration server is repaired fault according to this organization and administration fault type, and record the state of this organization and administration fault restoration, such as fault message is submitted to, fault message shifts, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to organization and administration server.

Step 14, if can not determine the organization and administration fault type corresponding with this fault message, then fault message is sent to system management server by organization and administration server.

In this step, if organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with this fault message, then judge to can not determine the organization and administration fault type corresponding with this fault message.

Step 15, fault message and the system management fault type collection pre-set are compared by system management server, determine the system management fault type corresponding with this fault message, and repair fault according to this system management fault type.

In this step, system management server is repaired fault according to this system management fault type, and record the state of this system management fault restoration, such as fault message is submitted to, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to system management server.

The present invention is after user finds fault, fault message is sent to organization and administration server in time by user terminal, organization and administration server is according to fault message determination fault type, or organization and administration server is transmitted to system management server determination fault type, organization and administration server or system management server are repaired fault according to fault type, can process in time fault thus, system resource is effectively used.

Fig. 2 is the structural representation of cloud data center of the present invention Fault Management System, as shown in Figure 2, comprising:

User terminal, for sending fault message to organization and administration server;

Organization and administration server, for pre-setting organization and administration fault type collection; Receive the fault message that user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with this fault message; If the organization and administration fault type corresponding with this fault message can be determined, then according to this organization and administration fault type, fault is repaired; If can not determine the organization and administration fault type corresponding with this fault message, then fault message is sent to system management server;

System management server, for pre-setting system management fault type collection; Receive the fault message from organization and administration server, fault message and the system management fault type collection pre-set are compared, determine the system management fault type corresponding with this fault message, and according to this system management fault type, fault is repaired.

Wherein, organization and administration server and system management server can be modified to the fault type collection arranged according to demand, such as, increase or delete certain fault type.

Wherein, fault message and the organization and administration fault type collection pre-set are compared by organization and administration server, judge whether to determine the organization and administration fault type corresponding with this fault message.If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with this fault message, then judge to determine the organization and administration fault type corresponding with this fault message, organization and administration server is repaired fault according to this organization and administration fault type, and record the state of this fault restoration, such as fault message is submitted to, fault message shifts, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to organization and administration server.If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with this fault message, then judge to can not determine the organization and administration fault type corresponding with this fault message.

Wherein, system management server is repaired fault according to this system management fault type, and record the state of this fault restoration, such as fault message is submitted to, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to system management server.

Be to be understood that, although this specification is described according to execution mode, but not each execution mode only comprises an independently technical scheme, this narrating mode of specification is only for clarity sake, those skilled in the art should by specification integrally, technical scheme in each execution mode also through appropriately combined, can form other execution modes that it will be appreciated by those skilled in the art that.

A series of detailed description listed is above only illustrating for feasibility execution mode of the present invention; they are not for limiting the scope of the invention, all do not depart from equivalent implementations that skill of the present invention spirit does or change all should be included within protection scope of the present invention.

Claims

1. a Zhong Yun data center failure management method, is characterized in that, comprising:

The fault message that organization and administration server receives user terminal is submitted to, judges whether to determine the organization and administration fault type corresponding with described fault message;

If can determine the organization and administration fault type corresponding with described fault message, then organization and administration server is repaired fault according to described organization and administration fault type;

If can not determine the organization and administration fault type corresponding with described fault message, then fault message is sent to system management server by organization and administration server, system management server according to described fault message certainty annuity managing failures type, and is repaired fault according to described system management fault type.

2. cloud data center according to claim 1 failure management method, it is characterized in that, the method also comprises:

Organization and administration server pre-sets organization and administration fault type collection, and described organization and administration server is modified to organization and administration fault type collection according to demand;

System management server pre-sets system management fault type collection, and described system management server is modified to system management fault type collection according to demand.

3. cloud data center according to claim 2 failure management method, is characterized in that, the fault message that described organization and administration server receives user terminal is submitted to, judges whether to determine the organization and administration fault type corresponding with described fault message, comprising:

The fault message that organization and administration server receives user terminal is submitted to, compares fault message and the organization and administration fault type collection pre-set, judges whether to determine the organization and administration fault type corresponding with described fault message;

If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with described fault message, then judge to determine the organization and administration fault type corresponding with described fault message;

If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with described fault message, then judge to can not determine the organization and administration fault type corresponding with described fault message.

4. cloud data center according to claim 1 failure management method, it is characterized in that, the method also comprises:

The state that record organization managing failures is repaired, the state of described organization and administration fault restoration comprises fault message submission, fault message shifts, fault processes, fault has solved or fault is closed; And/or,

The state that register system managing failures is repaired, the state of described system management fault restoration comprises fault message submission, fault processes, fault has solved or fault is closed.

5. a Zhong Yun data center Fault Management System, is characterized in that, comprising:

Organization and administration server, for receiving the fault message that user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with described fault message, if the organization and administration fault type corresponding with described fault message can be determined, then organization and administration server is repaired fault according to described organization and administration fault type, if can not determine the organization and administration fault type corresponding with described fault message, then fault message is sent to system management server by organization and administration server;

System management server, for receiving the fault message from organization and administration server, according to described fault message certainty annuity managing failures type, and repairs fault according to described system management fault type.

6. cloud data center according to claim 5 Fault Management System, is characterized in that, described organization and administration server, also for: pre-set organization and administration fault type collection, according to demand organization and administration fault type collection modified;

Described system management server, also for: pre-set system management fault type collection, according to demand system management fault type collection modified.

7. cloud data center according to claim 6 Fault Management System, it is characterized in that, described organization and administration server, specifically for: receive the fault message that user terminal is submitted to, fault message and the organization and administration fault type collection pre-set are compared, judges whether to determine the organization and administration fault type corresponding with described fault message;

8. cloud data center according to claim 6 Fault Management System, it is characterized in that, described organization and administration server, also for the state that: record organization managing failures is repaired, the state of described organization and administration fault restoration comprises that fault message is submitted to, fault message transfer, fault processes, fault has solved or fault is closed;

Described system management server, also for the state that: register system managing failures is repaired, the state of described system management fault restoration comprises that fault message is submitted to, fault processes, fault has solved or fault is closed.