CN104333459A - Method and device for fault management of cloud data center - Google Patents
Method and device for fault management of cloud data center Download PDFInfo
- Publication number
- CN104333459A CN104333459A CN201410363945.XA CN201410363945A CN104333459A CN 104333459 A CN104333459 A CN 104333459A CN 201410363945 A CN201410363945 A CN 201410363945A CN 104333459 A CN104333459 A CN 104333459A
- Authority
- CN
- China
- Prior art keywords
- fault
- organization
- administration
- message
- fault type
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention provides a method and a device for fault management of a cloud data center. The method comprises the following steps that an organization and management server receives fault information submitted by a user terminal and judges whether an organization and management fault type corresponding to the fault information can be determined; if the organization and management fault type corresponding to the fault information can be determined, the organization and management server repairs a fault according to the organization and management fault type; if the organization and management fault type corresponding to the fault information cannot be determined, the organization and management server transmits the fault information to a system management server, the system management server determines a system management fault type according to the fault information, and repairs the fault according to the system management fault type. According to the method and the device, the fault can be timely processed, so that system resources are effectively utilized.
Description
Technical field
The present invention relates to field of cloud computer technology, particularly relate to a kind of cloud data center failure management method and device.
Background technology
Along with the continuous maturation of cloud computing technology, cloud computing progressively becomes the Hot spots for development of industry.Cloud data center operation system is unique scheme that data center completes conversion from hardware to resource pool, and a large amount of heterogeneous device is fused to the logical resource pond that standard is unified, dynamic dispatching is applied to cloud, the service of complete paired terminal.Meanwhile, cloud data center operation system also carries upper Application of Interface, intermediate function to dispatching hardware management, is the unique link of link hardware and application, is in core status, has decisive influence for the technical indicator such as application system, hardware.
Cloud sea operating system is complete cloud data center solution, all demands of cloud data center are covered with the form of external member, system comprises alternation of bed, platform management layer, resource virtualizing layer three-tier architecture, wherein alternation of bed function is realized by iPortal, be divided into administrator interfaces and user interface, different roles uses unified platform to realize resource service; Platform management layer is dispatched the functional module such as (iCloudManager), resource management (iResourceManager), statistics charging (iCharge), Self-Service (iService) by resource pool and is formed; Resource virtualizing layer is carried by traditional server virtualization software, realizes the virtual of physical resource.
In cloud sea operating system, in use may there are some faults in the resources of virtual machine of virtual machine and network, as network is unavailable, or virtual machine such as cannot to be started shooting at the problem, but due to the restrict access of resource, user cannot solve fault, thus causes system resource effectively not use.
Summary of the invention
In order to solve the problems of the technologies described above, the invention provides a kind of cloud data center failure management method and device, can process in time fault, system resource is effectively used.
In order to reach the object of the invention, the invention provides a kind of cloud data center failure management method, comprising: the fault message that organization and administration server receives user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with fault message; If can determine the organization and administration fault type corresponding with fault message, then organization and administration server is repaired fault according to organization and administration fault type; If can not determine the organization and administration fault type corresponding with fault message, then fault message is sent to system management server by organization and administration server, system management server according to fault message certainty annuity managing failures type, and is repaired fault according to described system management fault type.
Further, the method also comprises: organization and administration server pre-sets organization and administration fault type collection, and organization and administration server is modified to organization and administration fault type collection according to demand; System management server pre-sets system management fault type collection, and system management server is modified to system management fault type collection according to demand.
Further, the fault message that organization and administration server receives user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with described fault message, comprise: the fault message that organization and administration server receives user terminal is submitted to, fault message and the organization and administration fault type collection pre-set are compared, judges whether to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with fault message, then judge to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with fault message, then judge to can not determine the organization and administration fault type corresponding with fault message.
Further, the method also comprises: the state that record organization managing failures is repaired, and the state of organization and administration fault restoration comprises fault message submission, fault message shifts, fault processes, fault has solved or fault is closed; And/or the state that register system managing failures is repaired, the state of system management fault restoration comprises fault message submission, fault processes, fault has solved or fault is closed.
The invention provides a kind of cloud data center Fault Management System, comprising: user terminal, for sending fault message to organization and administration server; Organization and administration server, for receiving the fault message that user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with fault message, if the organization and administration fault type corresponding with fault message can be determined, then organization and administration server is repaired fault according to organization and administration fault type, if can not determine the organization and administration fault type corresponding with fault message, then fault message is sent to system management server by organization and administration server; System management server, for receiving the fault message from organization and administration server, according to fault message certainty annuity managing failures type, and repairs fault according to system management fault type.
Further, organization and administration server, also for: pre-set organization and administration fault type collection, according to demand organization and administration fault type collection modified; System management server, also for: pre-set system management fault type collection, according to demand system management fault type collection modified.
Further, organization and administration server, specifically for: receive the fault message that user terminal is submitted to, fault message and the organization and administration fault type collection pre-set are compared, judges whether to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with fault message, then judge to determine the organization and administration fault type corresponding with fault message; If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with fault message, then judge to can not determine the organization and administration fault type corresponding with fault message.
Further, organization and administration server, also for the state that: record organization managing failures is repaired, the state of organization and administration fault restoration comprises that fault message is submitted to, fault message transfer, fault processes, fault has solved or fault is closed; System management server, also for the state that: register system managing failures is repaired, the state of system management fault restoration comprises that fault message is submitted to, fault processes, fault has solved or fault is closed.
Compared with prior art, the present invention includes: the fault message that organization and administration server receives user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with fault message; If can determine the organization and administration fault type corresponding with fault message, then organization and administration server is repaired fault according to organization and administration fault type; If can not determine the organization and administration fault type corresponding with fault message, then fault message is sent to system management server by organization and administration server, system management server according to fault message certainty annuity managing failures type, and is repaired fault according to described system management fault type.The present invention is after user finds fault, fault message is sent to organization and administration server in time by user terminal, organization and administration server is according to fault message determination fault type, or organization and administration server is transmitted to system management server determination fault type, organization and administration server or system management server are repaired fault according to fault type, can process in time fault thus, system resource is effectively used.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of cloud data center of the present invention failure management method.
Fig. 2 is the structural representation of cloud data center of the present invention fault management device.
Embodiment
Describe the present invention below with reference to embodiment shown in the drawings.
In cloud sea operating system, the role of cloud data center is system manager, organization administrator and user respectively.The whole architecture of System Administrator Management, is divided into multiple cloud data center by unified data center resource, is managed by organization administrator.The cloud data center of branch is consigned to different user's requests by organization administrator.System manager, organization administrator and user communicate with user terminal respectively by system management server, organization and administration server.
Fig. 1 is the schematic flow sheet of cloud data center of the present invention failure management method, as shown in Figure 1, comprising:
Step 11, organization and administration server pre-sets organization and administration fault type collection, and system management server pre-sets system management fault type collection;
In this step, the virtual resource used due to organization and administration server and system management server is not identical, so organization and administration server and system management server arrange the fault type collection of oneself respectively, it is unavailable that this fault type collection can comprise network, or virtual machine such as cannot to be started shooting at the fault type.
Organization and administration server and system management server can be modified to the fault type collection arranged according to demand, such as, increase or delete certain fault type.
Step 12, the fault message that organization and administration server receives user terminal is submitted to, judges whether to determine the organization and administration fault type corresponding with this fault message, if can, enter step 13; If can not, enter step 14.
In this step, if user finds fault, fault message can be submitted to organization and administration server by user terminal, fault message and the organization and administration fault type collection pre-set are compared by organization and administration server, judge whether to determine the organization and administration fault type corresponding with this fault message.
Step 13, if can determine the organization and administration fault type corresponding with this fault message, then organization and administration server is repaired fault according to this organization and administration fault type.
In this step, if organization and administration fault type is concentrated there is the organization and administration fault type corresponding with this fault message, then judge to determine the organization and administration fault type corresponding with this fault message.
Organization and administration server is repaired fault according to this organization and administration fault type, and record the state of this organization and administration fault restoration, such as fault message is submitted to, fault message shifts, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to organization and administration server.
Step 14, if can not determine the organization and administration fault type corresponding with this fault message, then fault message is sent to system management server by organization and administration server.
In this step, if organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with this fault message, then judge to can not determine the organization and administration fault type corresponding with this fault message.
Step 15, fault message and the system management fault type collection pre-set are compared by system management server, determine the system management fault type corresponding with this fault message, and repair fault according to this system management fault type.
In this step, system management server is repaired fault according to this system management fault type, and record the state of this system management fault restoration, such as fault message is submitted to, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to system management server.
The present invention is after user finds fault, fault message is sent to organization and administration server in time by user terminal, organization and administration server is according to fault message determination fault type, or organization and administration server is transmitted to system management server determination fault type, organization and administration server or system management server are repaired fault according to fault type, can process in time fault thus, system resource is effectively used.
Fig. 2 is the structural representation of cloud data center of the present invention Fault Management System, as shown in Figure 2, comprising:
User terminal, for sending fault message to organization and administration server;
Organization and administration server, for pre-setting organization and administration fault type collection; Receive the fault message that user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with this fault message; If the organization and administration fault type corresponding with this fault message can be determined, then according to this organization and administration fault type, fault is repaired; If can not determine the organization and administration fault type corresponding with this fault message, then fault message is sent to system management server;
System management server, for pre-setting system management fault type collection; Receive the fault message from organization and administration server, fault message and the system management fault type collection pre-set are compared, determine the system management fault type corresponding with this fault message, and according to this system management fault type, fault is repaired.
Wherein, organization and administration server and system management server can be modified to the fault type collection arranged according to demand, such as, increase or delete certain fault type.
Wherein, fault message and the organization and administration fault type collection pre-set are compared by organization and administration server, judge whether to determine the organization and administration fault type corresponding with this fault message.If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with this fault message, then judge to determine the organization and administration fault type corresponding with this fault message, organization and administration server is repaired fault according to this organization and administration fault type, and record the state of this fault restoration, such as fault message is submitted to, fault message shifts, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to organization and administration server.If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with this fault message, then judge to can not determine the organization and administration fault type corresponding with this fault message.
Wherein, system management server is repaired fault according to this system management fault type, and record the state of this fault restoration, such as fault message is submitted to, fault processes, fault has solved or fault is closed, so that user can check the state of troubleshooting to system management server.
The present invention is after user finds fault, fault message is sent to organization and administration server in time by user terminal, organization and administration server is according to fault message determination fault type, or organization and administration server is transmitted to system management server determination fault type, organization and administration server or system management server are repaired fault according to fault type, can process in time fault thus, system resource is effectively used.
Be to be understood that, although this specification is described according to execution mode, but not each execution mode only comprises an independently technical scheme, this narrating mode of specification is only for clarity sake, those skilled in the art should by specification integrally, technical scheme in each execution mode also through appropriately combined, can form other execution modes that it will be appreciated by those skilled in the art that.
A series of detailed description listed is above only illustrating for feasibility execution mode of the present invention; they are not for limiting the scope of the invention, all do not depart from equivalent implementations that skill of the present invention spirit does or change all should be included within protection scope of the present invention.
Claims (8)
1. a Zhong Yun data center failure management method, is characterized in that, comprising:
The fault message that organization and administration server receives user terminal is submitted to, judges whether to determine the organization and administration fault type corresponding with described fault message;
If can determine the organization and administration fault type corresponding with described fault message, then organization and administration server is repaired fault according to described organization and administration fault type;
If can not determine the organization and administration fault type corresponding with described fault message, then fault message is sent to system management server by organization and administration server, system management server according to described fault message certainty annuity managing failures type, and is repaired fault according to described system management fault type.
2. cloud data center according to claim 1 failure management method, it is characterized in that, the method also comprises:
Organization and administration server pre-sets organization and administration fault type collection, and described organization and administration server is modified to organization and administration fault type collection according to demand;
System management server pre-sets system management fault type collection, and described system management server is modified to system management fault type collection according to demand.
3. cloud data center according to claim 2 failure management method, is characterized in that, the fault message that described organization and administration server receives user terminal is submitted to, judges whether to determine the organization and administration fault type corresponding with described fault message, comprising:
The fault message that organization and administration server receives user terminal is submitted to, compares fault message and the organization and administration fault type collection pre-set, judges whether to determine the organization and administration fault type corresponding with described fault message;
If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with described fault message, then judge to determine the organization and administration fault type corresponding with described fault message;
If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with described fault message, then judge to can not determine the organization and administration fault type corresponding with described fault message.
4. cloud data center according to claim 1 failure management method, it is characterized in that, the method also comprises:
The state that record organization managing failures is repaired, the state of described organization and administration fault restoration comprises fault message submission, fault message shifts, fault processes, fault has solved or fault is closed; And/or,
The state that register system managing failures is repaired, the state of described system management fault restoration comprises fault message submission, fault processes, fault has solved or fault is closed.
5. a Zhong Yun data center Fault Management System, is characterized in that, comprising:
User terminal, for sending fault message to organization and administration server;
Organization and administration server, for receiving the fault message that user terminal is submitted to, judge whether to determine the organization and administration fault type corresponding with described fault message, if the organization and administration fault type corresponding with described fault message can be determined, then organization and administration server is repaired fault according to described organization and administration fault type, if can not determine the organization and administration fault type corresponding with described fault message, then fault message is sent to system management server by organization and administration server;
System management server, for receiving the fault message from organization and administration server, according to described fault message certainty annuity managing failures type, and repairs fault according to described system management fault type.
6. cloud data center according to claim 5 Fault Management System, is characterized in that, described organization and administration server, also for: pre-set organization and administration fault type collection, according to demand organization and administration fault type collection modified;
Described system management server, also for: pre-set system management fault type collection, according to demand system management fault type collection modified.
7. cloud data center according to claim 6 Fault Management System, it is characterized in that, described organization and administration server, specifically for: receive the fault message that user terminal is submitted to, fault message and the organization and administration fault type collection pre-set are compared, judges whether to determine the organization and administration fault type corresponding with described fault message;
If organization and administration fault type is concentrated there is the organization and administration fault type corresponding with described fault message, then judge to determine the organization and administration fault type corresponding with described fault message;
If organization and administration fault type is concentrated there is not the organization and administration fault type corresponding with described fault message, then judge to can not determine the organization and administration fault type corresponding with described fault message.
8. cloud data center according to claim 6 Fault Management System, it is characterized in that, described organization and administration server, also for the state that: record organization managing failures is repaired, the state of described organization and administration fault restoration comprises that fault message is submitted to, fault message transfer, fault processes, fault has solved or fault is closed;
Described system management server, also for the state that: register system managing failures is repaired, the state of described system management fault restoration comprises that fault message is submitted to, fault processes, fault has solved or fault is closed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410363945.XA CN104333459A (en) | 2014-07-28 | 2014-07-28 | Method and device for fault management of cloud data center |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410363945.XA CN104333459A (en) | 2014-07-28 | 2014-07-28 | Method and device for fault management of cloud data center |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104333459A true CN104333459A (en) | 2015-02-04 |
Family
ID=52408118
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410363945.XA Pending CN104333459A (en) | 2014-07-28 | 2014-07-28 | Method and device for fault management of cloud data center |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104333459A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10361919B2 (en) | 2015-11-09 | 2019-07-23 | At&T Intellectual Property I, L.P. | Self-healing and dynamic optimization of VM server cluster management in multi-cloud platform |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101621404A (en) * | 2008-07-05 | 2010-01-06 | 中兴通讯股份有限公司 | Method and system for layering processing of failure |
CN102053873A (en) * | 2011-01-13 | 2011-05-11 | 浙江大学 | Method for ensuring fault isolation of virtual machines of cache-aware multi-core processor |
CN103167004A (en) * | 2011-12-15 | 2013-06-19 | 中国移动通信集团上海有限公司 | Cloud platform host system fault correcting method and cloud platform front control server |
CN103685463A (en) * | 2013-11-08 | 2014-03-26 | 浪潮(北京)电子信息产业有限公司 | Access control method and system in cloud computing system |
-
2014
- 2014-07-28 CN CN201410363945.XA patent/CN104333459A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101621404A (en) * | 2008-07-05 | 2010-01-06 | 中兴通讯股份有限公司 | Method and system for layering processing of failure |
CN102053873A (en) * | 2011-01-13 | 2011-05-11 | 浙江大学 | Method for ensuring fault isolation of virtual machines of cache-aware multi-core processor |
CN103167004A (en) * | 2011-12-15 | 2013-06-19 | 中国移动通信集团上海有限公司 | Cloud platform host system fault correcting method and cloud platform front control server |
CN103685463A (en) * | 2013-11-08 | 2014-03-26 | 浪潮(北京)电子信息产业有限公司 | Access control method and system in cloud computing system |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10361919B2 (en) | 2015-11-09 | 2019-07-23 | At&T Intellectual Property I, L.P. | Self-healing and dynamic optimization of VM server cluster management in multi-cloud platform |
US10616070B2 (en) | 2015-11-09 | 2020-04-07 | At&T Intellectual Property I, L.P. | Self-healing and dynamic optimization of VM server cluster management in multi-cloud platform |
US11044166B2 (en) | 2015-11-09 | 2021-06-22 | At&T Intellectual Property I, L.P. | Self-healing and dynamic optimization of VM server cluster management in multi-cloud platform |
US11616697B2 (en) | 2015-11-09 | 2023-03-28 | At&T Intellectual Property I, L.P. | Self-healing and dynamic optimization of VM server cluster management in multi-cloud platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9600380B2 (en) | Failure recovery system and method of creating the failure recovery system | |
CN108039964B (en) | Fault processing method, device and system based on network function virtualization | |
CN109815043B (en) | Fault processing method, related equipment and computer storage medium | |
US8626936B2 (en) | Protocol independent server replacement and replication in a storage area network | |
WO2017114325A1 (en) | Fault processing method, device and system | |
CN104899095A (en) | Resource adjustment method and system for virtual machine | |
CN102984214B (en) | A kind of method and device realizing business migration in telecom cloud | |
CN104094230A (en) | System and method for supporting live migration of virtual machines in virtualization environment | |
CN103201724A (en) | Providing application high availability in highly-available virtual machine environments | |
CN104516789A (en) | Method and system for failover detection and treatment in checkpoint systems | |
CN104205060A (en) | Providing application based monitoring and recovery for a hypervisor of an ha cluster | |
CN110912991A (en) | Super-fusion-based high-availability implementation method for double nodes | |
CN103559124B (en) | Fast fault detection method and device | |
CN103516802A (en) | Method and device for achieving seamless transference of across heterogeneous virtual switch | |
CN103118130A (en) | Cluster management method and cluster management system for distributed service | |
CN112948063B (en) | Cloud platform creation method and device, cloud platform and cloud platform implementation system | |
US10353786B2 (en) | Virtualization substrate management device, virtualization substrate management system, virtualization substrate management method, and recording medium for recording virtualization substrate management program | |
CN102708027B (en) | A kind of method and system avoiding outage of communication device | |
CN103823708B (en) | The method and apparatus that virtual machine read-write requests are processed | |
WO2018137520A1 (en) | Service recovery method and apparatus | |
CN102929769A (en) | Virtual machine internal-data acquisition method based on agency service | |
CN104780075A (en) | Method for evaluating availability of cloud computing system | |
CN111679889B (en) | Conversion migration method and system of virtual machine | |
CN105556473A (en) | I/O task processing method, device and system | |
CN112099916B (en) | Virtual machine data migration method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150204 |
|
WD01 | Invention patent application deemed withdrawn after publication |