CN103152414B - A kind of high-availability system based on cloud computing - Google Patents

A kind of high-availability system based on cloud computing Download PDF

Info

Publication number
CN103152414B
CN103152414B CN201310065647.8A CN201310065647A CN103152414B CN 103152414 B CN103152414 B CN 103152414B CN 201310065647 A CN201310065647 A CN 201310065647A CN 103152414 B CN103152414 B CN 103152414B
Authority
CN
China
Prior art keywords
layer
subsystem
management
control agents
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310065647.8A
Other languages
Chinese (zh)
Other versions
CN103152414A (en
Inventor
王电钢
常健
王铁军
周毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SICHUAN ELECTRIC POWER Corp INFORMATION COMMUNICATION CO Ltd
Original Assignee
SICHUAN ELECTRIC POWER Corp INFORMATION COMMUNICATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SICHUAN ELECTRIC POWER Corp INFORMATION COMMUNICATION CO Ltd filed Critical SICHUAN ELECTRIC POWER Corp INFORMATION COMMUNICATION CO Ltd
Priority to CN201310065647.8A priority Critical patent/CN103152414B/en
Publication of CN103152414A publication Critical patent/CN103152414A/en
Application granted granted Critical
Publication of CN103152414B publication Critical patent/CN103152414B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Hardware Redundancy (AREA)

Abstract

The invention discloses a kind of high-availability system based on cloud computing and its implementation, this system comprises central control management service subsystem and autonomous control agents subsystem, protocol interconnection is passed through between central control management service subsystem and autonomous control agents subsystem, described central control management service subsystem comprises kernel service, resource management and task management etc. 5 layers, and described autonomous control agents subsystem comprises core frame, state acquisition and process monitoring etc. 5 layers; The method comprises establishment Information application mirror, running status is applied in monitoring, start the mirror image virtual machine of corresponding fault application, close the steps such as the mirror image virtual machine of the normal fault application of recovery.It is N:1 than 1:1 that the present invention changes active/standby number of servers in traditional dual-computer hot-standby high-availability system, thus saves a large amount of standby server resource, provides the utilance of server resource, has good flexibility and extensibility.

Description

A kind of high-availability system based on cloud computing
Technical field
The present invention relates to a kind of high-availability system based on cloud computing and its implementation.
Background technology
High availability (HighAvailability, HA) refers to the downtime by as far as possible shortening because routine maintaining operations (plan) and the system crash (unplanned) that happens suddenly cause, to improve the availability of system and application.HA system is that current enterprise prevents kernal computer system because of the most effective means of disorderly closedown.Along with the development of information application, data are more and more wider in the application of enterprise, and the high availability how improving Information application becomes one of top priority of building sane computer system.Information application adopts double-machine standby technology to improve the high availability of system usually.
Two-node cluster hot backup is refered in particular to hot standby (or High Availabitity) based on the two-server in high-availability system.Dual system banked solves a kind of inevitable plan or unplanned system and to delay the system (software or hardware) of machine problem, any system that causes is delayed the fault of machine and service disruption, capital is triggered corresponding flow process and is carried out mistake judgement, Fault Isolation, and parallel machine recovers to perform interrupted service.By the switching mode in work, two-shipper High Availabitity can be divided into: active/standby mode (Active-Standby mode) and two host mode (Active-Active mode).Wherein, namely active/standby mode refers to the state of activation (i.e. Active state) that a station server is in certain business, and another station server is in the stand-by state (i.e. Standby state) of this business; And namely two host mode refers to two kinds of different business activestandby state (i.e. Active-Standby and Standby-Active state) each other on two-server respectively.
The scheme of current composition two-node cluster hot backup mainly contains three kinds of modes: based on the mode of shared storage (disk array), the mode of full redundancy (two-shipper is two to be stored) and the mode based on data Replica.
Mode based on shared storage (disk array) is the mode the most often used, provides after switching data integrity and successional guarantee mainly through disk array.User data generally can be placed on disk array, and after machine delayed by main frame, standby host continues to obtain legacy data from disk array.Traditional two-node cluster hot backup mode based on separate unit storage is made up of a station server main frame, a station server standby host and a disk array, and this mode, because use a memory device, is often called disk Single Point of Faliure by insider.But the fail safe stored in general is higher.If so when ignoring storage device failure, this mode is also adopt in the industry maximum hot standby modes.
Really there is the situation of Single Point of Faliure in the traditional two-node cluster hot backup mode based on separate unit storage, for realizing storage redundancy, storing High Availabitityization and being also more and more easily accepted by a user.Can understand like this; two-node cluster hot backup is to delay the solution of machine for the planned shutdown of settlement server and unplanned property the earliest; but the server outage that the planned shutdown of memory device and the unplanned property machine of delaying bring cannot be realized; and memory device is as the equipment storing data unique in two-node cluster hot backup, it often causes Dual-Computer Hot-Standby System total collapse once break down.
Based on the High Availabitity two-node cluster hot backup scheme of two memory device, eliminate because the Single Point of Faliure brought shut down by separate unit memory device, enter the full redundancy two-node cluster hot backup mode not having Single Point of Faliure.
Full redundancy two-node cluster hot backup mode is made up of two memory devices, a station server main frame and a station server standby host, it is advantageous that: the data Replica between (1) memory device is without network, and two is copy by between memory device; Copying between (2) two memory devices is completely real-time, there is not time delay any time; (3) switching time between active and standby storage is less than 500ms, does not produce time delay during to guarantee system storage; (4) disk identifier of hard disk and subregion do not change because of the switching between active and standby storage; (5) switching of server, does not affect the initialization between storage, increment synchronization and data Replica; (6) the planned shutdown of a certain memory device, does not affect the work of whole server Dual-Computer Hot-Standby System; (7) use data de-duplication technology between memory device, complete increment synchronization work; (8) real 7 × 24 hours or switch full redundancy scheme.But this full redundancy two-node cluster hot backup mode cost is high, and complex management, is not suitable for small-scale Information application.
Mode based on data Replica mainly utilizes the method for synchronization of data, ensures the data consistency of active/standby server.Distributed copy block equipment (DistributedReplicatedBlockDevice, DRBD) is a data cluster scheme of increasing income, and it can provide the data syn-chronization between a kind of dynamic main frame.DRBD is responsible for receiving data, data is write local disk, then sends to another main frame.Data are deposited in the disk of oneself by another main frame again.Assembly needed for other has cluster member service, as TurboHA or heartbeat connect, and some application programs can run on block device.Such as: naked I/O, file system and fsck, there is the database of recovery capability.
The double-machine standby technology of above-mentioned three kinds of modes, is all at least needed 2 physical servers, is realized the High Availabitity of information system by the mode of redundancy.Time more, the equipment of these redundancies is in stand-by state.Along with the increase of information system quantity in enterprise, in order to ensure the High Availabitity of system, bulk redundancy equipment will certainly be brought.This situation, for medium-sized and small enterprises, must bring the increase of construction and maintenance cost.
Summary of the invention
The object of the invention is to overcome the deficiencies in the prior art, provide a kind of ensure information application high availability while reduce a kind of high-availability system based on cloud computing of entreprise cost and its implementation.
The object of the invention is to be achieved through the following technical solutions: a kind of high-availability system based on cloud computing, it comprises a central control management service subsystem and at least one autonomous control agents subsystem, protocol interconnection is passed through between central control management service subsystem and autonomous control agents subsystem, described central control management service subsystem comprises kernel service layer, resource management layer, task management layer, intelligent scheduling layer, monitoring alarm layer and mirror image management level, described autonomous control agents subsystem comprises core frame layer, Host Status acquisition layer, state acquisition layer, incident management layer, process monitoring layer and Joblet running environment layer,
Described kernel service layer provides the basic framework of system cloud gray model, at least comprise safety management, incident management and log management, and be responsible for setting up the communication with autonomous control agents subsystem, monitor, gather the information that all managed service devices send, be responsible for the telecommunication management of setting up with (LDAP) directory service of bottom Light Directory Access Protocol and database server, be responsible for adopting RESTful mode to carry out the telecommunication management of the system communicated with other;
Described resource management layer carries out unified management for the resource situation to all physical machine in high-availability system and virtual machine, resource service condition, running state information;
Described task management layer is used for amendment, creation task, and task scheduling and monitoring ruuning situation, to ensure the operation that virtual machine completing when needing startups, stopping and moving;
Described intelligent scheduling layer is used for completing intelligent scheduling to the physical machine in high-availability system and virtual machine, at least comprises High Availabitity scheduling, resources balance scheduling and energy saving scheduling;
Described monitoring alarm layer is used for gathering Information application and virtual machine running state data, gather and representing, and the person liable that the abnormal application of notice is relevant, initiate alarm to it;
Described mirror image management level have been responsible for creating the image file of virtual machine, have deleted, have been inquired about and retouching operation;
Described core frame layer corresponds to the kernel service layer of central control management service subsystem, for providing basis for the system safety in autonomous control agents subsystem, daily record, network connection, RESTful framework;
Described Host Status acquisition layer is responsible for the running status periodically gathering physical machine and virtual machine in resource pool, comprise static information and the multidate information of CPU, internal memory, disk and network, and by core frame layer by the information reporting that collects to central control management service subsystem;
Described state acquisition layer is used for gathering the system running state of Information application server, and by core frame layer, the information collected is uploaded to central control management service subsystem;
Described incident management layer is used for managing the event produced in autonomous control agents subsystem, comprises establishment, deletion, query event state;
Described process monitoring layer is for monitoring the critical processes be configured with on the Information application server of autonomous control agents subsystem, when finding that critical processes breaks down, process failure event is sent to central control management service subsystem, to excite corresponding virtual machine, the High Availabitity of guarantee information application, wherein, critical processes is manually configured the process needing to monitor according to the difference of Information application by administrative staff, comprise database, Web service;
Described Joblet running environment layer is used for providing basis for Joblet runs in autonomous control agents subsystem, Joblet and task Job is one to one, wherein, Job performs in center-control service subsystem, be responsible for initialization, management Joblet runs, Joblet is distributed in each physical machine in resource pool and performs, and completes actual task.
Based on an implementation method for the high-availability system of cloud computing, it comprises the following steps:
(1) Information application mirror image is created, i.e. the corresponding main and standby relation of configuration information application server and virtual machine on cloud host server;
(2) installation and deployment Agent assembly, namely configures the Agent information of cloud host server and Information application server;
(3) operation of Agent layer to application is monitored, namely the autonomous control agents of virtual cloud main frame and the autonomous control agents of Information application main frame are monitored by the operation of running status to application of collection virtual machine, cloud host server, Information application server, and monitor message is reported to central control management service layer;
(4) when monitoring application and breaking down, central control management service layer sends task Job to the Agent layer starting emergency measures;
(5) Agent layer is according to the mirror image virtual machine of the automatic startup separator application of the instruction of carrying in Joblet;
(6) Agent layer continues to monitor the operation of application;
(7) after fault application recovery being detected, the mirror image virtual machine of closing fault application.
A kind of implementation method of the high-availability system based on cloud computing also comprises one is initiated alarm to the director of abnormal application step when server exception being detected.
The invention has the beneficial effects as follows:
(1) the present invention changes active/standby number of servers in traditional dual-computer hot-standby high-availability system is N:1 than 1:1, thus saves a large amount of standby server resource, provides the utilance of server resource;
(2) service data collection of the present invention is realized physical server, virtual machine and critical processes respectively by autonomous control agents, adopt H2 memory database technology storage of collected data, and establish the monitor data analytical model and fast algorithm that adapt to various strategy, thus meet the demands such as real-time data analysis, abnormity early warning, scheduling of resource;
(3) the present invention has extensibility and the flexibility of function, and task management technology of the present invention is based on script edit, and staff only need use Python to write the amendment and expansion that can realize function;
(4) the present invention utilizes computational resource intelligent scheduling technology using physical servers all in resource pool as shared standby resources, unify to provide HA to support for all Information application, after heartbeat detection notes abnormalities, native system is by mirror image virtual machine corresponding for the application of this abnormal information of Automatic dispatching, the physical server that load is lighter in resource pool runs, takes over the Information application broken down;
(5) the propelling movement mode of abnormality alarming information of the present invention is various, comprise the multiple message transmit mechanisms such as mail, note, instant messaging, can ensure that important information is reliably sent to director in time, enable the failure condition of operation management personnel Information application in time, to take corresponding subsequent treatment measure in time;
(6) autonomous control agents is deployed in the operating system of Information application, the heartbeat that autonomous control agents is set up for Information application by network and central control management server is connected, intelligent agent is monitored in real time by according to the running status of the strategy of specifying to Information application, when detecting abnormal application and occurring, intelligent agent will perform corresponding policy action;
(7) the present invention adopts and has SIGAR(SystemInformationGathererAndReporter that is open, ripe, lightweight advantage) as state collection method, and by SIGAR Components integration in intelligent agent, in real time monitor data can be delivered in central control management service, and regularly preserve data according to certain employing frequency.
Accompanying drawing explanation
Fig. 1 is the block diagram of system of the present invention;
Fig. 2 is method flow schematic diagram of the present invention;
Fig. 3 is system physical configuration diagram of the present invention.
Embodiment
Below in conjunction with accompanying drawing, technical scheme of the present invention is described in further detail, but protection scope of the present invention is not limited to the following stated.
As shown in Figure 1, a kind of high-availability system based on cloud computing, it comprises a central control management service subsystem and at least one autonomous control agents subsystem, protocol interconnection is passed through between central control management service subsystem and autonomous control agents subsystem, described central control management service subsystem comprises kernel service layer, resource management layer, task management layer, intelligent scheduling layer, monitoring alarm layer and mirror image management level, described autonomous control agents subsystem comprises core frame layer, Host Status acquisition layer, state acquisition layer, incident management layer, process monitoring layer and Joblet running environment layer,
Described kernel service layer is the core of whole high-availability system, it provides the basic framework of system cloud gray model, at least comprise the functions such as safety management, incident management, log management, be responsible for setting up the communication with autonomous control agents subsystem simultaneously, monitor, gather the information that all managed service devices send, and responsible telecommunication management of setting up with the service of bottom ldap directory and database server, be responsible for adopting RESTful mode to carry out the telecommunication management of the system communicated with other;
Described resource management layer carries out unified management for the resource situation to all physical machine in high-availability system and virtual machine, resource service condition, running state information;
Described task management layer is used for amendment, creation task, and task scheduling and to management work such as ruuning situation monitoring, to ensure the operation that virtual machine completing when needing startups, stopping and moving;
Described intelligent scheduling layer is used for completing intelligent scheduling to the physical machine in high-availability system and virtual machine, at least comprises High Availabitity scheduling, resources balance scheduling and energy saving scheduling;
Described monitoring alarm layer is used for gathering Information application and virtual machine running state data, gather and representing, and the person liable that the abnormal application of notice is relevant, initiate alarm to it;
The operations such as described mirror image management level have been responsible for creating the image file of virtual machine, delete, inquire about, amendment;
Described core frame layer corresponds to the kernel service layer of central control management service subsystem, for providing basis for the system safety in autonomous control agents subsystem, daily record, network connection, RESTful framework;
Described Host Status acquisition layer is responsible for the running status periodically gathering physical machine and virtual machine in resource pool, comprise static information and the multidate informations such as CPU, internal memory, disk, network, and by core frame layer by the information reporting that collects to central control management service subsystem;
Described state acquisition layer is used for gathering the system running state of Information application server, and by core frame layer, the information collected is uploaded to central control management service subsystem;
Described incident management layer is used for managing the event produced in autonomous control agents subsystem, comprises establishment, deletion, query event state;
Described process monitoring layer is for monitoring the critical processes be configured with on the Information application server of autonomous control agents subsystem, when finding that critical processes breaks down, process failure event is sent to central control management service subsystem, to excite corresponding virtual machine, the High Availabitity of guarantee information application, wherein, critical processes is manually configured the process needing to monitor according to the difference of Information application by administrative staff, as database, Web service etc.;
Described Joblet running environment layer is used for providing basis for Joblet runs in autonomous control agents subsystem, Joblet and task Job is one to one, wherein, Job performs in center-control service subsystem, be responsible for initialization, management Joblet runs, Joblet is distributed in each physical machine in resource pool and performs, and completes actual task.
As shown in Figure 2, a kind of implementation method of the high-availability system based on cloud computing, it comprises the following steps:
(1) Information application mirror image is created, i.e. the corresponding main and standby relation of configuration information application server and virtual machine on cloud host server;
(2) installation and deployment Agent assembly, namely configures the Agent information of cloud host server and Information application server;
(3) operation of Agent layer to application is monitored, namely the autonomous control agents of virtual cloud main frame and the autonomous control agents of Information application main frame are monitored by the operation of running status to application of collection virtual machine, cloud host server, Information application server, and monitor message is reported to central control management service layer;
(4) when monitoring application and breaking down, central control management service layer sends task Job to the Agent layer starting emergency measures;
(5) Agent layer is according to the mirror image virtual machine of the automatic startup separator application of the instruction of carrying in Joblet;
(6) Agent layer continues to monitor the operation of application;
(7) after fault application recovery being detected, the mirror image virtual machine of closing fault application.
A kind of implementation method of the high-availability system based on cloud computing also comprises one is initiated alarm to the director of abnormal application step when server exception being detected.
A kind of high-availability system based on cloud computing, as shown in Figure 3, divide from physical structure, central control management server, cloud host server resource pool and three, Information application server resource pond part can be divided into, cloud host server resource pool comprises at least one cloud host server, Information application server resource pond comprises at least one station information application server, is communicated between cloud host server and Information application server by network with central control management server.Server in cloud host server resource pool is configured with virtual machine, and there is main and standby relation in the server in these virtual machines and Information application server resource pond, central control management network in charge carries out task management and intelligent scheduling to Servers-all.Divide from logical architecture, Agent layer and Server layer can be divided into, the service data (information such as operating load, critical processes running status of Servers-all and virtual machine) that Agent layer is responsible for Information application is monitored, is reported, receive simultaneously, explain and perform the order from Server layer, Server layer is collected the running status of main frame and is sent control command according to dispatching algorithm to respective host, realizes the scheduling to resource and management.

Claims (1)

1. the high-availability system based on cloud computing, it is characterized in that: it comprises a central control management service subsystem and at least one autonomous control agents subsystem, protocol interconnection is passed through between central control management service subsystem and autonomous control agents subsystem, described central control management service subsystem comprises kernel service layer, resource management layer, task management layer, intelligent scheduling layer, monitoring alarm layer and mirror image management level, described autonomous control agents subsystem comprises core frame layer, Host Status acquisition layer, state acquisition layer, incident management layer, process monitoring layer and Joblet running environment layer,
Described kernel service layer is the core of whole high-availability system, it provides the basic framework of system cloud gray model, at least comprise safety management, incident management and log management, and be responsible for setting up the communication with autonomous control agents subsystem, monitor, gather the information that all managed service devices send, be responsible for the telecommunication management of setting up with (LDAP) directory service of bottom Light Directory Access Protocol and database server, be responsible for adopting RESTful mode to carry out the telecommunication management of the system communicated with other;
Described resource management layer carries out unified management for the resource situation to all physical machine in high-availability system and virtual machine, resource service condition, running state information;
Described task management layer be used for amendment, creation task, and task scheduling and to ruuning situation monitoring, with ensure virtual machine need in complete startups, stop and move operation;
Described intelligent scheduling layer is used for completing intelligent scheduling to the physical machine in high-availability system and virtual machine, at least comprises High Availabitity scheduling, resources balance scheduling and energy saving scheduling;
Described monitoring alarm layer is used for gathering Information application and virtual machine running state data, gather and representing, and the person liable that the abnormal application of notice is relevant, initiate alarm to it;
Described mirror image management level have been responsible for creating the image file of virtual machine, have deleted, have been inquired about and retouching operation;
Described core frame layer corresponds to the kernel service layer of central control management service subsystem, for providing basis for the system safety in autonomous control agents subsystem, daily record, network connection, RESTful framework;
Described Host Status acquisition layer is responsible for the running status periodically gathering physical machine and virtual machine in autonomous control agents subsystem, comprise static information and the multidate information of CPU, internal memory, disk and network, and by core frame layer by the information reporting that collects to central control management service subsystem;
Described state acquisition layer is used for gathering the system running state of Information application server, and by core frame layer, the information collected is uploaded to central control management service subsystem;
Described incident management layer is used for managing the event produced in autonomous control agents subsystem, comprises establishment, deletion, query event state;
Described process monitoring layer is for monitoring the critical processes be configured with on the Information application server of autonomous control agents subsystem, when finding that critical processes breaks down, process failure event is sent to central control management service subsystem, to excite corresponding virtual machine, the High Availabitity of guarantee information application, wherein, critical processes is manually be configured by administrative staff the process needing to monitor according to the difference of Information application, comprises the process of database, the process of Web service;
Described Joblet running environment layer is used for providing basis for Joblet runs in autonomous control agents subsystem, Joblet and task Job is one to one, wherein, Job performs in center-control service subsystem, be responsible for initialization, management Joblet runs, Joblet is distributed in each physical machine in autonomous control agents subsystem and performs, and completes actual task.
CN201310065647.8A 2013-03-01 2013-03-01 A kind of high-availability system based on cloud computing Active CN103152414B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310065647.8A CN103152414B (en) 2013-03-01 2013-03-01 A kind of high-availability system based on cloud computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310065647.8A CN103152414B (en) 2013-03-01 2013-03-01 A kind of high-availability system based on cloud computing

Publications (2)

Publication Number Publication Date
CN103152414A CN103152414A (en) 2013-06-12
CN103152414B true CN103152414B (en) 2016-03-30

Family

ID=48550273

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310065647.8A Active CN103152414B (en) 2013-03-01 2013-03-01 A kind of high-availability system based on cloud computing

Country Status (1)

Country Link
CN (1) CN103152414B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107707398A (en) * 2017-09-29 2018-02-16 郑州云海信息技术有限公司 The method and apparatus that physical host is managed in cloud computing system
CN113220448A (en) * 2021-04-15 2021-08-06 广州广哈通信股份有限公司 Dual-computer system and control method thereof

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103399496B (en) * 2013-08-20 2017-03-01 中国能源建设集团广东省电力设计研究院有限公司 Intelligent grid magnanimity real time data load simulation test cloud platform and its method of testing
CN104683131A (en) * 2013-11-27 2015-06-03 杭州迪普科技有限公司 Application stage virtualization high-reliability method and device
CN104702422A (en) * 2013-12-04 2015-06-10 北京信威通信技术股份有限公司 Method, device and system for realizing high availability of communication equipment
CN103746988A (en) * 2013-12-31 2014-04-23 曙光云计算技术有限公司 Security management method and system of cloud host machine
CN104252378A (en) * 2014-05-14 2014-12-31 温武少 Virtual computer experience classroom system
CN105306234A (en) * 2014-06-19 2016-02-03 中兴通讯股份有限公司 Equipment monitoring method and device
CN104331353A (en) * 2014-10-17 2015-02-04 云南远信数通科技有限公司 Method for guaranteeing software high availability
CN105159752B (en) * 2015-09-22 2018-03-30 中国人民解放军国防科学技术大学 Virtualize the real-time task and resource regulating method of machine startup Time Perception in cloud
CN106559441B (en) * 2015-09-25 2020-09-04 华为技术有限公司 Virtual machine monitoring method, device and system based on cloud computing service
CN105516365A (en) * 2016-01-22 2016-04-20 浪潮电子信息产业股份有限公司 Method for managing a distributed type mirror image storage block device based on network
CN106375149A (en) * 2016-08-31 2017-02-01 武汉钢信软件有限公司 Auto associating and analyzing cloud computing monitor apparatus and method
CN107291589B (en) * 2017-05-04 2020-09-04 中国电子科技集团公司第三十二研究所 Method for improving system reliability in robot operating system
CN107659618A (en) * 2017-09-03 2018-02-02 中国南方电网有限责任公司 A kind of cloud auditing system
CN107623601A (en) * 2017-09-30 2018-01-23 郑州云海信息技术有限公司 A kind of privatization cloud platform alerts scheme
CN108388433B (en) * 2017-12-28 2021-09-17 深圳创新科软件技术有限公司 Management platform deployment method of super-fusion system
CN108449383A (en) * 2018-02-11 2018-08-24 西南电子技术研究所(中国电子科技集团公司第十研究所) Distributed thin cloud computing system mobile in real time
CN110232085B (en) * 2019-04-30 2021-09-24 中国科学院计算机网络信息中心 Big data ETL task arranging method and system
CN110196722B (en) * 2019-05-07 2023-11-28 平安科技(深圳)有限公司 Cloud host batch management method, system, equipment and storage medium
CN111447079B (en) * 2020-02-28 2022-08-16 华东计算技术研究所(中国电子科技集团公司第三十二研究所) High-availability extension system and method based on SCA framework
CN112468212B (en) * 2020-11-04 2022-10-04 北京遥测技术研究所 High-availability servo system of all-weather unattended measurement and control station
CN112787855B (en) * 2020-12-29 2022-07-26 中国电力科学研究院有限公司 Main/standby management system and management method for wide-area distributed service
CN115941686A (en) * 2022-11-15 2023-04-07 浪潮云信息技术股份公司 Method and system for realizing high-availability service of cloud native application

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101567816A (en) * 2009-05-27 2009-10-28 北京中企开源信息技术有限公司 Monitoring server and monitoring method
CN102946408A (en) * 2012-10-11 2013-02-27 浪潮(北京)电子信息产业有限公司 Cloud computing system, device and method for providing application service

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040001449A1 (en) * 2002-06-28 2004-01-01 Rostron Andy E. System and method for supporting automatic protection switching between multiple node pairs using common agent architecture

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101567816A (en) * 2009-05-27 2009-10-28 北京中企开源信息技术有限公司 Monitoring server and monitoring method
CN102946408A (en) * 2012-10-11 2013-02-27 浪潮(北京)电子信息产业有限公司 Cloud computing system, device and method for providing application service

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Research on Cloud Manufacturing Resource;Wei Jiang等;《Software Engineering and Service Science (ICSESS), 2012 IEEE 3rd International Conference on 》;20121231;395 - 398 *
云计算平台关键技术研究;张大华;《2011电力通信管理暨智能电网通信技术论坛论文集》;20110926;83-87 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107707398A (en) * 2017-09-29 2018-02-16 郑州云海信息技术有限公司 The method and apparatus that physical host is managed in cloud computing system
CN113220448A (en) * 2021-04-15 2021-08-06 广州广哈通信股份有限公司 Dual-computer system and control method thereof
CN113220448B (en) * 2021-04-15 2023-02-17 广州广哈通信股份有限公司 Dual-computer system and control method thereof

Also Published As

Publication number Publication date
CN103152414A (en) 2013-06-12

Similar Documents

Publication Publication Date Title
CN103152414B (en) A kind of high-availability system based on cloud computing
US9645900B2 (en) Warm standby appliance
US8688773B2 (en) System and method for dynamically enabling an application for business continuity
CN101179432A (en) Method of implementing high availability of system in multi-machine surroundings
CN106856489A (en) A kind of service node switching method and apparatus of distributed memory system
CN106603696B (en) A kind of high-availability system based on super fusion basic framework
CN105302661A (en) System and method for implementing virtualization management platform high availability
CN102088490B (en) Data storage method, device and system
WO2002089341A2 (en) System and method for providing access to resources using a fabric switch
CN102902615B (en) A kind of Lustre parallel file system false alarm method and system thereof
CN107480014A (en) A kind of High Availabitity equipment switching method and device
CN103457775A (en) High-availability virtual machine pooling management system based on roles
CN109144789A (en) A kind of method, apparatus and system for restarting OSD
CN106681858A (en) Virtual machine data disaster tolerance method and management device
CN108469996A (en) A kind of system high availability method based on auto snapshot
CN109905275A (en) A kind of detection of control plane failure and processing method based on SDN layer architecture
CN106294795A (en) A kind of data base's changing method and system
CN105959145B (en) A kind of method and system for the concurrent management server being applicable in high availability cluster
CN101686261A (en) RAC-based redundant server system
CN105490847A (en) Real-time detecting and processing method of node failure in private cloud storage system
CN108445857A (en) A kind of 1+N redundancy scheme design methods of SCADA system
CN117076196A (en) Database disaster recovery management and control method and device
CN116974816A (en) Disaster backup system for remote double-node service
CN105391790A (en) Database high-availability method similar to RAC One Node
CN107423167A (en) A kind of ISCSI target redundancy control methods and system based on dual control storage

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant