CN109995560A - Cloud resource pond management system and method - Google Patents

Cloud resource pond management system and method Download PDF

Info

Publication number
CN109995560A
CN109995560A CN201711492177.8A CN201711492177A CN109995560A CN 109995560 A CN109995560 A CN 109995560A CN 201711492177 A CN201711492177 A CN 201711492177A CN 109995560 A CN109995560 A CN 109995560A
Authority
CN
China
Prior art keywords
management
production
data center
virtual machine
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711492177.8A
Other languages
Chinese (zh)
Inventor
焦阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Guizhou Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Guizhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Guizhou Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711492177.8A priority Critical patent/CN109995560A/en
Publication of CN109995560A publication Critical patent/CN109995560A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • H04L41/0661Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities by reconfiguring faulty entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Abstract

The present invention provides cloud resource pond management system and methods.The system includes: production VC, at least one described production VC of each data center deployment, and the production VC is used to manage the business virtual machine of the data center;VC is managed, the management VC is for being managed collectively and dispatching all production VC.According to embodiments of the present invention, pass through two-level management framework, by management VC management resources of production and business virtual machine, high availability is provided to produce the business virtual machine of VC, to realize the unified management and scheduling of resource pool, meanwhile, the VC that will be distributed over multiple data centers is managed collectively, the efficiency of management for improving resource improves the O&M ability of cloud resource.

Description

Cloud resource pond management system and method
Technical field
The present invention relates to cloud resource pond administrative skill field more particularly to a kind of cloud resource pond management systems and method.
Background technique
The prior art mainly passes through directly management of the deployment VMware virtual resource cell system realization to resource pool.Smaller Resource pool system in usually using VC, (VirtualCenter, virtual architecture administrative center are that a kind of VMware is mentioned The management software of the virtual architecture of confession) carry out the pipes of receiving of all resources.In face of large-scale resource pool, can only each VC system it is only The certain resource of Self management, carries out alone the rights management of associated user, and operation maintenance personnel can be logged in by different account number ciphers Daily operation management operation is carried out to different VC administration interfaces, each VC is exactly a resource isolation, and resource cannot be across VC Management and allotment, be deficient in resources full view, and O&M is also without unified interface, O&M low efficiency.
The existing resource pool management framework of most of time all operational excellences, in the operation system that resource pool is run because of service Device hardware fault leads to host delay machine generation, can pass through the HA of cluster (High Available, high availability quickly Cluster) the fast quick-recovery of function;Server maintenance or updating operation are encountered, administrator can also execute vMotion function by VC Energy, active migration virtual machine to other servers, operation system are unaffected.
It is that host must be able to shared storage resource, host event that cloud resource pond, which has the characteristic basis of flexible dispatching computing resource, Virtual machine can switch between different hosts when barrier.But when the network of storage equipment or connection storage occurs When performance decline and link congestion, due to the shared storage of virtual machine, a wide-area failures, multiple host will result in It can all break down simultaneously with virtual machine, and VC is also deployed in shared storage as virtual machine, also will receive influence.This Under fault scenes, the one side a large number of services system failure is badly in need of restoring, and one side VC management system is also at malfunction.And And after VC failure, the vMotion, Storage vMotion, VDP (backup) that administrator can not be carried using VC, These means for quickly solving failures such as Replication (duplication) restore VC itself function, or even including third-party standby Part software, it is also desirable under the cooperation of VC, just can be carried out the recovery operation of backup virtual machine, and the capacity of VC virtual machine compared with Greatly, usually tens GB are to several hundred GB, and recovery time under normal circumstances is at least at 1 hour or more;During VC restores, O&M Personnel are not available the management function of VC to determine fault coverage, quickly ascertain the reason, and handling failure leads to operation maintenance personnel " two Eye one is blackened the name of ", in the case where host scale several hundred, host of problems can not be positioned in time, can only log in list one by one Platform host carries out checking processing, and under efficiency is very low, failure recovery time is very long.
In large-scale cloud resource pool management, with the increase of resource pool scale, the management framework in existing cloud resource pond is deposited In two large problems: (1) resource at different data center is managed by different VC, and ununified administration interface, resource is without legally constituted authority One scheduling, is managed if all resources are included in a VC, performance bottleneck easily occurs;(2) where VC management system Host, network, storage while when causing traffic failure, VC cannot be repaired in time, cause operation maintenance personnel can not be quickly through pipe Reason system positioning failure, main business restore slower.
Summary of the invention
The embodiment of the invention provides a kind of cloud resource pond management system and methods, by two-level management framework, by managing VC manages resources of production and business virtual machine, high availability is provided to produce the business virtual machine of VC, to realize resource pool Unified management and scheduling, meanwhile, the VC that will be distributed over multiple data centers is managed collectively, improve resource management effect Rate improves the O&M ability of cloud resource.
In a first aspect, the embodiment of the invention provides a kind of cloud resource pond management system, the system comprises:
VC is produced, at least one described production VC of each data center deployment, the production VC are for managing the data The business virtual machine at center;
VC is managed, the management VC is for being managed collectively and dispatching all production VC.
Further, which includes: the architecture controller PSC for being deployed in each data center, in all data The PSC of the heart is cascaded, and the production VC of each data center is interacted by the PSC of place data center with the management VC.
Further, it is deployed in the PSC at least two of each data center, any PSC of each data center is set For primary server, remaining PSC is arranged to standby server.
Further, the management VC is also used to after the production VC breaks down, and obtains matching for failure production VC Data are set, the production VC to break down using the fast quick-recovery of the configuration data.
Further, the management VC and the production VC are deployed in by more physical server structures in the form of virtual machine At management cluster in.
Further, the management VC is also used to when a physical server in the management cluster breaks down, The virtual of the production VC in the physical server to break down is restarted automatically in other physical servers with spare capacity Machine.
Further, the management VC regular utilization replication engine answers the configuration data of the production VC in shared memory On the local hard drive for making the physical server different from the operation physical server of production VC, and when the production VC is sent out After raw failure, the management VC obtains failure production from the physical server of configuration data for being stored with failure production VC The configuration data of VC produces VC using the configuration data fast quick-recovery failure got by the management VC.
Further, the production VC and the management VC are separately operable in different physical servers.
Further, the management VC is copied to the configuration data of the management VC in management cluster using replication engine The physical server different from the management physical server of VC is run.
Second aspect, the embodiment of the invention provides a kind of cloud resource pond management methods, comprising:
Pass through production VC manage data center business virtual machine, wherein each data center deployment at least one The production VC;
By management VC unified management and dispatch all production VC.
Cloud resource pond provided in an embodiment of the present invention management system and method, have the advantages that
(1) by two-level management framework, by management VC management resources of production and business virtual machine, for the business for producing VC Virtual machine provides high availability, thus realize the unified management and scheduling of resource pool, meanwhile, it will be distributed over multiple data centers VC is managed collectively, and the efficiency of management of resource is improved, and improves the O&M ability of cloud resource;
(2) by cascade deployment production VC, realize that user right information is synchronous with Resource TOC information, and improve and answer To the risk of production VC failure and physical server failure;
(3) VC will be managed and produce VC extension set cabinet and be deployed in the physical server of management cluster, management cluster is realized Master redundancy, access network redundancy while can integrally fall so that the reliability of management cluster is further promoted to avoid cabinet The generation of the extreme risk of electricity;
(4) local hard drive is backuped to for VC data are produced using VR function, avoid because it is shared store failure cause system without Method is restored;Local hard drive is backuped to for VC data are managed using cold standby method, avoid because it is shared store failure cause system without Method is restored.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described, for those of ordinary skill in the art, without creative efforts, also Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 shows the structural schematic diagram of cloud resource pond provided by one embodiment of the present invention management system;
Fig. 2 shows the preferred structure schematic diagrames of cloud resource pond provided by one embodiment of the present invention management system;
Fig. 3 shows the flow chart of cloud resource pond provided by one embodiment of the present invention management method;
Fig. 4 shows an example of cloud resource pond provided by one embodiment of the present invention management system;
Fig. 5 shows the more VC deployment frameworks of two-level management of cloud resource pond provided by one embodiment of the present invention management system Schematic diagram;
Fig. 6 shows the physical machine mode of an illustrative embodiment of the invention offer and the resource pool of management cluster mode Management framework figure;
The failure that Fig. 7 shows a kind of cloud resource pond management system provided by one embodiment of the present invention restarts schematic diagram;
Fig. 8 shows the schematic diagram that a kind of cloud resource pond management system provided by one embodiment of the present invention is upgraded;
Fig. 9 shows your peace management clustered deploy(ment) figure of an illustrative embodiment of the invention offer;
The fast quick-recovery of failure that Figure 10 shows a kind of cloud resource pond management system provided by one embodiment of the present invention is former Reason figure.
Specific embodiment
The feature and exemplary embodiment of various aspects of the invention is described more fully below, in order to make mesh of the invention , technical solution and advantage be more clearly understood, with reference to the accompanying drawings and embodiments, the present invention is further retouched in detail It states.It should be understood that specific embodiment described herein is only configured to explain the present invention, it is not configured as limiting the present invention. To those skilled in the art, the present invention can be real in the case where not needing some details in these details It applies.Below the description of embodiment is used for the purpose of better understanding the present invention to provide by showing example of the invention.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including There is also other identical elements in the process, method, article or equipment of the element.
As shown in Figure 1, one embodiment of the invention provides a kind of cloud resource pond management system, which includes:
VC101 is produced, each data center deployment at least one production VC101, production VC101 is for managing data center Business virtual machine;
VC102 is managed, management VC102 is for being managed collectively and dispatching all production VC101.
Wherein, entire cloud resource pond can be divided into multiple data centers according to the deployment of server, and each data center undertakes The management of a part of resource pool.
Wherein, each data center is according to resource pool Scaledeployment one or more production VC101, as shown in figure 1, data Center one is divided into two resource pools, and each resource pool distributes a production VC101.
Usual VC mainly provides management and the configuration feature of resource pool, is responsible for the scheduling (DRS function) of resource, and distribution The functions such as the management of formula interchanger.After VC failure, in the lesser situation of resource pool scale, the configuration change of virtual machine is operated Less, influence of the VC failure to management work is smaller;After carrying thousands of virtual machine in resource pool, especially in deployment Layer cloud platform software after, VC failure will the normal production work of extreme influence, especially with main business simultaneous faults when It waits.Therefore, when coping with large-scale cloud resource pool, cloud resource pond provided in this embodiment management system manages existing single layer VC Reason framework is transformed into two-stage VC management framework, and first order VC is as management VC102, second level VC as production VC101, every number According to center according to resource pool Scaledeployment one or more production VC101, by two-level management framework, by management VC102 pipe Resources of production and business virtual machine are managed, high availability is provided to produce the business virtual machine of VC101, to realize resource pool Unified management and scheduling, meanwhile, the production VC101 that will be distributed over multiple data centers is managed collectively, and resource is improved The efficiency of management improves the O&M ability of cloud resource.
Further, the mode that one embodiment of the invention is disposed using more VC link, as shown in Fig. 2, i.e. each data Center is deployed with PSC (architecture controller) 103, the PSC103 cascade of all data centers, the production of each data center VC101 is interacted by the PSC103 of place data center with management VC102.The system of above-mentioned cascade can be such that administrator uses After the management IP address of any one production VC101 carries out logon operation, it can be seen according to the permission of itself all or part of Other production VC101 and its resource of management, i.e., all production VC101 are unified in an administration interface to be presented, each to produce VC101 receives the resource of pipe and polymerize under the same directory service, thus realize virtual machine across production VC101 distribution and migration.
Further, the PSC103 at least two of each data center, any of a data center are deployed in PSC103 is arranged to primary server, remaining PSC103 in the data center is arranged to standby server.Produce VC101 with Any PSC103 of place data center is quickly switched into another by order if a PSC103 breaks down PSC103.If having load balancer in the server of data center, can be configured to automatically switch mould by load balancer Formula automatically switches to other PSC103 when PSC103 breaks down.Wherein, PSC103 failure only influences user's login production VC101 and management VC102, the user having logged on are unaffected.
As shown in figure 4, by taking Guizhou as an example, using the PSC103 linking scheme of VMware vSphere6.0, establish one it is expensive The management VC102 in state, to be managed collectively the cloud resource pond in full Guizhou;Then each data center in Guizhou deploys two A PSC103 cascades these PSC103, to realize that user right information is synchronous with Resource TOC information.Wherein, with For Gui An data center and Jinyang data center, produced according to the resource pool Scaledeployment of Gui An data center two VC101 (GA-VC1 and GA-VC2 in Fig. 4), according to one production VC101 of the resource pool Scaledeployment of Jinyang data center (JY-VC in Fig. 4), the production VC101 of each data center PSC103 by data center where it and management respectively VC103 interaction.
Based on above-mentioned two-stage VC management framework, the separation of management system and production system is realized, management VC102 is used for life It produces VC101 and high availability is provided, while protecting production VC101 using the various high availability schemes that VMware is provided, that is, work as life Produce the production VC101 after VC101 breaks down, to break down by managing the fast quick-recovery of VC102.
Fig. 5 shows the more VC deployment configuration diagrams of two-level management according to an illustrative embodiment of the invention.First Grade VC is Mgmt-VC (i.e. management VC102), and Mgmt-VC uses embedded PSC, is operated in your peace management cluster 104.Second Grade VC operates in that your peace is managed according to the production VC101, GA-VC1 and GA-VC2 that the size in different resource pond deploys different number It manages in cluster 104, manages Kweiyang resource pool -1 respectively and Kweiyang resource pool -2, JY-VC operates in Jinyang management cluster 104, Manage Jinyang resource pool.Two PSC103 of the management clustered deploy(ment) of each data center, and operate in the object of each management cluster In reason machine, all PSC103 cascade in system, for each VC101 that produces by PSC103 interaction, Mgmt-VC is embedded by its PSC realizes the control to each production VC101.The PSC-1 and PSC-2 of same data center active/standby server each other, when PSC-1 is sent out When raw failure, starts spare PSC-2, guarantee the normal operation of system.
Further, management VC102 is also used to obtain matching for the failure production VC101 after production VC101 breaks down Data are set, the production VC101 to break down using the fast quick-recovery of the configuration data.
Further, physical server failure is coped with by cluster HA and vMotion, specifically: management VC102 and production VC101 is deployed in the management cluster being made of more physical servers in the form of virtual machine.VC disposes framework by separate unit object Reason server evolves to management cluster, and VC is changed into virtual machine by physical server, i.e., by virtualizing VC management server, To solve the problems, such as that VC does not have host high availability as separate unit physical machine.
Wherein, the vMotion function of VMware can make IT environment keep operating normally, and provide unprecedented flexibility and can be used Property, to meet business and end user's ever increasing need.Virtual machine is migrated with zero downtime, it is virtual by what is be currently running Machine is moved to another physical server from a physical server, without influencing end user.
Wherein, production VC101 and management VC102 be separately operable in different physical servers, avoid management VC102 with Production VC101 is operated on a physical server simultaneously, to realize the master redundancy of management cluster, is accessed network redundancy, is made The reliability that cluster must be managed further is promoted, while can be to avoid the generation of the extreme risk of cabinet entirety power down.
Based on the framework of above-mentioned management cluster, manages VC102 and be also used to when the physical server hair in management cluster When raw failure, the production that is restarted automatically in other physical servers with spare capacity in the physical server to break down The virtual machine of VC101.As shown in fig. 7, VMware HA is having after a physical server in management cluster breaks down Have and be restarted automatically virtual machine impacted in faulty physical server in other physical servers of capacity, virtual machine starting at After function, automatic load service, load service time is different in size according to type of service, and the service of usual VC is extensive in 15 minutes Multiple, storing data is restored in 1 hour, and during fault recovery, operation system is unaffected.As shown in figure 8, utilizing VMotion function also the operation such as can be upgraded, replace component to production VC101 and management VC102, to realize production VC101 With virtual machine zero shutdown in hardware maintenance of management VC102.
According to an illustrative embodiment of the invention, Fig. 6 gives the resource of physical machine mode by taking your peace resource pool as an example The resource pool management architecture diagram of pond management framework figure and management cluster mode.Existing physical machine mode, manages object by separate unit Reason machine 106 manages your entire peace resource pool, and the ability to ward off risks is poor.Therefore, according to the framework mode of the present embodiment, your peace resource pool A management cluster 107 is created with three ESXi hosts, for carrying multiple VC virtual machines and related management external member virtual machine, with Realizing more VC linking schemes, single VC is decomposed into GA-VC1, GA-VC2, PSC-1 from a virtual machine, tetra- virtual machines of PSC-2, The failure of physical server accident delay machine is solved using the HA function that VMware itself is provided, and utilizes vMotion function To reduce because of physical server hardware maintenance, the downtime window of software upgrading.
As shown in figure 5, the management cluster of Jinyang data center and Gui An data center is respectively by three physical server groups At Jinyang manages cluster for the related managements virtual machines such as JY-VC and PSC relevant with Jinyang resource pool, your An Guanli to be carried Cluster is used to carry the GA-VC1 of your peace resource pool, and GA-VC2 two production VC101 and correlation PSC etc. are managed outside virtual machine, Also extra bearer manages VC102, and the quantity for producing VC can further expand.If Fig. 9 shows the management clustered deploy(ment) figure of your peace, Each square represents a cabinet in Fig. 9, and the square of every kind of different fillings represents different host models, management VC102 deployment In the same management cluster for being deployed in your peace with production VC101, three hosts of composition management cluster are respectively from 4A- 8 cabinets, 6A-8 cabinet, the No.1 server of 7A-8 cabinet, these servers are respectively connected to the access switch of column. Meanwhile utilizing the virtual machine and physics of DRS (Distributed Resource Scheduler, distributed resource scheduling program) The relevance function of machine, make manage VC102 and two VC respectively fixation operate in different hosts, avoid management VC102 with Production VC101 is operated on a host simultaneously, to realize the master redundancy of management cluster, accesses network redundancy, so that management The reliability of cluster is further promoted, while can be to avoid the generation of the extreme risk of cabinet entirety power down.
Due to having new virtual machine online daily and configuration change, the configuration information for producing VC101 changes greatly, therefore, Based on two-stage VC management framework, using the vSphere Replication (data duplication) of VMware come process for producing VC101's Data backup and resume.VSphere Replication is the proprietary replication engine of VMware, for being existed by network Virtual machine in the open state is replicated between vSphere host, i.e., the data block the machine changed is copied into recovery station Point, it is ensured that realize lower bandwidth availability ratio and higher recovery point objectives.Meanwhile during initial synchronisation, virtual machine is utilized " the seed copy " of data is by tracking the disk areas changed and only replicating incremental data, it is ensured that efficiently utilizes network.
The management VC102 of the present embodiment utilizes VR (vSphere Replication, replication engine) function of VMware, The configuration data of production VC101 in shared memory is periodically copied to the physical services different from operation production VC101 The local hard drive of device, to realize that the regular automaticdata to production VC101 replicates.Based on two-stage VC management framework, VR VC just It could work in the case where often, production VC101 itself cannot be restored by VR, it is necessary to by the VR function ability of management VC102 It is able to achieve recovery, i.e., fast quick-recovery is carried out to production VC101 by management VC102.Specifically, when production VC101 breaks down Afterwards, management VC102 obtains the production from the local hard drive of the physical server different from failure production VC101 is run The configuration data of VC101 is broken down in suitable physical server using the fast quick-recovery of configuration data by management VC102 VC101 is produced, can choose the physical server of storage backup storing data, also can choose other spare appearances in management cluster Amount it is sufficient as managed server, avoid leading to not the VC101 that resumes production because of shared memory failure.
Further, the deploying virtual machine of VC102 will be managed on the shared memory, and manages VC102 using duplication The configuration data for managing VC102 is copied to physics different from the physical server of operational management VC in management cluster and taken by engine Business device.When delay machine occurs in management VC102, the Backup Data recovery management of the place storage except shared memory is utilized VC102.In the way of periodic replication, the configuration data of management VC102 is backed up, is backed up to be complete standby, virtual machine is both Have, management VC102 recovery time is the time for restarting virtual machine, and after duplication, operation maintenance personnel can be immediately to multiple The virtual machine of system carries out functional verification, it is ensured that service can be good for use when management VC102 restores, and recovery operation is usually at 15 points It is completed in clock.Using the method for above-mentioned cold standby management VC102, avoid leading to not recovery management because of shared memory failure VC102.In the present embodiment, the cold standby method of use includes: (1) every some cycles (such as 1 month) replication management VC102; (2) replication engine duplication variation is utilized by management VC102 immediately after detecting that the configuration data of management VC102 changes Data.
As shown in Figure 10, management cluster is made of three ESXi hosts, and VMDK, VMDK are deployed in every physical host (VMWare Virtual Machine Disk Format) is the virtual hard format of virtual machine VMware creation, and file is present in In VMware file system, referred to as VMFS (virtual machine file system), a VMDK file representative VMFS are on a virtual machine One physical hard disk driving, all customer data and the configuration information in relation to virtual server are stored in VMDK file.Benefit With VR data copy solution, multiple production VC101 in shared memory are copied to and run each physics for producing VC101 The not VMDK of the different physical server of server is arrived configuration information storage by the way of cold standby for management VC102 In other physical servers, the regular automaticdata duplication of VC102 is realized to production VC101 and manages, to cope with the system failure.
Based on above-mentioned cloud resource pond management system, one embodiment of the invention provides a kind of cloud resource pond management method, As shown in Figure 3, comprising:
Step S1 manages the business virtual machine of data center by production VC101, wherein each data center deployment is extremely A few production VC101;
Step S2 by management VC102 unified management and dispatches all production VC101.
Wherein, each data center deployment has PSC103, the PSC103 of all data centers to cascade, each data center Production VC101 is interacted by the PSC103 of place data center with management VC102.Further, it is deployed in each data center PSC103 at least two, any PSC103 of each data center is arranged to primary server, remaining PSC103 is set For standby server.
Wherein, management VC102 and production VC101 are deployed in the pipe being made of more physical servers in the form of virtual machine It manages in cluster.
Wherein, production VC101 and management VC102 are separately operable in different physical servers.
Further, this method further include: after production VC101 breaks down, obtain failure life by managing VC102 Produce the configuration data of VC101, the production VC101 to break down using the fast quick-recovery of the configuration data.
Further, this method further include: when a physical server in management cluster breaks down, with standby With the virtual machine for the production VC101 for being restarted automatically the physical server to break down in other physical servers of capacity.
Further, this method further include: by managing VC102 regular utilization replication engine for the life in shared memory The configuration data for producing VC101 copies to the local hard of the physical server different from the operation physical server of production VC101 On disk;And after production VC101 breaks down, by managing VC102 from the configuration data for being stored with failure production VC101 Physical server in obtain the configuration data of failure production VC101, it is fast using the configuration data that gets by management VC102 The production VC101 that quick-recovery breaks down.
Further, this method further include: the configuration data for managing VC102 is copied into management cluster using replication engine In the physical server different from the physical server of operational management VC102.
It should also be noted that, the exemplary embodiment referred in the present invention, is retouched based on a series of step or device State certain methods or system.But the present invention is not limited to the sequence of above-mentioned steps, that is to say, that can be according in embodiment The sequence referred to executes step, may also be distinct from that the sequence in embodiment or several steps are performed simultaneously.
The above description is merely a specific embodiment, it is apparent to those skilled in the art that, For convenience of description and succinctly, the system, module of foregoing description and the specific work process of unit can refer to preceding method Corresponding process in embodiment, details are not described herein.It should be understood that scope of protection of the present invention is not limited thereto, it is any to be familiar with Those skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or substitutions, These modifications or substitutions should be covered by the protection scope of the present invention.

Claims (10)

1. a kind of cloud resource pond management system, which is characterized in that the system comprises:
Virtual architecture administrative center VC is produced, at least one described production VC of each data center deployment, the production VC are used for Manage the business virtual machine of the data center;
VC is managed, the management VC is for being managed collectively and dispatching all production VC.
2. system according to claim 1, which is characterized in that the system includes: to be deployed in the basis of each data center Framework controller PSC, the PSC cascade of all data centers, the production VC of each data center pass through the PSC of place data center It is interacted with the management VC.
3. system according to claim 2, which is characterized in that be deployed in the PSC at least two of each data center, often Any PSC of a data center is arranged to primary server, remaining PSC is arranged to standby server.
4. system according to claim 1, which is characterized in that the management VC is also used to break down as the production VC Afterwards, the configuration data of failure production VC, the production VC to break down using the fast quick-recovery of the configuration data are obtained.
5. system according to claim 4, which is characterized in that the management VC and the production VC are in the form of virtual machine It is deployed in the management cluster being made of more physical servers.
6. system according to claim 5, which is characterized in that the management VC is also used to as one in the management cluster When platform physical server breaks down, the physics to break down is restarted automatically in other physical servers with spare capacity The virtual machine of production VC in server.
7. system according to claim 5, which is characterized in that the management VC regular utilization replication engine is by shared storage The configuration data of production VC on device copies to the local of the physical server different from the operation physical server of production VC On hard disk, and after the production VC breaks down, the VC that manages is from the configuration data for being stored with failure production VC The configuration data that failure production VC is obtained in physical server, it is quickly extensive using the configuration data got by the management VC The multiple failure produces VC.
8. system according to claim 5, which is characterized in that the production VC and the management VC are separately operable in difference Physical server in.
9. system according to claim 5, which is characterized in that the management VC is using replication engine by the management VC's Configuration data copies to physical server different from the management physical server of VC is run in management cluster.
10. a kind of cloud resource pond management method characterized by comprising
Pass through the business virtual machine that production VC manages data center, wherein each data center deployment is described at least one Produce VC;
By management VC unified management and dispatch all production VC.
CN201711492177.8A 2017-12-30 2017-12-30 Cloud resource pond management system and method Pending CN109995560A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711492177.8A CN109995560A (en) 2017-12-30 2017-12-30 Cloud resource pond management system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711492177.8A CN109995560A (en) 2017-12-30 2017-12-30 Cloud resource pond management system and method

Publications (1)

Publication Number Publication Date
CN109995560A true CN109995560A (en) 2019-07-09

Family

ID=67110145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711492177.8A Pending CN109995560A (en) 2017-12-30 2017-12-30 Cloud resource pond management system and method

Country Status (1)

Country Link
CN (1) CN109995560A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114285865A (en) * 2021-12-28 2022-04-05 天翼云科技有限公司 Access authority control system for sharing cloud hard disk

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3684797D1 (en) * 1985-02-01 1992-05-21 Toshiba Kawasaki Kk DYNAMIC TYPE STORAGE ARRANGEMENT.
CN101155133A (en) * 2006-09-26 2008-04-02 华为技术有限公司 Processing method for information of flux project periodic line
CN102103518A (en) * 2011-02-23 2011-06-22 运软网络科技(上海)有限公司 System for managing resources in virtual environment and implementation method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3684797D1 (en) * 1985-02-01 1992-05-21 Toshiba Kawasaki Kk DYNAMIC TYPE STORAGE ARRANGEMENT.
CN101155133A (en) * 2006-09-26 2008-04-02 华为技术有限公司 Processing method for information of flux project periodic line
CN102103518A (en) * 2011-02-23 2011-06-22 运软网络科技(上海)有限公司 System for managing resources in virtual environment and implementation method thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GUYOL8888: "如何实现多个vCenter集中管理", 《百度知道》 *
TANXM: "vsphere6.0的VC架构部分更新", 《51CTO博客》 *
张婧婧: "VMware虚拟化技术来架构企业数据中心", 《计算机技术》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114285865A (en) * 2021-12-28 2022-04-05 天翼云科技有限公司 Access authority control system for sharing cloud hard disk
CN114285865B (en) * 2021-12-28 2023-08-08 天翼云科技有限公司 Access authority control system for shared cloud hard disk

Similar Documents

Publication Publication Date Title
US11797395B2 (en) Application migration between environments
EP2645253B1 (en) Private cloud replication and recovery
US11663085B2 (en) Application backup and management
US20200110675A1 (en) Data backup and disaster recovery between environments
JP5102901B2 (en) Method and system for maintaining data integrity between multiple data servers across a data center
US8359491B1 (en) Disaster recovery rehearsal using copy on write
CN104794028B (en) A kind of disaster tolerance processing method, device, primary data center and preliminary data center
CN110784350B (en) Design method of real-time high-availability cluster management system
US8539087B2 (en) System and method to define, visualize and manage a composite service group in a high-availability disaster recovery environment
CN103414712B (en) A kind of distributed virtual desktop management system and method
CN110912991A (en) Super-fusion-based high-availability implementation method for double nodes
EP3745269B1 (en) Hierarchical fault tolerance in system storage
CN103778031A (en) Distributed system multilevel fault tolerance method under cloud environment
WO2007041288A2 (en) Application of virtual servers to high availability and disaster recovery solutions
CN103677967A (en) Remote data service system of data base and task scheduling method
WO2009012132A1 (en) Maintaining availability of a data center
CN108469996A (en) A kind of system high availability method based on auto snapshot
CN106850315A (en) One kind automation disaster tolerance system
CN107291821A (en) A kind of method that same city dual-active framework is switched fast
CN105487946A (en) Fault computer automatic switching method and device
CN109995560A (en) Cloud resource pond management system and method
CN103780433B (en) Self-healing type virtual resource configuration management data architecture
CN105468446A (en) Linux based method for realizing high availability of HPC job scheduling
Leangsuksun et al. Highly reliable Linux HPC clusters: Self-awareness approach
CN214851313U (en) Upper computer virtualization system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190709