CN107168830A - A kind of disaster tolerance system based on virtual platform, method - Google Patents

A kind of disaster tolerance system based on virtual platform, method Download PDF

Info

Publication number
CN107168830A
CN107168830A CN201710354793.0A CN201710354793A CN107168830A CN 107168830 A CN107168830 A CN 107168830A CN 201710354793 A CN201710354793 A CN 201710354793A CN 107168830 A CN107168830 A CN 107168830A
Authority
CN
China
Prior art keywords
website
circuit
storage
storage virtualization
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710354793.0A
Other languages
Chinese (zh)
Inventor
魏朝辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201710354793.0A priority Critical patent/CN107168830A/en
Publication of CN107168830A publication Critical patent/CN107168830A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of disaster tolerance system based on virtual platform, the system includes multiple websites, and each website includes cluster virtual machine, Storage Virtualization gateway and storage device;The cluster virtual machine includes multiple virtual machines, multiple Storage Virtualization gateways are connected with each other to constitute Storage Virtualization gateway cluster, described in the first website during device fails, the Storage Virtualization gateway cluster will connect the circuit switching of first website to the circuit for connecting the second website;Data syn-chronization is carried out between multiple storage devices;Disaster tolerance system disclosed in this invention, when an error occurs can be timely by the circuit switching of connecting fault website to the circuit for connecting other websites, to ensure the continuity of business event, so as to avoid the loss that failure is brought to enterprise;The invention also discloses a kind of disaster recovery method based on above-mentioned disaster tolerance system, equally with above-mentioned beneficial effect.

Description

A kind of disaster tolerance system based on virtual platform, method
Technical field
The present invention relates to network communication technology field, more particularly to a kind of disaster tolerance system and side based on virtual platform Method.
Background technology
Currently, with the development and the continuous expansion of scope of the enterprise of the network communications technology, the business of enterprise is for network Dependence it is increasing.But it is due to the generation of various natural and man-made calamities, the business of enterprise can be interrupted, this will give enterprise's band Carry out huge property loss.So modern enterprise needs a set of perfect disaster tolerance system to ensure being normally carried out for business event.
In today's society, with the popularization of virtualization technology, increasing enterprise by business be placed on virtual platform it In, at this moment enterprise is accomplished by carrying out disaster tolerance construction to cluster virtual machine, to ensure the normal operation of business event.
In the prior art, enterprise is combined to build business system typically by virtual platform with a storage network System, refer to Fig. 1, and many virtual machines 100 are interconnected to constitute cluster virtual machine, the cluster virtual machine by gateway 200 with Storage device 300 is connected.When a certain virtual machine breaks down, gateway can arrive the circuit switching of connecting fault virtual machine The circuit of another virtual machine is connected, to ensure the operation of business.
But in the prior art, when storage device breaks down, whole system will be affected, whole enterprise Business will be interrupted, and this can cause serious property loss to enterprise.
The content of the invention
In view of this, it is a primary object of the present invention to provide a kind of disaster tolerance system based on virtual platform, Ke Yiyou The Single Point of Faliure avoided in system of effect;Another object of the present invention is to provide a kind of disaster tolerance side based on virtual platform Method, can go on the business continuity performed by system when failure occurs.
In order to solve the above problems, the invention provides a kind of disaster tolerance system based on virtual platform, the system bag Multiple websites are included, each described website includes cluster virtual machine, Storage Virtualization gateway and storage device;
The cluster virtual machine is connected with the Storage Virtualization gateway, the Storage Virtualization gateway and the storage Equipment is connected, and the cluster virtual machine includes multiple virtual machines, and multiple Storage Virtualization gateways are connected with each other to constitute Storage Virtualization gateway cluster, when the device fails in the first website, the Storage Virtualization gateway cluster will be connected The circuit switching of first website extremely connects the circuit of the second website;
Data syn-chronization is kept between multiple storage devices.
Alternatively, the Storage Virtualization gateway cluster specifically for:
When the first virtual machine breaks down, the circuit switching for connecting first virtual machine is extremely connected into the second virtual machine Circuit.
Alternatively, the Storage Virtualization gateway cluster specifically for:
When the first storage device breaks down, the circuit switching for connecting first storage device is deposited to connecting second Store up the circuit of equipment.
Alternatively, the Storage Virtualization gateway cluster specifically for:
When the first Storage Virtualization gateway failure, the circuit switching of the first Storage Virtualization gateway will be connected To the circuit for connecting the second Storage Virtualization gateway.
Alternatively, the Storage Virtualization gateway is additionally operable to:
It is when the fault restoration of first website is completed, the circuit switching for connecting second website is described to connecting The circuit of first website.
Alternatively, arbitration node is provided between multiple Storage Virtualization gateways, the hair for preventing fissure situation It is raw.
Present invention also offers a kind of disaster recovery method based on virtual platform, including:
When the device fails of the first website in multiple websites, the Storage Virtualization gateway cluster obtains described The information of the device fails of first website;Wherein, the website includes cluster virtual machine, Storage Virtualization gateway and storage Equipment, the cluster virtual machine includes multiple virtual machines, and multiple Storage Virtualization gateways are connected with each other empty to constitute storage Data syn-chronization is kept between planization gateway cluster, multiple storage devices;
The Storage Virtualization gateway cluster will connect the circuit switching of first website into the connection website The circuit of second website.
Alternatively, the Storage Virtualization gateway cluster will connect the circuit switching of first website to the connection station The circuit of the second website in point includes:
When the first virtual machine breaks down, the Storage Virtualization gateway cluster will connect the line of first virtual machine Road is switched to the circuit of the second virtual machine of connection.
Alternatively, the Storage Virtualization gateway cluster will connect the circuit switching of first website to the connection station The circuit of the second website in point includes:
When the first storage device breaks down, the Storage Virtualization gateway cluster will connect first storage device Circuit switching to connection the second storage device circuit.
Alternatively, the Storage Virtualization gateway cluster will connect the circuit switching of first website to the connection station The circuit of the second website in point includes:
When the first Storage Virtualization gateway failure, the Storage Virtualization gateway cluster is deposited connecting described first Circuit of the circuit switching of storage virtualization gateway to the second Storage Virtualization gateway of connection.
, there are multiple websites in system provided by the present invention, each website includes cluster virtual machine, Storage Virtualization net Close and storage device, the data stored between multiple websites are substantially the same, when an error occurs can be timely by connection event Hinder the circuit switching of website to the circuit for connecting other websites, to ensure the continuity of business event, so as to avoid failure to enterprise The loss that industry is brought.Present invention also offers a kind of disaster recovery method, with above-mentioned beneficial effect, it will not be repeated here.
Brief description of the drawings
In order to illustrate more clearly of the technical scheme in the embodiment of the present application, make required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present application, for For those of ordinary skill in the art, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of disaster tolerance system based on virtual platform in the prior art;
The structural representation for the first disaster tolerance system that Fig. 2 is provided by the embodiment of the present invention;
The structural representation for second of disaster tolerance system that Fig. 3 is provided by the embodiment of the present invention;
The structural representation for the third disaster tolerance system that Fig. 4 is provided by the embodiment of the present invention;
A kind of flow chart for disaster recovery method that Fig. 5 is provided by the embodiment of the present invention;
Fig. 6 is the flow chart of the specific implementation method of step 102 in Fig. 5.
Embodiment
In order that those skilled in the art more fully understand the technical scheme in the application, it is real below in conjunction with the application The accompanying drawing in example is applied, the technical scheme in the embodiment of the present application is clearly and completely described, it is clear that described implementation Example only some embodiments of the present application, rather than whole embodiments.Based on the embodiment in the application, this area is common The every other embodiment that technical staff is obtained under the premise of creative work is not made, should all belong to protection of the present invention Scope.
The present invention is a kind of disaster tolerance system based on virtual platform, in the prior art, and its system has single-point event Barrier, i.e., when storing network failure, whole system will be affected, so as to be caused any property loss to enterprise;And this hair Single Point of Faliure is not present in bright provided system, can be by connecting fault when the device fails of any one in website The circuit switching of equipment is to the circuit of other identical equipment is connected, so as to ensure that the continuity of business event, it is to avoid failure The loss brought to enterprise.
The present invention is described in detail below in conjunction with accompanying drawing.
It refer to Fig. 2, the structural representation for the first disaster tolerance system that Fig. 2 is provided by the embodiment of the present invention, the system Including:
Multiple websites 101, each described website 101 includes cluster virtual machine 102, Storage Virtualization gateway 105 and deposited Store up equipment 107.
There is provided a kind of disaster tolerance system that point single fault is not present, appearance provided by the present invention in embodiments of the present invention Calamity system includes multiple websites 101, and each website includes one or more the said equipments, i.e., each website is wrapped One or more cluster virtual machines 102 are included, one or more Storage Virtualization gateways 105, and one or more storages are set Standby 107, disaster tolerance system so provided by the present invention just includes multiple cluster virtual machines 102, multiple Storage Virtualization gateways 105 and multiple storage devices 107.
The cluster virtual machine 102 is connected with the Storage Virtualization gateway 105, the Storage Virtualization gateway 105 It is connected with the storage device 107, the cluster virtual machine 102 includes multiple virtual machines 103 and virtual machine 104, Duo Gesuo State Storage Virtualization gateway 105 to be connected with each other to constitute Storage Virtualization gateway cluster 106, the equipment hair described in the first website During raw failure, the circuit switching for connecting first website is extremely connected the second website by the Storage Virtualization gateway cluster 106 Circuit.
Connected mode between each above-mentioned equipment can select various ways to be attached, and such as selection is carried out with netting twine Connection, can also be selected to be attached with optical fiber, naturally it is also possible to which selection selection between some equipment is attached with netting twine, Selection is attached with optical fiber between some equipment, such as by the disaster tolerance system that the embodiment of the present invention is provided, due to Storage Virtualization gateway cluster 106 needs to carry out more judgement either switch step, in order to ensure Storage Virtualization gateway The speed of service of cluster 106, can select to be attached with optical fiber between multiple Storage Virtualization gateways 105, whole to constitute Individual Storage Virtualization gateway cluster 106.When fibre is used up in selection to be attached, further light can be set in fibre circuit Fiber amplifier, is transmitted with an optical fiber by one group of optical wavelength of DWDM (intensive multiplexed optical wave with) combinations, is reduced with this The number of fibers for the disaster tolerance system that the embodiment of the present invention is provided, with cost-effective or constant in the system number of fibers In the case of improve data transfer when bandwidth.
In embodiments of the present invention, the virtual machine 103 and virtual machine 104 in cluster virtual machine 102 can be grasped by Linux Make KVM (Kernel-based Virtual Machine) system void that system is integrated in after 2.6.20 versions in Linux What planization module was produced after being virtualized, it is of course also possible to produce multiple virtual machines 103 and virtual machine by other method 104, to constitute cluster virtual machine 102, are not specifically limited in embodiments of the present invention, but are that selection is used under normal circumstances Kvm system virtualization modules in (SuSE) Linux OS produce multiple virtual machines, because the KVM systems in (SuSE) Linux OS System virtualization modules are the system virtualization an increased income modules, and its core source code is seldom, during specifically used very It is convenient.
In embodiments of the present invention, first website and the second website are not to refer in particular to some specific website, but For convenience of description, the website broken down is referred to as the first website, and by the circuit switching of connecting fault website to connecting After the circuit for connecing new website, the new website being switched to is referred to as the second website.Certainly, it is being switched to new station Point after, may find that the new website also there occurs failure, then now can also by the circuit of connecting fault website after Continuous to be switched to the circuit for connecting another new website, now the website of failure is still referred to as the first website, is switched to New website be referred to as the second website.
Certainly, before the second website is switched to, first the second station point that will be switched to can be estimated, if the Two websites do not break down, then will connect the circuit switching of the first website to the circuit of the second website of connection;If the second website Failure is there occurs, then continues to be estimated next new website, after finding that does not have a faulty website, then will The circuit switching of the first website is connected to the circuit for connecting new website.The above-mentioned evaluation work to new site is typically by storing Virtualization gateway cluster 106 is estimated what is either judged, naturally it is also possible to be that miscellaneous equipment either other modules are carried out Assess and either judge, do not do specific restriction in embodiments of the present invention.
It is above-mentioned when device fails in the first website, the Storage Virtualization gateway cluster 106 can first obtain described The information of device fails in first website.Obtaining the mode of fault message has many kinds, for example, working as device fails When, fault message can be sent, when virtualization gateway cluster 106 receives the fault message sent by faulty equipment, will be connected The circuit switching of failure website is to the circuit for connecting other websites, to ensure the operation of whole system;Or be due to equipment The failure ratio of generation is more serious, it is impossible to when sending data to virtualization gateway cluster 106, and virtualization gateway cluster 106 is above-mentioned Start timing when device data is interrupted, when the time of data outage reaching threshold value set in advance, just by the said equipment It is judged as having occurred and that failure, the circuit switching for the website for including above-mentioned faulty equipment will be connected to the circuit of connection new site. For example when the data outage of equipment was for up to 10 seconds, the circuit switching for the website for including above-mentioned faulty equipment will be connected extremely The circuit of new site is connected, to ensure the normal operation of whole system.Above-mentioned threshold value set in advance can be data once in Disconnected duration, can also be the number of times that the device data is interrupted within a period of time, such as when the data of equipment were at 60 seconds When inside having interrupted 5 times, equally the said equipment is judged as to have occurred and that failure, connection is included to the website of above-mentioned faulty equipment Circuit switching extremely connects the circuit of new site.In a practical situation, the situation of other failures, specific determination methods can also be met Above-mentioned three kinds are also not limited to, but which kind of no matter judges the method for equipment fault using, the reality of the present invention is not influenceed It is existing.
Under normal conditions, usually above-mentioned three kinds of methods are used in combination, i.e., when Storage Virtualization gateway cluster 106 is received The information sent to faulty equipment, or when the time of data outage reaching threshold value set in advance, or work as data When the number of times of interruption reaches threshold value set in advance within a period of time, occur any one of above-mentioned three kinds of situations, all will even The circuit switching of the website including faulty equipment is connect to the circuit for connecting new site, to ensure the normal operation of whole system.When One kind in above-mentioned three kinds of methods can also be so selected, is not specifically limited in embodiments of the present invention.
Data syn-chronization is carried out between multiple storage devices 107.
When the system is operated, due to having carried out data syn-chronization between multiple storage devices 107, equivalent to In the system, the data for having carried out being preserved between data syn-chronization, i.e., multiple websites 101 between multiple websites 101 are substantially phases With, so above-mentioned website 101 can be the website of standby usage each other, when when device fails in some website, again Either due to certain natural and man-made calamities, whole website there occurs failure, the disaster tolerance that now embodiment of the present invention is provided System can be by the circuit switching of connecting fault website to the circuit for connecting new website, to ensure the normal fortune of whole system OK, it is ensured that the fluency of the system execution work.
By the disaster tolerance system that the embodiment of the present invention is provided has multiple storage devices 107, multiple storage devices Storage network is constituted in the present system, and the storage network can be present among each website, to be produced to the system Raw data are stored.
In embodiments of the present invention, storage device 107 can be disk array or other there is setting for store function It is standby.In general, typically selection disk array as disaster tolerance system storage device 107 because disk array can be with By identical data storage in the different places of multiple hard disks, data redundancy can be done by above-mentioned disk array, to ensure The safety of data, adds the fault-tolerance of data;And it can in a balanced fashion overlap, carried with this in input-output operation Rise the performance of disk array.Due to above-mentioned advantage, selected during the disaster tolerance system that the embodiment of the present invention is provided under normal conditions Disk array is used as storage device 107, naturally it is also possible to select other equipment with store function as storage device 107, Specific restriction is not done herein.
Above-mentioned data syn-chronization can be carried out data syn-chronization by Storage Virtualization gateway cluster 106 or pass through Other equipment, or be the data syn-chronization between the module progress storage device that there is data syn-chronization function by other, This is not specifically limited.The data syn-chronization can be that either to carry out data by asynchronous mode by way of synchronous same Walk, either mode either asynchronous by way of synchronous carries out data syn-chronization, can realize that the embodiment of the present invention is carried The disaster tolerance system problem to be solved of confession.
A kind of disaster tolerance system based on the virtualization network platform that the embodiment of the present invention is provided, with multiple websites 101, Each website includes being stored between cluster virtual machine 102, Storage Virtualization gateway 105 and storage device 107, multiple websites Data be substantially the same, when an error occurs can be timely by the circuit switching of connecting fault website to connecting other websites Circuit, to ensure the continuity of business event, so as to avoid the loss that failure is brought to enterprise.
, can be only by connecting fault equipment when some specific device fails among particular situation Circuit switching to the circuit for connecting remaining equipment, ensure the trouble-free operation of whole system with this.Concrete condition will be in following realities Apply in example and be described in detail.
It refer to Fig. 3, the structural representation for second of disaster tolerance system that Fig. 3 is provided by the embodiment of the present invention, the present invention The difference for the system that the system that embodiment is provided is provided with a upper inventive embodiments is, is carried in the embodiment of the present invention In the system of confession, when some specific device fails, without the circuit for connecting overall website is switched over, but By the circuit switching of the specific faulty equipment of connection to the circuit for connecting other identical equipment, to ensure when an error occurs, entirely System normally can smoothly be run.
In embodiments of the present invention, the virtualization gateway cluster 106 is specially:
When the first virtual machine breaks down, the circuit switching for connecting first virtual machine is extremely connected into the second virtual machine Circuit equipment.
Above-mentioned first virtual machine and the second virtual machine are not the title of some specific virtual machine, but will break down Virtual machine to be collectively referred to as be the first virtual machine, it is the second virtual machine that the new virtual machine being switched to, which is collectively referred to as,.Work as switching To new virtual machine be also the virtual machine of failure when, the circuit of connecting fault virtual machine can be continued to switch to connection another The circuit of new virtual machine, untill the new virtual machine being switched to is available virtual machine.Now the virtual machine of failure according to The first virtual machine so is referred to as, the new virtual machine being switched to is referred to as the second virtual machine.
Certainly before the second virtual machine is switched to, first the second virtual machine being switched to can be estimated, if the Two virtual machines do not break down, then will connect the circuit switching of the first virtual machine to the circuit of the second virtual machine of connection;If the Two virtual machines there occurs failure, then continue to be estimated next new virtual machine, not have faulty void until finding one After plan machine, do not have faulty virtual machine to be referred to as the second virtual machine, then the circuit switching that the first virtual machine will be connected by described To the circuit for connecting the second virtual machine.The above-mentioned evaluation work to new virtual machine is typically by Storage Virtualization gateway cluster 106 It is estimated or judges, naturally it is also possible to is miscellaneous equipment other modules are estimated either judgement, nothing By being that what equipment is estimated to virtual machine, the realization of the embodiment of the present invention is not interfered with.
When the first storage device breaks down, the circuit switching for connecting first storage device is deposited to connecting second The equipment for storing up the circuit of equipment.
Above-mentioned first storage device and the title that the second storage device is not some specific storage device, but will hair It is the first storage device that the storage device of raw failure, which is collectively referred to as, and it is second to deposit that the new storage device being switched to, which is collectively referred to as, Store up equipment.
The step of remaining switching, is switched to that the second virtual machine is similar with above-mentioned first virtual machine, and specific situation refer to The description as described in switching virtual machine circuit is stated, be will not be described here.
When the first Storage Virtualization gateway failure, the circuit switching of the first Storage Virtualization gateway will be connected To the equipment of the circuit of the second Storage Virtualization gateway of connection.
Above-mentioned first Storage Virtualization gateway and the second Storage Virtualization gateway are not some specific Storage Virtualization The title of gateway, but it is the first Storage Virtualization gateway that the Storage Virtualization gateway broken down, which is collectively referred to as, will be switched To new Storage Virtualization gateway to be collectively referred to as be the second Storage Virtualization gateway.
The step of remaining switching, is switched to that the second virtual machine is similar with above-mentioned first virtual machine, and specific situation refer to The description as described in switching virtual machine circuit is stated, be will not be described here.
When an error occurs, Storage Virtualization gateway cluster can be by the circuit switching of connecting fault equipment to connecting remaining Identical equipment is to ensure the operation of total system;After fault restoration, Storage Virtualization gateway cluster can also cut circuit Gain original equipment.
In embodiments of the present invention, the virtualization gateway cluster 106 is additionally operable to:
When fault restoration is completed, the circuit switching that connects second website will be switched to connecting first website Circuit.
When fault restoration is completed, the circuit switching for connecting the second website can be returned by connection first by staff manually The circuit of website or from staff repaired failure it is rear to virtualization gateway cluster 106 sent reparation The information of completion, or the instruction for the circuit that the circuit switching that connect the second website can be connected to the first website is sent, afterwards After virtualization gateway cluster 106 receives information or instructed, connection will be switched to by virtualization gateway cluster 106 described The circuit switching of second website is to the circuit for connecting first website.
Certainly, the information that above-mentioned reparation has been completed either is sent out can connect first by the circuit switching that connect the second website The circuit of website instruction can staff send or sent automatically by the equipment being repaired, herein It is not specifically limited.
Further, it can also be and first website gone out by the automatic decision of Storage Virtualization gateway cluster 106 It is repaired, such as after virtualization gateway cluster 106 receives the data that faulty equipment is sent in the first website, will switches To connecting the circuit switching of second website to the circuit for connecting first website.In order to judge whether faulty equipment is true Repaired completely, a time threshold can also be preset, when virtualization gateway cluster 106 continuously receive by first stop When the time span for the data that faulty equipment is sent reaches threshold value set in advance in point, connection second website will be switched to Circuit switching to the circuit for connecting first website, particular situation is for example:When virtualization gateway cluster 106 is received by the Start timing during the data that faulty equipment is sent in one website, when the data for continuously receiving the faulty equipment transmission in the first website Time span reach 10 seconds when, virtualization gateway cluster 106 by switch to connect second website circuit switching extremely Connect the circuit of first website.Except setting time threshold value, failure in capacity threshold, i.e. the first website can also be set and set When standby transmission or the size of data downloaded reach capacity threshold set in advance, virtualization gateway cluster 106 will switch to connection The circuit switching of second website is to the circuit for connecting first website.It is, of course, also possible to other threshold values are set, or It is that the circuit switching for connecting second website will be switched to the circuit for connecting first website by other method, herein It is not especially limited.
In embodiments of the present invention, if being extremely to connect the circuit switching for connecting whole failure website when breaking down The circuit of new website, when fault restoration is completed, virtualization gateway cluster 106 will just switch to connection second website Circuit switching is to the circuit for connecting first website;If being to cut the circuit for connecting specific faulty equipment when breaking down The circuit for connecting remaining identical equipment is shifted to, in fault restoration, gateway cluster 106 is correspondingly virtualized with regard to connection will be switched to The circuit switching of remaining the identical equipment is to the circuit for connecting original faulty equipment.
Concretely, if the first virtual machine fault restoration complete when, Storage Virtualization gateway cluster 106 specifically for By the circuit switching of connection second virtual machine to the circuit for connecting first virtual machine;If in the event of the first storage device Barrier is repaired when completing, Storage Virtualization gateway cluster 106 specifically for the circuit switching by second storage device is connected extremely Connect the circuit of first storage device;If when the fault restoration of the first Storage Virtualization gateway is completed, Storage Virtualization Gateway cluster 106 is empty to first storage is connected specifically for the circuit switching that will connect the second Storage Virtualization gateway The circuit of planization gateway.
Second of disaster tolerance system based on virtual platform that the embodiment of the present invention is provided, when some specific equipment When breaking down, the circuit switching of specific faulty equipment can will be connected to the circuit for connecting other identical equipment, and without will The circuit of the overall website of connection switches over, it is to avoid the waste of resource in total system.When faulty equipment reparation is completed, more Virtualization gateway cluster 106 can will switch to the circuit that connects the new equipment and cut automatically in the further embodiment of the present invention The circuit for connecting original faulty equipment is shifted to, to ensure the orderly work of total system, it is to avoid resource in total system Waste.
Due in disaster tolerance system provided by the present invention, being generally to judge each by virtualizing gateway cluster 106 Whether equipment breaks down.But it is due to be judged by cluster, it is possible that the situation of fissure, i.e., multiple websites are mutual Mutually judge into other side to break down, worked independently so that whole system is split into several mini systems.When the above-mentioned situation of appearance When, it is necessary to arbitration node plays a role.Being related to the concrete condition of arbitration node will be described in detail in the following embodiments.
It refer to Fig. 4, the structural representation for the third disaster tolerance system that Fig. 4 is provided by the embodiment of the present invention, the present invention The system that embodiment is provided connects compared with the system that embodiment before is provided between multiple Storage Virtualization gateways 105 Arbitration node 401 is connected to, i.e., adds arbitration node 401 in virtualization gateway cluster 106, the hair for preventing fissure situation It is raw.
Due in disaster tolerance system provided by the present invention, being generally to judge each by virtualizing gateway cluster 106 Whether equipment breaks down.But it is due to be judged by cluster, it is possible that following situations:Multiple websites are sentenced mutually It is broken into other side to break down, is worked independently so that whole system is split into several subsystems.
Above-mentioned situation is that system there occurs fissure.After fissure occurs for system, it may be sent out between different subsystems Shared resource is fought in life, the problems such as reading while write shared storage, and above mentioned problem can be caused greatly to server and data storage Damage.
Disaster tolerance system provided by the present invention, arbitration node 401 is connected between multiple Storage Virtualization gateways 105, Arbitration node 401 is added in virtualization gateway cluster 106, the occurrence of for preventing fissure.The arbitration node 401 arbitration mode has many kinds, such as when there is fissure situation, is voted by the arbitration node 401 with odd number property Mode judge which Storage Virtualization gateway is correct, now subsequent action can be judged just by above-mentioned True Storage Virtualization gateway is continued executing with, and prevents the situation of fissure from occurring with this;When there is fissure situation, arbitration node 401 can also reacquire fault message and judge which Storage Virtualization gateway is correct, and now subsequent action can Being continued executing with by arbitration node 401, prevent the situation of fissure from occurring with this.
In addition to the workflow of above-mentioned arbitration node, there can also be the arbitration node 401 of other working methods.No matter Using which kind of arbitration node, as long as can prevent the situation of fissure from occurring.Certainly in the present invention, can also be without using secondary Node 401 is cut out, such as during concrete application, when there is Storage Virtualization 105 failure of gateway, virtualizes gateway cluster 106 selection odd number Storage Virtualization gateways 105 judge whether the Storage Virtualization gateway there occurs in the way of ballot Failure.Now due to fissure situation will not be produced, arbitration node 401 can not also be added in Storage Virtualization gateway cluster. So, specific system architecture is not specifically limited herein depending on particular situation.
The third disaster tolerance system based on virtual platform that the embodiment of the present invention is provided, when there is fissure situation, Can be arbitrated by arbitration node 401, fissure is avoided the occurrence of with this, it is to avoid fissure for system server with And the destruction of data storage.
It refer to a kind of flow chart for disaster recovery method that Fig. 5 and Fig. 6, Fig. 5 are provided by the embodiment of the present invention;Fig. 6 is The flow chart of the specific implementation method of step 102 in Fig. 5.
The disaster recovery method that the embodiment of the present invention is provided is applied to one kind described in the above-described embodiments and is based on virtually Change the disaster tolerance system of platform, the system to be described in detail, will not be repeated here in the above-described embodiments, concrete condition referring to Above-described embodiment.
The disaster recovery method that the embodiment of the present invention is provided, is specifically included:
Step 101:Described in the first website during device fails, the Storage Virtualization gateway cluster obtains described The information of device fails in first website.
In this step, the method for obtaining fault message has many kinds, for example, when device fails, can send Fault message, when virtualization gateway cluster receives the fault message sent by faulty equipment, by the circuit of connecting fault website The circuit for connecting other websites is switched to, to ensure the operation of whole system;Or be due to equipment occur failure ratio compared with Seriously, it is impossible to when sending data to virtualization gateway cluster, virtualization gateway cluster is opened when the said equipment data are interrupted The said equipment, when the time of data outage reaching threshold value set in advance, is just judged as having occurred and that failure by beginning timing, will Connection includes the circuit switching of the website of above-mentioned faulty equipment to the circuit of connection new site.For example when the data outage of equipment Between be up to 10 seconds when, will connect include above-mentioned faulty equipment website circuit switching to connect new site circuit, with guarantor Demonstrate,prove the normal operation of whole system.Above-mentioned threshold value set in advance can be the duration that data are once interrupted, and can also be one The number of times that the device data is interrupted in the section time, such as, equally will be upper when the data of equipment have interrupted 5 times in 60 seconds State equipment to be judged as having occurred and that failure, the circuit switching for connecting the website for including above-mentioned faulty equipment is extremely connected into new site Circuit.In a practical situation, the situation of other failures can also be met, specific determination methods are also not limited to above-mentioned three Kind, but which kind of no matter judges the method for equipment fault using, do not influence the realization of the present invention.
Under normal conditions, usually above-mentioned three kinds of methods are used in combination, i.e., when Storage Virtualization gateway cluster is received The information that faulty equipment is sent, or when the time of data outage reaching threshold value set in advance, or when in data When disconnected number of times reaches threshold value set in advance within a period of time, occur any one of above-mentioned three kinds of situations, all will connection The circuit switching of website including faulty equipment is to the circuit for connecting new site, to ensure the normal operation of whole system.Certainly One kind in above-mentioned three kinds of methods can also be selected, is not specifically limited in embodiments of the present invention.
Step 102:The Storage Virtualization gateway cluster will connect the circuit switching of first website to connecting second The circuit of website.
In this step, the description as described in the first website and the second website, and the first website switch to the second website Step is described in detail in the above-described embodiments, not reinflated herein to elaborate.
In this step, it is to cut the circuit for connecting whole website when some device fails in the first website Shift to the circuit of the second website of connection.More specifically, only will connection event when can work as the device fails of some in website Hinder the circuit switching of equipment to the circuit for connecting remaining same fault equipment, with this come the resource for the system of saving.Specific steps are such as It is lower described:
Step 201:When the first virtual machine breaks down, it is empty that the Storage Virtualization gateway cluster will connect described first Circuit of the circuit switching of plan machine to the second virtual machine of connection.
In step and following step 202 and step 203, on the first virtual machine, the second virtual machine, the first storage is set Standby, the second storage device, the first Storage Virtualization gateway, the second Storage Virtualization gateway has been done in detail in the above-described embodiments Describe in detail it is bright, it is not reinflated herein to be repeated.
Step 202:When the first storage device breaks down, the Storage Virtualization gateway cluster will connect described first Circuit of the circuit switching of storage device to the second storage device of connection.
Step 203:When the first Storage Virtualization gateway failure, the Storage Virtualization gateway cluster will connect institute The circuit switching of the first Storage Virtualization gateway is stated to the circuit of the second Storage Virtualization gateway of connection.
Above-mentioned steps 201, to the step performed by step 203, are, when the device fails of some in website, only will The circuit switching of connecting fault equipment is to the circuit for connecting remaining same fault equipment, with this come the resource for the system of saving.
Further say, when the equipment broken down is repaired completion, virtualization gateway cluster will can be switched to The circuit switching of second website is connected to the circuit for connecting first website.
By above-mentioned switch step, when faulty equipment reparation is completed, virtualization gateway cluster will can be switched to automatically The circuit switching of the new equipment is connected to the circuit for connecting original faulty equipment, to ensure the orderly work of total system Make, it is to avoid the waste of resource in total system.Specific step of repairing is described in detail in the above-described embodiments, detailed feelings Condition refer to above-described embodiment, will not be repeated here.
A kind of disaster recovery method based on virtual platform that the embodiment of the present invention is provided, applied to institute in above-described embodiment The disaster tolerance system of description, the disaster recovery method that the embodiment of the present invention is provided can be when system breaks down, timely will connection The circuit of faulty equipment timely switches to the circuit for connecting remaining identical equipment, ensures the continuity of enterprise objective with this, So as to avoid the loss that failure is brought to enterprise.After system repair completion, it can will further connect new automatically The circuit switching of equipment is to the circuit for connecting former faulty equipment, to ensure the orderly work of total system, it is to avoid total system The waste of middle resource.
Described above, above example is only to the technical scheme for illustrating the application, rather than its limitations;Although with reference to before Embodiment is stated the application is described in detail, it will be understood by those within the art that:It still can be to preceding State the technical scheme described in each embodiment to modify, or equivalent substitution is carried out to which part technical characteristic;And these Modification is replaced, and the essence of appropriate technical solution is departed from the spirit and scope of each embodiment technical scheme of the application.

Claims (10)

1. a kind of disaster tolerance system based on virtual platform, it is characterised in that the system includes multiple websites, described in each Website includes cluster virtual machine, Storage Virtualization gateway and storage device;
The cluster virtual machine is connected with the Storage Virtualization gateway, the Storage Virtualization gateway and the storage device It is connected, the cluster virtual machine includes multiple virtual machines, multiple Storage Virtualization gateways are connected with each other to constitute storage Gateway cluster is virtualized, when the device fails in the first website, the Storage Virtualization gateway cluster will connect described The circuit switching of first website extremely connects the circuit of the second website;
Data syn-chronization is kept between multiple storage devices.
2. system according to claim 1, it is characterised in that the Storage Virtualization gateway cluster specifically for:
When the first virtual machine breaks down, the circuit switching of first virtual machine will be connected to the line of the second virtual machine of connection Road.
3. system according to claim 2, it is characterised in that the Storage Virtualization gateway cluster specifically for:
When the first storage device breaks down, the circuit switching for connecting first storage device is set to the second storage is connected Standby circuit.
4. system according to claim 3, it is characterised in that the Storage Virtualization gateway cluster specifically for:
When the first Storage Virtualization gateway failure, the circuit switching of the first Storage Virtualization gateway will be connected to even Connect the circuit of the second Storage Virtualization gateway.
5. the system according to any one of Claims 1-4 claim, it is characterised in that the Storage Virtualization gateway is also For:
When the fault restoration of first website is completed, the circuit switching of second website will be connected to connecting described first The circuit of website.
6. system according to claim 1, it is characterised in that arbitration section is provided between multiple Storage Virtualization gateways Point, the occurrence of for preventing fissure.
7. a kind of disaster recovery method based on virtual platform, it is characterised in that including:
When the device fails of the first website in multiple websites, the Storage Virtualization gateway cluster obtains described first The information of the device fails of website;Wherein, the website is set including cluster virtual machine, Storage Virtualization gateway and storage Standby, the cluster virtual machine includes multiple virtual machines, and multiple Storage Virtualization gateways are connected with each other virtual to constitute storage Change and keep data syn-chronization between gateway cluster, multiple storage devices;
The Storage Virtualization gateway cluster by connect the circuit switching of first website to connect in the website second The circuit of website.
8. method according to claim 7, it is characterised in that the Storage Virtualization gateway cluster will connect the first stop The circuit switching of point includes to the circuit for connecting the second website in the website:
When the first virtual machine breaks down, the Storage Virtualization gateway cluster cuts the circuit for connecting first virtual machine Change to the circuit of the second virtual machine of connection.
9. method according to claim 8, it is characterised in that the Storage Virtualization gateway cluster will connect the first stop The circuit switching of point includes to the circuit for connecting the second website in the website:
When the first storage device breaks down, the Storage Virtualization gateway cluster will connect the line of first storage device Road is switched to the circuit of the second storage device of connection.
10. method according to claim 9, it is characterised in that the Storage Virtualization gateway cluster will connect described first The circuit switching of website includes to the circuit for connecting the second website in the website:
When the first Storage Virtualization gateway failure, it is empty that the Storage Virtualization gateway cluster will connect first storage Circuit of the circuit switching of planization gateway to the second Storage Virtualization gateway of connection.
CN201710354793.0A 2017-05-18 2017-05-18 A kind of disaster tolerance system based on virtual platform, method Pending CN107168830A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710354793.0A CN107168830A (en) 2017-05-18 2017-05-18 A kind of disaster tolerance system based on virtual platform, method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710354793.0A CN107168830A (en) 2017-05-18 2017-05-18 A kind of disaster tolerance system based on virtual platform, method

Publications (1)

Publication Number Publication Date
CN107168830A true CN107168830A (en) 2017-09-15

Family

ID=59815498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710354793.0A Pending CN107168830A (en) 2017-05-18 2017-05-18 A kind of disaster tolerance system based on virtual platform, method

Country Status (1)

Country Link
CN (1) CN107168830A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108880874A (en) * 2018-06-06 2018-11-23 郑州云海信息技术有限公司 A kind of pair of server virtualization platform carries out the method, apparatus and equipment of disaster tolerance
CN109104319A (en) * 2018-08-24 2018-12-28 郑州云海信息技术有限公司 A kind of data storage device and method
CN109491838A (en) * 2018-11-01 2019-03-19 郑州云海信息技术有限公司 A kind of method and device handling virtual-machine data
CN110113192A (en) * 2019-04-23 2019-08-09 深信服科技股份有限公司 Route selecting method, routing device, system, storage medium and the device of virtual desktop
CN111865632A (en) * 2019-04-28 2020-10-30 阿里巴巴集团控股有限公司 Switching method of distributed data storage cluster and switching instruction sending method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103209218A (en) * 2013-04-23 2013-07-17 深圳市京华科讯科技有限公司 Management system for disaster-tolerant all-in-one machine
CN103257908A (en) * 2013-05-24 2013-08-21 浪潮电子信息产业股份有限公司 Software and hardware cooperative multi-controller disk array designing method
CN204836244U (en) * 2015-08-25 2015-12-02 杭州九州方园科技有限公司 Use two data center of living systems
CN105812191A (en) * 2016-04-28 2016-07-27 杭州华三通信技术有限公司 Disaster recovery switching method and device
CN205644550U (en) * 2016-04-06 2016-10-12 乌鲁木齐领航科技有限公司 Hospital's information disaster recovery and backup systems
CN106034037A (en) * 2015-03-13 2016-10-19 腾讯科技(深圳)有限公司 Disaster recovery switching method and device based on virtual machine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103209218A (en) * 2013-04-23 2013-07-17 深圳市京华科讯科技有限公司 Management system for disaster-tolerant all-in-one machine
CN103257908A (en) * 2013-05-24 2013-08-21 浪潮电子信息产业股份有限公司 Software and hardware cooperative multi-controller disk array designing method
CN106034037A (en) * 2015-03-13 2016-10-19 腾讯科技(深圳)有限公司 Disaster recovery switching method and device based on virtual machine
CN204836244U (en) * 2015-08-25 2015-12-02 杭州九州方园科技有限公司 Use two data center of living systems
CN205644550U (en) * 2016-04-06 2016-10-12 乌鲁木齐领航科技有限公司 Hospital's information disaster recovery and backup systems
CN105812191A (en) * 2016-04-28 2016-07-27 杭州华三通信技术有限公司 Disaster recovery switching method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108880874A (en) * 2018-06-06 2018-11-23 郑州云海信息技术有限公司 A kind of pair of server virtualization platform carries out the method, apparatus and equipment of disaster tolerance
CN109104319A (en) * 2018-08-24 2018-12-28 郑州云海信息技术有限公司 A kind of data storage device and method
CN109104319B (en) * 2018-08-24 2021-12-03 郑州云海信息技术有限公司 Data storage device and method
CN109491838A (en) * 2018-11-01 2019-03-19 郑州云海信息技术有限公司 A kind of method and device handling virtual-machine data
CN110113192A (en) * 2019-04-23 2019-08-09 深信服科技股份有限公司 Route selecting method, routing device, system, storage medium and the device of virtual desktop
CN111865632A (en) * 2019-04-28 2020-10-30 阿里巴巴集团控股有限公司 Switching method of distributed data storage cluster and switching instruction sending method and device

Similar Documents

Publication Publication Date Title
CN107168830A (en) A kind of disaster tolerance system based on virtual platform, method
CN104794028B (en) A kind of disaster tolerance processing method, device, primary data center and preliminary data center
CN101741536A (en) Data level disaster-tolerant method and system and production center node
CN107147529A (en) A kind of data disaster tolerance system and method
CN103401749A (en) Failure recovery method for first node and host node in network
CN106850315B (en) Automatic disaster recovery system
CN103746841A (en) Failure recovery method and controller
JP2014241536A (en) Monitoring device, and monitoring method
JP2002510160A (en) System and method for increasing the robustness of an optical ring network
CN106888116B (en) Scheduling method of double-controller cluster shared resources
CN109391691A (en) The restoration methods and relevant apparatus that NAS is serviced under a kind of single node failure
CN103186348B (en) Storage system and data read-write method thereof
CN109597718A (en) A kind of disaster recovery platform and a kind of disaster recovery method
CN102484603B (en) Create the method and apparatus of redundancy logic connection and store automated system equipment
US8990619B1 (en) Method and systems to perform a rolling stack upgrade
CN111988169B (en) Method, system, equipment and medium for cleaning and repairing abnormal disk of cloud platform
CN101266566A (en) Multi- test scene automatic dispatch system and method
CN106027313B (en) Network link disaster tolerance system and method
CN107678891A (en) The dual control method, apparatus and readable storage medium storing program for executing of a kind of storage system
CN101714064A (en) Data access method and server
CN110708254B (en) Service processing method, control equipment and storage medium
US20030204539A1 (en) Facility protection utilizing fault tolerant storage controllers
CN112543113A (en) Method, device, equipment and medium for flexible Ethernet to respond to link failure
CN106020975A (en) Data operation method, device and system
CN106502831A (en) The method and device that a kind of image file is replicated

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170915