CN111367711A - Safety disaster recovery method based on super fusion data - Google Patents
Safety disaster recovery method based on super fusion data Download PDFInfo
- Publication number
- CN111367711A CN111367711A CN201811601627.7A CN201811601627A CN111367711A CN 111367711 A CN111367711 A CN 111367711A CN 201811601627 A CN201811601627 A CN 201811601627A CN 111367711 A CN111367711 A CN 111367711A
- Authority
- CN
- China
- Prior art keywords
- super
- data
- fusion
- disaster recovery
- cloud
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000011084 recovery Methods 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 title claims abstract description 18
- 230000004927 fusion Effects 0.000 title claims abstract description 10
- 230000002159 abnormal effect Effects 0.000 claims abstract description 5
- 230000005540 biological transmission Effects 0.000 claims abstract description 5
- 230000007246 mechanism Effects 0.000 claims abstract description 5
- 238000007726 management method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000008439 repair process Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000032683 aging Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1464—Management of the backup or restore process for networked environments
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention provides a safety disaster recovery method based on super-fusion data, which comprises a service system, a super-fusion all-in-one machine, cloud storage and a computer; the method comprises the following steps: s1, setting the service system on cloud storage; s2, establishing a resource pool by adopting a virtualization and distributed storage architecture and a computing mechanism through a super-fusion all-in-one machine; s3, configuring at least one computer to set an equivalent operation environment; s4, connecting and safely isolating the service system and each corresponding layered module through a virtual network; s5, when the current operating system or data of any layered module of the service system has abnormal fault, the data is stored and backed up in the cloud in an encrypted transmission mode, and meanwhile, the standby computer is automatically switched to continue to operate. According to the invention, a key service operation environment is constructed by integrating local super fusion and cloud data encryption disaster recovery modes, and a safe, reliable and stable key service guarantee and data heterogeneous ground cloud disaster recovery comprehensive solution is provided.
Description
Technical Field
The invention relates to the technical field of hardware expansion, in particular to a safety disaster recovery method based on super fusion data.
Background
The traditional physical equipment service has the defects of scattered management, low expandability, insufficient IO performance and incapability of ensuring service reliability to a greater extent, and the super integration is realized by integrating computing, storage, network and virtualized resources through a software defined infrastructure. The goal of the hyper-converged infrastructure is to provide an easier way to build a data center by integrating software defined storage and server virtualization to replace traditional SAN storage. Hyper-fusion focuses more on achieving data management and control based on low cost X86 servers. Therefore, high-availability fault-free operation of key services and data is guaranteed to a greater extent.
However, the security and data storage of the key services on the market at present have the following disadvantages:
1) aging of corresponding hardware environment, low performance and serious deficiency of IO performance
2) The stability of the single physical device is not enough, and the traditional dual-computer backup mode has the problem of temporary interruption of the service
3) Management dispersion, low expandability, and low data safety and reliability
4) The increase of the data volume of the subsequent service system and the number of the service visitors can quickly face the problem of performance bottleneck
5) The daily operation and maintenance workload is large, and professional technicians are required to perform daily maintenance.
It is obvious that the prior art has certain defects.
Disclosure of Invention
According to the invention, a key service operation environment is constructed by integrating local super fusion (capable of being used) and cloud data encryption disaster recovery modes, a safe, reliable and stable key service guarantee and a data heterogeneous cloud disaster recovery comprehensive solution are provided, old hardware resources are reused, the enterprise efficiency is improved, and the operation cost is saved.
In order to achieve the purpose, the invention provides the following technical scheme:
a safety disaster recovery method based on super-fusion data comprises a service system, more than one super-fusion all-in-one machine, a cloud storage and more than two computers; the method comprises the following steps:
s1, setting the service system on cloud storage;
s2, establishing a resource pool by adopting a virtualization and distributed storage architecture and a computing mechanism through a super-fusion all-in-one machine;
s3, configuring at least one computer to set the same operation environment as the standby;
s4, connecting and safely isolating the service system and each corresponding layered module through a virtual network;
s5, when the current operating system or data of any layered module of the service system has abnormal fault, the data is stored and backed up in the cloud in an encrypted transmission mode, and meanwhile, the standby computer is automatically switched to continue to operate.
Further, steps S1 are performed in parallel with S2.
Further, it is characterized in that more than two computers can be set in different places.
Furthermore, the system also comprises an old-fashioned storage gateway and a plurality of storage devices, wherein the storage devices are connected with the resource pool through the old-fashioned storage gateway.
According to the invention, a key service operation environment is constructed by integrating local super fusion (capable of being used) and cloud data encryption disaster recovery modes, a safe, reliable and stable key service guarantee and a data heterogeneous cloud disaster recovery comprehensive solution are provided, old hardware resources are reused, the enterprise efficiency is improved, and the operation cost is saved.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention. It should be noted that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
1. A safety disaster recovery method based on super-fusion data comprises a service system, more than one super-fusion all-in-one machine, a cloud storage and more than two computers; the method comprises the following steps:
s1, setting the service system on cloud storage;
s2, establishing a resource pool by adopting a virtualization and distributed storage architecture and a computing mechanism through a super-fusion all-in-one machine;
s3, configuring at least one computer to set the same operation environment as the standby;
s4, connecting and safely isolating the service system and each corresponding layered module through a virtual network;
s5, when the current operating system or data of any layered module of the service system has abnormal fault, the data is stored and backed up in the cloud in an encrypted transmission mode, and meanwhile, the standby computer is automatically switched to continue to operate.
The step S2 is designed to provide stable, reliable, extensible, safe and efficient data disaster recovery storage service for the key business service in the local super-fusion environment. The distributed resource scheduling method based on the software definition has the advantages that decentralized and elastic infrastructure support is provided for upper-layer application, all rules are defined on the software level based on software, distributed resource scheduling is achieved through virtualization, distributed storage and other technologies, all resources are pooled through the distributed architecture, and accordingly elimination, decentralized and unlimited expansion of single-point performance bottlenecks are achieved.
S5, when the current operation system or data has abnormal fault, the service system can switch automatically without sensing, and the data can be backed up and disaster-tolerant on the remote cloud by encryption transmission mode, so as to ensure that the service is not interrupted and the data is not lost.
And constructing a key service operation environment in a local super fusion (capable of using the old) and cloud data encryption disaster recovery mode. A safe, reliable and stable key service guarantee and data heterogeneous cloud disaster recovery comprehensive solution is constructed, namely, a super-fusion all-in-one machine X platform is adopted (the quantity is configured according to service requirements), a service system is virtualized and migrated to a cloud platform, and the service system and each corresponding layered module are connected and safely isolated through a virtualization network. Through the virtual storage component, a unified virtual storage resource pool can be constructed without external storage and with the help of a disk of each all-in-one machine, the requirements of business system transformation on data capacity and high IO performance are met, linear correspondence between business growth and platform performance expansion can be realized, and the existing storage equipment can be subjected to old utilization processing by combining with actual conditions.
The specific control nodes are deployed as follows:
the install command realizes the deployment of the control node, and the join command realizes the deployment of the container node.
Deploying master control nodes
#bash-c"$(docker run--rm daocloud.io/daocloud/CJY install)"
Deploying secondary control nodes
#bash-c"$(docker run--rm daocloud.io/daocloud/CJY install--force-pull--replica--replica-controller MASTER_CONTROLLER_IP)"
Deploying container nodes
#bash-c"$(docker run--rm daocloud.io/daocloud/CJY join --force-pullMASTER_CONTROLLER_IP)"
Three types of node objects can be deployed by using three commands, and the same node type command can be reused. Note that MASTER _ CONTROLLER _ IP is the IP address of the MASTER node.
Common faults include network interruption, power failure, server downtime, hard disk faults and the like, and Ceph can tolerate the faults and automatically repair the faults, so that the reliability of data and the availability of a system are ensured.
Monitors is the Ceph housekeeper, maintaining the global state of Ceph. Monitor functions similarly to zookeeper, which uses the Quorum and Paxos algorithms to establish a consensus of global states.
OSDs can perform automatic repair, and are parallel repairs.
When the OSD A detects that the OSD B does not respond, the OSD A reports to the Monitors that the OSDB can not be connected, and the Monitors marks the OSD B as a down state and updates the OSD Map. When M seconds have elapsed and the OSD B cannot be connected, monitor marks the OSD B as out (indicating that the OSD B cannot operate) and updates the OSDMap.
When one OSD in the OSD set corresponding to a PG is marked down (if a Primary is marked down, a certain replay becomes a new Primary and processes all read-write object requests), the PG is in an active + degraded state, that is, the number of valid copies of the current PG is N-1.
After M seconds, if the OSD cannot be connected, it is marked out and Ceph recalculates the PG to OSD set mapping (when a new OSD is added to the cluster, all PG to OSD set mappings are also recalculated), thus ensuring that the number of valid copies of PG is N.
Primary of the new OSD set collects PG log from the old OSD set to obtain an Australitic History (complete, full-sequence operation sequence), and lets other Replicas agree with the Australitic History (i.e. other Replicas agree on the status of all objects of PG), which is called Peering.
After the Peering process is completed, the PG enters an active + recovery state, and the Primary migrates and synchronizes the degraded objects to all replicas, so as to ensure that the copy number of the objects is N.
Grouping the objects into groups reduces the amount of metadata that needs to be tracked and processed (on a global level we do not need to track and process metadata and playback for each Object, only the metadata of the PG need to be managed.
Increasing the number of PGs can balance the load of each OSD and improve the parallelism.
And fault domains are separated, and the reliability of data is improved.
When Primary receives the write request of Object, it is responsible for sending data to other Replicas, and as long as this data is stored on all OSDs, Primary responds to the write request of Object, which ensures consistency of copies.
And (4) multiple copies of the data. Configurable per-pool replica policies and fault domain placement, support strong consistency.
There is no single point of failure. Many failure scenarios can be tolerated; preventing brain chapping; individual components may be rolled up and replaced online.
Detection and automatic recovery of all faults. Recovery does not require human intervention, and normal data access can be maintained during recovery.
And (6) recovering in parallel. The parallel recovery mechanism greatly reduces the data recovery time and improves the reliability of the data.
And (4) self-management. Easy to expand, upgrade and replace. When the component fails, the data is automatically copied again. The redistribution of data is done automatically when a component changes (adds/deletes).
As another variation of the above embodiment, steps S1 are performed in parallel with S2.
As another variation of the above embodiment, two or more computers may be remotely located.
As another variation of the above embodiment, the system further includes a used storage gateway and a plurality of storage devices, and the plurality of storage devices are connected to the resource pool through the used storage gateway.
According to the invention, a key service operation environment is constructed by integrating local super fusion (capable of being used) and cloud data encryption disaster recovery modes, a safe, reliable and stable key service guarantee and a data heterogeneous cloud disaster recovery comprehensive solution are provided, old hardware resources are reused, the enterprise efficiency is improved, and the operation cost is saved.
The above-mentioned embodiments only express one embodiment of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (4)
1. A safety disaster recovery method based on super-fusion data comprises a service system, more than one super-fusion all-in-one machine, a cloud storage and more than two computers; the method is characterized by comprising the following steps:
s1, setting the service system on cloud storage;
s2, establishing a resource pool by adopting a virtualization and distributed storage architecture and a computing mechanism through a super-fusion all-in-one machine;
s3, configuring at least one computer to set the same operation environment as the standby;
s4, connecting and safely isolating the service system and each corresponding layered module through a virtual network;
s5, when the current operating system or data of any layered module of the service system has abnormal fault, the data is stored and backed up in the cloud in an encrypted transmission mode, and meanwhile, the standby computer is automatically switched to continue to operate.
2. The disaster recovery method based on super fusion data security as claimed in claim 1, wherein steps S1 and S2 are performed in parallel.
3. The disaster recovery method based on super-fusion data security according to any one of claims 1 or 2, wherein two or more computers can be remotely located.
4. The disaster recovery method based on super fusion data security as claimed in claim 3, further comprising a used-by-the-old storage gateway and a plurality of storage devices, wherein the plurality of storage devices are connected with the resource pool through the used-by-the-old storage gateway.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811601627.7A CN111367711A (en) | 2018-12-26 | 2018-12-26 | Safety disaster recovery method based on super fusion data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811601627.7A CN111367711A (en) | 2018-12-26 | 2018-12-26 | Safety disaster recovery method based on super fusion data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111367711A true CN111367711A (en) | 2020-07-03 |
Family
ID=71208480
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811601627.7A Pending CN111367711A (en) | 2018-12-26 | 2018-12-26 | Safety disaster recovery method based on super fusion data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111367711A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112131185A (en) * | 2020-09-22 | 2020-12-25 | 江苏安超云软件有限公司 | Method and device for high availability of service in super-fusion distributed storage node |
CN112995335A (en) * | 2021-04-07 | 2021-06-18 | 上海道客网络科技有限公司 | Position-aware container scheduling optimization system and method |
-
2018
- 2018-12-26 CN CN201811601627.7A patent/CN111367711A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112131185A (en) * | 2020-09-22 | 2020-12-25 | 江苏安超云软件有限公司 | Method and device for high availability of service in super-fusion distributed storage node |
CN112131185B (en) * | 2020-09-22 | 2022-08-02 | 江苏安超云软件有限公司 | Method and device for high availability of service in super-fusion distributed storage node |
CN112995335A (en) * | 2021-04-07 | 2021-06-18 | 上海道客网络科技有限公司 | Position-aware container scheduling optimization system and method |
CN112995335B (en) * | 2021-04-07 | 2022-09-23 | 上海道客网络科技有限公司 | Position-aware container scheduling optimization system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9785691B2 (en) | Method and apparatus for sequencing transactions globally in a distributed database cluster | |
US7933987B2 (en) | Application of virtual servers to high availability and disaster recovery solutions | |
CN102640108B (en) | The monitoring of replicated data | |
US7318095B2 (en) | Data fail-over for a multi-computer system | |
CN102656565B (en) | Failover and recovery for replicated data instances | |
US8856091B2 (en) | Method and apparatus for sequencing transactions globally in distributed database cluster | |
US9280428B2 (en) | Method for designing a hyper-visor cluster that does not require a shared storage device | |
US8688773B2 (en) | System and method for dynamically enabling an application for business continuity | |
US8539087B2 (en) | System and method to define, visualize and manage a composite service group in a high-availability disaster recovery environment | |
CN111327467A (en) | Server system, disaster recovery backup method thereof and related equipment | |
KR20110044858A (en) | Maintain data indetermination in data servers across data centers | |
CN110912991A (en) | Super-fusion-based high-availability implementation method for double nodes | |
US7702757B2 (en) | Method, apparatus and program storage device for providing control to a networked storage architecture | |
JP2020021277A (en) | Information processing system, managing method for information processing system, and program | |
US8015432B1 (en) | Method and apparatus for providing computer failover to a virtualized environment | |
US11544162B2 (en) | Computer cluster using expiring recovery rules | |
US7120821B1 (en) | Method to revive and reconstitute majority node set clusters | |
CN111367711A (en) | Safety disaster recovery method based on super fusion data | |
Salapura et al. | Resilient cloud computing | |
CN103793296A (en) | Method for assisting in backing-up and copying computer system in cluster | |
CN104052799B (en) | A kind of method that High Availabitity storage is realized using resource ring | |
CN105487946A (en) | Fault computer automatic switching method and device | |
Neng | Construction of high-availability bank system in virtualized environments | |
CN109995560A (en) | Cloud resource pond management system and method | |
Salapura et al. | Enabling enterprise-level workloads in the enterprise-class cloud |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200703 |
|
WD01 | Invention patent application deemed withdrawn after publication |