CN101651710A - Disaster-tolerant backup method based on P2P - Google Patents

Disaster-tolerant backup method based on P2P Download PDF

Info

Publication number
CN101651710A
CN101651710A CN 200910092062 CN200910092062A CN101651710A CN 101651710 A CN101651710 A CN 101651710A CN 200910092062 CN200910092062 CN 200910092062 CN 200910092062 A CN200910092062 A CN 200910092062A CN 101651710 A CN101651710 A CN 101651710A
Authority
CN
Grant status
Application
Patent type
Prior art keywords
disaster recovery
node
disaster
server
data
Prior art date
Application number
CN 200910092062
Other languages
Chinese (zh)
Inventor
楠 姜
健 王
Original Assignee
北京工业大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Abstract

The invention discloses a disaster-tolerant backup method based on P2P. The method comprises the following steps that a network of a disaster-tolerant backup system to be configured is divided into aplurality of logic regions; each logic region includes at least one disaster-tolerant backup server and a plurality of nodes; the disaster-tolerant backup server allocates one unique identity for eachnode added to the disaster-tolerant backup system and takes charge of saving a node information list and a disaster-tolerant backup data list; each node mutually executes the data backup according tothe node information list and the disaster-tolerant backup data list saved on the disaster-tolerant backup server; and after the disaster happens, the disaster-tolerant backup server recovers data broken by the disaster from the nodes not suffering the disaster according to the two lists. The invention utilizes a P2P technology and fully utilizes the spare resource of each node in the disaster-tolerant backup system to perform disaster-tolerant backup to recover data and services of other nodes so as to reduce the cost of the disaster-tolerant backup system and also improve the resource utilization ratio of the disaster-tolerant backup system.

Description

基于P2P的容灾备份方法 P2P-based disaster recovery methods

技术领域 FIELD

本发明属于灾备领域,具体涉及一种利用P2P (端到端,Peer toPeer)技术,对电子信息和服务进行容灾备份的方法。 The present invention is in the field of disaster recovery, particularly relates to a P2P (end, Peer toPeer) technique, a method for electronic information and disaster recovery services.

背景技术 Background technique

隨着信息技术的不断发展,越来越多的企事业单位用计算机系统进行快速的数据存储和处理,并对单位内外提供应用和服务。 With the continuous development of information technology, more and more enterprises and institutions for fast data storage and processing computer systems, and applications and services inside and outside the unit. 这使得企事业单位的正常运行越来越依赖于计算机系统, 一旦遇到能够造成计算机系统瘫痪的不可知灾难,整个单位就会陷入瘫痪。 This makes the normal operation of enterprises and institutions increasingly dependent on computer systems in the event of a disaster can cause unknown computer system failures, the entire unit will be paralyzed. 例如美国"9.11"事件发生后,世贸大厦内有超过80%的公司因为数据丢失和 For example, after the United States "9.11" incident, inside the World Trade Center more than 80 percent of the company because of data loss and

服务中断而倒闭。 Service interruptions collapse.

容灾备份,简称灾备,是避免这种情况发生的有效方法。 Disaster recovery, referred to as disaster recovery, is an effective way to avoid this to happen. 所谓灾备是指在灾难发生吋,能够保证数据尽量少丢失,系统不间断运行或者尽快恢复正常运行。 The so-called disaster recovery refers inches occurred in the disaster, to ensure that as little data loss, non-stop operation or return to normal operation as quickly as possible. 灾备一般是通过数据或者硬件的冗余来实现。 Disaster recovery is generally achieved by the hardware or data redundancy. 目前常用的灾备系统有一个共同的特点:备份系统与正常运行系统在物理上是分开的。 The commonly used disaster recovery systems have one thing in common: the normal operation of the system and backup system are physically separated. 部署一套正常运行系统,就要另外再部署一套,甚至两套、三套备份系统,而且这些备份系统多数情况下是闲置的,只有在发生灾难吋才起作用。 Deployment of a normal operation of the system, it is necessary to deploy an additional set, even two sets, three sets of backup systems, and in most cases these backup system is idle, it works only in inches disaster. 这使得构建灾备系统成本高昂,并且灾备系统资源利用率低,这是灾备系统的两个重要缺点。 This makes building disaster recovery system costly, and disaster recovery system resource utilization is low, which are two important drawbacks disaster recovery system.

为了克服这两个缺点,硏究人员想出了很多办法。 To overcome these shortcomings, WH The researchers came up with a lot of ways. 比如建立数据中心,数据中心的实质是一个集中式的备份系统,它可以同吋为多个正常运行系统提供灾备服务,降低了单独建立灾备系统的成本。 For example, a data center, the essence of the data center is a centralized backup system that can provide disaster recovery systems running multiple services, reducing the cost of setting up a separate disaster recovery system with inches. 但是这种方法仍然存在资源利用率低的问题,而且又产生了数据中心自身灾备的问题。 However, this method still have a problem of low utilization of resources, and also creates a problem of disaster recovery data center itself.

虚似化技术是另一种克服灾备系统两个缺点的方法。 Virtual method similar technology is another way of overcoming the two disadvantages disaster recovery system. 虚似化技术既可以把一个物理的灾备服务器分割成若干个独立的虚似灾备服务器,又可以把若干个分散的物理灾备服务器虚拟为一个大的逻辑灾备服务器。 Virtual similar technology can be either a physical disaster recovery server is divided into several independent virtual servers like disaster recovery, but also the number of virtual servers distributed physical disaster recovery of a large logical disaster recovery server. 从而可以根据实际应用的需要,灵活配置逻辑灾备服务器的大小,以取得较优的系统性能。 According to the application needs can be flexibly configured size of the logical disaster recovery server to obtain superior performance. 但是在系统部署之初一旦配置好逻辑灾备服务器,后面就不容易修改。 But once configured logical disaster recovery server, followed by the deployment of the system is not easy to modify the beginning. 而且没有灾难发生吋,灾备系统仍然处于闲置状态。 And there is no disaster inch, disaster recovery system remains idle.

第三种常用方法是集群技术。 The third method is commonly used clustering technology. 集群就是一组计算机,组成集群的单个计算机是集群的节点。 Cluster is a group of computers, a cluster consisting of a single computer is a cluster node. 一个集群包含多台服务器,各节点服务器通过内部局域网相互通讯。 A cluster with multiple servers, each server nodes communicate with each other via the internal LAN. 当一台节点服务器发生故障吋,这台服务器上所运行的应用程序将在另一节点服务器上被自动接管。 When a node fails inch server, application server running on this will be automatically taken over on another server node. 集群技术能够自动进行负载均衡,提高了灾备系统利用率。 Clustering technology to automatically load balancing, improved disaster recovery system utilization. 但是集群系统需单独组网, 一方面成本较高,另一方面不适合大规模应用。 But the cluster system needs a separate network, on the one hand higher costs, on the other hand is not suitable for large-scale applications.

这些方法虽然能够在一定程度上缓解灾备系统的两个重要缺点,但是都不能从根本上解决问题。 Although these methods can alleviate two major disadvantages disaster recovery system to some extent, but not fundamentally solve the problem fundamentally.

发明内容 SUMMARY

为了解决己有灾备系统成本高昂、利用率低的问题,本发明提出 In order to solve the disaster recovery system already have a high cost, low utilization problems, the present invention proposes

一种利用P2P技术,对电子信息和服务逬行容灾备份的方法。 Utilizing P2P technology, electronic information services and disaster recovery method Beng line. 注意到P2P技术的特点在于网络中的节点,即计算机、服务器或者其他具有计算和存储能力的终端设备,既可以获取其它节点的资源或服务,同吋又是资源或服务的提供者。 P2P technology characteristics noted that nodes in the network, i.e. a computer, server, or other terminal equipment having a computing and storage capabilities, other nodes may be acquired resources or services, but also with the inch or service provider resources. 因此可以基于P2P技术构建 So you can build based on P2P technology

新型灾备系统,该灾备系统最根本的思想在于:正常运行系统中的节点旣可以在系统正常运行吋发挥应有的作用,又可以利用闲置的硬盘、内存、CPU等资源灾备其他节点的数据和服务。 The new disaster recovery system, the disaster recovery system is the most fundamental idea is: the normal operation of the system nodes Ji can play its due role in the normal operation inch, they can make use of idle hard drives, memory, CPU and other resources to the other node disaster recovery data and services. 本发明的技术方案是: Aspect of the present invention is:

灾备系统的部署:根据网络规模、地理位置、灾备强度等,将待部署灾备系统的网络划分为若干个逻辑区域,逻辑区域的个数至少1个;每个逻辑区中新放置或者从已有节点中选择至少1个节点,做为灾备服务器,其他节点作为普通节点; Deployment disaster recovery system: according to the network size, location, and other disaster recovery strength, dividing the network into disaster recovery system is to be deployed a number of logical areas, the number of at least one logic region; each logical region or a newly placed select from among at least one node in the node, as a disaster recovery server, other nodes as an ordinary node;

软件的安装:灾备服务器上装有灾备服务器软件和数据库软件; Software installation: with a disaster recovery server software and database software on a disaster recovery server;

普通节点上装有灾备客户端软件;所述的数据库软件负责记录与灾备 With disaster recovery client software on ordinary nodes; the database software responsible for recording and disaster recovery

系统相矢的数据,其中主要有4个数据表:服务器信息表、节点信息 Phasor data system, wherein there are four main data tables: a server information table, the node information

表、节点状态表、灾备数据表;服务器信息表中记录了所有灾备服务器的网络地址;节点信息表中记录了灾备系统中所有普通节点的灾备 Table, the node status table, the data table disaster; server information table of the addresses of all the network disaster recovery server; node information table in the disaster recovery system disaster all common nodes

系统标识IDn。 System identification IDn. de、加入吋间等信息,其中灾备系统标识ID^e是由服 de, and the like added information between inches, where disaster recovery system is identified by the server ID ^ e

务器分配给节点的;节点状态表中记录了节点登陆吋间、登出吋间、闲置资源状况、网络地址等信息;灾备数据表中记录了灾备过的所有 Service assigned to a node; node state table recorded landing inch between nodes, inter-inch logout, idle resources, the network address information; disaster recovery data table records all over the disaster

数据块的来源节点的IDn。 IDn source node of the data block. de、接收节点的IDn。 de, IDn receiving node. de、大小、备份吋间、备 De room, the size of the backup inch, prepared

份次数等信息;所述的灾备服务器软件负责管理和实施灾备服务器的容灾备份和容灾恢复,以及管理普通节点的容灾备份和容灾恢复;所述的灾备客户端软件负责实施普通节点的容灾备份和容灾恢复;数据的灾备以数据块为单位进行; Parts number and other information; a disaster recovery and management server software is responsible for implementation of disaster recovery backup and disaster recovery server disaster recovery backup and disaster recovery and management of disaster recovery of ordinary nodes; the client software is responsible for disaster recovery embodiment ordinary node disaster recovery backup and disaster recovery; disaster recovery data in units of data block;

灾备服务器的容灾备份过程:当一台灾备服务器的数据库中的数 Disaster recovery process disaster recovery server: disaster recovery when a database server in the number of

据发生变化吋,灾备服务器软件根据服务器信息表中记录的所有服务器的网络地址,將发生变化数据广播给其他灾备服务器,其他灾备服务器收到后更新自己的数据库,以保证所有灾备服务器数据的一致 It changes inches, disaster recovery server software based on the network addresses of all servers in the server information table, and change data broadcast to other disaster recovery server will occur, update your database server receives disaster after another, to ensure that all disaster recovery consistent server data

性,即灾备服务器上的数据得到了容灾备份; Resistance, i.e., the data obtained on the disaster recovery server disaster recovery;

灾备服务器的容灾恢复过程:待恢复的灾备服务器向随机选择的 Disaster recovery server disaster recovery process: to be restored to the disaster recovery server randomly selected

一个正常工作的灾备服务器发送灾备恢复请求;正常工作的灾备服务器将自己的数据库中的全部内容发送给待恢复的灾备服务器,待恢复的灾备服务器接收并存储; Disaster recovery for a working server transmits a request disaster recovery; disaster recovery server transmits the normal operation of the entire contents of its own database server to be restored disaster, the disaster recovery server to be received and stored;

普通节点的登陆过程:普通节点的灾备客户端软件向一个隨机选 Login process common node: disaster recovery client software common to a randomly selected node

择的灾备服务器发送加入请求,灾备服务器收到请求之后为该节点生 Optional disaster recovery server sends a join request, then the requesting node receives the disaster recovery server for Health

成一个全灾备系统唯一的标识IDn。 Into a full disaster recovery system unique identifier IDn. de,并将其发送给该节点保存;同 de, and sends it to the storage node; with

吋将该节点的标识IDn。 The node identifier IDn inch. de和加入吋间等信息记入自己的节点信息表; Inch and de is added between the information recorded in their own node information table;

普通节点的容灾备份过程: 一个普通节点通过灾备客户端软件,用自己的IDn。 Disaster recovery processes common node: The node via a common disaster recovery client software with their own IDn. de登陆灾备系统,即允许灾备客户端软件搜集该节点的网络地址和闲置资源情况,并告知一个隨机选择的灾备服务器,该灾备服务器将该节点的登陆吋间、闲置资源情况、网络地址记录在自己的节点状态表中;该普通节点向一个隨机选择的灾备服务器提出灾备请求,灾备请求中至少包括节点标识IDn。 de landed disaster recovery system, which allows the client software to collect disaster recovery network address and idle resources of the node, and inform a randomly selected server disaster recovery, disaster recovery server that landed inches between the nodes of idle resources , a network node address is recorded in its own state table; the common node requests made disaster disaster recovery server to a randomly selected, disaster recovery request includes at least the node identifier IDn. de、待灾备的数据块个数、数据块大小、灾备次数;灾备服务器收到这些信息之后,根据节点状态表中记录的各个普通节点是否已登陆、闲置资源的多少等情况,决定哪个数据块灾备到哪个节点上,并形成灾备列表发送给该普通节 de, number of data blocks to be disaster recovery, data block size, the number of disaster; disaster recovery after the server receives such information, in accordance with the respective common node state table has been recorded in the landing, the number of idle resources etc., decides which data block to which node the disaster, and the common section is formed to a disaster recovery list

点;灾备列表是一个至少有两个字段的表,两个字段分别是"数据块标识、接收节点网络标识";节点上的客户端软件根据灾备列表,将所有数据块的所有备份灾备到各个接收节点上,即將数椐块发送给并保存在各个接收节点上; Point; disaster recovery list is a list of at least two fields, two fields are "data block identifier, the receiving node network identification"; client software on the node according to the disaster recovery list, all the backup of all data blocks disaster Preparation to each receiving node, i.e. the number noted in the blocks sent to and stored on each receiving node;

普通节点的容灾恢复过程:待恢复普通节点的客户端软件向一个 Disaster common node of the recovery process: ordinary nodes to be restored to a client software

隨机选择的正常工作的灾备服务器发送灾备恢复请求,该请求中包含该普通节点的IDn。 Randomly selected for normal operation of the disaster recovery disaster recovery server sends a request, the request contains the common node IDn. de;灾备服务器根据IDn。 de; disaster recovery server according to IDn. de査找灾备数据表,将该 de disaster recovery data lookup table, the

节点灾备过的所有数据块的信息找出;对每一个数据块,灾备服务器根据节点状态表中记录的接收节点是否已登陆,以及网络性能等实际情况选择一个灾备过该数据块的、目前正处于登陆状态的正常工作的普通节点,并获得该正常工作节点的网络标识;灾备服务器將所有的数据块标识,及每个数据块对应的正常工作节点的网络标识,组成一个列表,即恢复列表;灾备服务器将恢复列表发送给待恢复的节点;待恢复的节点上的客户端软件,根据恢复列表,到网络中的指定的正常工作节点上获得所有数据块,并将所有数据块重新组合为原始数据。 Node information through all data blocks to find out disaster recovery; for each data block, according to the disaster recovery server receives the node status table record has been landed, and the actual situation of the network performance to select a block through the disaster , ordinary node is currently in a normal operating state of the landing, and obtain the network identifier of the node's normal operation; disaster recovery server identifies all network data block identifier, and each data block corresponding to the normal operation of the node, a list composed of , i.e. restoration list; disaster recovery server node to send the list to be restored; client software on a node to be restored, the restoration list according to the normal operation of the network nodes access to all specified data blocks, and all block reassembled into the original data.

本发明的有益效果 Advantageous Effects of Invention

与其他灾备方法相比,本发明具有以下特点:1.基于P2P结构,利用正常运行系统中节点的闲置资源灾备其他节点的数椐和服务,既降低了灾备系统的成本,又提高了灾备系统资源利用率。 Compared with other methods of disaster recovery, the present invention has the following characteristics: 1 based on the P2P architecture, the normal operation of the system by using idle resources in other nodes in the node disaster recovery services, and noted in the number, only reduces the cost of the disaster recovery system, and improved a disaster recovery system resource utilization.

2. 对能够运行灾备系统的网络的唯一要求是可以逬行端到端的数据传输,而目前绝大部分的网络都可以提供这一功能,因此系统可 2. The only requirement to be able to run the network disaster recovery system that can Beng-end data transmission line, and now most of the network can provide this functionality, so the system can

以广泛部署在Intemet、局域网、行业专网、有线网络、无线网络、 3G移动通信网络等网络中。 In broad deployment Intemet, a local area network, private network industry, wired networks, wireless networks, 3G mobile communication network and other networks.

3. 数据被分割成数据块分布到整个网络中,提高了抵抗灾难的能力。 3. The data is divided into data blocks are distributed across the network, to improve the resistance to disaster.

4. 一个数据块可逬行任意次数的备份,能够!放到i(感数椐逬行次数较多的备份, 一般数椐进行次数较少的备份.提高了系统灵活性, 并且备份次数不再受到硬件的限制。 4. A data block may be any number of rows Peng backup possible! Into I (sense line number noted Peng times more backup, generally noted in the number for a smaller number of backup. Improve the flexibility of the system, and the backup frequency and not then limited hardware.

5. 加入灾备系统的节点越多,提供的闲置资源也越多,系统的灾备能力就越强,不会形成系统瓶颈,不会产生"备不下"的情况。 The more nodes join the disaster recovery system, the idle resources of the more, the greater the disaster recovery capability of the system, the system does not form a bottleneck, the situation does not occur "prepared less than" a.

以下结合附图说明和具体实施方式对本发明作进一步的详细说 The following description in conjunction with the accompanying drawings and specific embodiments of the present invention is further detailed

明: Bright:

附图说明 BRIEF DESCRIPTION

图1 总体流程图; Overall flowchart FIG. 1;

图2 普通节点加入示意图; FIG 2 a schematic view of an ordinary node is added;

图3 普通节点登陆示意图; Figure 3 a schematic view of an ordinary node login;

图4 普通节点容灾备份示意图; FIG. 4 schematic ordinary node disaster recovery;

图5 灾备服务器容灾恢复示意图; FIG 5 a schematic view of disaster disaster recovery server;

图6 普通节点容灾恢复示意图; FIG 6 schematic ordinary node disaster recovery;

图7 普通节点登出示意图。 FIG 7 schematic Sign ordinary node.

10具体实施方式 10 DETAILED DESCRIPTION

图1表示的是本发明的总体流程图。 FIG 1 shows a general flow chart of the present invention. 首先根据网络规模、地理位置、灾备强度等,將需要部署灾备系统的网络划分为若干个逻辑区域。 First, according to the network size, location, intensity and other disaster recovery, the need to deploy a network disaster recovery system is divided into a plurality of logical regions. 例如如果部署全国性的灾备系统,可以划分为东北、华北、华东、华南、西北、西南等逻辑区域。 For example, if the deployment of a national disaster recovery system can be divided into the northeast, north, east, south, northwest, southwest and other logic area.

之后需要部署灾备服务器。 After the need to deploy a disaster recovery server. 灾备服务器是一个软硬件性能都比较好的网络节点,既可以从网络中已有的节点中选择,也可以另外向网络中加入一台新的节点。 Disaster recovery server is a software and hardware properties are relatively good network node, either select an existing node from the network, may be additionally added a new node to the network. 一个逻辑区域中至少有1台灾备服务器。 A logic region in at least 1 of disaster recovery server. 灾备服务器要事先人工安装好灾备服务器端软件和数据库软件。 Disaster recovery server to be installed in advance artificial disaster recovery server software and database software.

数据库软件负责记录与灾备系统相关的数据,其中有4个数据表:服务器信息表、节点信息表、节点状态表、灾备数据表。 Database software is responsible for recording data associated with a disaster recovery system, which has four data sheet: server information table, information table node, node status table, disaster recovery data table. 服务器信息表中记录了所有灾备服务器的网络地址。 Server information table of the addresses of all network disaster recovery server. 节点信息表中记录了灾备系统中所有普通节点的灾备系统标识IDMde、加入吋间等信息,其 The node information table in the disaster recovery system disaster recovery system common node identifies all IDMde, added information between inch like,

中灾备系统标识ID。 In disaster recovery system identification ID. . de是由服务器分配给节点的。 de is assigned by the server to the node. 节点状态表中记录了节点登陆吋间、登出吋间、闲置资源状况、网络地址等信息。 Node status table records between nodes landing inch, inch out between idle information resources, such as network addresses. 灾备数据表中记录了灾备过的所有数据块的来源节点的IDn。 Disaster recovery data table records the source node IDn disaster recovery through all the data blocks. de、接收节点的ID。 de, ID of the receiving node. . de、大小、备份吋间、备份次数等信息。 Information de, size, between backup inch, the number of backups and so on.

灾备服务器软件有两个主要功能: 一是管理和实施灾备服务器的容灾备份和容灾恢复;二是管理普通节点的容灾备份和容灾恢复。 Disaster recovery server software has two main functions: First, the management and implementation of disaster recovery and disaster recovery server disaster recovery; the second is the general disaster recovery and disaster recovery management node recovery.

节点首先需要加入灾备系统,以获得在灾备系统中的合法身份。 Node first need to add disaster recovery system, in order to obtain legal status in the disaster recovery system. 如果要灾备自己的数据,还需要登陆,然后才能进行容灾备份, 当灾难发生后,逬行容灾恢复。 If you want to own data disaster recovery, also you need to login before you can make disaster recovery, when a disaster occurs, Peng line disaster recovery. 节点还可以随吋登出,节点登出之后,将不能灾备数据,也不会再接收其他节点发来的灾备数据,除非再次登陆。 Node can also follow inch out, after the node out, will not disaster recovery data would no longer be receiving from other nodes disaster recovery data, unless the landing again.

图2表示的是本发明的节点加入灾备系统的过程。 Figure 2 shows the process of the present invention, a node is added disaster recovery system. 节点从网络上下载灾备客户端软件的安裝程序,并安装灾备客户端软件,形成装有客户端的节点,为了叙述简便,除非特别说明,下文中的"节点"指的都是装有客户端的普通节点。 Downloaded from the network node disaster recovery client software installation, and install the client software disaster recovery, with the client node is formed, for simple description, unless otherwise specified, hereinafter "node" refers to a customer are provided with end ordinary node. 安装程序中带有所有灾备服务器的网络地址,节点从中随机选择一个灾备服务器,向其提出加入灾备系统的请求,灾备服务器收到请求之后为该节点生成一个全灾备系统唯一的标识ID。 Installer with all the network address servers disaster, a disaster recovery randomly selected node from the server, the request is added thereto disaster recovery system, a disaster recovery server after receiving a request for full disaster recovery system node generates a unique identification ID. . de,将ID。 de, will ID. . de发送给节点保存。 de sent to the node saving. 同吋將节点的标识IDn。 IDn identification inch with the node. de、 de,

加入吋间等信息记入自己的节点信息表。 Added information recorded in inch between its own node information table.

同吋,灾备服务器將这个新加入节点的节点信息发送给网中其他灾备服务器。 Same-inch, disaster recovery server node will send information about the new node is added to the network in other disaster recovery server. 其中节点信息至少包括节点的标识IDn。 Wherein the node information comprises at least identifier IDn node. de、加入吋间。 de, joined Inter inches. 其他灾备服务器收到节点信息之后,在各自的节点信息表中增加一条记录,包括节点的标识IDn。 After the server receives the node information of other disaster, an increase in the respective records in the node information table, the node includes an identifier IDn. de、加入吋间。 de, joined Inter inches.

图3表示的是本发明的节点登陆的过程。 Figure 3 shows a node according to the present invention, the landing process. 节点点击客户端软件中 Click on a node client software

的"登陆"按钮,客户端软件搜集该节点的网络地址和闲置资源情况, 组成节点状态,并告知一个隨机选择的灾备服务器。 The "landing" button, the client software to collect unused resources and network address of the node, composed of node status and tell a disaster recovery server randomly selected. 灾备服务器收到节点状态之后,连同收到该节点状态的吋间,即登陆吋间,记录在节点状态表中,并将其发送给网中其他灾备服务器备份。 After the disaster recovery server receives the node status, along with the received inter-node status inch, i.e., between the landing inches, recorded in the node status table, and sends it to other network disaster recovery backup servers. 只有已登陆的节点,对灾备系统来讲才是可用的。 Only the landing of nodes, in terms of disaster recovery system is available.

图4表示的是本发明的普通节点容灾备份过程。 Figure 4 shows the ordinary node disaster recovery process of the present invention. 可以将其分为5 。 It can be divided into five. 步骤1选择灾备类型。 Step 1 Select the type of disaster recovery.

灾备类型有"自动"和"手工"两种。 Disaster recovery type of "automatic" and "manual" two. 节点点击客户端软件中的"容灾备份"按钮,从弹出框中选择"自动"或者"手工"。 Click on the client node software "disaster recovery" button, select the pop-up box from "Auto" or "Manual."

如果选择"自动",节点需要填写弹出框中灾备吋间段、需备份数据所在目录、灾备次数、数据块大小、灾备间隔等内容。 If "Auto", the nodes need to fill disaster recovery period between pop-up box inch, directory data to be backed up, the number of times the content of disaster recovery, data block size, spacing and other disaster recovery. 其中灾备 Where the disaster recovery

吋间段可以是一个指定的吋间段,也可以是随吋;灾备间隔可以是一 Inch segments may be between a specified inch lap, may be with inch; disaster recovery interval may be a

个固定的吋间间隔,也可以是当数据发生变化吋。 Inch spacing between the fixed, but may be changed when the data inches. 节点可以选择多个目录进行自动灾备,各个目录的灾备吋间段、灾备次数、数据块大小、 灾备间隔可以不一样。 Directory node may select a plurality of automatic disaster, the disaster recovery inch between each directory segment, the number of disaster recovery, data block size, the disaster recovery interval can be different.

如果选择"手工",节点从弹出框中选择需要灾备的数据,并指定灾备次数和数据块大小。 If "manual", the node select from the pop-up box disaster recovery data, and specify the data block size and the number of disaster recovery. 节点可以进行多次手工灾备。 Multiple nodes can manually disaster recovery.

在一个节点上,自动灾备和手工灾备可以同吋逬行。 In a node, and manual automatic disaster disaster can Peng line with inches.

步骤2分割数据。 Step 2 divided data.

如果是自动灾备,客户端软件会自动将灾备吋间段内、需备份数据所在目录下的所有数据,按照灾备间隔规定的吋间,自动分割为大小一致的数据块。 If the disaster recovery is automatic, the client software will automatically the disaster-inch lap, all data to be backed up data in the directory, according to the distance between a predetermined disaster recovery inch, the same size is automatically split into data blocks. 其中"需备份数据所在目录下的所有数据"指的是, The "all of the data to be backed up directory data in a" mean,

当第l次自动灾备吋,是该目录下的所有数据;之后再灾备吋,是该目录下两次灾备之间,发生变化的数据。 When the disaster recovery automatic l th inch, all the data in the directory; disaster recovery after re-inches, is the directory between the two disaster recovery, data changes.

如果是手工灾备,客户端软件將数据分割为大/」\一致的数据块。 If the manual disaster recovery, the client software will be divided into a large data / "\ consistent data blocks.

步骤3获得灾备列表。 Step 3 to obtain a list of disaster recovery.

节点向随机选择的灾备服务器发出灾备请求。 Node sends a request to the disaster disaster recovery server randomly selected. 灾备请求中至少包 Disaster recovery request packet at least

括节点标识ID。 Comprising node identification ID. . de、数据块个数、数据块大小、灾备次数。 de, number of data blocks, a data block size, the number of disaster recovery. 灾备服务器收到这些信息之后,根据节点状态表中记录的网络中各个节点是否已登陆、闲置资源的多少等情况,决定哪个数据块灾备到哪个节点上, 并形成灾备列表。 After the disaster recovery server receives this information, according to state records in the table node network, each node has landed, how many idle resources, etc., to determine which data blocks on which node disaster recovery, and disaster recovery form the list.

灾备列表是一个至少有两个字段的表,两个字段分别是"数据块标识、接收节点网络标识"。 Disaster recovery list is a list of at least two fields, two fields are "data block identifier, the receiving node network identification." 例如灾备列表中有一条记录是"数据块 For example there is a disaster list record is "data block

1、 172.21.13.1",表示将数据块l灾备到IP地址为172.21.13.1的这个节点上。 一个数据块可能被灾备到多个接收节点上, 一个接收节点也可能接收多个数据块。 1, 172.21.13.1 ", indicates the data block to l disaster this node is the IP address of 172.21.13.1. A data block may be a disaster to a plurality of receiving nodes, a receiving node may also receive a plurality of data blocks .

灾备服务器将灾备列表发送给节点。 Disaster recovery disaster recovery server will send the list to the node.

步骤4灾备数据。 Step 4 disaster recovery data.

节点上的客户端软件根据灾备列表,将所有数据块的所有备份灾备到各个接收节点上,即将数据块发送给并保存在各个接收节点上。 Client software on the node according to the disaster recovery list, all disaster recovery backup of all data blocks to each receiving node, the data block to be sent to and stored on the receiving nodes. 步骤5备份灾备列表。 Step 5 disaster recovery backup list.

灾备服务器将灾备列表中的内容存储在自己的灾备数据表中,并將灾备列表发送给网中其他灾备服务器,其他灾备服务器收到之后將其加入各自的灾备数据表中。 Disaster recovery disaster recovery server will store the contents of the list in their disaster recovery data table and disaster recovery list to other network server disaster recovery, disaster recovery server receives after the other to join their disaster recovery data table in.

图5表示的是本发明的灾备服务器的容灾恢复过程。 FIG. 5 shows a disaster recovery server disaster recovery process of the present invention. 在待恢复的灾备服务器上,人工安装灾备服务器端软件,点击服务器端软件中的"容灾恢复"按钮,在弹出框中入工填写某一正常工作的灾备服务器的网络地址,服务器端软件向这一个正常工作的灾备服务器发送灾备恢复请求。 On the server to be restored disaster recovery, disaster recovery manual installation of server-side software, "disaster recovery" button to click on the server-side software, pop-up box into work to fill the network address of a server disaster recovery to work, the server disaster recovery server-side software to send it a work of disaster recovery requests. 正常工作的灾备服务器将自己的服务器信息表、节点信息表、节点状态表、灾备数据表发送给待恢复的灾备服务器,待恢复的灾备服务器接收并存储。 Disaster recovery work server sends its server information table, the node information table, the node status table, the data table to the disaster recovery server to be restored disaster, the disaster recovery server to be received and stored.

图6表示的是本发明的普通节点的容灾恢复过程。 Figure 6 shows the ordinary node disaster recovery process of the present invention. 可以将其分为3步。 Which can be divided into three steps.

步骤1获得恢复列表。 Step 1 get restoration list.

待恢复的节点重新下载并安装灾备客户端软件,其中带有系统中正常工作的灾备服务器的网络地址。 Node to be restored to re-download and install the client software, disaster recovery, disaster recovery server where the network address of the system with normal work. 点击客户端软件中的"容灾恢复" 按钮,客户端软件向随机选择的一个正常工作的灾备服务器发送灾备 Click the client software "disaster recovery" button, the client software sends disaster recovery disaster recovery server to a working randomly selected

恢复请求,该请求中包含节点的IDn。 Restoration request, IDn node contained in the request. de。 de. 灾备服务器根据IDn。 According to the disaster recovery server IDn. de查找自己的灾备数据表,将节点灾备过的所有数据块的信息找出。 de Find your own disaster recovery data table, all data blocks of the information node disaster recovery ever find out. 对每一个数椐块,灾备服务器根据节点状态表中记录的接收节点是否已登陆,以及网络性能等实际情况选择一个灾备过该数据块的、目前正处于登陆状态的正常工作节点,并获得该正常工作节点的网络标识。 As noted in the number of the actual situation of each block, according to whether the received disaster recovery server node state table recorded has landed, and a disaster recovery and network performance through the selected data block is currently in the normal operating state landing node, and get the network identifies the work node. 灾备服务器将所有的数据块标识,及每个数椐块对应的正常工作节点的网络标识,组成一个列表,即恢复列表。 The disaster recovery server all the data block identifier, identifying the network node and each of the normal operation noted in the number corresponding to the block to form a list, the list is restored. 恢复列表至少有两个字段, 两个字段分别是"数据块标识、正常工作节点网络标识",表示可以到该正常工作节点上获得该数据块。 Restoration list has at least two fields, two fields are "data block identifier, the network node identifier normal operation" represents the normal operation of the node to obtain the data block.

灾备服务器将恢复列表发送给待恢复的节点。 Disaster recovery server will send a list of nodes to be restored.

步骤2获得数据块。 Step 2 to obtain a data block.

待恢复的节点上的客户端软件,根据恢复列表,到网络中的指定的正常工作节点上获得所有数据块。 Be restored software on the client node, according to the restoration list, the normal operation of the network nodes access to all specified data blocks. 步骤3 恢复数据。 Step 3 to recover the data.

待恢复的节点上的客户端软件,将所有数据块重新组合为原始数据。 Be restored software on the client node, all data blocks reassembled into the original data.

图7表示的是本发明的节点登出的过程。 FIG. 7 shows a node of the present invention logout process. 节点点击客户端软件的"登出"按钮,客户端软件发送登出通知给随机选择的灾备服务器。 Click on a node client software "logout" button, the client software sends out a notification to the disaster recovery server randomly selected. 登出通知中包括节点标识IDn。 Sign notification includes a node identifier IDn. de。 de. 灾备服务器收到登出通知之后将节点状态表中节点的登出吋间改为收到登出通知的吋闾,并通知网中其他灾备服务器同样將各自节点状态表中节点的登出吋间改为收到登出通知的吋间。 After the disaster notification server receives a logout logout inch between the node status table node to receive notification of logout inch Lu, and other disaster recovery server notifies the network each node will be the same node status table logout between inch instead receive between log out notification inches.

Claims (1)

  1. 1.基于P2P的容灾备份方法,其特征在于,包括: 灾备系统的部署:根据网络规模、地理位置、灾备强度等,将待部署灾备系统的网络划分为若干个逻辑区域,逻辑区域的个数至少1个;每个逻辑区中新放置或者从已有节点中选择至少1个节点,做为灾备服务器,其他节点作为普通节点; 软件的安装:灾备服务器上装有灾备服务器软件和数据库软件;普通节点上装有灾备客户端软件;所述的数据库软件负责记录与灾备系统相关的数据,其中主要有4个数据表:服务器信息表、节点信息表、节点状态表、灾备数据表;服务器信息表中记录了所有灾备服务器的网络地址;节点信息表中记录了灾备系统中所有普通节点的灾备系统标识IDnode、加入时间等信息,其中灾备系统标识IDnode是由服务器分配给节点的;节点状态表中记录了节点登陆时间、登出时间、闲置资 1. P2P based method for disaster recovery, characterized by comprising: deploying disaster recovery system: according to the network size, location, intensity and other disaster recovery, disaster recovery system will be deployed in the network is divided into a number of logical region, logic at least a number of regions; newly placed in each logical region or selecting at least one node, as a disaster recovery from existing server node, other nodes as an ordinary node; software installed: on the disaster recovery server with disaster server software and database software; ordinary node with the disaster recovery client software; software responsible for the database record associated with the data disaster recovery system, wherein there are four main data tables: a server information table, the node information table, the node status table , disaster recovery data table; server information table of the addresses of all the network disaster recovery server; node information table in the disaster recovery system disaster recovery system identification IDnode all common nodes, adding time information, disaster recovery system identification IDnode is assigned to a node by a server; node node state table recorded login time and logout time, idle resources 源状况、网络地址等信息;灾备数据表中记录了灾备过的所有数据块的来源节点的IDnode、接收节点的IDnode、大小、备份时间、备份次数等信息;所述的灾备服务器软件负责管理和实施灾备服务器的容灾备份和容灾恢复,以及管理普通节点的容灾备份和容灾恢复;所述的灾备客户端软件负责实施普通节点的容灾备份和容灾恢复; 数据的灾备以数据块为单位进行; 灾备服务器的容灾备份过程:当一台灾备服务器的数据库中的数据发生变化时,灾备服务器软件根据服务器信息表中记录的所有服务器的网络地址,将发生变化数据广播给其他灾备服务器,其他灾备服务器收到后更新自己的数据库,以保证所有灾备服务器数据的一致性,即灾备服务器上的数据得到了容灾备份; 灾备服务器的容灾恢复过程:待恢复的灾备服务器向随机选择的一个正常工作的灾 Source status, network address information; disaster recovery data table records information IDnode IDnode, the receiving node over the source node disaster recovery for all data block size, time of the backup frequency and the like; the disaster recovery server software responsible for the management and implementation of disaster recovery server backup and disaster recovery disaster recovery, backup and disaster recovery management and disaster recovery common node; the disaster recovery client software is responsible for the implementation of disaster recovery and disaster recovery common node; data disaster recovery data block units; disaster recovery server disaster recovery process: when a disaster recovery data in the database server is changed, according to the disaster recovery network server software for all servers in the server information recorded in the table address, change data broadcasting will happen to other disaster recovery server, and other disaster recovery server updates its database upon receipt to ensure the consistency of all disaster recovery server data, namely data on the disaster recovery server has been disaster recovery; disaster disaster recovery backup server: disaster disaster recovery server to be a randomly selected normal working 服务器发送灾备恢复请求;正常工作的灾备服务器将自己的数据库中的全部内容发送给待恢复的灾备服务器,待恢复的灾备服务器接收并存储; 普通节点的登陆过程:普通节点的灾备客户端软件向一个随机选择的灾备服务器发送加入请求,灾备服务器收到请求之后为该节点生成一个全灾备系统唯一的标识IDnode,并将其发送给该节点保存;同时将该节点的标识IDnode和加入时间等信息记入自己的节点信息表; 普通节点的容灾备份过程:一个普通节点通过灾备客户端软件,用自己的IDnode登陆灾备系统,即允许灾备客户端软件搜集该节点的网络地址和闲置资源情况,并告知一个随机选择的灾备服务器,该灾备服务器将该节点的登陆时间、闲置资源情况、网络地址记录在自己的节点状态表中;该普通节点向一个随机选择的灾备服务器提出灾备请求,灾备请 Disaster recovery server sends a request; a working disaster recovery server will send the entire contents of your database server to be restored to the disaster, the disaster recovery server to be received and stored; the login process common node: The node common disaster Preparation of disaster recovery client software sends the server a request to join randomly selected, after a disaster recovery server receives the request full disaster recovery system generating a unique identifier for the node IDnode, and send it to the storage node; while the node IDnode identification and time information entry is added own node information table; disaster recovery process in an ordinary node: a node common disaster recovery client software with their own login IDnode disaster recovery system that allows the client software disaster recovery collect unused resources and network address of the node, and inform a disaster recovery server randomly selected, the landing time of the disaster recovery server node, idle resources, network address record in its own node status table; the common node made disaster to disaster recovery server request a random choice, please disaster recovery 求中至少包括节点标识IDnode、待灾备的数据块个数、数据块大小、灾备次数;灾备服务器收到这些信息之后,根据节点状态表中记录的各个普通节点是否已登陆、闲置资源的多少等情况,决定哪个数据块灾备到哪个节点上,并形成灾备列表发送给该普通节点;灾备列表是一个至少有两个字段的表,两个字段分别是“数据块标识、接收节点网络标识”;节点上的客户端软件根据灾备列表,将所有数据块的所有备份灾备到各个接收节点上,即将数据块发送给并保存在各个接收节点上; 普通节点的容灾恢复过程:待恢复普通节点的客户端软件向一个随机选择的正常工作的灾备服务器发送灾备恢复请求,该请求中包含该普通节点的IDnode;灾备服务器根据IDnode查找灾备数据表,将该节点灾备过的所有数据块的信息找出;对每一个数据块,灾备服务器根据节点 Request includes at least the node identifier IDnode, the number of data blocks to be disaster recovery, data block size, the number of disaster; disaster recovery after the server receives such information, in accordance with the respective common nodes recorded in the state table has been landed, idle resources the number, etc., to determine which block of data on which node disaster recovery, and disaster recovery list to the formation of the common node; disaster recovery list is a list of at least two fields, two fields are "data block identifier, the receiving node network identification "; client software on the node according to the disaster recovery list, all disaster recovery backup of all data blocks to each receiving node, the data block to be sent to and stored on each receiving node; disaster common node recovery procedure: ordinary node to be restored client software sends the server a disaster recovery randomly selected for normal operation of disaster recovery request, the request contains the common node IDnode; disaster recovery server lookup table according to the data disaster IDnode, the All blocks in the node had to find disaster recovery; for each data block, according to the disaster recovery server node 状态表中记录的接收节点是否已登陆,以及网络性能等实际情况选择一个灾备过该数据块的、目前正处于登陆状态的正常工作的普通节点,并获得该正常工作节点的网络标识;灾备服务器将所有的数据块标识,及每个数据块对应的正常工作节点的网络标识,组成一个列表,即恢复列表;灾备服务器将恢复列表发送给待恢复的节点;待恢复的节点上的客户端软件,根据恢复列表,到网络中的指定的正常工作节点上获得所有数据块,并将所有数据块重新组合为原始数据。 If the receiving node state table recorded has landed, and the actual situation of the network performance to select a block through the disaster, ordinary node is currently in a working state of the landing, and obtain the network identifier of the normal operation of the node; disaster Preparation of the server all the data block identifier, identifying the network node and the normal operation corresponding to each data block, composed of a list, the list is restored; disaster recovery server to a list of nodes to be restored; node to be restored client software, according to the restoration list, the normal operation of the network nodes access to all specified data blocks, and all data blocks back into the original data.
CN 200910092062 2009-09-21 2009-09-21 Disaster-tolerant backup method based on P2P CN101651710A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200910092062 CN101651710A (en) 2009-09-21 2009-09-21 Disaster-tolerant backup method based on P2P

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200910092062 CN101651710A (en) 2009-09-21 2009-09-21 Disaster-tolerant backup method based on P2P

Publications (1)

Publication Number Publication Date
CN101651710A true true CN101651710A (en) 2010-02-17

Family

ID=41673815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200910092062 CN101651710A (en) 2009-09-21 2009-09-21 Disaster-tolerant backup method based on P2P

Country Status (1)

Country Link
CN (1) CN101651710A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834904A (en) * 2010-05-14 2010-09-15 杭州华三通信技术有限公司 Method and equipment for database backup
CN101902498A (en) * 2010-07-02 2010-12-01 广州鼎甲计算机科技有限公司 Network technology based storage cloud backup method
WO2011150741A1 (en) * 2010-06-01 2011-12-08 中兴通讯股份有限公司 Point to point (p2p) overlay network, data resources operation method and new node join method thereof
CN102411520A (en) * 2011-09-21 2012-04-11 电子科技大学 Data-unit-based disaster recovery method for seismic data
CN102752404A (en) * 2012-07-25 2012-10-24 钟祝君 Novel backup method and system for disaster recovery
CN103119551A (en) * 2010-09-30 2013-05-22 Emc 公司 Optimized recovery
US9165019B2 (en) 2010-09-30 2015-10-20 Emc Corporation Self recovery
US9195685B2 (en) 2010-09-30 2015-11-24 Emc Corporation Multi-tier recovery
US9195549B1 (en) 2010-09-30 2015-11-24 Emc Corporation Unified recovery
US9417966B2 (en) 2010-09-30 2016-08-16 Emc Corporation Post backup catalogs

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834904A (en) * 2010-05-14 2010-09-15 杭州华三通信技术有限公司 Method and equipment for database backup
WO2011150741A1 (en) * 2010-06-01 2011-12-08 中兴通讯股份有限公司 Point to point (p2p) overlay network, data resources operation method and new node join method thereof
CN101902498B (en) 2010-07-02 2013-03-27 广州鼎甲计算机科技有限公司 Network technology based storage cloud backup method
CN101902498A (en) * 2010-07-02 2010-12-01 广州鼎甲计算机科技有限公司 Network technology based storage cloud backup method
US9542280B2 (en) 2010-09-30 2017-01-10 EMC IP Holding Company LLC Optimized recovery
US9417966B2 (en) 2010-09-30 2016-08-16 Emc Corporation Post backup catalogs
CN103119551A (en) * 2010-09-30 2013-05-22 Emc 公司 Optimized recovery
CN103119551B (en) * 2010-09-30 2016-09-21 Emc 公司 Optimized recovery
US9195549B1 (en) 2010-09-30 2015-11-24 Emc Corporation Unified recovery
US9165019B2 (en) 2010-09-30 2015-10-20 Emc Corporation Self recovery
US9195685B2 (en) 2010-09-30 2015-11-24 Emc Corporation Multi-tier recovery
CN102411520A (en) * 2011-09-21 2012-04-11 电子科技大学 Data-unit-based disaster recovery method for seismic data
CN102411520B (en) 2011-09-21 2013-09-25 电子科技大学 Data-unit-based disaster recovery method for seismic data
CN102752404B (en) * 2012-07-25 2015-02-18 高旭磊 Novel backup method and system for disaster recovery
CN102752404A (en) * 2012-07-25 2012-10-24 钟祝君 Novel backup method and system for disaster recovery

Similar Documents

Publication Publication Date Title
US7792944B2 (en) Executing programs based on user-specified constraints
US7788522B1 (en) Autonomous cluster organization, collision detection, and resolutions
US7461130B1 (en) Method and apparatus for self-organizing node groups on a network
US7543020B2 (en) Distributed client services based on execution of service attributes and data attributes by multiple nodes in resource groups
US7599941B2 (en) Transparent redirection and load-balancing in a storage network
US20080183991A1 (en) System and Method for Protecting Against Failure Through Geo-Redundancy in a SIP Server
US20100299553A1 (en) Cache data processing using cache cluster with configurable modes
US20070260721A1 (en) Physical server discovery and correlation
US20070016822A1 (en) Policy-based, cluster-application-defined quorum with generic support interface for cluster managers in a shared storage environment
US20030126196A1 (en) System for optimizing the invocation of computer-based services deployed in a distributed computing environment
US20060041580A1 (en) Method and system for managing distributed storage
US20050216910A1 (en) Increasing fault-tolerance and minimizing network bandwidth requirements in software installation modules
US20070234116A1 (en) Method, apparatus, and computer product for managing operation
US7139809B2 (en) System and method for providing virtual network attached storage using excess distributed storage capacity
US7370336B2 (en) Distributed computing infrastructure including small peer-to-peer applications
CN1945539A (en) Method for distributing shared resource lock in computer cluster system and cluster system
US20080244068A1 (en) Computer product, operation management method, and operation management apparatus
US9053167B1 (en) Storage device selection for database partition replicas
Fagg et al. Scalable networked information processing environment (SNIPE)
CN1554055A (en) High-availability cluster virtual server system
CN1851657A (en) Dual-machine back-up realizing method and system
CN102053982A (en) Method and equipment for managing database information
US20130185408A1 (en) Systems and Methods for Server Cluster Application Virtualization
US20090177756A1 (en) Multiple network shared disk servers
CN101170416A (en) Network data storage system and data access method

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
C02 Deemed withdrawal of patent application after publication (patent law 2001)