CN109039750B

CN109039750B - Method for improving multi-city multi-park deployment disaster recovery capability of block chain application system

Info

Publication number: CN109039750B
Application number: CN201810917877.5A
Authority: CN
Inventors: 陈嘉俊; 臧铖
Original assignee: China Zheshang Bank Co Ltd
Current assignee: Yiqiyin Hangzhou Technology Co ltd; China Zheshang Bank Co Ltd
Priority date: 2018-08-13
Filing date: 2018-08-13
Publication date: 2021-06-15
Anticipated expiration: 2038-08-13
Also published as: CN109039750A

Abstract

The invention discloses a method for improving the multi-city multi-park deployment disaster recovery capability of an application system realized based on a block chain technology, which determines the range of the number of production nodes according to a node consensus mechanism of a block chain platform; when the number of fault-tolerant nodes is known, determining the number of production nodes according to a fault-tolerant algorithm of a node consensus mechanism; when the number of the production nodes is known, determining the number of fault-tolerant nodes according to a fault-tolerant algorithm of a node consensus mechanism; and establishing a multi-park multi-activity and multi-park mutual backup scheme according to the number of the parks, the number of production nodes, the number of fault-tolerant nodes and the number of backup nodes, and establishing a corresponding exception handling flow. According to the block chain platform node consensus mechanism and the block chain link point deployment condition, the disaster recovery scheme of multiple places and multiple centers is realized in a configuration mode, particularly, the local/same-city park mutual recovery or same-city double-activity disaster recovery and different-place park disaster recovery are realized under a two-place and three-center architecture of a financial institution, and the production risk is reduced.

Description

Method for improving multi-city multi-park deployment disaster recovery capability of block chain application system

Technical Field

The invention belongs to the field of computer systems, and particularly relates to a method for improving disaster recovery capability of multi-city and multi-park deployment of an application system based on a block chain technology.

Background

How to ensure the high availability of an application system is always the key work of the construction of an information system of a financial institution, the traditional application system of a commercial bank has a relatively mature disaster-tolerant scheme at present, but the block chain platform is mostly deployed in a form of an alliance chain under a commercial environment, peripheral applications are all called in an ESDK mode, the system has the particularity of multi-node deployment, Byzantine fault tolerance and the like, the traditional application system disaster-tolerant scheme cannot be applied, and the system needs to be independently planned. Meanwhile, the block chain platform is used as a bottom platform to be connected with a plurality of service systems, the upper layer supports a plurality of product applications, the degree of dependence among the systems is high, the transaction frequency is in a continuous rising trend, and in order to reduce the production risk, a targeted disaster preparation scheme needs to be made as soon as possible according to the technical characteristics of the block chain.

Disclosure of Invention

The invention aims to make a targeted disaster recovery scheme according to the technical characteristics of the block chain based on a 'two places three centers' disaster recovery system architecture commonly adopted by financial institutions and improve the high availability of the block chain system.

The purpose of the invention is realized by the following technical scheme: a method for improving disaster recovery capability of multi-city and multi-park deployment of an application system based on a block chain technology comprises the following steps:

(1) determining the range of the number n of production nodes according to a node consensus mechanism of an alliance chain or a private chain of a block chain platform, wherein the minimum value of n is 3; when the number f of fault-tolerant nodes is known, determining that the number n of production nodes is more than or equal to 3f +1 and less than or equal to 3f +3 according to a fault-tolerant algorithm of a node consensus mechanism; when the number n of the production nodes is known, determining the number f of fault-tolerant nodes as TRUNC [ (n-1)/3] according to a fault-tolerant algorithm of a node consensus mechanism;

(2) according to the number m of the parks, the number n of production nodes, the number f of fault-tolerant nodes and the number b of backup nodes, a multi-park multi-activity and multi-park mutual backup scheme is formulated, and the scheme specifically comprises the following steps:

the deployment mode is as follows: n production nodes, b backup nodes, b being n;

a first park: deploying CEIL (n/m) production nodes;

second to m-1 parks: at least 1 production node is deployed in each park, at most f production nodes are deployed, the number of backup nodes starts to traverse from a second park to an m-1 park (the backup nodes can traverse in sequence or according to any rules such as random and the like), the number of the backup nodes of the traversed park is the number of the production nodes of the first park-the number of the production nodes of the current park, and the traversal is stopped when the sum of the number of the production nodes deployed from the first park to the m-1 park and the number of the backup nodes is n;

the mth park: the number of the deployed production nodes is the sum of the number of the production nodes from the n-first park to the m-1 park, and the number of the backup nodes is b-the number of the production nodes of the current park;

when the number of the fault nodes is less than or equal to f, production operation is not influenced.

Further, the exception handling process of the multi-park multi-activity and multi-park mutual backup scheme is as follows:

single node failure: switching the fault node to a backup node of any park, preferentially switching to a backup node of the same park, and then switching to the park of the same city;

multi-node failure: switching the fault node to a backup node of any park or a plurality of parks, preferentially switching to a backup node of the same park, and then switching to the park of the same city;

park level failure: switching the fault node to any one or more backup nodes of the non-fault park, and preferentially switching to the same park;

urban fault: when the city where the first park to the (m-1) th park is located has a fault, switching the backup node of the mth park in a different place; and when the city where the mth park is located fails, switching the first remote park to the backup node of the m-1 park.

Furthermore, in the exception handling process, the production node IP/host name maintained in the database table or the configuration file is adjusted through a manual or automatic monitoring mechanism, the fault node IP/host name is modified into the corresponding backup node IP/host name, and the node is rapidly switched.

Further, the consensus algorithm includes a workload proof mechanism or algorithm, a rights proof mechanism or algorithm, a BFT algorithm, or an algorithm based on BFT implementations.

Further, the backup node can synchronize from one or more production nodes in a real-time or quasi-real-time manner through data, and can also synchronize data through a consensus algorithm, but whether to participate in consensus is determined by parameter setting.

The invention has the beneficial effects that: based on the existing disaster recovery system architecture of commercial banks, a high-availability disaster recovery method different from a traditional application system is adopted for a block chain platform, so that the dependence degree of the traditional system and the block chain system is ensured to be increasingly tight, the transaction amount is continuously increased, the production risk is reduced, and the stable operation of the block chain application system is ensured.

Drawings

FIG. 1 is a node deployment architecture diagram of an embodiment of a three campus deployment four production nodes.

Detailed Description

The present invention will be described in further detail with reference to the following drawings and specific embodiments, it being understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Because the block chain platform has the characteristics of multi-node deployment, Byzantine fault tolerance and the like, the traditional disaster preparation scheme of the application system cannot be applied, and independent planning is needed. The invention provides a method for improving disaster recovery capability of an application system based on a block chain technology, which realizes mutual backup of local/same-city parks and disaster recovery of different-place parks in a configuration mode according to a node consensus mechanism of a alliance chain or a private chain of a block chain platform and the deployment condition of block chain link points, and reduces production risk.

In the field of commercial banking, the concept of consensus mechanism is: the verification and confirmation of the transaction are completed in a short time through the voting of the selected nodes; for a transaction, if a plurality of nodes with irrelevant benefits can achieve consensus, the whole network can be considered to achieve consensus; the consensus algorithm comprises a workload proving mechanism or algorithm, a rights proving mechanism or algorithm, a BFT algorithm or an algorithm based on BFT implementation, and the like. The backup node can synchronize from one or more production nodes in a real-time or quasi-real-time mode through data, and can also synchronize the data through a consensus algorithm, but whether the backup node participates in consensus is determined by parameter setting.

The invention provides a method for improving the multi-city multi-park deployment disaster recovery capability of an application system realized based on a block chain technology, which specifically comprises the following steps:

(1) determining the range of the number n of production nodes according to a node consensus mechanism of an alliance chain or a private chain of a block chain platform, wherein the minimum value of n is 3; when the number f of fault-tolerant nodes is known, determining that the number n of production nodes is more than or equal to 3f +1 and less than or equal to 3f +3 according to a fault-tolerant algorithm of a node consensus mechanism; when the number n of the production nodes is known, determining the number f of fault-tolerant nodes as TRUNC [ (n-1)/3] according to a fault-tolerant algorithm of a node consensus mechanism, wherein TRUNC is an intercepted integer/a down-rounding function;

(2) according to the number m of parks (or a data center can be a machine room or a relatively independent operation environment), the number n of production nodes, the number f of fault-tolerant nodes and the number b of backup nodes, a multi-park multi-activity and multi-park mutual backup scheme is formulated, and the scheme is as follows:

(a) the deployment mode is as follows: n production nodes, b backup nodes, b being n;

a first park: deploying CEIL (n/m) production nodes;

when the number of the fault nodes is less than or equal to f, the production operation is not influenced;

(b) exception handling flow:

In the abnormal processing flow, the IP/host name of the production node maintained in the database table or the configuration file is adjusted through a manual or automatic monitoring mechanism, the IP/host name of the fault node is modified into the corresponding IP/host name of the backup node, and the node is rapidly switched.

Example (b): taking the example that four production nodes are deployed in three parks, comprehensive analysis and comparison are carried out on the multi-park multi-activity and multi-park mutual backup scheme and the exception handling flow. As shown in fig. 1, a denotes a production node and B denotes a backup node.

(a) The deployment mode is as follows: 4 production nodes and 4 backup nodes;

the first park acts as the master park: deploying 2 production nodes;

the second park is as the same park: deploying 1 production node and 1 backup node;

the third park is used as a disaster recovery park at different places: deploying 1 production node and 3 backup nodes;

(b) exception handling flow:

urban fault: when the main park and the city where the same park is located have a fault, switching the backup nodes of the remote disaster backup park; and when the city where the remote disaster recovery park is located has a fault, switching to the backup node of other parks.

The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims

1. A method for improving disaster recovery capability of multi-city and multi-park deployment of an application system based on a block chain technology is characterized by comprising the following steps:

a first park: deploying CEIL (n/m) production nodes;

second to m-1 parks: at least 1 production node is deployed in each park, at most f production nodes are deployed, the number of backup nodes starts to traverse from a second park to an m-1 park, the number of the backup nodes of the traversed park is the number of the production nodes of the first park to the number of the production nodes of the current park, and the traversing is stopped when the sum of the number of the production nodes deployed from the first park to the m-1 park and the number of the backup nodes is n;

2. The method for improving disaster recovery capability of multi-city multi-park deployment of an application system based on a block chain technology according to claim 1, wherein the exception handling process of the multi-park multi-activity multi-park mutual backup scheme is as follows:

3. The method for improving disaster recovery capability of multi-city multi-campus deployment of an application system based on a block chain technology as claimed in claim 2, wherein in the exception handling process, the production node IP/host name maintained in the database table or the configuration file is adjusted by a manual or automatic monitoring mechanism, and the failure node IP/host name is modified to the corresponding backup node IP/host name, thereby realizing fast node switching.

4. The method of claim 1, wherein the consensus algorithm comprises a workload certification mechanism or algorithm, a rights and interests certification mechanism or algorithm, a BFT algorithm, or an algorithm implemented based on BFT.

5. The method according to claim 1, wherein the backup node can synchronize from one or more production nodes in real-time or quasi-real-time manner, or synchronize data through a consensus algorithm, but determine whether to participate in consensus through parameter setting.