WO2021052416A1

WO2021052416A1 - Disaster tolerant method and apparatus, site, and storage medium

Info

Publication number: WO2021052416A1
Application number: PCT/CN2020/115892
Authority: WO
Inventors: 张贵
Original assignee: 中兴通讯股份有限公司
Priority date: 2019-09-18
Filing date: 2020-09-17
Publication date: 2021-03-25
Also published as: CN112527552A

Abstract

A disaster tolerant method and apparatus, a site, and a storage medium, relating to the field of communications and information. The disaster tolerant method comprises: if the identity of this site is determined as a standby site, detecting whether a target site satisfying a preset condition exists (101), the preset condition comprising: the identity of the target site being a main site, and the target site being allocated to this site and having not established a disaster backup relationship with this site; if the target site satisfying the preset condition exists, establishing the disaster backup relationship with the target site (102); and backing up, for the target site, data to be backed up in the target site (103). The flexibility of service expansion can be improved.

Description

Disaster tolerance method, device, site and storage medium

Technical field

The embodiments of the present invention relate to the field of communications and information, and in particular to a disaster tolerance method, device, site, and storage medium.

Background technique

As the management capabilities of telecommunication systems are expanding and the amount of data information is also exploding, the high availability of the corresponding system and the disaster tolerance of data are particularly important. Distributed as an effective form of architecture for system capability expansion, deploys applications in the form of microservices and containers on platform as a service (Platform as Service, PaaS for short), and is widely used in telecommunication systems.

At present, the disaster recovery relationship in the system disaster recovery solution is mainly one-to-one correspondence, that is, a backup site can only back up one active site. If an active site is added, a new backup site must be built to back up the data produced by the newly added active site, so business expansion is not flexible enough.

Summary of the invention

The purpose of the embodiments of the present invention is to provide a disaster recovery method, device, site, and storage medium. The backup site actively establishes a disaster recovery relationship with the active site, which can prevent the active site from actively establishing disaster recovery with the backup site. During the relationship, the original business is interrupted due to the expansion of the business in the running process, and the flexibility of business expansion is improved at the same time.

In order to solve the above technical problems, the embodiments of the present invention provide a disaster recovery method, including: if the identity of the local site is determined to be a backup site, detecting whether there is a target site meeting preset conditions; the preset conditions include: target The identity of the site is the main site, and the target site is assigned to the site and has not yet established a disaster recovery relationship with the site; if there is a target site that meets the preset conditions, establish a disaster recovery relationship with the target site; it is the target The site backs up the data to be backed up in the target site.

The embodiment of the present invention also provides a disaster recovery device, including: a comprehensive service module, a message middleware, and an application embedded with a data backup module; the comprehensive service module is set to determine the identity of the local site as a backup site, Detect whether there is a target site that meets the preset conditions. The preset conditions include: the target site is the master site, the target site is assigned to the site and has not yet established a disaster recovery relationship with the site; it is also set to meet The target site with preset conditions establishes a disaster recovery relationship; the message middleware is set to store the identity information of the local site from the integrated service module; the data backup module in the application is set to monitor the identity information of the local site in the message middleware , And after establishing a disaster recovery relationship between the local site and the target site, according to the identity information of the local site, back up the data to be backed up in the target site for the target site.

The embodiment of the present invention also provides a site, including: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are at least One processor executes, so that at least one processor can execute the above disaster recovery method.

The embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program is executed by a processor to implement the disaster recovery method described above.

Compared with the prior art, in the embodiment of the present invention, the backup site actively detects the active site that is assigned to the local site and has not established a disaster recovery relationship with the local site, and actively requests the establishment of the disaster recovery relationship, and the backup site can communicate with Multiple primary sites establish a disaster recovery relationship to solve the problem that the disaster recovery architecture is relatively solidified when the disaster recovery relationship is one-to-one. Moreover, the primary site does not need to care about the status of the standby site, which improves the flexibility of business expansion.

In addition, establishing a disaster recovery relationship with the primary site includes: sending a connection request to the target site, and confirming that the disaster recovery relationship is successfully established after receiving a response from the target site to allow the connection. The backup site actively initiates the connection, that is, for the master site, there is no need to know its corresponding backup site, which reduces the workload of the master site in the process of establishing a disaster recovery relationship.

In addition, detecting whether there is the target site that meets the preset conditions includes: scanning the pre-stored list of primary sites; the list of primary sites includes all the primary sites assigned to the local site; detecting the primary sites The current status of each primary site in the list is determined, and the primary site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset conditions. A list of primary sites is pre-stored in the backup site, and the primary site that has not established a disaster recovery relationship in the scan table provides a specific detection method to ensure that the backup site is only compatible with the primary site that meets the preset conditions. Establish a disaster preparedness relationship.

In addition, backing up the data to be backed up in the target site for the target site includes: pulling the data to be backed up from the storage area of the target site for the backup site to access. The backup site actively pulls the data to be backed up from the active site that has established a disaster recovery relationship, without the need for the active site to allocate the data to the corresponding backup site, reducing the need for the active site in the process of backing up data. Point of work load.

In addition, the data to be backed up is bound with the identity information of the target site; for the target site, backing up the data to be backed up in the target site includes: storing the data to be backed up to the corresponding target site according to the identity information Storage area. According to the identity information, the source of the acquired data to be backed up can be known, and all the acquired data to be backed up can be distinguished according to the identity information, which is convenient for storage and management of the acquired data to be backed up. In addition, the identity information is unique, which ensures that the source of the obtained data to be backed up is reliable.

In addition, after establishing a disaster recovery relationship with the target site, it also includes: if the identity exchange between the local site and the target site is detected, when a connection request from the target site is received, the target site will send a connection permission response to the target site. This embodiment provides a processing method after the master and backup identities are exchanged.

In addition, after detecting that the identities of the local site and the target site are exchanged, it also includes: putting the backup data of the target site stored in the local site into the preset storage area of the local site, so that the target site can be preset from the local site. Storage area pull. Provides a working situation of this office after the identity exchange.

In addition, after detecting that the identities of the local site and the target site are interchanged, it also includes: clearing or dumping the data stored in the local site that is backed up by sites other than the target site. Separately store the backup data of the primary site and the target site in other disaster recovery relationships to facilitate the direct pull of the target site after the identity exchange, and reduce the difficulty of the target site to pull the backup data.

In addition, backing up the data to be backed up in the target site for the target site includes: the application of the local site is the same application backup data as the application in the target site. There is a one-to-one correspondence between the applications in the local site and the target site, and the applications in the local site need to back up the data of the applications in the corresponding target site. Provide a data backup method to make the data backup process more organized.

Description of the drawings

One or more embodiments are exemplified by the pictures in the corresponding drawings. These exemplified descriptions do not constitute a limitation on the embodiments. The elements with the same reference numerals in the drawings are denoted as similar elements. Unless otherwise stated, the figures in the attached drawings do not constitute a scale limitation.

FIG. 1 is a schematic structural diagram of a 2+2 disaster tolerance architecture in the first embodiment of the present invention;

2 is a flowchart of a disaster recovery method in the first embodiment of the present invention;

3 is a schematic diagram of the disaster recovery relationship of the 2+2 disaster recovery architecture in the first embodiment of the present invention;

4 is a schematic diagram of the disaster recovery relationship of the 3+2 disaster recovery architecture in the first embodiment of the present invention;

5 is a schematic diagram of the disaster recovery relationship of the 3+3 disaster recovery architecture in the first embodiment of the present invention;

6 is a flowchart of a disaster recovery method in the second embodiment of the present invention;

FIG. 7 is a flowchart of a disaster recovery method in the third embodiment of the present invention;

FIG. 8 is a flowchart of a disaster recovery method in the fourth embodiment of the present invention;

9 is a flowchart of a disaster recovery method in the fifth embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a disaster recovery device in a sixth embodiment of the present invention;

11 is a schematic diagram of the establishment of a 2+2 disaster recovery framework disaster recovery relationship in the sixth embodiment of the present invention;

12 is a schematic diagram of backup data of a 2+2 disaster recovery framework in the sixth embodiment of the present invention;

FIG. 13 is a schematic diagram of the structure of a site in the seventh embodiment of the present invention.

detailed description

In order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the various embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, a person of ordinary skill in the art can understand that in each embodiment of the present invention, many technical details are proposed in order to enable readers to better understand the present application. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solution claimed in this application can be realized. The following division of the various embodiments is for convenience of description, and should not constitute any limitation on the specific implementation of the present invention, and the various embodiments may be combined with each other without contradiction.

The inventor found that a new backup site to back up the data produced by the newly added primary site can achieve the purpose of disaster recovery, but it will have a greater demand for resources, which inevitably increases the operation and maintenance cost of the system. . At the same time, if the primary site actively establishes a disaster recovery relationship with the backup site, it will cause business interruption during business expansion and affect the flexibility of business expansion. Based on this, the inventor proposed the technical solution of this application.

The first embodiment of the present invention relates to a disaster tolerance method. In this embodiment, if the identity of the local site is the backup site, a disaster recovery relationship is established with the active site that is allocated to the local site and has not yet established a disaster recovery relationship with the local site. After the disaster recovery relationship is established, the local site serves as the backup site to back up the data in the active site. The entire disaster recovery method involves an M+N disaster recovery architecture, which is composed of an active domain and a standby domain, where the active domain includes M active sites, and the standby domain includes N standby sites. It can be considered that the disaster tolerance architecture consists of M primary sites and N backup sites, and the primary sites and backup sites can communicate with each other. An active site can establish a disaster recovery relationship with multiple backup sites, and a backup site can also establish a disaster recovery relationship with multiple active sites. Each site can be deployed in the same or different locations according to actual planning requirements. area. Among them, both M and N are natural numbers greater than or equal to 1. As shown in Figure 1, when both M and N are 2, that is, the structure diagram when the disaster tolerance architecture is 2+2, the active domain includes the active sites A and B, and the backup domain includes D and E. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 2, including:

Step 101: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 102 is entered; otherwise, the process ends.

Specifically, in the process of detecting whether there is a target site that meets the preset conditions, first, the local site scans the pre-stored master site list. The master site list includes all the master sites assigned to the local site, and then The current status of each master site in the master site list is detected, and the master site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset condition.

In a specific example, the identity of the site includes the primary site and the backup site. The operation and maintenance personnel can configure the site through a specific interface to achieve the purpose of setting the identity of the site.

Step 102: Establish a disaster recovery relationship with the target site.

Specifically, when it detects that there is a target site that meets the preset conditions, the local site serves as a backup site to send a connection request to the target site, and after receiving the response of the target site to allow connection, confirm the disaster recovery relationship Successfully established. The connection request can include the identity information of the local site for the target site to identify and record the identity information of the local site, so that the target site can know the source of the connection request, that is, the object that establishes the disaster recovery relationship.

Step 103: Back up the data to be backed up in the target site for the target site.

Specifically, when the local site is the backup site identity, the data to be backed up in the target site will be pulled from the preset storage area of the target site. This method can well solve the coupling caused by the many-to-many disaster recovery relationship. The active site only needs to back up data, which is decoupled from the recovery process of the standby site, and each active site in the active domain Each backup site in the backup domain does not affect each other. It will not cause the failure of a primary site to produce data to be backed up, or a backup site’s failure to back up data, resulting in the entire disaster recovery process. unavailable.

Specifically, the data to be backed up is bound with the identity information of the target site, and the local site will store the data to be backed up in a storage area corresponding to the target site based on the identity information.

Specifically, there is a one-to-one correspondence between the data to be backed up and the applications in the target site. When the local site is the target site and the data to be backed up in the target site is backed up, the applications of the local site are the target site. The data to be backed up in each application of the site, and each application of the local site corresponds to each application of the target site one-to-one.

In a specific example, each application in the primary site produces data required for disaster recovery, and the data is to be backed up. The application independently determines the backup strategy according to the particularity of its own business and related parameters. The backup strategy includes: periodic, full backup, incremental backup, etc., which are not limited here. For the data of each key application in the system, the backup period can be as short as possible, for example, it can be set to backup once every 30 seconds. For non-critical applications, you can choose to backup once an hour. For applications with a large amount of data, you can choose incremental synchronization, otherwise you can choose full backup. It should be noted that critical applications and non-critical applications can be distinguished based on preset standards. The criteria for defining critical applications and non-critical applications are not limited here, and the actual operation can be determined according to the actual situation. In particular, for the primary site in the primary domain, there may be multiple primary sites being backed up by the same backup site. Therefore, when data is backed up, each primary site is in the process of data backup. Carry the main site identification bit, that is, bind identity information. Correspondingly, the application in each backup site in the backup domain pulls the corresponding backup data from the storage medium of the active site in the active domain for data synchronization. The period, frequency, and timing of data synchronization can be determined by each application by configuring related parameters according to its own business. The applications in each site in the backup domain restore the backup data to the backup site to maintain the consistency of the data in the primary and backup sites. For the backup site in the backup domain, there may be a situation where one backup site backs up multiple active sites. Therefore, when data is restored, the active site identification bits carried in the backup data need to be restored together. In principle, the versions of the sites in the active domain and the backup domain should be consistent. When the version is inconsistent, for example, the version in the primary domain is high or low. Based on the principle of downward compatibility, the site version in the backup domain should use the high version, so that the backup domain can restore the primary domain compatible. There is a lower version of the data in.

In a specific example, suppose that a many-to-many disaster recovery relationship with a 2+2 disaster recovery architecture is to be established, as shown in Figure 3. The active sites A and B form the active domain, and the standby sites D and E form the standby domain. At this time, the disaster recovery relationship of the 2+2 type disaster recovery architecture is shown in Table 1. That is, the backup site D has established a disaster recovery relationship with the active sites A and B, and the backup site E has established a disaster recovery relationship with the active site. Use site A to establish a disaster recovery relationship. It should be noted that the establishment of the above-mentioned disaster recovery relationship is only a situation during actual operation, not a necessary condition for the realization of this solution.

Table 1

First, determine the identity of the local site. If the local site is a standby site, the identity of the local site is announced to the message middleware as the standby site. In the same way, if the identity of the local site is the primary site, the identity of the local site is published to the message middleware as the primary site. Then send a heartbeat message to the active site that meets the preset conditions and is in the active state. For example, the backup site D sends a heartbeat message to the active site A to request the establishment of a connection. If the standby site receives a response that allows the connection, it confirms that the disaster recovery relationship is established. If no response is received, the object sending the heartbeat message should be set to an abnormal state in the local site. For example, the backup site D sends a heartbeat message to the active site A that meets the preset conditions and is active, but does not receive a response to the heartbeat message, then the backup site D will record the active site A as a status Abnormal, and send an alarm notification. Each application in the site will monitor the site identity information stored in the message middleware, and then perform corresponding actions based on the site identity. The primary site produces the data to be backed up, and the backup site will be the primary site that has established a disaster recovery relationship. The data to be backed up in the site is backed up and restored.

In a specific example, suppose that the disaster tolerance architecture is 2+2 type, that is, 2 primary sites and 2 backup sites. Use A and B to represent the primary site, and C and D to represent the backup site. The specific disaster recovery relationship is shown in Figure 3. Now because of business expansion, a new primary site C needs to be opened in a certain area, and two disaster recovery redundancy, that is, two backups, are required. Then, first set up site C, and then open up the network plane so that the new site C can communicate with sites A, B, D, and E. Then the operation and maintenance personnel set the identity of site C as the primary site on the configuration interface. At this time, the operation and maintenance personnel can configure the environment information of the active site C on the current standby sites D and E, that is, add the active site C to the active site list of the standby sites D and E, and Set the active site C to the activated state. It should be noted that if the active site C is in an inactive state, it cannot establish a disaster recovery relationship with other backup sites. The environmental information includes port information, which is not limited here. Take the backup site D as an example. After the authority point D determines that it is the backup site, it will detect whether there is an active site that meets the preset conditions, that is, the active site that is assigned to itself and has not established a disaster recovery relationship with itself. game point. Since the primary site C is newly added, there must be a primary site C that meets the preset conditions in the primary site list. When the information of the primary site C is scanned in the primary site list, a connection request will be sent to the primary site C. The connection request can be a heartbeat message, which is not limited here. When the primary site C receives the connection request from the backup site D, it will store the information of the backup site D. Similarly, the information of the backup site D will also be stored, and then the active site will be activated. Point C will send a connection permission response to backup site D to confirm that the disaster recovery relationship between primary site C and backup site D is successfully established. After the disaster recovery relationship is established, each application on the primary site C will store the data to be backed up in the storage medium according to the backup strategy formulated by the respective application. The applications in the backup site D obtain the data to be backed up from the storage medium of the primary site C according to their respective recovery strategies. The acquisition methods include Rsync, Ftp, etc., which are not limited here, and the data is finally restored. The same is true for the standby site E, so I won't repeat them one by one. So far, the disaster tolerance architecture has been expanded from 2+2 to 3+2, as shown in Figure 4. When the disaster recovery architecture is expanded to 3+2, the disaster recovery relationship between sites is shown in Table 2:

Table 2

Further, in another specific example, assuming that the above disaster tolerance architecture is 3+2 type, for the primary sites A and C, the current two disaster tolerance redundancy does not require disaster tolerance, and a backup point needs to be added. , Can expand the current disaster recovery architecture from 3+2 type to 3+3 type, that is, add a backup site. Assuming that the added site is F, first set up site F, and then open up the network plane, so that the new site F can communicate with sites A, B, D, and E. Then the operation and maintenance personnel set the identity of site F as the backup site on the configuration interface. The operation and maintenance personnel configure the environment information of the primary sites A and C on site F, that is, add the primary sites A and C to the primary site list of site F, and set the primary sites A and C For the active state. It should be noted that the operation and maintenance personnel can also configure only the environmental information of the primary site A on site F. That is, in this example, the operation and maintenance personnel configure the primary site A and C on site F. Environmental information is only one of the conditions that can be selected in the actual implementation, and is not a necessary condition for the realization of this program. After the authority point F determines that it is the backup site, it will detect whether there is an active site that meets the preset conditions, that is, the active site that is assigned to itself and has not yet established a disaster recovery relationship with itself. Since the backup site F is newly added, the primary sites A and C in the primary site list of the backup site F must meet the preset conditions. Take the primary site C as an example. When the agency F scans the information of the primary site C in the primary site list, it will send a connection request to the primary site C. The connection request can be a heartbeat message. Not limited. When the active site C receives the connection request from the backup site F, it will store the information of the backup site F, and then the active site C will send a connection permission response to the backup site F to confirm The disaster recovery relationship between the primary site C and the backup site F is successfully established. After the disaster recovery relationship is established, each application on the primary site C will store the data to be backed up in the storage medium according to the backup strategy formulated by the respective application. The applications in the backup site F obtain the data to be backed up from the storage medium of the primary site C according to their respective recovery strategies. The acquisition methods include Rsync, Ftp, etc., which are not limited here, and the data is finally restored. The same is true for the standby site E, so I won't repeat them here. So far, the disaster tolerance architecture has been expanded from 3+2 to 3+3, as shown in Figure 5. The disaster recovery relationship between the sites when the disaster recovery architecture is expanded to 3+3 is shown in Table 3:

table 3

In this embodiment, after determining that the local site is the backup site, the local site takes the initiative to establish a disaster recovery relationship with the active site that is assigned to the local site and has not yet established a disaster recovery relationship with the local site, and then backs up the active site data. In the process of establishing a disaster recovery relationship, the primary site does not need to actively request it to avoid the original business interruption due to business expansion during the operation process when the primary site actively establishes a disaster recovery relationship, which can improve business expansion flexibility.

The second embodiment of the present invention relates to a disaster tolerance method. In this embodiment, the identity of the local site and the target site that have established a disaster recovery relationship are exchanged. At this time, the identity of the local site is changed from the backup site to the active site, and the identity of the target site is changed from the primary site. Converted to a backup site. Therefore, the local office only needs to send a connection permission response to the target office when it receives the connection request from the target office. This embodiment takes into account the situation of active/standby switching. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 6, including:

Step 201: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 202 is entered; otherwise, the process ends.

Step 202: Establish a disaster recovery relationship with the target site.

Step 203: Back up the data to be backed up in the target site for the target site.

Steps 201-203 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.

In step 204, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 204; otherwise, the process ends.

Step 205: Receive a connection request from the target site, and send a connection permission response to the target site.

It should be noted that, in this embodiment, the detection of whether the site is exchanged with the target site in step 204 is performed after the data to be backed up in the target site is backed up for the target site in step 203. This implementation The example only provides a situation. However, in actual implementation, step 204 can be performed at the same time as step 203, or performed after step 203, and these cases should be within the scope of protection.

In a specific example, taking into account the risks and unpredictability brought by automatic master/slave conversion, and the disaster recovery relationship designed by this method is a many-to-many relationship, that is, the data of a master site may be backed up to many On a backup site, when a disaster occurs, the operation and maintenance personnel manually select a backup site in the backup domain to perform the upgrade operation on the interface according to the data recovery status of each backup site. Assuming that the current disaster recovery architecture is 3+3, the main sites are A, B, and C, and the backup sites are D, E, and F. Now, the area where the main site C is located is affected by natural disasters. Site C is unavailable, and services need to be restored quickly. The operation and maintenance personnel can view the health status and current operation status of the active site C in the standby sites D, E, and F respectively. Suppose it is finally determined that the identities of the backup site F and the active site C are exchanged, and then the operation and maintenance personnel configure the backup site F as the active site to take over the work of the active site C according to the disaster recovery relationship list. To provide normal business functions. Correspondingly, the operation and maintenance personnel restore the active site C, and configure the active site C as a backup site to replace the backup site F to complete the service restoration work. After the identity exchange, the local site F as the primary site only needs to receive the connection request from the target site C and send a connection permission response to the site C. Each site monitors its own message middleware messages and executes corresponding actions based on the site's identity. That is, the primary site produces the data to be backed up according to the backup strategy, and the backup site obtains the backup data for restoration according to the recovery strategy.

In this embodiment, considering the situation of identity exchange, the current site as the primary site only needs to respond to the connection request of the target site that has become the backup site. Although the identities are exchanged, the essence is still that the backup site actively establishes a disaster recovery relationship with the active site that has been assigned to the local site and has not yet established a disaster recovery relationship with the local site, without the active request of the primary site to avoid When the primary site actively establishes a disaster recovery relationship, the original business is interrupted due to business expansion during the operation process, which improves the flexibility of business expansion.

The third embodiment of the present invention relates to a disaster tolerance method. In this embodiment, after the identity exchange between the local site and the target site is detected, the data backed up by the local site as the target site is also put into the storage area of the local site for the backup site to access. For the target site to pull from the storage area of the local site for the backup site to access. Provides a working situation of this office after the identity exchange. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 7, including:

Step 301: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 302 is entered; otherwise, the process ends.

Step 302: Establish a disaster recovery relationship with the target site.

Step 303: Back up the data to be backed up in the target site for the target site.

Steps 301-303 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.

In step 304, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 304; otherwise, the process ends. Similar to step 204, it will not be repeated here.

It should be noted that in this embodiment, the detection of whether the site and the target site perform identity exchange in step 304 is executed after the data to be backed up in the target site is backed up for the target site in step 303. This implementation The example only provides a situation. However, in actual implementation, step 304 can be performed at the same time as step 303, or performed after step 303, and these cases should be within the scope of protection.

Step 305: Put the data backed up by the local site as the target site into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access .

Step 306: Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.

It should be noted that, in this embodiment, the data backed up by the local site as the target site in step 305 is placed in the storage area of the local site for the backup site to access in step 306. The connection request is executed after sending a response allowing the connection to the target site. This embodiment only provides one case. In actual implementation, step 305 can be performed at the same time as step 306, or performed after step 306, all of which should fall within the protection scope.

In this embodiment, after the identity exchange between the local site and the target site is detected, the data backed up by the local site as the target site is put into the storage area of the local site for the backup site to access for the target site. The site is pulled from the storage area of the local site for the backup site to access, so that the identity can be converted to the target site of the backup site to pull, without the need to change the identity to the primary site’s local site to actively send it. Provides a working situation of this office after the identity exchange.

The fourth embodiment of the present invention relates to a disaster tolerance method. Compared with the second embodiment, this implementation mode needs to clear or transfer the data stored in the local site and the backup site other than the target site after the identity exchange between the local site and the target site is detected. That is, the backup data of the main site in the other disaster recovery relationship of the local site and the backup data of the target site for identity exchange are stored separately, which is convenient for the target site after the identity exchange to be directly pulled, and the target site is reduced. The difficulty of the process of pulling backup data. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 8, including:

Step 401: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets the preset conditions; if it exists, step 402 is entered; otherwise, the flow ends.

Step 402: Establish a disaster recovery relationship with the target site.

Step 403: Back up the data to be backed up in the target site for the target site.

Steps 401-403 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.

In step 404, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 404; otherwise, the process ends. Similar to step 204, it will not be repeated here.

It should be noted that, in this embodiment, the detection of whether the site is exchanged with the target site in step 404 is executed after the data to be backed up in the target site is backed up for the target site in step 403. This implementation The example only provides a situation. However, in actual implementation, step 404 can be performed at the same time as step 403, or performed after step 403, all of which should fall within the scope of protection.

Step 405: Clear or dump the data stored in the local site that is backed up by sites other than the target site.

Step 406: Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.

It should be noted that, in this embodiment, the clearing or dumping of data stored in the local site in step 405 that is backed up by sites other than the target site in step 406 receives the connection request of the target site, and It is executed after sending a connection permission response to the target site, and this embodiment only provides one case. In actual implementation, step 405 can be performed at the same time as step 406, or performed after step 406, all of which should fall within the scope of protection.

In this embodiment, when the identity exchange between the local site and the target site is detected, the backup data stored in the local site other than the backup data of the target site needs to be cleared or dumped, that is, the backup data stored in the local site must be removed from other disasters in the local site. The backup data of the main site in the backup relationship and the backup data of the target site for identity exchange are stored separately, which facilitates the direct pull of the target site after the identity exchange, and reduces the difficulty of the target site to pull the backup data. .

The fifth embodiment of the present invention relates to a disaster tolerance method. In this embodiment, when the identity exchange between the local site and the target site is detected, it is not only necessary to clear or transfer the data stored in the local site that is backed up by sites other than the target site, that is, to transfer other sites to other sites. The backup data of the primary site in the disaster recovery relationship is stored separately from the backup data of the target site for identity exchange, and the data backed up by the local site as the target site must be placed in the local site for the backup site The accessed storage area is for the target site to pull from the storage area of the local site for the backup site to access. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 9, including:

Step 501: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 502 is entered; otherwise, the process ends.

Step 502: Establish a disaster recovery relationship with the target site.

Step 503: Back up the data to be backed up in the target site for the target site.

Steps 501-503 are similar to steps 101-103 in the first embodiment, respectively, and will not be repeated here.

In step 504, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 505; otherwise, the process ends. Similar to step 204, it will not be repeated here.

It should be noted that, in this embodiment, the detection of whether the site is exchanged with the target site in step 504 is performed after the data to be backed up in the target site is backed up for the target site in step 503. This implementation The example only provides a situation. However, in actual implementation, step 504 can be performed at the same time as step 503, or performed after step 503, and these cases should be within the protection scope.

Step 505: Put the data backed up by the local site as the target site into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access .

Step 506: Clear or dump the backed-up data stored in the local site except for the target site. Similar to step 405, it will not be repeated here.

Step 507: Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.

It should be noted that the execution order of

steps

505, 506, and 507 in this embodiment can be changed according to actual conditions, and all the changed conditions should be within the scope of protection.

In this embodiment, when the identity exchange between the local site and the target site is detected, it is not only necessary to clear or dump the data stored in the local site that is backed up for sites other than the target site, but also to transfer the local site’s identities. The data backed up for the target site is put into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access, providing a local office The work situation of the site and the target site after the identity exchange.

The division of the steps of the various methods above is just for clarity of description. When implemented, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications to the algorithm or process or introducing insignificant design, but not changing the core design of the algorithm and process are within the scope of protection of the patent.

The fifth embodiment of the present invention relates to a disaster tolerance device. The disaster recovery device includes: an integrated service module 601, a message middleware 602, and an application embedded with a data backup module 603. The number of applications is n, and n is a natural number greater than zero. The specific structure diagram is shown in Figure 10, including:

The integrated service module 601 is set to detect whether there is a target site that meets preset conditions when determining the identity of the local site as a backup site. The preset conditions include: the target site is the primary site, and the target site is assigned For the local site and have not yet established a disaster recovery relationship with the local site; it is also set to establish a disaster recovery relationship with the target site that meets the preset conditions.

Specifically, the operation and maintenance personnel can set the identity of the site through simple configuration of the interface provided by the integrated service module 601, for example, set the identity of one of the sites as the primary site.

The message middleware 602 is configured to store the identity information of the local office from the integrated service module 601.

The data backup module 603 is set to monitor the identity information of the local site in the message middleware, and after the disaster recovery relationship is established between the local site and the target site, according to the identity information of the local site, back up the target site to be waiting for the target site. Backed up data.

In a specific example, the initiator of establishing the disaster recovery relationship is generally the integrated service module 601 of the backup site. The backup site integrated service module 601 sends a connection request to the target site, and after receiving a response from the target site to allow the connection, confirms that the disaster recovery relationship is successfully established. The connection request can be a heartbeat message, which is not limited here. It is assumed that the integrated service module 601 establishes and maintains the disaster recovery relationship between the primary and backup sites through heartbeat messages. The schematic diagram of the 2+2 disaster recovery framework when the disaster recovery relationship is established is shown in Figure 11. The backup site D and the integrated service module 601 in the E in the backup domain respectively send to the active sites A and B in the active domain. Send a heartbeat message.

In a specific example, the integrated service module 601 scans a pre-stored list of primary sites, and the list of primary sites includes all primary sites assigned to the local site. Then the current status of each primary site in the primary site list is detected, and the primary site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset conditions.

In a specific example, in the process of backing up the data to be backed up in the target site for the target site, the data backup module 603 pulls the data to be backed up from the storage area of the target site for the backup site to access.

In a specific example, the data to be backed up is bound with the identity information of the target site. During the process of backing up the data to be backed up in the target site for the target site, the data backup module 603 will back up the data to be backed up according to the identity information. The data is stored in the storage area corresponding to the target site. The schematic diagram of the 2+2 disaster recovery architecture when backing up data is shown in Figure 12. The data backup modules 603 in the backup sites D and E in the backup domain send backups to the active sites A and B in the active domain respectively. Data request, and then pull the data to be backed up from the storage media of the primary sites A and B respectively.

In a specific example, after the disaster recovery relationship between the local site and the target site is established, if the identity exchange between the local site and the target site is detected, the integrated service module 601 sends the target site to the target site upon receiving the connection request from the target site. Click to send a reply that allows the connection.

In a specific example, after detecting the identity exchange between the local site and the target site, the data backup module 603 puts the data backed up by the local site as the target site into the storage area of the local site for the backup site to access. For the target site to pull from the storage area of the local site for the backup site to access.

In a specific example, after detecting that the identities of the local site and the target site are exchanged, the data backup module 603 clears or dumps the data stored in the local site that is backed up by sites other than the target site.

In a specific example, in the process of backing up the data to be backed up in the target site for the target site, the application of the local site is the same application backup data as the application in the target site.

It is not difficult to find that this embodiment is a system example corresponding to the first embodiment, and this embodiment can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and the technical effects that can be achieved in the first embodiment can also be achieved in this embodiment. In order to reduce repetition, details are not repeated here. Correspondingly, the related technical details mentioned in this embodiment can also be applied in the first embodiment.

It is worth mentioning that the modules involved in this embodiment are all logical modules. In practical applications, a logical unit can be a physical unit, a part of a physical unit, or multiple physical units. The combination of units is realized. In addition, in order to highlight the innovative part of the present invention, this embodiment does not introduce units that are not closely related to solving the technical problems proposed by the present invention, but this does not indicate that there are no other units in this embodiment.

The sixth embodiment of the present invention relates to a site, as shown in FIG. 13, comprising: at least one processor 701; and a memory 702 communicatively connected with at least one processor; wherein, the memory 702 stores at least one The instructions executed by the processor 701 are executed by the at least one processor 701, so that the at least one processor 701 can execute the foregoing disaster recovery method.

The memory 702 and the processor 701 are connected in a bus manner, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more various circuits of the processor 701 and the memory 702 together. The bus can also connect various other circuits such as peripheral sites, voltage regulators, power management circuits, etc., which are all known in the art, and therefore, will not be further described herein. The bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or multiple elements, such as multiple receivers and transmitters, providing a unit configured to communicate with various other devices on a transmission medium. The data processed by the processor 701 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 701.

The processor 701 is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interfaces, voltage regulation, power management, and other control functions. The memory 702 may be configured to store data used by the processor 701 when performing operations.

The seventh embodiment of the present invention relates to a computer-readable storage medium storing a computer program. When the computer program is executed by the processor, the above method embodiment is realized.

That is, those skilled in the art can understand that all or part of the steps in the method of the foregoing embodiments can be implemented by instructing relevant hardware through a program. The program is stored in a storage medium and includes several instructions to enable a site (It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) executes all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .

A person of ordinary skill in the art can understand that the above-mentioned embodiments are specific examples for realizing the present invention, and in practical applications, various changes can be made to them in form and details without departing from the spirit and spirit of the present invention. range.

Claims

A disaster tolerance method, including:

If it is determined that the identity of the current site is a backup site, check whether there is a target site that meets preset conditions; the preset conditions include: the identity of the target site is the primary site, and the target site is allocated To the said local site and has not yet established a disaster recovery relationship with the said local site;

If there is the target site that meets the preset conditions, establish the disaster recovery relationship with the target site;

Back up the data to be backed up in the target site for the target site.
The disaster recovery method according to claim 1, wherein the establishing the disaster recovery relationship with the primary site comprises:

Send a connection request to the target site, and after receiving a connection permission response from the target site, confirm that the disaster recovery relationship is successfully established.
The disaster recovery method according to claim 1, wherein the detecting whether there is the target site meeting a preset condition comprises:

Scan the pre-stored list of primary sites; the list of primary sites includes all primary sites assigned to the local site;

Detect the current state of each primary site in the primary site list, and determine that the current state is that the primary site that has not established the disaster recovery relationship with the local site is the one that satisfies the prediction Set the target site of the condition.
The disaster recovery method according to claim 1, wherein backing up the data to be backed up in the target site for the target site comprises: from a storage area of the target site for the backup site to access Pull the data to be backed up.
The disaster recovery method according to claim 1, wherein the data to be backed up is bound with the identity information of the target site;

The backing up the data to be backed up in the target site for the target site includes:

According to the identity information, the data to be backed up is stored in a storage area corresponding to the target site.
The disaster recovery method according to claim 1, wherein after the establishment of the disaster recovery relationship with the target site, the method further comprises:

If it is detected that the identities of the local site and the target site are exchanged, upon receiving the connection request of the target site, a response allowing connection is sent to the target site.
The disaster recovery method according to claim 6, wherein, after detecting that the identities of the local site and the target site are exchanged, the method further comprises:

Put the data backed up by the local site for the target site into the storage area of the local site for the backup site to access, so that the target site can use it for the backup site from the local site. Click the accessed storage area to pull.
The disaster recovery method according to claim 6, wherein, after detecting that the identities of the local site and the target site are exchanged, the method further comprises:

Clear or dump the backed-up data stored in the local site except for the site other than the target site.
The disaster recovery method according to claim 1, wherein the backing up the data to be backed up in the target site for the target site comprises:

The application of the local site is the same application backup data as the application in the target site.
A disaster tolerance device, including: a comprehensive service module, a message middleware, and an application embedded with a data backup module;

The integrated service module is configured to detect whether there is a target site that meets preset conditions when determining the identity of the local site as a backup site, and is set to establish disaster recovery with the target site that meets the preset conditions Relationship; the preset conditions include: the target site is the master site identity, the target site is allocated to the local site and has not yet established a disaster recovery relationship with the local site;

The message middleware is configured to store the identity information of the local site from the integrated service module;

The data backup module in the application is configured to monitor the identity information of the local site in the message middleware, and is set to monitor the local site and the target site after the disaster recovery relationship is established according to The identity information of the local site backs up the data to be backed up in the target site for the target site.
A type of site, including:

At least one processor; and,

A memory communicatively connected with the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute any one of claims 1 to 9 The disaster tolerance method described.
A computer-readable storage medium that stores a computer program, which, when executed by a processor, implements the disaster recovery method described in any one of claims 1 to 9.