WO2021052416A1 - Disaster tolerant method and apparatus, site, and storage medium - Google Patents

Disaster tolerant method and apparatus, site, and storage medium Download PDF

Info

Publication number
WO2021052416A1
WO2021052416A1 PCT/CN2020/115892 CN2020115892W WO2021052416A1 WO 2021052416 A1 WO2021052416 A1 WO 2021052416A1 CN 2020115892 W CN2020115892 W CN 2020115892W WO 2021052416 A1 WO2021052416 A1 WO 2021052416A1
Authority
WO
WIPO (PCT)
Prior art keywords
site
target site
disaster recovery
backup
data
Prior art date
Application number
PCT/CN2020/115892
Other languages
French (fr)
Chinese (zh)
Inventor
张贵
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2021052416A1 publication Critical patent/WO2021052416A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1461Backup scheduling policy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Definitions

  • the embodiments of the present invention relate to the field of communications and information, and in particular to a disaster tolerance method, device, site, and storage medium.
  • the disaster recovery relationship in the system disaster recovery solution is mainly one-to-one correspondence, that is, a backup site can only back up one active site. If an active site is added, a new backup site must be built to back up the data produced by the newly added active site, so business expansion is not flexible enough.
  • the purpose of the embodiments of the present invention is to provide a disaster recovery method, device, site, and storage medium.
  • the backup site actively establishes a disaster recovery relationship with the active site, which can prevent the active site from actively establishing disaster recovery with the backup site.
  • the original business is interrupted due to the expansion of the business in the running process, and the flexibility of business expansion is improved at the same time.
  • the embodiments of the present invention provide a disaster recovery method, including: if the identity of the local site is determined to be a backup site, detecting whether there is a target site meeting preset conditions; the preset conditions include: target The identity of the site is the main site, and the target site is assigned to the site and has not yet established a disaster recovery relationship with the site; if there is a target site that meets the preset conditions, establish a disaster recovery relationship with the target site; it is the target The site backs up the data to be backed up in the target site.
  • the embodiment of the present invention also provides a disaster recovery device, including: a comprehensive service module, a message middleware, and an application embedded with a data backup module; the comprehensive service module is set to determine the identity of the local site as a backup site, Detect whether there is a target site that meets the preset conditions.
  • the preset conditions include: the target site is the master site, the target site is assigned to the site and has not yet established a disaster recovery relationship with the site; it is also set to meet The target site with preset conditions establishes a disaster recovery relationship; the message middleware is set to store the identity information of the local site from the integrated service module; the data backup module in the application is set to monitor the identity information of the local site in the message middleware , And after establishing a disaster recovery relationship between the local site and the target site, according to the identity information of the local site, back up the data to be backed up in the target site for the target site.
  • the embodiment of the present invention also provides a site, including: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are at least One processor executes, so that at least one processor can execute the above disaster recovery method.
  • the embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program is executed by a processor to implement the disaster recovery method described above.
  • the backup site actively detects the active site that is assigned to the local site and has not established a disaster recovery relationship with the local site, and actively requests the establishment of the disaster recovery relationship, and the backup site can communicate with Multiple primary sites establish a disaster recovery relationship to solve the problem that the disaster recovery architecture is relatively solidified when the disaster recovery relationship is one-to-one.
  • the primary site does not need to care about the status of the standby site, which improves the flexibility of business expansion.
  • establishing a disaster recovery relationship with the primary site includes: sending a connection request to the target site, and confirming that the disaster recovery relationship is successfully established after receiving a response from the target site to allow the connection.
  • the backup site actively initiates the connection, that is, for the master site, there is no need to know its corresponding backup site, which reduces the workload of the master site in the process of establishing a disaster recovery relationship.
  • detecting whether there is the target site that meets the preset conditions includes: scanning the pre-stored list of primary sites; the list of primary sites includes all the primary sites assigned to the local site; detecting the primary sites The current status of each primary site in the list is determined, and the primary site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset conditions.
  • a list of primary sites is pre-stored in the backup site, and the primary site that has not established a disaster recovery relationship in the scan table provides a specific detection method to ensure that the backup site is only compatible with the primary site that meets the preset conditions. Establish a disaster preparedness relationship.
  • backing up the data to be backed up in the target site for the target site includes: pulling the data to be backed up from the storage area of the target site for the backup site to access.
  • the backup site actively pulls the data to be backed up from the active site that has established a disaster recovery relationship, without the need for the active site to allocate the data to the corresponding backup site, reducing the need for the active site in the process of backing up data. Point of work load.
  • the data to be backed up is bound with the identity information of the target site; for the target site, backing up the data to be backed up in the target site includes: storing the data to be backed up to the corresponding target site according to the identity information Storage area.
  • identity information the source of the acquired data to be backed up can be known, and all the acquired data to be backed up can be distinguished according to the identity information, which is convenient for storage and management of the acquired data to be backed up.
  • the identity information is unique, which ensures that the source of the obtained data to be backed up is reliable.
  • the target site after establishing a disaster recovery relationship with the target site, it also includes: if the identity exchange between the local site and the target site is detected, when a connection request from the target site is received, the target site will send a connection permission response to the target site.
  • This embodiment provides a processing method after the master and backup identities are exchanged.
  • the target site after detecting that the identities of the local site and the target site are exchanged, it also includes: putting the backup data of the target site stored in the local site into the preset storage area of the local site, so that the target site can be preset from the local site. Storage area pull. Provides a working situation of this office after the identity exchange.
  • the target site after detecting that the identities of the local site and the target site are interchanged, it also includes: clearing or dumping the data stored in the local site that is backed up by sites other than the target site. Separately store the backup data of the primary site and the target site in other disaster recovery relationships to facilitate the direct pull of the target site after the identity exchange, and reduce the difficulty of the target site to pull the backup data.
  • backing up the data to be backed up in the target site for the target site includes: the application of the local site is the same application backup data as the application in the target site. There is a one-to-one correspondence between the applications in the local site and the target site, and the applications in the local site need to back up the data of the applications in the corresponding target site. Provide a data backup method to make the data backup process more organized.
  • FIG. 1 is a schematic structural diagram of a 2+2 disaster tolerance architecture in the first embodiment of the present invention
  • FIG. 2 is a flowchart of a disaster recovery method in the first embodiment of the present invention
  • FIG. 3 is a schematic diagram of the disaster recovery relationship of the 2+2 disaster recovery architecture in the first embodiment of the present invention
  • FIG. 4 is a schematic diagram of the disaster recovery relationship of the 3+2 disaster recovery architecture in the first embodiment of the present invention
  • FIG. 5 is a schematic diagram of the disaster recovery relationship of the 3+3 disaster recovery architecture in the first embodiment of the present invention.
  • FIG. 6 is a flowchart of a disaster recovery method in the second embodiment of the present invention.
  • FIG. 7 is a flowchart of a disaster recovery method in the third embodiment of the present invention.
  • FIG. 8 is a flowchart of a disaster recovery method in the fourth embodiment of the present invention.
  • FIG. 9 is a flowchart of a disaster recovery method in the fifth embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a disaster recovery device in a sixth embodiment of the present invention.
  • FIG. 11 is a schematic diagram of the establishment of a 2+2 disaster recovery framework disaster recovery relationship in the sixth embodiment of the present invention.
  • FIG. 12 is a schematic diagram of backup data of a 2+2 disaster recovery framework in the sixth embodiment of the present invention.
  • FIG. 13 is a schematic diagram of the structure of a site in the seventh embodiment of the present invention.
  • the primary site actively establishes a disaster recovery relationship with the backup site it will cause business interruption during business expansion and affect the flexibility of business expansion. Based on this, the inventor proposed the technical solution of this application.
  • the first embodiment of the present invention relates to a disaster tolerance method.
  • a disaster recovery relationship is established with the active site that is allocated to the local site and has not yet established a disaster recovery relationship with the local site.
  • the local site serves as the backup site to back up the data in the active site.
  • the entire disaster recovery method involves an M+N disaster recovery architecture, which is composed of an active domain and a standby domain, where the active domain includes M active sites, and the standby domain includes N standby sites. It can be considered that the disaster tolerance architecture consists of M primary sites and N backup sites, and the primary sites and backup sites can communicate with each other.
  • An active site can establish a disaster recovery relationship with multiple backup sites, and a backup site can also establish a disaster recovery relationship with multiple active sites.
  • Each site can be deployed in the same or different locations according to actual planning requirements. area.
  • both M and N are natural numbers greater than or equal to 1.
  • the active domain includes the active sites A and B
  • the backup domain includes D and E.
  • the implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution.
  • the specific process is shown in Figure 2, including:
  • Step 101 When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 102 is entered; otherwise, the process ends.
  • the local site scans the pre-stored master site list.
  • the master site list includes all the master sites assigned to the local site, and then The current status of each master site in the master site list is detected, and the master site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset condition.
  • the identity of the site includes the primary site and the backup site.
  • the operation and maintenance personnel can configure the site through a specific interface to achieve the purpose of setting the identity of the site.
  • Step 102 Establish a disaster recovery relationship with the target site.
  • the local site when it detects that there is a target site that meets the preset conditions, the local site serves as a backup site to send a connection request to the target site, and after receiving the response of the target site to allow connection, confirm the disaster recovery relationship Successfully established.
  • the connection request can include the identity information of the local site for the target site to identify and record the identity information of the local site, so that the target site can know the source of the connection request, that is, the object that establishes the disaster recovery relationship.
  • Step 103 Back up the data to be backed up in the target site for the target site.
  • the data to be backed up in the target site will be pulled from the preset storage area of the target site.
  • This method can well solve the coupling caused by the many-to-many disaster recovery relationship.
  • the active site only needs to back up data, which is decoupled from the recovery process of the standby site, and each active site in the active domain
  • Each backup site in the backup domain does not affect each other. It will not cause the failure of a primary site to produce data to be backed up, or a backup site’s failure to back up data, resulting in the entire disaster recovery process. unavailable.
  • the data to be backed up is bound with the identity information of the target site, and the local site will store the data to be backed up in a storage area corresponding to the target site based on the identity information.
  • the data to be backed up there is a one-to-one correspondence between the data to be backed up and the applications in the target site.
  • the applications of the local site are the target site.
  • the data to be backed up in each application of the site, and each application of the local site corresponds to each application of the target site one-to-one.
  • each application in the primary site produces data required for disaster recovery, and the data is to be backed up.
  • the application independently determines the backup strategy according to the particularity of its own business and related parameters.
  • the backup strategy includes: periodic, full backup, incremental backup, etc., which are not limited here.
  • the backup period can be as short as possible, for example, it can be set to backup once every 30 seconds.
  • non-critical applications you can choose to backup once an hour.
  • applications with a large amount of data you can choose incremental synchronization, otherwise you can choose full backup. It should be noted that critical applications and non-critical applications can be distinguished based on preset standards.
  • the criteria for defining critical applications and non-critical applications are not limited here, and the actual operation can be determined according to the actual situation.
  • the primary site in the primary domain there may be multiple primary sites being backed up by the same backup site. Therefore, when data is backed up, each primary site is in the process of data backup. Carry the main site identification bit, that is, bind identity information.
  • the application in each backup site in the backup domain pulls the corresponding backup data from the storage medium of the active site in the active domain for data synchronization.
  • the period, frequency, and timing of data synchronization can be determined by each application by configuring related parameters according to its own business.
  • the applications in each site in the backup domain restore the backup data to the backup site to maintain the consistency of the data in the primary and backup sites.
  • the backup site in the backup domain there may be a situation where one backup site backs up multiple active sites. Therefore, when data is restored, the active site identification bits carried in the backup data need to be restored together.
  • the versions of the sites in the active domain and the backup domain should be consistent. When the version is inconsistent, for example, the version in the primary domain is high or low. Based on the principle of downward compatibility, the site version in the backup domain should use the high version, so that the backup domain can restore the primary domain compatible. There is a lower version of the data in.
  • the identity of the local site is a standby site. If the local site is a standby site, the identity of the local site is announced to the message middleware as the standby site. In the same way, if the identity of the local site is the primary site, the identity of the local site is published to the message middleware as the primary site. Then send a heartbeat message to the active site that meets the preset conditions and is in the active state. For example, the backup site D sends a heartbeat message to the active site A to request the establishment of a connection. If the standby site receives a response that allows the connection, it confirms that the disaster recovery relationship is established. If no response is received, the object sending the heartbeat message should be set to an abnormal state in the local site.
  • the backup site D sends a heartbeat message to the active site A that meets the preset conditions and is active, but does not receive a response to the heartbeat message, then the backup site D will record the active site A as a status Abnormal, and send an alarm notification.
  • Each application in the site will monitor the site identity information stored in the message middleware, and then perform corresponding actions based on the site identity.
  • the primary site produces the data to be backed up, and the backup site will be the primary site that has established a disaster recovery relationship.
  • the data to be backed up in the site is backed up and restored.
  • the disaster tolerance architecture is 2+2 type, that is, 2 primary sites and 2 backup sites.
  • the specific disaster recovery relationship is shown in Figure 3.
  • a new primary site C needs to be opened in a certain area, and two disaster recovery redundancy, that is, two backups, are required.
  • two disaster recovery redundancy that is, two backups.
  • the operation and maintenance personnel set the identity of site C as the primary site on the configuration interface.
  • the operation and maintenance personnel can configure the environment information of the active site C on the current standby sites D and E, that is, add the active site C to the active site list of the standby sites D and E, and Set the active site C to the activated state.
  • the environmental information includes port information, which is not limited here. Take the backup site D as an example. After the authority point D determines that it is the backup site, it will detect whether there is an active site that meets the preset conditions, that is, the active site that is assigned to itself and has not established a disaster recovery relationship with itself. game point.
  • the connection request can be a heartbeat message, which is not limited here.
  • the primary site C receives the connection request from the backup site D, it will store the information of the backup site D. Similarly, the information of the backup site D will also be stored, and then the active site will be activated. Point C will send a connection permission response to backup site D to confirm that the disaster recovery relationship between primary site C and backup site D is successfully established.
  • each application on the primary site C will store the data to be backed up in the storage medium according to the backup strategy formulated by the respective application.
  • the applications in the backup site D obtain the data to be backed up from the storage medium of the primary site C according to their respective recovery strategies.
  • the acquisition methods include Rsync, Ftp, etc., which are not limited here, and the data is finally restored.
  • the same is true for the standby site E, so I won't repeat them one by one. So far, the disaster tolerance architecture has been expanded from 2+2 to 3+2, as shown in Figure 4. When the disaster recovery architecture is expanded to 3+2, the disaster recovery relationship between sites is shown in Table 2:
  • the current two disaster tolerance redundancy does not require disaster tolerance, and a backup point needs to be added.
  • the added site is F
  • first set up site F and then open up the network plane, so that the new site F can communicate with sites A, B, D, and E.
  • the operation and maintenance personnel set the identity of site F as the backup site on the configuration interface.
  • the operation and maintenance personnel configure the environment information of the primary sites A and C on site F, that is, add the primary sites A and C to the primary site list of site F, and set the primary sites A and C For the active state. It should be noted that the operation and maintenance personnel can also configure only the environmental information of the primary site A on site F. That is, in this example, the operation and maintenance personnel configure the primary site A and C on site F. Environmental information is only one of the conditions that can be selected in the actual implementation, and is not a necessary condition for the realization of this program. After the authority point F determines that it is the backup site, it will detect whether there is an active site that meets the preset conditions, that is, the active site that is assigned to itself and has not yet established a disaster recovery relationship with itself.
  • the primary sites A and C in the primary site list of the backup site F must meet the preset conditions. Take the primary site C as an example.
  • the agency F scans the information of the primary site C in the primary site list, it will send a connection request to the primary site C.
  • the connection request can be a heartbeat message.
  • the active site C receives the connection request from the backup site F, it will store the information of the backup site F, and then the active site C will send a connection permission response to the backup site F to confirm The disaster recovery relationship between the primary site C and the backup site F is successfully established. After the disaster recovery relationship is established, each application on the primary site C will store the data to be backed up in the storage medium according to the backup strategy formulated by the respective application.
  • the applications in the backup site F obtain the data to be backed up from the storage medium of the primary site C according to their respective recovery strategies.
  • the acquisition methods include Rsync, Ftp, etc., which are not limited here, and the data is finally restored.
  • the same is true for the standby site E, so I won't repeat them here. So far, the disaster tolerance architecture has been expanded from 3+2 to 3+3, as shown in Figure 5.
  • the disaster recovery relationship between the sites when the disaster recovery architecture is expanded to 3+3 is shown in Table 3:
  • the local site after determining that the local site is the backup site, the local site takes the initiative to establish a disaster recovery relationship with the active site that is assigned to the local site and has not yet established a disaster recovery relationship with the local site, and then backs up the active site data.
  • the primary site does not need to actively request it to avoid the original business interruption due to business expansion during the operation process when the primary site actively establishes a disaster recovery relationship, which can improve business expansion flexibility.
  • the second embodiment of the present invention relates to a disaster tolerance method.
  • the identity of the local site and the target site that have established a disaster recovery relationship are exchanged.
  • the identity of the local site is changed from the backup site to the active site, and the identity of the target site is changed from the primary site. Converted to a backup site. Therefore, the local office only needs to send a connection permission response to the target office when it receives the connection request from the target office.
  • This embodiment takes into account the situation of active/standby switching.
  • the implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution.
  • the specific process is shown in Figure 6, including:
  • Step 201 When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 202 is entered; otherwise, the process ends.
  • Step 202 Establish a disaster recovery relationship with the target site.
  • Step 203 Back up the data to be backed up in the target site for the target site.
  • Steps 201-203 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.
  • step 204 it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 204; otherwise, the process ends.
  • Step 205 Receive a connection request from the target site, and send a connection permission response to the target site.
  • step 204 the detection of whether the site is exchanged with the target site in step 204 is performed after the data to be backed up in the target site is backed up for the target site in step 203.
  • This implementation The example only provides a situation.
  • step 204 can be performed at the same time as step 203, or performed after step 203, and these cases should be within the scope of protection.
  • the disaster recovery relationship designed by this method is a many-to-many relationship, that is, the data of a master site may be backed up to many On a backup site, when a disaster occurs, the operation and maintenance personnel manually select a backup site in the backup domain to perform the upgrade operation on the interface according to the data recovery status of each backup site.
  • the current disaster recovery architecture is 3+3
  • the main sites are A, B, and C
  • the backup sites are D, E, and F.
  • the area where the main site C is located is affected by natural disasters. Site C is unavailable, and services need to be restored quickly.
  • the operation and maintenance personnel can view the health status and current operation status of the active site C in the standby sites D, E, and F respectively.
  • the identities of the backup site F and the active site C are exchanged, and then the operation and maintenance personnel configure the backup site F as the active site to take over the work of the active site C according to the disaster recovery relationship list.
  • the operation and maintenance personnel restore the active site C, and configure the active site C as a backup site to replace the backup site F to complete the service restoration work.
  • the local site F as the primary site only needs to receive the connection request from the target site C and send a connection permission response to the site C.
  • Each site monitors its own message middleware messages and executes corresponding actions based on the site's identity. That is, the primary site produces the data to be backed up according to the backup strategy, and the backup site obtains the backup data for restoration according to the recovery strategy.
  • the current site as the primary site only needs to respond to the connection request of the target site that has become the backup site.
  • the identities are exchanged, the essence is still that the backup site actively establishes a disaster recovery relationship with the active site that has been assigned to the local site and has not yet established a disaster recovery relationship with the local site, without the active request of the primary site to avoid
  • the primary site actively establishes a disaster recovery relationship the original business is interrupted due to business expansion during the operation process, which improves the flexibility of business expansion.
  • the third embodiment of the present invention relates to a disaster tolerance method.
  • the data backed up by the local site as the target site is also put into the storage area of the local site for the backup site to access.
  • For the target site to pull from the storage area of the local site for the backup site to access.
  • the implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution.
  • the specific process is shown in Figure 7, including:
  • Step 301 When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 302 is entered; otherwise, the process ends.
  • Step 302 Establish a disaster recovery relationship with the target site.
  • Step 303 Back up the data to be backed up in the target site for the target site.
  • Steps 301-303 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.
  • step 304 it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 304; otherwise, the process ends. Similar to step 204, it will not be repeated here.
  • step 304 the detection of whether the site and the target site perform identity exchange in step 304 is executed after the data to be backed up in the target site is backed up for the target site in step 303.
  • This implementation The example only provides a situation.
  • step 304 can be performed at the same time as step 303, or performed after step 303, and these cases should be within the scope of protection.
  • Step 305 Put the data backed up by the local site as the target site into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access .
  • Step 306 Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.
  • step 305 the data backed up by the local site as the target site in step 305 is placed in the storage area of the local site for the backup site to access in step 306.
  • the connection request is executed after sending a response allowing the connection to the target site.
  • step 305 can be performed at the same time as step 306, or performed after step 306, all of which should fall within the protection scope.
  • the data backed up by the local site as the target site is put into the storage area of the local site for the backup site to access for the target site.
  • the site is pulled from the storage area of the local site for the backup site to access, so that the identity can be converted to the target site of the backup site to pull, without the need to change the identity to the primary site’s local site to actively send it.
  • the fourth embodiment of the present invention relates to a disaster tolerance method.
  • this implementation mode needs to clear or transfer the data stored in the local site and the backup site other than the target site after the identity exchange between the local site and the target site is detected. That is, the backup data of the main site in the other disaster recovery relationship of the local site and the backup data of the target site for identity exchange are stored separately, which is convenient for the target site after the identity exchange to be directly pulled, and the target site is reduced. The difficulty of the process of pulling backup data.
  • the implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution.
  • the specific process is shown in Figure 8, including:
  • Step 401 When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets the preset conditions; if it exists, step 402 is entered; otherwise, the flow ends.
  • Step 402 Establish a disaster recovery relationship with the target site.
  • Step 403 Back up the data to be backed up in the target site for the target site.
  • Steps 401-403 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.
  • step 404 it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 404; otherwise, the process ends. Similar to step 204, it will not be repeated here.
  • step 404 the detection of whether the site is exchanged with the target site in step 404 is executed after the data to be backed up in the target site is backed up for the target site in step 403.
  • This implementation The example only provides a situation.
  • step 404 can be performed at the same time as step 403, or performed after step 403, all of which should fall within the scope of protection.
  • Step 405 Clear or dump the data stored in the local site that is backed up by sites other than the target site.
  • Step 406 Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.
  • step 405 receives the connection request of the target site, and It is executed after sending a connection permission response to the target site, and this embodiment only provides one case.
  • step 405 can be performed at the same time as step 406, or performed after step 406, all of which should fall within the scope of protection.
  • the backup data stored in the local site other than the backup data of the target site needs to be cleared or dumped, that is, the backup data stored in the local site must be removed from other disasters in the local site.
  • the backup data of the main site in the backup relationship and the backup data of the target site for identity exchange are stored separately, which facilitates the direct pull of the target site after the identity exchange, and reduces the difficulty of the target site to pull the backup data. .
  • the fifth embodiment of the present invention relates to a disaster tolerance method.
  • the identity exchange between the local site and the target site is detected, it is not only necessary to clear or transfer the data stored in the local site that is backed up by sites other than the target site, that is, to transfer other sites to other sites.
  • the backup data of the primary site in the disaster recovery relationship is stored separately from the backup data of the target site for identity exchange, and the data backed up by the local site as the target site must be placed in the local site for the backup site
  • the accessed storage area is for the target site to pull from the storage area of the local site for the backup site to access.
  • Step 501 When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 502 is entered; otherwise, the process ends.
  • Step 502 Establish a disaster recovery relationship with the target site.
  • Step 503 Back up the data to be backed up in the target site for the target site.
  • Steps 501-503 are similar to steps 101-103 in the first embodiment, respectively, and will not be repeated here.
  • step 504 it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 505; otherwise, the process ends. Similar to step 204, it will not be repeated here.
  • step 504 the detection of whether the site is exchanged with the target site in step 504 is performed after the data to be backed up in the target site is backed up for the target site in step 503.
  • This implementation The example only provides a situation.
  • step 504 can be performed at the same time as step 503, or performed after step 503, and these cases should be within the protection scope.
  • Step 505 Put the data backed up by the local site as the target site into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access .
  • Step 506 Clear or dump the backed-up data stored in the local site except for the target site. Similar to step 405, it will not be repeated here.
  • Step 507 Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.
  • steps 505, 506, and 507 in this embodiment can be changed according to actual conditions, and all the changed conditions should be within the scope of protection.
  • the identity exchange between the local site and the target site when the identity exchange between the local site and the target site is detected, it is not only necessary to clear or dump the data stored in the local site that is backed up for sites other than the target site, but also to transfer the local site’s identities.
  • the data backed up for the target site is put into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access, providing a local office The work situation of the site and the target site after the identity exchange.
  • the fifth embodiment of the present invention relates to a disaster tolerance device.
  • the disaster recovery device includes: an integrated service module 601, a message middleware 602, and an application embedded with a data backup module 603.
  • the number of applications is n, and n is a natural number greater than zero.
  • the specific structure diagram is shown in Figure 10, including:
  • the integrated service module 601 is set to detect whether there is a target site that meets preset conditions when determining the identity of the local site as a backup site.
  • the preset conditions include: the target site is the primary site, and the target site is assigned For the local site and have not yet established a disaster recovery relationship with the local site; it is also set to establish a disaster recovery relationship with the target site that meets the preset conditions.
  • the operation and maintenance personnel can set the identity of the site through simple configuration of the interface provided by the integrated service module 601, for example, set the identity of one of the sites as the primary site.
  • the message middleware 602 is configured to store the identity information of the local office from the integrated service module 601.
  • the data backup module 603 is set to monitor the identity information of the local site in the message middleware, and after the disaster recovery relationship is established between the local site and the target site, according to the identity information of the local site, back up the target site to be waiting for the target site. Backed up data.
  • the initiator of establishing the disaster recovery relationship is generally the integrated service module 601 of the backup site.
  • the backup site integrated service module 601 sends a connection request to the target site, and after receiving a response from the target site to allow the connection, confirms that the disaster recovery relationship is successfully established.
  • the connection request can be a heartbeat message, which is not limited here. It is assumed that the integrated service module 601 establishes and maintains the disaster recovery relationship between the primary and backup sites through heartbeat messages.
  • the schematic diagram of the 2+2 disaster recovery framework when the disaster recovery relationship is established is shown in Figure 11.
  • the backup site D and the integrated service module 601 in the E in the backup domain respectively send to the active sites A and B in the active domain. Send a heartbeat message.
  • the integrated service module 601 scans a pre-stored list of primary sites, and the list of primary sites includes all primary sites assigned to the local site. Then the current status of each primary site in the primary site list is detected, and the primary site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset conditions.
  • the data backup module 603 pulls the data to be backed up from the storage area of the target site for the backup site to access.
  • the data to be backed up is bound with the identity information of the target site.
  • the data backup module 603 will back up the data to be backed up according to the identity information.
  • the data is stored in the storage area corresponding to the target site.
  • the schematic diagram of the 2+2 disaster recovery architecture when backing up data is shown in Figure 12.
  • the data backup modules 603 in the backup sites D and E in the backup domain send backups to the active sites A and B in the active domain respectively. Data request, and then pull the data to be backed up from the storage media of the primary sites A and B respectively.
  • the integrated service module 601 sends the target site to the target site upon receiving the connection request from the target site. Click to send a reply that allows the connection.
  • the data backup module 603 After detecting the identity exchange between the local site and the target site, the data backup module 603 puts the data backed up by the local site as the target site into the storage area of the local site for the backup site to access. For the target site to pull from the storage area of the local site for the backup site to access.
  • the data backup module 603 clears or dumps the data stored in the local site that is backed up by sites other than the target site.
  • the application of the local site in the process of backing up the data to be backed up in the target site for the target site, is the same application backup data as the application in the target site.
  • this embodiment is a system example corresponding to the first embodiment, and this embodiment can be implemented in cooperation with the first embodiment.
  • the related technical details mentioned in the first embodiment are still valid in this embodiment, and the technical effects that can be achieved in the first embodiment can also be achieved in this embodiment. In order to reduce repetition, details are not repeated here. Correspondingly, the related technical details mentioned in this embodiment can also be applied in the first embodiment.
  • modules involved in this embodiment are all logical modules.
  • a logical unit can be a physical unit, a part of a physical unit, or multiple physical units. The combination of units is realized.
  • this embodiment does not introduce units that are not closely related to solving the technical problems proposed by the present invention, but this does not indicate that there are no other units in this embodiment.
  • the sixth embodiment of the present invention relates to a site, as shown in FIG. 13, comprising: at least one processor 701; and a memory 702 communicatively connected with at least one processor; wherein, the memory 702 stores at least one The instructions executed by the processor 701 are executed by the at least one processor 701, so that the at least one processor 701 can execute the foregoing disaster recovery method.
  • the memory 702 and the processor 701 are connected in a bus manner, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more various circuits of the processor 701 and the memory 702 together.
  • the bus can also connect various other circuits such as peripheral sites, voltage regulators, power management circuits, etc., which are all known in the art, and therefore, will not be further described herein.
  • the bus interface provides an interface between the bus and the transceiver.
  • the transceiver may be one element or multiple elements, such as multiple receivers and transmitters, providing a unit configured to communicate with various other devices on a transmission medium.
  • the data processed by the processor 701 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 701.
  • the processor 701 is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interfaces, voltage regulation, power management, and other control functions.
  • the memory 702 may be configured to store data used by the processor 701 when performing operations.
  • the seventh embodiment of the present invention relates to a computer-readable storage medium storing a computer program.
  • the computer program is executed by the processor, the above method embodiment is realized.
  • the program is stored in a storage medium and includes several instructions to enable a site (It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) executes all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A disaster tolerant method and apparatus, a site, and a storage medium, relating to the field of communications and information. The disaster tolerant method comprises: if the identity of this site is determined as a standby site, detecting whether a target site satisfying a preset condition exists (101), the preset condition comprising: the identity of the target site being a main site, and the target site being allocated to this site and having not established a disaster backup relationship with this site; if the target site satisfying the preset condition exists, establishing the disaster backup relationship with the target site (102); and backing up, for the target site, data to be backed up in the target site (103). The flexibility of service expansion can be improved.

Description

容灾方法、装置、局点和存储介质Disaster tolerance method, device, site and storage medium 技术领域Technical field
本发明实施例涉及通信与信息领域,特别涉及一种容灾方法、装置、局点和存储介质。The embodiments of the present invention relate to the field of communications and information, and in particular to a disaster tolerance method, device, site, and storage medium.
背景技术Background technique
随着电信系统的管理能力日益扩展、数据信息量亦呈爆炸式增长,相应的系统高可用性,数据的容灾性显得尤为重要。分布式作为系统能力扩展的一种有效架构形式,将应用以微服务、容器的形态部署在平台即服务(Platform as a Service,简称PaaS)上,被广泛地应用于电信系统。As the management capabilities of telecommunication systems are expanding and the amount of data information is also exploding, the high availability of the corresponding system and the disaster tolerance of data are particularly important. Distributed as an effective form of architecture for system capability expansion, deploys applications in the form of microservices and containers on platform as a service (Platform as Service, PaaS for short), and is widely used in telecommunication systems.
目前,系统容灾方案中的灾备关系主要是一一对应的,即一个备用局点仅可备份一个主用局点。如果增加一个主用局点,就要新搭建一个备用局点来备份新增的主用局点生产的数据,因此业务扩展不够灵活。At present, the disaster recovery relationship in the system disaster recovery solution is mainly one-to-one correspondence, that is, a backup site can only back up one active site. If an active site is added, a new backup site must be built to back up the data produced by the newly added active site, so business expansion is not flexible enough.
发明内容Summary of the invention
本发明实施方式的目的在于提供一种容灾方法、装置、局点和存储介质,备用局点主动与主用局点建立灾备关系,能够避免主用局点主动与备用局点建立灾备关系时,因在运行进程中扩展业务而造成原始业务中断的情况,同时提高了业务扩展的灵活性。The purpose of the embodiments of the present invention is to provide a disaster recovery method, device, site, and storage medium. The backup site actively establishes a disaster recovery relationship with the active site, which can prevent the active site from actively establishing disaster recovery with the backup site. During the relationship, the original business is interrupted due to the expansion of the business in the running process, and the flexibility of business expansion is improved at the same time.
为解决上述技术问题,本发明的实施方式提供了一种容灾方法,包括:若确定本局点的身份为备用局点,检测是否存在满足预设条件的目标局点;预设条件包括:目标局点的身份为主用局点,目标局点被分配给本局点且尚未与本局点建立灾备关系;若存在满足预设条件的目标局点,与目标局点建立灾备关系;为目标局点备份目标局点中待备份的数据。In order to solve the above technical problems, the embodiments of the present invention provide a disaster recovery method, including: if the identity of the local site is determined to be a backup site, detecting whether there is a target site meeting preset conditions; the preset conditions include: target The identity of the site is the main site, and the target site is assigned to the site and has not yet established a disaster recovery relationship with the site; if there is a target site that meets the preset conditions, establish a disaster recovery relationship with the target site; it is the target The site backs up the data to be backed up in the target site.
本发明的实施方式还提供了一种容灾装置,包括:综合服务模块、消息中间件以及内嵌有数据备份模块的应用;综合服务模块,设置为在确定本局点为备用局点身份时,检测是否存在满足预设条件的目标局点,预设 条件包括:目标局点为主用局点身份,目标局点被分配给本局点且尚未与本局点建立灾备关系;还设置为与满足预设条件的目标局点建立灾备关系;消息中间件,设置为存储来自综合服务模块的本局点的身份信息;应用中的数据备份模块,设置为监听消息中间件中的本局点的身份信息,并在本局点与目标局点建立灾备关系后,根据本局点的身份信息,为目标局点备份目标局点中待备份的数据。The embodiment of the present invention also provides a disaster recovery device, including: a comprehensive service module, a message middleware, and an application embedded with a data backup module; the comprehensive service module is set to determine the identity of the local site as a backup site, Detect whether there is a target site that meets the preset conditions. The preset conditions include: the target site is the master site, the target site is assigned to the site and has not yet established a disaster recovery relationship with the site; it is also set to meet The target site with preset conditions establishes a disaster recovery relationship; the message middleware is set to store the identity information of the local site from the integrated service module; the data backup module in the application is set to monitor the identity information of the local site in the message middleware , And after establishing a disaster recovery relationship between the local site and the target site, according to the identity information of the local site, back up the data to be backed up in the target site for the target site.
本发明的实施方式还提供了一种局点,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行上述的容灾方法。The embodiment of the present invention also provides a site, including: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are at least One processor executes, so that at least one processor can execute the above disaster recovery method.
本发明的实施方式还提供了一种计算机可读存储介质,存储有计算机程序,计算机程序被处理器执行时实现上述容灾方法。The embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program is executed by a processor to implement the disaster recovery method described above.
本发明实施方式相对于现有技术而言,备用局点主动检测被分配给本局点且尚未与本局点建立灾备关系的主用局点,并主动请求建立灾备关系,备用局点可与多个主用局点建立灾备关系,解决灾备关系一一对应时容灾架构相对固化的问题。而且主用局点不需要关心备用局点的状态,提高了业务扩展的灵活性。Compared with the prior art, in the embodiment of the present invention, the backup site actively detects the active site that is assigned to the local site and has not established a disaster recovery relationship with the local site, and actively requests the establishment of the disaster recovery relationship, and the backup site can communicate with Multiple primary sites establish a disaster recovery relationship to solve the problem that the disaster recovery architecture is relatively solidified when the disaster recovery relationship is one-to-one. Moreover, the primary site does not need to care about the status of the standby site, which improves the flexibility of business expansion.
另外,与主用局点建立灾备关系,包括:向目标局点发送连接请求,并在接收到目标局点的允许连接的应答后,确认灾备关系建立成功。备用局点主动发起连接,即,对于主用局点而言,无需知晓其对应的备用局点,减轻了主用局点在建立灾备关系的过程中的工作负担。In addition, establishing a disaster recovery relationship with the primary site includes: sending a connection request to the target site, and confirming that the disaster recovery relationship is successfully established after receiving a response from the target site to allow the connection. The backup site actively initiates the connection, that is, for the master site, there is no need to know its corresponding backup site, which reduces the workload of the master site in the process of establishing a disaster recovery relationship.
另外,检测是否存在满足预设条件的所述目标局点,包括:扫描预存的主用局点列表;主用局点列表包括被分配给本局点的所有主用局点;检测主用局点列表中各主用局点的当前状态,并将当前状态为尚未与本局点建立灾备关系的主用局点确定为满足所述预设条件的目标局点。备用局点中预存有主用局点列表,扫描表中未建立灾备关系的主用局点,提供了一种具体的检测方式,确保备用局点只和满足预设条件的主用局点建立灾备 关系。In addition, detecting whether there is the target site that meets the preset conditions includes: scanning the pre-stored list of primary sites; the list of primary sites includes all the primary sites assigned to the local site; detecting the primary sites The current status of each primary site in the list is determined, and the primary site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset conditions. A list of primary sites is pre-stored in the backup site, and the primary site that has not established a disaster recovery relationship in the scan table provides a specific detection method to ensure that the backup site is only compatible with the primary site that meets the preset conditions. Establish a disaster preparedness relationship.
另外,为目标局点备份目标局点中待备份的数据,包括:从目标局点的用于供备用局点访问的存储区拉取待备份的数据。备用局点主动从已经建立灾备关系的主用局点中拉取待备份的数据,而不需要主用局点将数据分配给对应的备用局点,减轻了在备份数据过程中主用局点的工作负担。In addition, backing up the data to be backed up in the target site for the target site includes: pulling the data to be backed up from the storage area of the target site for the backup site to access. The backup site actively pulls the data to be backed up from the active site that has established a disaster recovery relationship, without the need for the active site to allocate the data to the corresponding backup site, reducing the need for the active site in the process of backing up data. Point of work load.
另外,待备份的数据被绑定有目标局点的身份信息;为目标局点备份目标局点中待备份的数据,包括:根据身份信息,将待备份的数据存储至与目标局点对应的存储区域。根据身份信息能够知晓获取的待备份的数据的来源,可以根据身份信息将获取的所有待备份的数据进行区分,便于对获取的待备份的数据进行存储管理。且身份信息具有唯一性,保证了获取的待备份的数据来源可靠。In addition, the data to be backed up is bound with the identity information of the target site; for the target site, backing up the data to be backed up in the target site includes: storing the data to be backed up to the corresponding target site according to the identity information Storage area. According to the identity information, the source of the acquired data to be backed up can be known, and all the acquired data to be backed up can be distinguished according to the identity information, which is convenient for storage and management of the acquired data to be backed up. In addition, the identity information is unique, which ensures that the source of the obtained data to be backed up is reliable.
另外,与目标局点建立灾备关系后,还包括:若检测到本局点与目标局点身份互换,在接收到目标局点的连接请求时,向目标局点发送允许连接的应答。本实施例提供了主备身份互换后的一种处理方式。In addition, after establishing a disaster recovery relationship with the target site, it also includes: if the identity exchange between the local site and the target site is detected, when a connection request from the target site is received, the target site will send a connection permission response to the target site. This embodiment provides a processing method after the master and backup identities are exchanged.
另外,检测到本局点与目标局点身份互换后,还包括:将本局点存储的目标局点的备份数据放入本局点的预设存储区,以供目标局点从本局点的预设存储区拉取。提供了一种在身份互换之后本局点的工作情况。In addition, after detecting that the identities of the local site and the target site are exchanged, it also includes: putting the backup data of the target site stored in the local site into the preset storage area of the local site, so that the target site can be preset from the local site. Storage area pull. Provides a working situation of this office after the identity exchange.
另外,检测到本局点与目标局点身份互换后,还包括:清除或转存本局点中存储的除为目标局点以外的局点备份的数据。将其它灾备关系中主用局点的备份数据和目标局点的备份数据分开存放,便于身份互换后的目标局点直接拉取,降低目标局点拉取备份数据过程的难度。In addition, after detecting that the identities of the local site and the target site are interchanged, it also includes: clearing or dumping the data stored in the local site that is backed up by sites other than the target site. Separately store the backup data of the primary site and the target site in other disaster recovery relationships to facilitate the direct pull of the target site after the identity exchange, and reduce the difficulty of the target site to pull the backup data.
另外,为目标局点备份目标局点中待备份的数据,包括:本局点的应用为目标局点中的与应用相同的应用备份数据。本局点和目标局点中的各应用一一对应,本局点的应用要备份对应的目标局点的应用的数据。提供一种数据备份的方法,使得数据备份过程更有条理。In addition, backing up the data to be backed up in the target site for the target site includes: the application of the local site is the same application backup data as the application in the target site. There is a one-to-one correspondence between the applications in the local site and the target site, and the applications in the local site need to back up the data of the applications in the corresponding target site. Provide a data backup method to make the data backup process more organized.
附图说明Description of the drawings
一个或多个实施例通过与之对应的附图中的图片进行示例性说明,这些示例性说明并不构成对实施例的限定,附图中具有相同参考数字标号的元件表示为类似的元件,除非有特别申明,附图中的图不构成比例限制。One or more embodiments are exemplified by the pictures in the corresponding drawings. These exemplified descriptions do not constitute a limitation on the embodiments. The elements with the same reference numerals in the drawings are denoted as similar elements. Unless otherwise stated, the figures in the attached drawings do not constitute a scale limitation.
图1是本发明第一实施方式中2+2型容灾架构的结构示意图;FIG. 1 is a schematic structural diagram of a 2+2 disaster tolerance architecture in the first embodiment of the present invention;
图2是本发明第一实施方式中容灾方法流程图;2 is a flowchart of a disaster recovery method in the first embodiment of the present invention;
图3是本发明第一实施方式中2+2型容灾架构的灾备关系示意图;3 is a schematic diagram of the disaster recovery relationship of the 2+2 disaster recovery architecture in the first embodiment of the present invention;
图4是本发明第一实施方式中3+2型容灾架构的灾备关系示意图;4 is a schematic diagram of the disaster recovery relationship of the 3+2 disaster recovery architecture in the first embodiment of the present invention;
图5是本发明第一实施方式中3+3型容灾架构的灾备关系示意图;5 is a schematic diagram of the disaster recovery relationship of the 3+3 disaster recovery architecture in the first embodiment of the present invention;
图6是本发明第二实施方式中容灾方法流程图;6 is a flowchart of a disaster recovery method in the second embodiment of the present invention;
图7是本发明第三实施方式中容灾方法流程图;FIG. 7 is a flowchart of a disaster recovery method in the third embodiment of the present invention;
图8是本发明第四实施方式中容灾方法流程图;FIG. 8 is a flowchart of a disaster recovery method in the fourth embodiment of the present invention;
图9是本发明第五实施方式中容灾方法流程图;9 is a flowchart of a disaster recovery method in the fifth embodiment of the present invention;
图10是本发明第六实施方式中容灾装置的结构示意图;FIG. 10 is a schematic structural diagram of a disaster recovery device in a sixth embodiment of the present invention;
图11是本发明第六实施方式中2+2型容灾构架灾备关系建立的示意图;11 is a schematic diagram of the establishment of a 2+2 disaster recovery framework disaster recovery relationship in the sixth embodiment of the present invention;
图12是本发明第六实施方式中2+2型容灾构架备份数据的示意图;12 is a schematic diagram of backup data of a 2+2 disaster recovery framework in the sixth embodiment of the present invention;
图13是本发明第七实施方式中局点的结构示意图。FIG. 13 is a schematic diagram of the structure of a site in the seventh embodiment of the present invention.
具体实施方式detailed description
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合附图对本发明的各实施方式进行详细的阐述。然而,本领域的普通技术人员可以理解,在本发明各实施方式中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施方式的种种变化和修改,也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便,不应对本发明的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。In order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the various embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, a person of ordinary skill in the art can understand that in each embodiment of the present invention, many technical details are proposed in order to enable readers to better understand the present application. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solution claimed in this application can be realized. The following division of the various embodiments is for convenience of description, and should not constitute any limitation on the specific implementation of the present invention, and the various embodiments may be combined with each other without contradiction.
发明人发现,新搭建一个备用局点来备份新增的主用局点生产的数据虽然能够达到容灾的目的,但是会对资源有较大的需求,不可避免的增加了系统的运维成本。同时,如果采用主用局点主动与备用局点建立灾备关系的方案,在进行业务扩展时会造成业务中断,影响业务扩展的灵活性。基于此,发明人提出了本申请的技术方案。The inventor found that a new backup site to back up the data produced by the newly added primary site can achieve the purpose of disaster recovery, but it will have a greater demand for resources, which inevitably increases the operation and maintenance cost of the system. . At the same time, if the primary site actively establishes a disaster recovery relationship with the backup site, it will cause business interruption during business expansion and affect the flexibility of business expansion. Based on this, the inventor proposed the technical solution of this application.
本发明的第一实施方式涉及一种容灾方法。在本实施方式中,如果本局点的身份为备用局点,则与分配给本局点且尚未与本局点建立灾备关系的主用局点建立灾备关系。灾备关系建立之后本局点作为备用局点来备份主用局点中的数据。整个容灾方法涉及M+N型容灾架构,该容灾架构由主用域和备用域组成,其中,主用域包含M个主用局点,备用域包含N个备用局点。可视为该容灾架构由M个主用局点和N个备用局点组成,且主用局点和备用局点之间可以互相通信。一个主用局点可与多个备用局点建立灾备关系,一个备用局点也可与多个主用局点建立灾备关系,每个局点根据实际规划需求可以部署在相同或不同的地区。其中,M和N均为大于等于1的自然数。如图1所示,M和N均为2时,即容灾架构为2+2型时的结构示意图,主用域中包括主用局点A和B,备用域中包括D和E。下面对本实施方式的容灾方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须。具体流程如图2所示,包括:The first embodiment of the present invention relates to a disaster tolerance method. In this embodiment, if the identity of the local site is the backup site, a disaster recovery relationship is established with the active site that is allocated to the local site and has not yet established a disaster recovery relationship with the local site. After the disaster recovery relationship is established, the local site serves as the backup site to back up the data in the active site. The entire disaster recovery method involves an M+N disaster recovery architecture, which is composed of an active domain and a standby domain, where the active domain includes M active sites, and the standby domain includes N standby sites. It can be considered that the disaster tolerance architecture consists of M primary sites and N backup sites, and the primary sites and backup sites can communicate with each other. An active site can establish a disaster recovery relationship with multiple backup sites, and a backup site can also establish a disaster recovery relationship with multiple active sites. Each site can be deployed in the same or different locations according to actual planning requirements. area. Among them, both M and N are natural numbers greater than or equal to 1. As shown in Figure 1, when both M and N are 2, that is, the structure diagram when the disaster tolerance architecture is 2+2, the active domain includes the active sites A and B, and the backup domain includes D and E. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 2, including:
步骤101,当确定本局点的身份为备用局点时,检测是否存在满足预设条件的目标局点;若存在,则进入步骤102;否则,流程结束。Step 101: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 102 is entered; otherwise, the process ends.
具体地说,检测是否存在满足预设条件的目标局点过程中,首先,本局点扫描预存的主用局点列表,主用局点列表包括被分配给本局点的所有主用局点,然后检测主用局点列表中各主用局点的当前状态,并将当前状态为尚未与本局点建立灾备关系的主用局点确定为满足所述预设条件的目标局点。Specifically, in the process of detecting whether there is a target site that meets the preset conditions, first, the local site scans the pre-stored master site list. The master site list includes all the master sites assigned to the local site, and then The current status of each master site in the master site list is detected, and the master site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset condition.
在一个具体的例子中,局点的身份包括主用局点和备用局点,运维人 员可以通过特定的界面来进行配置,来达到设置局点的身份的目的。In a specific example, the identity of the site includes the primary site and the backup site. The operation and maintenance personnel can configure the site through a specific interface to achieve the purpose of setting the identity of the site.
步骤102,与目标局点建立灾备关系。Step 102: Establish a disaster recovery relationship with the target site.
具体地说,当检测到有满足预设条件的目标局点时,本局点作为备用局点向目标局点发送连接请求,并在接收到目标局点的允许连接的应答后,确认灾备关系建立成功。连接请求可以包含本局点的身份信息,以供目标局点识别并记录本局点的身份信息,这样目标局点就能够知晓连接请求的来源,即知晓建立灾备关系的对象。Specifically, when it detects that there is a target site that meets the preset conditions, the local site serves as a backup site to send a connection request to the target site, and after receiving the response of the target site to allow connection, confirm the disaster recovery relationship Successfully established. The connection request can include the identity information of the local site for the target site to identify and record the identity information of the local site, so that the target site can know the source of the connection request, that is, the object that establishes the disaster recovery relationship.
步骤103,为目标局点备份目标局点中待备份的数据。Step 103: Back up the data to be backed up in the target site for the target site.
具体地说,本局点为备用局点身份时,会从目标局点的预设存储区拉取目标局点中待备份的数据。这种方式可以很好的解决多对多灾备关系带来的耦合性,主用局点只需要备份数据,与备用局点的恢复流程解耦,且主用域中的各主用局点之间、备用域中的各备用局点之间均是互不影响,不会因为一个主用局点生产待备份的数据失败,或者是一个备用局点备份数据失败,导致整个容灾过程的不可用。Specifically, when the local site is the backup site identity, the data to be backed up in the target site will be pulled from the preset storage area of the target site. This method can well solve the coupling caused by the many-to-many disaster recovery relationship. The active site only needs to back up data, which is decoupled from the recovery process of the standby site, and each active site in the active domain Each backup site in the backup domain does not affect each other. It will not cause the failure of a primary site to produce data to be backed up, or a backup site’s failure to back up data, resulting in the entire disaster recovery process. unavailable.
具体地说,待备份的数据被绑定有目标局点的身份信息,本局点会根据身份信息,将待备份的数据存储至与目标局点对应的存储区域。Specifically, the data to be backed up is bound with the identity information of the target site, and the local site will store the data to be backed up in a storage area corresponding to the target site based on the identity information.
具体地说,待备份的数据和目标局点中的各应用一一对应,当本局点为目标局点备份目标局点中待备份的数据时,本局点的各应用为目标局点备份目标局点的各应用中的待备份的数据,且本局点的各应用和目标局点的各应用一一对应。Specifically, there is a one-to-one correspondence between the data to be backed up and the applications in the target site. When the local site is the target site and the data to be backed up in the target site is backed up, the applications of the local site are the target site. The data to be backed up in each application of the site, and each application of the local site corresponds to each application of the target site one-to-one.
在一个具体的例子中,主用局点中的各应用生产容灾所需的数据,该数据待备份。应用根据自身业务的特殊性根据相关参数自主决定备份策略,备份策略包括:周期、全量备份、增量备份等,在此不做限定。对于系统中各关键应用的数据,备份周期可以尽可能的短,例如可以设置为30秒备份一次。对于非关键应用,可以选择1小时备份一次。对于数据量大的应用,可以选择增量同步,反之可以选择全量备份。需要说明的是,关键应用和非关键应用可以根据预设的标准来进行区分,在此不对界定关键应 用和非关键应用的标准做限定,实际操作时可根据实际情况决定。特别地,对于主用域中的主用局点来讲,可能存在多个主用局点被同一个备用局点备份的情况,故在数据备份时,每个主用局点在数据备份时携带了主用局点标识位,即绑定身份信息。相应的,备用域中各备用局点中应用从主用域中激活态的主用局点的存储介质中拉取对应的备份数据,以进行数据同步。数据同步的周期、频次、时机可由各应用根据自身业务通过配置相关参数确定。备用域中各局点中的应用将备份数据恢复至备局,保持主备局数据的一致性。对于备用域中的备用局点来讲,可能存在一个备用局点备份多个主用局点的情况,故在数据恢复时,备份数据携带的主用局标识位需要一并恢复。原则上,主用域和备用域中各局点的版本应当保持一致。当出现版本不一致的情况时,例如主用域中版本有高有低,本着向下兼容的原则,备用域中的局点版本应该使用高版本,这样备用域就可以兼容的恢复主用域中存在低版本的数据。In a specific example, each application in the primary site produces data required for disaster recovery, and the data is to be backed up. The application independently determines the backup strategy according to the particularity of its own business and related parameters. The backup strategy includes: periodic, full backup, incremental backup, etc., which are not limited here. For the data of each key application in the system, the backup period can be as short as possible, for example, it can be set to backup once every 30 seconds. For non-critical applications, you can choose to backup once an hour. For applications with a large amount of data, you can choose incremental synchronization, otherwise you can choose full backup. It should be noted that critical applications and non-critical applications can be distinguished based on preset standards. The criteria for defining critical applications and non-critical applications are not limited here, and the actual operation can be determined according to the actual situation. In particular, for the primary site in the primary domain, there may be multiple primary sites being backed up by the same backup site. Therefore, when data is backed up, each primary site is in the process of data backup. Carry the main site identification bit, that is, bind identity information. Correspondingly, the application in each backup site in the backup domain pulls the corresponding backup data from the storage medium of the active site in the active domain for data synchronization. The period, frequency, and timing of data synchronization can be determined by each application by configuring related parameters according to its own business. The applications in each site in the backup domain restore the backup data to the backup site to maintain the consistency of the data in the primary and backup sites. For the backup site in the backup domain, there may be a situation where one backup site backs up multiple active sites. Therefore, when data is restored, the active site identification bits carried in the backup data need to be restored together. In principle, the versions of the sites in the active domain and the backup domain should be consistent. When the version is inconsistent, for example, the version in the primary domain is high or low. Based on the principle of downward compatibility, the site version in the backup domain should use the high version, so that the backup domain can restore the primary domain compatible. There is a lower version of the data in.
在一个具体的例子中,假设要建立容灾架构为2+2型的多对多灾备关系,如图3所示。主用局点A、B组成主用域,备用局点D、E组成备用域。此时,容灾架构为2+2型的灾备关系如表1所示,即,备用局点D的分别与主用局点A、B建立了灾备关系,备用局点E的与主用局点A建立了灾备关系。需要说明的是,上述灾备关系的建立仅仅是实际操作时的一种情况,而不是实现本方案的必须条件。In a specific example, suppose that a many-to-many disaster recovery relationship with a 2+2 disaster recovery architecture is to be established, as shown in Figure 3. The active sites A and B form the active domain, and the standby sites D and E form the standby domain. At this time, the disaster recovery relationship of the 2+2 type disaster recovery architecture is shown in Table 1. That is, the backup site D has established a disaster recovery relationship with the active sites A and B, and the backup site E has established a disaster recovery relationship with the active site. Use site A to establish a disaster recovery relationship. It should be noted that the establishment of the above-mentioned disaster recovery relationship is only a situation during actual operation, not a necessary condition for the realization of this solution.
Figure PCTCN2020115892-appb-000001
Figure PCTCN2020115892-appb-000001
表1Table 1
首先,确定本局点的身份,若本局点的身为备用局点,则向消息中间件发布本局点的身份为备用局点。同理,若本局点的身份为主用局点,则向消息中间件发布本局点的身份为主用局点。然后向满足预设条件且处于 激活状态的主用局点发送心跳消息,比如备用局点D向激活状态的主用局点A发送心跳消息,请求建立连接。如果备用局点收到允许连接的应答,则确认灾备关系建立。如果没有收到应答,则要将发送心跳消息的对象在本局点中设置为状态异常。比如备用局点D向满足预设条件且处于激活状态的主用局点A发送了心跳消息,但是没有收到心跳消息的应答,那么备用局点D上会将主用局点A记录为状态异常,并发送告警通知。局点中的各应用会监听消息中间件的存储的局点身份信息,然后根据局点身份执行相应动作,主用局点生产待备份的数据,备用局点将已经建立灾备关系的主用局点中的待备份数据进行备份并恢复。First, determine the identity of the local site. If the local site is a standby site, the identity of the local site is announced to the message middleware as the standby site. In the same way, if the identity of the local site is the primary site, the identity of the local site is published to the message middleware as the primary site. Then send a heartbeat message to the active site that meets the preset conditions and is in the active state. For example, the backup site D sends a heartbeat message to the active site A to request the establishment of a connection. If the standby site receives a response that allows the connection, it confirms that the disaster recovery relationship is established. If no response is received, the object sending the heartbeat message should be set to an abnormal state in the local site. For example, the backup site D sends a heartbeat message to the active site A that meets the preset conditions and is active, but does not receive a response to the heartbeat message, then the backup site D will record the active site A as a status Abnormal, and send an alarm notification. Each application in the site will monitor the site identity information stored in the message middleware, and then perform corresponding actions based on the site identity. The primary site produces the data to be backed up, and the backup site will be the primary site that has established a disaster recovery relationship. The data to be backed up in the site is backed up and restored.
在一个具体的例子中,假设容灾架构为2+2型,即2个主用局点和2个备用局点,分别用A、B表示主用局点,C、D表示备用局点,具体灾备关系如图3所示,现在因为业务扩展,需要在某地区新开主用局点C,需要有两个容灾冗余度,即两个备份。那么,首先要搭建局点C,接着要打通网络平面,使新局点C能够与局点A、B、D、E互通。然后运维人员在配置界面将局点C的身份设置为主用局点。此时运维人员可以在当前备用局点D和E上分别配置主用局点C的环境信息,即,将主用局点C加入备用局点D和E的主用局点列表中,并将主用局点C置为激活状态,需要说明的是,如果主用局点C是未激活状态,就不能建立与其它备用局点建立灾备关系。环境信息包括端口信息,在此不做限定。这里以备用局点D为例,当局点D确定自己为备用局点后,会检测是否存在满足预设条件的主用局点,即被分配给自己且尚未与自己建立灾备关系的主用局点。由于主用局点C是新增的,所以,主用局点列表中必然存在满足预设条件的主用局点C。当在主用局点列表中扫描到主用局点C的信息时,会向主用局点C发送连接请求,连接请求可以是心跳消息,在此不做限定。当主用局点C接受到来自备用局点D的连接请求后,会将备用局点D的信息进行存储,同理,也会将备用局点D的信息进行存储,然后激活状态的主用局点C会给备用局点D发送允许连接的应答,确认主用局点C和备用局点D的灾备关系建立成功。建立灾备关系之后,主用局点C上各应用 会根据各自应用制定的备份策略,将待备份的数据存储至存储介质中。备用局点D中的应用根据各自的恢复策略,从主用局点C的存储介质中获取待备份的数据,获取方式包括Rsync、Ftp等方式,在此不做限定,最后将数据进行恢复。备用局点E同理,不再一一赘述。至此,容灾架构由2+2型扩展成了3+2型,如图4所示。容灾架构扩展为3+2型时局点间的灾备关系如表2所示:In a specific example, suppose that the disaster tolerance architecture is 2+2 type, that is, 2 primary sites and 2 backup sites. Use A and B to represent the primary site, and C and D to represent the backup site. The specific disaster recovery relationship is shown in Figure 3. Now because of business expansion, a new primary site C needs to be opened in a certain area, and two disaster recovery redundancy, that is, two backups, are required. Then, first set up site C, and then open up the network plane so that the new site C can communicate with sites A, B, D, and E. Then the operation and maintenance personnel set the identity of site C as the primary site on the configuration interface. At this time, the operation and maintenance personnel can configure the environment information of the active site C on the current standby sites D and E, that is, add the active site C to the active site list of the standby sites D and E, and Set the active site C to the activated state. It should be noted that if the active site C is in an inactive state, it cannot establish a disaster recovery relationship with other backup sites. The environmental information includes port information, which is not limited here. Take the backup site D as an example. After the authority point D determines that it is the backup site, it will detect whether there is an active site that meets the preset conditions, that is, the active site that is assigned to itself and has not established a disaster recovery relationship with itself. game point. Since the primary site C is newly added, there must be a primary site C that meets the preset conditions in the primary site list. When the information of the primary site C is scanned in the primary site list, a connection request will be sent to the primary site C. The connection request can be a heartbeat message, which is not limited here. When the primary site C receives the connection request from the backup site D, it will store the information of the backup site D. Similarly, the information of the backup site D will also be stored, and then the active site will be activated. Point C will send a connection permission response to backup site D to confirm that the disaster recovery relationship between primary site C and backup site D is successfully established. After the disaster recovery relationship is established, each application on the primary site C will store the data to be backed up in the storage medium according to the backup strategy formulated by the respective application. The applications in the backup site D obtain the data to be backed up from the storage medium of the primary site C according to their respective recovery strategies. The acquisition methods include Rsync, Ftp, etc., which are not limited here, and the data is finally restored. The same is true for the standby site E, so I won't repeat them one by one. So far, the disaster tolerance architecture has been expanded from 2+2 to 3+2, as shown in Figure 4. When the disaster recovery architecture is expanded to 3+2, the disaster recovery relationship between sites is shown in Table 2:
Figure PCTCN2020115892-appb-000002
Figure PCTCN2020115892-appb-000002
表2Table 2
进一步地,在另一个具体的例子中,假设上述容灾架构为3+2型时,对于主用局点A和C来说,目前两个容灾冗余度不容灾需求,需要增加备份点,可以将当前容灾架构由3+2型扩展成3+3型,即增加一个备用局点。假设增加的局点为F,首先搭建局点F,接着打通网络平面,使新局点F能够与局点A、B、D、E互通。然后运维人员在配置界面将局点F的身份设置为备用局点。运维人员在局点F上配置主用局点A和C的环境信息,即将主用局点A和C加入局点F的主用局点列表中,并将主用局点A和C设置为激活态。需要说明的是,运维人员也可以在局点F上仅配置主用局点A的环境信息,即,在本例子中,运维人员在局点F上配置主用局点A和C的环境信息仅是实际实施中可以选择的其中一种情况,并不是实现本方案的必须条件。当局点F确定自己为备用局点后,会检测是否存在满足预设条件的主用局点,即被分配给自己且尚未与自己建立灾备关系的主用局点。由于备用局点F是新增的,所以,备用局点F的主用局点列表中必然满足预设条件的主用局点A和C。以主用局点C为例,当局点F在主用局点列表中扫描到主用局点C的信息时,会向主用局点C发送连接请求,连接请求可以是心跳消息,在此不做限定。当主用局点C 接受到来自备用局点F的连接请求后,会将备用局点F的信息进行存储,然后激活状态的主用局点C会给备用局点F发送允许连接的应答,确认主用局点C和备用局点F的灾备关系建立成功。建立灾备关系之后,主用局点C上各应用会根据各自应用制定的备份策略,将待备份的数据存储至存储介质中。备用局点F中的应用根据各自的恢复策略,从主用局点C的存储介质中获取待备份的数据,获取方式包括Rsync、Ftp等方式,在此不做限定,最后将数据进行恢复。备用局点E同理,在此不再一一赘述。至此,容灾架构由3+2型扩展成了3+3型,如图5所示。容灾架构扩展为3+3型时局点间的灾备关系如表3所示:Further, in another specific example, assuming that the above disaster tolerance architecture is 3+2 type, for the primary sites A and C, the current two disaster tolerance redundancy does not require disaster tolerance, and a backup point needs to be added. , Can expand the current disaster recovery architecture from 3+2 type to 3+3 type, that is, add a backup site. Assuming that the added site is F, first set up site F, and then open up the network plane, so that the new site F can communicate with sites A, B, D, and E. Then the operation and maintenance personnel set the identity of site F as the backup site on the configuration interface. The operation and maintenance personnel configure the environment information of the primary sites A and C on site F, that is, add the primary sites A and C to the primary site list of site F, and set the primary sites A and C For the active state. It should be noted that the operation and maintenance personnel can also configure only the environmental information of the primary site A on site F. That is, in this example, the operation and maintenance personnel configure the primary site A and C on site F. Environmental information is only one of the conditions that can be selected in the actual implementation, and is not a necessary condition for the realization of this program. After the authority point F determines that it is the backup site, it will detect whether there is an active site that meets the preset conditions, that is, the active site that is assigned to itself and has not yet established a disaster recovery relationship with itself. Since the backup site F is newly added, the primary sites A and C in the primary site list of the backup site F must meet the preset conditions. Take the primary site C as an example. When the agency F scans the information of the primary site C in the primary site list, it will send a connection request to the primary site C. The connection request can be a heartbeat message. Not limited. When the active site C receives the connection request from the backup site F, it will store the information of the backup site F, and then the active site C will send a connection permission response to the backup site F to confirm The disaster recovery relationship between the primary site C and the backup site F is successfully established. After the disaster recovery relationship is established, each application on the primary site C will store the data to be backed up in the storage medium according to the backup strategy formulated by the respective application. The applications in the backup site F obtain the data to be backed up from the storage medium of the primary site C according to their respective recovery strategies. The acquisition methods include Rsync, Ftp, etc., which are not limited here, and the data is finally restored. The same is true for the standby site E, so I won't repeat them here. So far, the disaster tolerance architecture has been expanded from 3+2 to 3+3, as shown in Figure 5. The disaster recovery relationship between the sites when the disaster recovery architecture is expanded to 3+3 is shown in Table 3:
Figure PCTCN2020115892-appb-000003
Figure PCTCN2020115892-appb-000003
表3table 3
在本实施方式中,确定本局点是备用局点后,本局点主动与分配给本局点且尚未与本局点建立灾备关系的主用局点建立灾备关系,然后为该主用局点备份数据。在建立灾备关系的过程中,不需要主用局点主动请求,避免主用局点主动建立灾备关系时,因在运行进程中扩展业务而造成原始业务中断的情况,能够提高业务扩展的灵活性。In this embodiment, after determining that the local site is the backup site, the local site takes the initiative to establish a disaster recovery relationship with the active site that is assigned to the local site and has not yet established a disaster recovery relationship with the local site, and then backs up the active site data. In the process of establishing a disaster recovery relationship, the primary site does not need to actively request it to avoid the original business interruption due to business expansion during the operation process when the primary site actively establishes a disaster recovery relationship, which can improve business expansion flexibility.
本发明的第二实施方式涉及一种容灾方法。在本实施方式中,已经建立灾备关系的本局点和目标局点进行身份互换,此时本局点的身份由备用局点转变为了主用局点,目标局点的身份由主用局点转变为了备用局点。所以此时的本局点只需要在接收到目标局点的连接请求时,向目标局点发送允许连接的应答。本实施方式考虑到主备切换的情况。下面对本实施方式的容灾方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须。具体流程如图6所示,包括:The second embodiment of the present invention relates to a disaster tolerance method. In this embodiment, the identity of the local site and the target site that have established a disaster recovery relationship are exchanged. At this time, the identity of the local site is changed from the backup site to the active site, and the identity of the target site is changed from the primary site. Converted to a backup site. Therefore, the local office only needs to send a connection permission response to the target office when it receives the connection request from the target office. This embodiment takes into account the situation of active/standby switching. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 6, including:
步骤201,当确定本局点的身份为备用局点时,检测是否存在满足预设条件的目标局点;若存在,则进入步骤202;否则,流程结束。Step 201: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 202 is entered; otherwise, the process ends.
步骤202,与目标局点建立灾备关系。Step 202: Establish a disaster recovery relationship with the target site.
步骤203,为目标局点备份目标局点中待备份的数据。Step 203: Back up the data to be backed up in the target site for the target site.
步骤201-203分别与第一实施方式中的步骤101-103类似,在此不再一一赘述。Steps 201-203 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.
步骤204,检测本局点是否与目标局点进行身份互换;若是,则进入步骤204;否则,流程结束。In step 204, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 204; otherwise, the process ends.
步骤205,接收目标局点的连接请求,并向目标局点发送允许连接的应答。Step 205: Receive a connection request from the target site, and send a connection permission response to the target site.
需要说明的是,在本实施方式中,步骤204中的检测局点是否与目标局点进行身份互换在步骤203中的为目标局点备份目标局点中待备份的数据之后执行,本实施例只是提供一种情况。但是在实际实施中,步骤204可以和步骤203同时执行,或在步骤203之后执行,这些情况都应在保护范围之内。It should be noted that, in this embodiment, the detection of whether the site is exchanged with the target site in step 204 is performed after the data to be backed up in the target site is backed up for the target site in step 203. This implementation The example only provides a situation. However, in actual implementation, step 204 can be performed at the same time as step 203, or performed after step 203, and these cases should be within the scope of protection.
在一个具体的例子中,考虑到自动主备转换带来的风险与不可预知性,以及本方法设计的灾备关系是多对多的关系,即一个主用局点的数据可能会备份到多个备用局点上,灾难发生时,运维人员根据各备用局点统计的数据恢复情况,在界面上人工选择使用备用域中的某个备用局点进行升主操作。假设现在的容灾架构为3+3型,主用局点分别是A、B、C,备用局点分别是D、E、F,现在主用局点C所在的地区发生自然灾害,主用局点C不可用,需要迅速恢复业务。运维人员可以分别查看备用局点D、E、F中主用局点C的健康状况以及当前的运行情况。假设最终确定将备用局点F和主用局点C的身份进行互换,然后运维人员根据灾备关系列表将备用局点F配置为主用局点来接替主用局点C的工作,以提供正常业务功能。相应的,运维人员恢复主用局点C,将主用局点C配置为备用局点来代替备用局点F进行工作,完成业务恢复工作。身份互换后的本局点F 作为主用局点只需要接收来自目标局点C的连接请求并向局点C发送允许连接的应答。各局点监听各自消息中间件消息,根据局点的身份执行相应动作,即主用局点根据备份策略生产待备份的数据,备用局点根据恢复策略获取备份数据进行恢复。In a specific example, taking into account the risks and unpredictability brought by automatic master/slave conversion, and the disaster recovery relationship designed by this method is a many-to-many relationship, that is, the data of a master site may be backed up to many On a backup site, when a disaster occurs, the operation and maintenance personnel manually select a backup site in the backup domain to perform the upgrade operation on the interface according to the data recovery status of each backup site. Assuming that the current disaster recovery architecture is 3+3, the main sites are A, B, and C, and the backup sites are D, E, and F. Now, the area where the main site C is located is affected by natural disasters. Site C is unavailable, and services need to be restored quickly. The operation and maintenance personnel can view the health status and current operation status of the active site C in the standby sites D, E, and F respectively. Suppose it is finally determined that the identities of the backup site F and the active site C are exchanged, and then the operation and maintenance personnel configure the backup site F as the active site to take over the work of the active site C according to the disaster recovery relationship list. To provide normal business functions. Correspondingly, the operation and maintenance personnel restore the active site C, and configure the active site C as a backup site to replace the backup site F to complete the service restoration work. After the identity exchange, the local site F as the primary site only needs to receive the connection request from the target site C and send a connection permission response to the site C. Each site monitors its own message middleware messages and executes corresponding actions based on the site's identity. That is, the primary site produces the data to be backed up according to the backup strategy, and the backup site obtains the backup data for restoration according to the recovery strategy.
在本实施方式中,考虑到身份互换的情况,本局点现在作为主用局点只需要响应已经变为备用局点的目标局点的连接请求。虽然进行了身份互换,但本质仍然是备用局点主动与分配给本局点且尚未与本局点建立灾备关系的主用局点建立灾备关系,而不需要主用局点主动请求,避免主用局点主动建立灾备关系时,因在运行进程中扩展业务而造成原始业务中断的情况,提高了业务扩展的灵活性。In this embodiment, considering the situation of identity exchange, the current site as the primary site only needs to respond to the connection request of the target site that has become the backup site. Although the identities are exchanged, the essence is still that the backup site actively establishes a disaster recovery relationship with the active site that has been assigned to the local site and has not yet established a disaster recovery relationship with the local site, without the active request of the primary site to avoid When the primary site actively establishes a disaster recovery relationship, the original business is interrupted due to business expansion during the operation process, which improves the flexibility of business expansion.
本发明的第三实施方式涉及一种容灾方法。本实施方式中,在检测到本局点与所述目标局点身份互换后,还要将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区,以供目标局点从本局点的用于供备用局点访问的存储区拉取。提供了一种在身份互换之后本局点的工作情况。下面对本实施方式的容灾方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须。具体流程如图7所示,包括:The third embodiment of the present invention relates to a disaster tolerance method. In this embodiment, after the identity exchange between the local site and the target site is detected, the data backed up by the local site as the target site is also put into the storage area of the local site for the backup site to access. For the target site to pull from the storage area of the local site for the backup site to access. Provides a working situation of this office after the identity exchange. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 7, including:
步骤301,当确定本局点的身份为备用局点时,检测是否存在满足预设条件的目标局点;若存在,则进入步骤302;否则,流程结束。Step 301: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 302 is entered; otherwise, the process ends.
步骤302,与目标局点建立灾备关系。Step 302: Establish a disaster recovery relationship with the target site.
步骤303,为目标局点备份目标局点中待备份的数据。Step 303: Back up the data to be backed up in the target site for the target site.
步骤301-303分别与第一实施方式中的步骤101-103类似,在此不再一一赘述。Steps 301-303 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.
步骤304,检测本局点是否与目标局点进行身份互换;若是,则进入步骤304;否则,流程结束。与步骤204类似,在此不再一一赘述。In step 304, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 304; otherwise, the process ends. Similar to step 204, it will not be repeated here.
需要说明的是,在本实施方式中,步骤304中的检测局点是否与目标局点进行身份互换在步骤303中的为目标局点备份目标局点中待备份的数 据之后执行,本实施例只是提供一种情况。但是在实际实施中,步骤304可以和步骤303同时执行,或在步骤303之后执行,这些情况都应在保护范围之内。It should be noted that in this embodiment, the detection of whether the site and the target site perform identity exchange in step 304 is executed after the data to be backed up in the target site is backed up for the target site in step 303. This implementation The example only provides a situation. However, in actual implementation, step 304 can be performed at the same time as step 303, or performed after step 303, and these cases should be within the scope of protection.
步骤305,将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区,以供目标局点从本局点的用于供备用局点访问的存储区拉取。Step 305: Put the data backed up by the local site as the target site into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access .
步骤306,接收目标局点的连接请求,并向目标局点发送允许连接的应答。与步骤205类似,在此不再一一赘述。Step 306: Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.
需要说明的是,在本实施例中,步骤305中的将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区在步骤306中的接收目标局点的连接请求,并向目标局点发送允许连接的应答之后执行,本实施例只是提供一种情况。在实际实施中,步骤305可以和步骤306同时执行,或在步骤306之后执行,这些情况都应在保护范围之内。It should be noted that, in this embodiment, the data backed up by the local site as the target site in step 305 is placed in the storage area of the local site for the backup site to access in step 306. The connection request is executed after sending a response allowing the connection to the target site. This embodiment only provides one case. In actual implementation, step 305 can be performed at the same time as step 306, or performed after step 306, all of which should fall within the protection scope.
在本实施方式中,当检测到本局点与目标局点进行身份互换后,将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区,以供目标局点从本局点的用于供备用局点访问的存储区拉取,便于身份转换为备用局点的目标局点进行拉取,而不需要身份转换为主用局点的本局点主动发送,提供了一种在身份互换之后本局点的工作情况。In this embodiment, after the identity exchange between the local site and the target site is detected, the data backed up by the local site as the target site is put into the storage area of the local site for the backup site to access for the target site. The site is pulled from the storage area of the local site for the backup site to access, so that the identity can be converted to the target site of the backup site to pull, without the need to change the identity to the primary site’s local site to actively send it. Provides a working situation of this office after the identity exchange.
本发明的第四实施方式涉及一种容灾方法。本实施方式相对于第二实施方式来说,当检测到本局点与目标局点进行身份互换后,需要清除或转存本局点中存储的除为目标局点以外的局点备份的数据,即,将本局点其它灾备关系中的主用局点的备份数据和进行身份互换的目标局点的备份数据分开存放,便于身份互换后的目标局点直接拉取,降低目标局点拉取备份数据过程的难度。下面对本实施方式的容灾方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须。具体流程如图8所示,包括:The fourth embodiment of the present invention relates to a disaster tolerance method. Compared with the second embodiment, this implementation mode needs to clear or transfer the data stored in the local site and the backup site other than the target site after the identity exchange between the local site and the target site is detected. That is, the backup data of the main site in the other disaster recovery relationship of the local site and the backup data of the target site for identity exchange are stored separately, which is convenient for the target site after the identity exchange to be directly pulled, and the target site is reduced. The difficulty of the process of pulling backup data. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 8, including:
步骤401,当确定本局点的身份为备用局点时,检测是否存在满足预 设条件的目标局点;若存在,则进入步骤402;否则,流程结束。Step 401: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets the preset conditions; if it exists, step 402 is entered; otherwise, the flow ends.
步骤402,与目标局点建立灾备关系。Step 402: Establish a disaster recovery relationship with the target site.
步骤403,为目标局点备份目标局点中待备份的数据。Step 403: Back up the data to be backed up in the target site for the target site.
步骤401-403分别与第一实施方式中的步骤101-103类似,在此不再一一赘述。Steps 401-403 are respectively similar to steps 101-103 in the first embodiment, and will not be repeated here.
步骤404,检测本局点是否与目标局点进行身份互换;若是,则进入步骤404;否则,流程结束。与步骤204类似,在此不再一一赘述。In step 404, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 404; otherwise, the process ends. Similar to step 204, it will not be repeated here.
需要说明的是,在本实施方式中,步骤404中的检测局点是否与目标局点进行身份互换在步骤403中的为目标局点备份目标局点中待备份的数据之后执行,本实施例只是提供一种情况。但是在实际实施中,步骤404可以和步骤403同时执行,或在步骤403之后执行,这些情况都应在保护范围之内。It should be noted that, in this embodiment, the detection of whether the site is exchanged with the target site in step 404 is executed after the data to be backed up in the target site is backed up for the target site in step 403. This implementation The example only provides a situation. However, in actual implementation, step 404 can be performed at the same time as step 403, or performed after step 403, all of which should fall within the scope of protection.
步骤405,清除或转存本局点中存储的除为目标局点以外的局点备份的数据。Step 405: Clear or dump the data stored in the local site that is backed up by sites other than the target site.
步骤406,接收目标局点的连接请求,并向目标局点发送允许连接的应答。与步骤205类似,在此不再一一赘述。Step 406: Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.
需要说明的是,在本实施例中,步骤405中的清除或转存本局点中存储的除为目标局点以外的局点备份的数据在步骤406中的接收目标局点的连接请求,并向目标局点发送允许连接的应答之后执行,本实施例只是提供一种情况。在实际实施中,步骤405可以和步骤406同时执行,或在步骤406之后执行,这些情况都应在保护范围之内。It should be noted that, in this embodiment, the clearing or dumping of data stored in the local site in step 405 that is backed up by sites other than the target site in step 406 receives the connection request of the target site, and It is executed after sending a connection permission response to the target site, and this embodiment only provides one case. In actual implementation, step 405 can be performed at the same time as step 406, or performed after step 406, all of which should fall within the scope of protection.
在本实施方式中,当检测到本局点与目标局点进行身份互换后,需要清除或转存本局点中存储的除目标局点的备份数据以外的备份数据,即,将本局点其它灾备关系中的主用局点的备份数据和进行身份互换的目标局点的备份数据分开存放,便于身份互换后的目标局点直接拉取,降低目标局点拉取备份数据过程的难度。In this embodiment, when the identity exchange between the local site and the target site is detected, the backup data stored in the local site other than the backup data of the target site needs to be cleared or dumped, that is, the backup data stored in the local site must be removed from other disasters in the local site. The backup data of the main site in the backup relationship and the backup data of the target site for identity exchange are stored separately, which facilitates the direct pull of the target site after the identity exchange, and reduces the difficulty of the target site to pull the backup data. .
本发明的第五实施方式涉及一种容灾方法。本实施方式中,当检测到本局点与目标局点进行身份互换后,不仅需要清除或转存本局点中存储的除为目标局点以外的局点备份的数据,即,将本局点其它灾备关系中的主用局点的备份数据和进行身份互换的目标局点的备份数据分开存放,还要将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区,以供目标局点从本局点的用于供备用局点访问的存储区拉取。下面对本实施方式的容灾方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须。具体流程如图9所示,包括:The fifth embodiment of the present invention relates to a disaster tolerance method. In this embodiment, when the identity exchange between the local site and the target site is detected, it is not only necessary to clear or transfer the data stored in the local site that is backed up by sites other than the target site, that is, to transfer other sites to other sites. The backup data of the primary site in the disaster recovery relationship is stored separately from the backup data of the target site for identity exchange, and the data backed up by the local site as the target site must be placed in the local site for the backup site The accessed storage area is for the target site to pull from the storage area of the local site for the backup site to access. The implementation details of the disaster recovery method of this embodiment will be described in detail below. The following content is only provided for ease of understanding and is not necessary for the implementation of this solution. The specific process is shown in Figure 9, including:
步骤501,当确定本局点的身份为备用局点时,检测是否存在满足预设条件的目标局点;若存在,则进入步骤502;否则,流程结束。Step 501: When it is determined that the identity of the local site is a backup site, it is detected whether there is a target site that meets a preset condition; if it exists, step 502 is entered; otherwise, the process ends.
步骤502,与目标局点建立灾备关系。Step 502: Establish a disaster recovery relationship with the target site.
步骤503,为目标局点备份目标局点中待备份的数据。Step 503: Back up the data to be backed up in the target site for the target site.
步骤501-503分别与第一实施方式中的步骤101-103类似,在此不再一一赘述。Steps 501-503 are similar to steps 101-103 in the first embodiment, respectively, and will not be repeated here.
步骤504,检测本局点是否与目标局点进行身份互换;若是,则进入步骤505;否则,流程结束。与步骤204类似,在此不再一一赘述。In step 504, it is detected whether the identity exchange between the local site and the target site is performed; if so, the process proceeds to step 505; otherwise, the process ends. Similar to step 204, it will not be repeated here.
需要说明的是,在本实施方式中,步骤504中的检测局点是否与目标局点进行身份互换在步骤503中的为目标局点备份目标局点中待备份的数据之后执行,本实施例只是提供一种情况。但是在实际实施中,步骤504可以和步骤503同时执行,或在步骤503之后执行,这些情况都应在保护范围之内。It should be noted that, in this embodiment, the detection of whether the site is exchanged with the target site in step 504 is performed after the data to be backed up in the target site is backed up for the target site in step 503. This implementation The example only provides a situation. However, in actual implementation, step 504 can be performed at the same time as step 503, or performed after step 503, and these cases should be within the protection scope.
步骤505,将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区,以供目标局点从本局点的用于供备用局点访问的存储区拉取。Step 505: Put the data backed up by the local site as the target site into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access .
步骤506,清除或转存本局点中存储的除为目标局点以外的局点备份的数据。与步骤405类似,在此不再一一赘述。Step 506: Clear or dump the backed-up data stored in the local site except for the target site. Similar to step 405, it will not be repeated here.
步骤507,接收目标局点的连接请求,并向目标局点发送允许连接的应答。与步骤205类似,在此不再一一赘述。Step 507: Receive a connection request from the target site, and send a connection permission response to the target site. Similar to step 205, it will not be repeated here.
需要说明的是,本实施例中的步骤505、506、507的执行顺序可以根据实际情况进行变换,变换的所有情况理应也在保护范围内。It should be noted that the execution order of steps 505, 506, and 507 in this embodiment can be changed according to actual conditions, and all the changed conditions should be within the scope of protection.
在本实施方式中,当检测到本局点与目标局点进行身份互换后,不仅需要清除或转存本局点中存储的除为目标局点以外的局点备份的数据,还要将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区,以供目标局点从本局点的用于供备用局点访问的存储区拉取,提供了一种本局点和目标局点进行身份互换后的工作情况。In this embodiment, when the identity exchange between the local site and the target site is detected, it is not only necessary to clear or dump the data stored in the local site that is backed up for sites other than the target site, but also to transfer the local site’s identities. The data backed up for the target site is put into the storage area of the local site for the backup site to access, so that the target site can pull from the storage area of the local site for the backup site to access, providing a local office The work situation of the site and the target site after the identity exchange.
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包括相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。The division of the steps of the various methods above is just for clarity of description. When implemented, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications to the algorithm or process or introducing insignificant design, but not changing the core design of the algorithm and process are within the scope of protection of the patent.
本发明的第五实施方式涉及一种容灾装置。该容灾装置包含:综合服务模块601、消息中间件602、内嵌有数据备份模块603的应用。其中应用的数量为n,n为大于零的自然数。具体结构示意图如图10所示,包括:The fifth embodiment of the present invention relates to a disaster tolerance device. The disaster recovery device includes: an integrated service module 601, a message middleware 602, and an application embedded with a data backup module 603. The number of applications is n, and n is a natural number greater than zero. The specific structure diagram is shown in Figure 10, including:
综合服务模块601,设置为在确定本局点为备用局点身份时,检测是否存在满足预设条件的目标局点,预设条件包括:目标局点为主用局点身份,目标局点被分配给本局点且尚未与本局点建立灾备关系;还设置为与满足预设条件的目标局点建立灾备关系。The integrated service module 601 is set to detect whether there is a target site that meets preset conditions when determining the identity of the local site as a backup site. The preset conditions include: the target site is the primary site, and the target site is assigned For the local site and have not yet established a disaster recovery relationship with the local site; it is also set to establish a disaster recovery relationship with the target site that meets the preset conditions.
具体地说,运维人员可以通过综合服务模块601提供的界面简单配置,来设置局点的身份,比如将其中某一局点的身份设置为主用局点。Specifically, the operation and maintenance personnel can set the identity of the site through simple configuration of the interface provided by the integrated service module 601, for example, set the identity of one of the sites as the primary site.
消息中间件602,设置为存储来自综合服务模块601的本局点的身份信息。The message middleware 602 is configured to store the identity information of the local office from the integrated service module 601.
数据备份模块603,设置为监听消息中间件中的本局点的身份信息,并在本局点与目标局点建立灾备关系后,根据本局点的身份信息,为目标 局点备份目标局点中待备份的数据。The data backup module 603 is set to monitor the identity information of the local site in the message middleware, and after the disaster recovery relationship is established between the local site and the target site, according to the identity information of the local site, back up the target site to be waiting for the target site. Backed up data.
在一个具体的例子中,建立灾备关系的发起方一般是备用局点的综合服务模块601。备用局点综合服务模块601向目标局点发送连接请求,并在接收到目标局点的允许连接的应答后,确认灾备关系建立成功。连接请求可以是心跳消息,在此不做限定。假设综合服务模块601通过心跳消息建立和维护主备用局点之间的灾备关系。2+2型容灾构架建立灾备关系时的示意图如图11所示,备用域中的备用局点D和E中的综合服务模块601分别向主用域中的主用局点A和B发送心跳消息。In a specific example, the initiator of establishing the disaster recovery relationship is generally the integrated service module 601 of the backup site. The backup site integrated service module 601 sends a connection request to the target site, and after receiving a response from the target site to allow the connection, confirms that the disaster recovery relationship is successfully established. The connection request can be a heartbeat message, which is not limited here. It is assumed that the integrated service module 601 establishes and maintains the disaster recovery relationship between the primary and backup sites through heartbeat messages. The schematic diagram of the 2+2 disaster recovery framework when the disaster recovery relationship is established is shown in Figure 11. The backup site D and the integrated service module 601 in the E in the backup domain respectively send to the active sites A and B in the active domain. Send a heartbeat message.
在一个具体的例子中,综合服务模块601扫描预存的主用局点列表,主用局点列表包括被分配给本局点的所有主用局点。然后检测主用局点列表中各主用局点的当前状态,并将当前状态为尚未与本局点建立灾备关系的主用局点确定为满足预设条件的目标局点。In a specific example, the integrated service module 601 scans a pre-stored list of primary sites, and the list of primary sites includes all primary sites assigned to the local site. Then the current status of each primary site in the primary site list is detected, and the primary site whose current state has not established a disaster recovery relationship with the local site is determined as the target site that meets the preset conditions.
在一个具体的例子中,为目标局点备份目标局点中待备份的数据过程中,数据备份模块603从目标局点的用于供备用局点访问的存储区拉取待备份的数据。In a specific example, in the process of backing up the data to be backed up in the target site for the target site, the data backup module 603 pulls the data to be backed up from the storage area of the target site for the backup site to access.
在一个具体的例子中,待备份的数据被绑定有目标局点的身份信息,为目标局点备份目标局点中待备份的数据过程中,数据备份模块603根据身份信息,将待备份的数据存储至与目标局点对应的存储区域。2+2型容灾构架备份数据时的示意图如图12所示,备用域中的备用局点D和E中的数据备份模块603分别向主用域中的主用局点A和B发送备份数据的请求,然后分别从主用局点A和B的存储介质中拉取待备份的数据。In a specific example, the data to be backed up is bound with the identity information of the target site. During the process of backing up the data to be backed up in the target site for the target site, the data backup module 603 will back up the data to be backed up according to the identity information. The data is stored in the storage area corresponding to the target site. The schematic diagram of the 2+2 disaster recovery architecture when backing up data is shown in Figure 12. The data backup modules 603 in the backup sites D and E in the backup domain send backups to the active sites A and B in the active domain respectively. Data request, and then pull the data to be backed up from the storage media of the primary sites A and B respectively.
在一个具体的例子中,本局点与目标局点建立灾备关系后,若检测到本局点与目标局点身份互换,综合服务模块601在接收到目标局点的连接请求时,向目标局点发送允许连接的应答。In a specific example, after the disaster recovery relationship between the local site and the target site is established, if the identity exchange between the local site and the target site is detected, the integrated service module 601 sends the target site to the target site upon receiving the connection request from the target site. Click to send a reply that allows the connection.
在一个具体的例子中,检测到本局点与目标局点身份互换后,数据备份模块603将本局点为目标局点备份的数据放入本局点的用于供备用局点访问的存储区,以供目标局点从本局点的用于供备用局点访问的存储区拉 取。In a specific example, after detecting the identity exchange between the local site and the target site, the data backup module 603 puts the data backed up by the local site as the target site into the storage area of the local site for the backup site to access. For the target site to pull from the storage area of the local site for the backup site to access.
在一个具体的例子中,检测到本局点与目标局点身份互换后,数据备份模块603清除或转存本局点中存储的除为目标局点以外的局点备份的数据。In a specific example, after detecting that the identities of the local site and the target site are exchanged, the data backup module 603 clears or dumps the data stored in the local site that is backed up by sites other than the target site.
在一个具体的例子中,为目标局点备份目标局点中待备份的数据过程中,本局点的应用为目标局点中的与应用相同的应用备份数据。In a specific example, in the process of backing up the data to be backed up in the target site for the target site, the application of the local site is the same application backup data as the application in the target site.
不难发现,本实施方式为与第一实施方式相对应的系统实施例,本实施方式可与第一实施方式互相配合实施。第一实施方式中提到的相关技术细节在本实施方式中依然有效,在第一实施方式中所能达到的技术效果在本实施方式中也同样可以实现,为了减少重复,这里不再赘述。相应地,本实施方式中提到的相关技术细节也可应用在第一实施方式中。It is not difficult to find that this embodiment is a system example corresponding to the first embodiment, and this embodiment can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and the technical effects that can be achieved in the first embodiment can also be achieved in this embodiment. In order to reduce repetition, details are not repeated here. Correspondingly, the related technical details mentioned in this embodiment can also be applied in the first embodiment.
值得一提的是,本实施方式中所涉及到的各模块均为逻辑模块,在实际应用中,一个逻辑单元可以是一个物理单元,也可以是一个物理单元的一部分,还可以以多个物理单元的组合实现。此外,为了突出本发明的创新部分,本实施方式中并没有将与解决本发明所提出的技术问题关系不太密切的单元引入,但这并不表明本实施方式中不存在其它的单元。It is worth mentioning that the modules involved in this embodiment are all logical modules. In practical applications, a logical unit can be a physical unit, a part of a physical unit, or multiple physical units. The combination of units is realized. In addition, in order to highlight the innovative part of the present invention, this embodiment does not introduce units that are not closely related to solving the technical problems proposed by the present invention, but this does not indicate that there are no other units in this embodiment.
本发明的第六实施方式涉及一种局点,如图13所示,包括:至少一个处理器701;以及,与至少一个处理器通信连接的存储器702;其中,存储器702存储有可被至少一个处理器701执行的指令,指令被至少一个处理器701执行,以使至少一个处理器701能够执行上述容灾方法。The sixth embodiment of the present invention relates to a site, as shown in FIG. 13, comprising: at least one processor 701; and a memory 702 communicatively connected with at least one processor; wherein, the memory 702 stores at least one The instructions executed by the processor 701 are executed by the at least one processor 701, so that the at least one processor 701 can execute the foregoing disaster recovery method.
其中,存储器702和处理器701采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器701和存储器702的各种电路连接在一起。总线还可以将诸如外围局点、稳压器和功率管理电路等之类的各种其它电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供设置为在传输介质上与各种其它装置通信的单元。经处理器701处理 的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器701。The memory 702 and the processor 701 are connected in a bus manner, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more various circuits of the processor 701 and the memory 702 together. The bus can also connect various other circuits such as peripheral sites, voltage regulators, power management circuits, etc., which are all known in the art, and therefore, will not be further described herein. The bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or multiple elements, such as multiple receivers and transmitters, providing a unit configured to communicate with various other devices on a transmission medium. The data processed by the processor 701 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 701.
处理器701负责管理总线和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节、电源管理以及其它控制功能。而存储器702可以被设置为存储处理器701在执行操作时所使用的数据。The processor 701 is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interfaces, voltage regulation, power management, and other control functions. The memory 702 may be configured to store data used by the processor 701 when performing operations.
本发明第七实施方式涉及一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。The seventh embodiment of the present invention relates to a computer-readable storage medium storing a computer program. When the computer program is executed by the processor, the above method embodiment is realized.
即,本领域技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个局点(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。That is, those skilled in the art can understand that all or part of the steps in the method of the foregoing embodiments can be implemented by instructing relevant hardware through a program. The program is stored in a storage medium and includes several instructions to enable a site (It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) executes all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .
本领域的普通技术人员可以理解,上述各实施方式是实现本发明的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本发明的精神和范围。A person of ordinary skill in the art can understand that the above-mentioned embodiments are specific examples for realizing the present invention, and in practical applications, various changes can be made to them in form and details without departing from the spirit and spirit of the present invention. range.

Claims (12)

  1. 一种容灾方法,包括:A disaster tolerance method, including:
    若确定本局点的身份为备用局点,检测是否存在满足预设条件的目标局点;所述预设条件包括:所述目标局点的身份为主用局点,所述目标局点被分配给所述本局点且尚未与所述本局点建立灾备关系;If it is determined that the identity of the current site is a backup site, check whether there is a target site that meets preset conditions; the preset conditions include: the identity of the target site is the primary site, and the target site is allocated To the said local site and has not yet established a disaster recovery relationship with the said local site;
    若存在所述满足预设条件的目标局点,与所述目标局点建立所述灾备关系;If there is the target site that meets the preset conditions, establish the disaster recovery relationship with the target site;
    为所述目标局点备份所述目标局点中待备份的数据。Back up the data to be backed up in the target site for the target site.
  2. 根据权利要求1所述的灾备方法,其中,所述与所述主用局点建立所述灾备关系,包括:The disaster recovery method according to claim 1, wherein the establishing the disaster recovery relationship with the primary site comprises:
    向所述目标局点发送连接请求,并在接收到所述目标局点的允许连接的应答后,确认所述灾备关系建立成功。Send a connection request to the target site, and after receiving a connection permission response from the target site, confirm that the disaster recovery relationship is successfully established.
  3. 根据权利要求1所述的灾备方法,其中,所述检测是否存在满足预设条件的所述目标局点,包括:The disaster recovery method according to claim 1, wherein the detecting whether there is the target site meeting a preset condition comprises:
    扫描预存的主用局点列表;所述主用局点列表包括被分配给所述本局点的所有主用局点;Scan the pre-stored list of primary sites; the list of primary sites includes all primary sites assigned to the local site;
    检测所述主用局点列表中各主用局点的当前状态,并将所述当前状态为尚未与所述本局点建立所述灾备关系的主用局点确定为所述满足所述预设条件的所述目标局点。Detect the current state of each primary site in the primary site list, and determine that the current state is that the primary site that has not established the disaster recovery relationship with the local site is the one that satisfies the prediction Set the target site of the condition.
  4. 根据权利要求1所述的容灾方法,其中,为所述目标局点备份所述目标局点中待备份的数据,包括:从所述目标局点的用于供备用局点访问的存储区拉取所述待备份的数据。The disaster recovery method according to claim 1, wherein backing up the data to be backed up in the target site for the target site comprises: from a storage area of the target site for the backup site to access Pull the data to be backed up.
  5. 根据权利要求1所述的容灾方法,其中,所述待备份的数据被绑定有所述目标局点的身份信息;The disaster recovery method according to claim 1, wherein the data to be backed up is bound with the identity information of the target site;
    所述为所述目标局点备份所述目标局点中待备份的数据,包括:The backing up the data to be backed up in the target site for the target site includes:
    根据所述身份信息,将所述待备份的数据存储至与所述目标局点对应的存储区域。According to the identity information, the data to be backed up is stored in a storage area corresponding to the target site.
  6. 根据权利要求1所述的容灾方法,其中,所述与所述目标局点建立所述灾备关系后,还包括:The disaster recovery method according to claim 1, wherein after the establishment of the disaster recovery relationship with the target site, the method further comprises:
    若检测到所述本局点与所述目标局点身份互换,在接收到所述目标局点的连接请求时,向所述目标局点发送允许连接的应答。If it is detected that the identities of the local site and the target site are exchanged, upon receiving the connection request of the target site, a response allowing connection is sent to the target site.
  7. 根据权利要求6所述的容灾方法,其中,所述检测到所述本局点与所述目标局点身份互换后,还包括:The disaster recovery method according to claim 6, wherein, after detecting that the identities of the local site and the target site are exchanged, the method further comprises:
    将所述本局点为所述目标局点备份的数据放入所述本局点的用于供备用局点访问的存储区,以供所述目标局点从所述本局点的用于供备用局点访问的存储区拉取。Put the data backed up by the local site for the target site into the storage area of the local site for the backup site to access, so that the target site can use it for the backup site from the local site. Click the accessed storage area to pull.
  8. 根据权利要求6所述的容灾方法,其中,所述检测到所述本局点与所述目标局点身份互换后,还包括:The disaster recovery method according to claim 6, wherein, after detecting that the identities of the local site and the target site are exchanged, the method further comprises:
    清除或转存所述本局点中存储的除为所述目标局点以外的局点备份的数据。Clear or dump the backed-up data stored in the local site except for the site other than the target site.
  9. 根据权利要求1所述的容灾方法,其中,所述为所述目标局点备份所述目标局点中待备份的数据,包括:The disaster recovery method according to claim 1, wherein the backing up the data to be backed up in the target site for the target site comprises:
    所述本局点的应用为所述目标局点中的与所述应用相同的应用备份数据。The application of the local site is the same application backup data as the application in the target site.
  10. 一种容灾装置,包括:综合服务模块、消息中间件以及内嵌有数据备份模块的应用;A disaster tolerance device, including: a comprehensive service module, a message middleware, and an application embedded with a data backup module;
    所述综合服务模块,设置为在确定所述本局点为备用局点身份时,检测是否存在满足预设条件的目标局点,并设置为与所述满足预设条件的目标局点建立灾备关系;所述预设条件包括:所述目标局点为主用局点身份,所述目标局点被分配给所述本局点且尚未与所述本局点建立灾备关系;The integrated service module is configured to detect whether there is a target site that meets preset conditions when determining the identity of the local site as a backup site, and is set to establish disaster recovery with the target site that meets the preset conditions Relationship; the preset conditions include: the target site is the master site identity, the target site is allocated to the local site and has not yet established a disaster recovery relationship with the local site;
    所述消息中间件,设置为存储来自所述综合服务模块的所述本局点的身份信息;The message middleware is configured to store the identity information of the local site from the integrated service module;
    所述应用中的数据备份模块,设置为监听所述消息中间件中的所述本局点的身份信息,并设置为在所述本局点与所述目标局点建立所述灾备关系后,根据所述本局点的身份信息,为所述目标局点备份所述目标局点中待备份的数据。The data backup module in the application is configured to monitor the identity information of the local site in the message middleware, and is set to monitor the local site and the target site after the disaster recovery relationship is established according to The identity information of the local site backs up the data to be backed up in the target site for the target site.
  11. 一种局点,包括:A type of site, including:
    至少一个处理器;以及,At least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至9中任一项所述的容灾方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute any one of claims 1 to 9 The disaster tolerance method described.
  12. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至9中任一项所述容灾方法。A computer-readable storage medium that stores a computer program, which, when executed by a processor, implements the disaster recovery method described in any one of claims 1 to 9.
PCT/CN2020/115892 2019-09-18 2020-09-17 Disaster tolerant method and apparatus, site, and storage medium WO2021052416A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910881460.2 2019-09-18
CN201910881460.2A CN112527552A (en) 2019-09-18 2019-09-18 Disaster recovery method, device, local point and storage medium

Publications (1)

Publication Number Publication Date
WO2021052416A1 true WO2021052416A1 (en) 2021-03-25

Family

ID=74883360

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/115892 WO2021052416A1 (en) 2019-09-18 2020-09-17 Disaster tolerant method and apparatus, site, and storage medium

Country Status (2)

Country Link
CN (1) CN112527552A (en)
WO (1) WO2021052416A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113568779B (en) * 2021-06-25 2022-07-26 杭州雅观科技有限公司 Community data backup system based on routing equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102439903A (en) * 2011-05-31 2012-05-02 华为技术有限公司 Method, device and system for realizing disaster-tolerant backup
CN104717083A (en) * 2013-12-13 2015-06-17 中国移动通信集团上海有限公司 Disaster tolerant switching system, method and device for A-SBC equipment
CN106357787A (en) * 2016-09-30 2017-01-25 郑州云海信息技术有限公司 Storage disaster tolerant control system
JP2017142605A (en) * 2016-02-09 2017-08-17 株式会社日立製作所 Backup restoration system and restoration method
CN107222327A (en) * 2016-03-22 2017-09-29 中兴通讯股份有限公司 A kind of method and device based on cloud platform management server
CN109117305A (en) * 2018-07-24 2019-01-01 郑州市景安网络科技股份有限公司 A kind of data back up method, device, equipment and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102439903A (en) * 2011-05-31 2012-05-02 华为技术有限公司 Method, device and system for realizing disaster-tolerant backup
CN104717083A (en) * 2013-12-13 2015-06-17 中国移动通信集团上海有限公司 Disaster tolerant switching system, method and device for A-SBC equipment
JP2017142605A (en) * 2016-02-09 2017-08-17 株式会社日立製作所 Backup restoration system and restoration method
CN107222327A (en) * 2016-03-22 2017-09-29 中兴通讯股份有限公司 A kind of method and device based on cloud platform management server
CN106357787A (en) * 2016-09-30 2017-01-25 郑州云海信息技术有限公司 Storage disaster tolerant control system
CN109117305A (en) * 2018-07-24 2019-01-01 郑州市景安网络科技股份有限公司 A kind of data back up method, device, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN112527552A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
EP3694148A1 (en) Configuration modification method for storage cluster, storage cluster and computer system
CN109951331B (en) Method, device and computing cluster for sending information
EP1550036B1 (en) Method of solving a split-brain condition in a cluster computer system
US6971095B2 (en) Automatic firmware version upgrade system
JP2001306349A (en) Backup device and backup method
CN105025084A (en) A cloud storage system based on synchronization agents and mixed storage
CN107276839B (en) Self-monitoring method and system of cloud platform
CN111666336B (en) Method, system and electronic equipment for data intercommunication among block chains
CN115562911B (en) Virtual machine data backup method, device, system, electronic equipment and storage medium
CN113535391B (en) Distributed cluster state information management method and system of cross-domain big data platform
WO2021052416A1 (en) Disaster tolerant method and apparatus, site, and storage medium
JPH0314161A (en) Processor monitoring processing system
CN104794026B (en) A kind of failover method of cluster instance multi-data source binding
EP4060514A1 (en) Distributed database system and data disaster backup drilling method
JP3407016B2 (en) Network management system
US20050197718A1 (en) High reliability system, redundant construction control method, and program
US11637789B2 (en) Orchestrating apparatus, VNFM apparatus, managing method and program
CN114598711B (en) Data migration method, device, equipment and medium
JP4645435B2 (en) Information processing apparatus, communication load distribution method, and communication load distribution program
US11327679B2 (en) Method and system for bitmap-based synchronous replication
JP6856574B2 (en) Service continuation system and service continuation method
CN108279850B (en) Data resource storage method
CN112948177A (en) Disaster recovery backup method and device, electronic equipment and storage medium
JP6289214B2 (en) Information processing system and method
CN115238005B (en) Data synchronization method and system based on message middleware cluster

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20865069

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20865069

Country of ref document: EP

Kind code of ref document: A1