CN111209140B

CN111209140B - Method and device for recovering crash of main and standby dual-node databases

Info

Publication number: CN111209140B
Application number: CN201911391020.5A
Authority: CN
Inventors: 潘景基
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2023-01-06
Anticipated expiration: 2039-12-30
Also published as: CN111209140A

Abstract

The embodiment of the invention discloses a method and a device for recovering crash of a main and standby dual-node database, which comprises the steps of preprocessing backup path configuration parameters of the database; separating the ics-manager service from the main/standby switching; and according to the state of the database service mariardb and the condition of database integrity backup, performing database recovery operation. According to the method and the system, the recovery operation of the database is respectively carried out according to the database service and the database integrity backup condition. And the data recovery is realized under the normal condition of the system and the data disk, so that the continuous availability of a virtualization system user is realized. The integrity of the data of the HCI virtualization system is ensured, and the high maintainability of the system operation is enhanced.

Description

Method and device for recovering crash of main and standby dual-node databases

Technical Field

The invention relates to the technical field of virtualization, in particular to a method and a device for recovering crash of a main and standby dual-node database.

Background

Cloud computing is a new innovation in the information age following the internet and computers, has strong expansibility and desirability, and can provide a brand new experience for users.

The virtualization technology in the cloud computing technology is developed particularly rapidly at present, and in the face of the development opportunity, a wave and tide super-fusion all-in-one machine launched by wave and tide deploys an InCloud Rail virtualization system, namely an HCI system, which is an enterprise-level server virtualization solution, converts a static and complex IT environment into a more dynamic and easily-managed virtual data center through fusion, distribution and management of bottom-layer physical resources, improves the agility and flexibility of resource delivery and the use efficiency of resources, helps enterprises to create a high-performance, extensible, manageable and flexible server virtualization infrastructure, and provides high-quality virtual data center services.

For the Langchao InCloud Rail super-fusion architecture system, namely the HCI system, for some users, the system exception may be triggered under the condition that the users do not operate by people according to the instruction manual or due to sudden exception conditions, so that the environment is broken down. Particularly, data recovery is performed only for the iCenter in the HCI system active/standby dual-node environment under normal system and data disk conditions, that is, a database or a database file needs to be recovered due to damage caused by a special reason.

Disclosure of Invention

The embodiment of the invention provides a method and a device for recovering crash of a main and standby dual-node database, which are used for solving the problem of data recovery when an HCI (host-standby dual-node) system is abnormally crashed in the prior art.

In order to solve the technical problem, the embodiment of the invention discloses the following technical scheme:

the first aspect of the present invention provides a method for recovering crash of a primary/standby dual-node database, where the method includes the following steps:

preprocessing the configuration parameters of the backup path of the database;

separating the ics-manager service from the main/standby switching;

and according to the state of the database service mariaddb and the condition of database integrity backup, performing recovery operation on the database.

Further, the recovering operation of the database according to the state of the database service mariaddb and the condition of the database integrity backup specifically comprises:

judging whether the state of the database service mariaddb is normal or not;

if yes, judging whether the database has integrity backup, if yes, directly starting data recovery operation, and if not, performing data recovery operation according to the current environment;

if not, the service is crashed, and the crash service is checked and analyzed.

Further, the directly enabling data recovery operation specifically includes:

deleting the database;

entering a database backup catalog, and decompressing backup database files;

importing the backup data into a database, and restarting the ics-manager service after the database is recovered to be normal;

adds ics-manager services to the heartbeat cluster.

Further, the delete database includes delete database name, delete database neutron, and delete mysql.

Further, the specific process of performing troubleshooting analysis on the crash service is as follows:

backing up a database data directory and a database log;

acquiring configuration information through a database configuration file, and checking a service log;

and calling a problem solution library according to the service log, and performing problem pairing and recovery.

A second aspect of the present invention provides a device for recovering a crash of a primary/standby dual-node database, where the device includes:

the data preprocessing module is used for preprocessing the configuration parameters of the backup path of the database;

the service separation module is used for separating the ics-manager service from the main/standby switching;

and the data recovery module is used for performing recovery operation on the database according to the state of the database service mariaddb and the condition of database integrity backup.

Further, the data recovery module comprises:

the state judgment unit is used for judging whether the state of the database service mariaddb is normal or not;

the backup integrity judging unit is used for judging whether the database has an integrity backup;

the first data recovery unit is used for performing data recovery operation when the service state of the database is normal and the backup is complete;

the second data recovery unit is used for performing data recovery operation according to the current environment when the service state of the database is normal and the backup is incomplete;

and the analysis and investigation unit is used for carrying out investigation and analysis on the crash service when the service state of the database is abnormal.

Further, the analysis and investigation unit includes:

the data backup subunit is used for backing up the database data catalog and the database log;

the information acquisition subunit acquires the configuration information through the database configuration file and checks the service log;

and the data recovery subunit calls the problem solution library according to the service log, and performs problem pairing and recovery.

Further, the first data recovery unit includes:

the first data processing subunit is used for deleting the database;

the second data processing subunit enters a database backup catalog and decompresses backup database files;

the service recovery subunit is used for importing the backup data into the database and restarting the ics-manager service after the database is recovered to be normal;

and the service configuration subunit adds the ics-manager service to the heartbeat cluster.

The effect provided in the summary of the invention is only the effect of the embodiment, not all the effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:

and respectively carrying out recovery operation on the databases according to the database service and the database integrity backup condition. And the data recovery is realized under the normal condition of the system and the data disk, so that the continuous availability of a virtualization system user is realized. The integrity of the data of the HCI virtualization system is ensured, and the high maintainability of the system operation is enhanced.

Drawings

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a schematic flow diagram of an embodiment of the method of the present invention;

fig. 3 is a schematic diagram of the structure of the device of the present invention.

Detailed Description

In order to clearly explain the technical features of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different features of the invention. To simplify the disclosure of the present invention, the components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.

As shown in fig. 1, the method for recovering crash of the active/standby dual-node database of the present invention includes the following steps:

s1, preprocessing configuration parameters of a database backup path;

s2, separating the ics-manager service from the main/standby switching;

and S3, performing recovery operation on the database according to the state of the database service mariaddb and the condition of database integrity backup.

In step S1, backup path configuration parameters of the iCenter system database are reserved, and the path is/var/backup/up.

In step S2, a heartbeat-disable ics-manager command is executed, and ics-manager service is separated from the main/standby switching.

As shown in fig. 2, the implementation process in step S3 is: executing a system status mariaddb, and judging whether the state of the database service mariaddb is normal or not according to the state result; active (running) indicates that the maridb service is normal, otherwise, the maridb service is abnormal. If the service state is normal, checking the/var/backup file, judging whether the database has an integrity backup, if so, directly starting data recovery operation, and if not, performing data recovery operation according to the current environment; if the service state is abnormal, the service is broken down, and the broken-down service is checked and analyzed.

The specific method for directly starting the data recovery operation is as follows: respectively executing database name deletion (drop database), database neutron (drop database neutron) and deletion of mysql. Entering a/var/backup database backup directory, and decompressing a backup database file by adopting a gunzip command gunzip xxx.sql.gz; importing backup data into a database by executing mysql-boot-ppassword mysql < xxx.sql, and restarting ics-manager service by executing systemtctl restart-manager after the database is recovered to be normal; the ics-manager service is added to the heartbeat cluster by heartbeat-enablers-manager.

The specific process of performing troubleshooting analysis on the crash service comprises the following steps: backing up database data directories datadir =/var/mysql, log-bin =/var/mysql/xxx.log and database logs; acquiring configuration information through a database configuration file, and viewing/var/log/mariaddb/mariaddb.log service logs; and calling a problem solution library according to the service log, and performing problem pairing and recovery.

As shown in fig. 3, the recovery apparatus for a crash of a primary and standby dual-node database of the present invention includes a data preprocessing module 1, a service separation module 2, and a data recovery module 3. The data preprocessing module 1 preprocesses the configuration parameters of the database backup path; the service separation module 2 separates the ics-manager service from the main/standby switching; and the data recovery module 3 performs database recovery operation according to the state of the database service mariardb and the condition of database integrity backup.

The data restoring module 3 includes a state judging unit 31, a backup integrity judging unit 32, a first data restoring unit 33, a second data restoring unit 34, and an analysis and review unit 35. The state judging unit 31 is used for judging whether the state of the database service mariaddb is normal; the backup integrity judging unit 32 is used for judging whether an integrity backup exists in the database; the first data recovery unit 33 is configured to perform a data recovery operation when the database service state is normal and the backup is complete; the second data recovery unit 33 is configured to perform data recovery operation according to the current environment when the database service state is normal and the backup is incomplete; the analysis and troubleshooting unit 34 is configured to perform troubleshooting analysis on the crash service when the database service state is abnormal.

The analysis and review unit 35 includes a data backup sub-unit 351, an information acquisition sub-unit 352, and a data restoration sub-unit 353. The data backup subunit 351 backs up the database data directory and the database log; the information obtaining subunit 352 obtains the configuration information through the database configuration file, and checks the service log; the data recovery subunit 353 calls the problem solution library according to the service log, and performs problem pairing and recovery.

The first data restoring unit 33 includes a first data processing sub-unit 331, a second data processing sub-unit 332, a service restoring sub-unit 333, and a service configuring sub-unit 334. The first data processing subunit 331 is configured to delete the database; the second data processing subunit 332 enters the database backup directory and decompresses the backup database file; the service recovery subunit 333 imports the backup data into the database, and restarts the ics-manager service after the database is recovered to be normal; the service configuration subunit 334 adds the ics-manager service to the heartbeat cluster.

The foregoing is only a preferred embodiment of the present invention and it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principle of the present invention and are intended to be included within the scope of the present invention.

Claims

1. A recovery method for crash of a main and standby dual-node database is characterized by comprising the following steps:

preprocessing the configuration parameters of the backup path of the database;

separating the ics-manager service from the main/standby switching;

according to the state of the database service mariardb and the condition of database integrity backup, performing recovery operation on the database;

the specific operation of restoring the database according to the state of the database service mariaddb and the condition of database integrity backup is as follows:

judging whether the state of the database service mariaddb is normal or not;

if not, the service is crashed, and the crash service is checked and analyzed.

2. The method according to claim 1, wherein the specific process of directly starting the data recovery operation is as follows:

deleting the database;

entering a database backup catalog, and decompressing backup database files;

adds ics-manager services to the heartbeat cluster.

3. The method for recovering from crash of active/standby dual-node database according to claim 2, wherein said deleting database includes deleting database name, deleting database neutron, and deleting mysql.

4. The method for recovering crash of active/standby dual-node database according to claim 1, wherein the specific process of performing troubleshooting analysis on crash service is as follows:

backing up a database data directory and a database log;

5. A device for recovering crash of a main and standby dual-node database is characterized by comprising:

the data recovery module is used for performing recovery operation on the database according to the state of the database service mariaddb and the condition of database integrity backup;

the data recovery module comprises:

6. The apparatus according to claim 5, wherein the parsing unit comprises:

7. The apparatus for recovering from a crash of an active/standby dual-node database according to claim 5, wherein said first data recovery unit comprises:

the first data processing subunit is used for deleting the database;