CN109471759A - A kind of database failure switching method and equipment based on SAS dual control equipment - Google Patents
A kind of database failure switching method and equipment based on SAS dual control equipment Download PDFInfo
- Publication number
- CN109471759A CN109471759A CN201811394323.8A CN201811394323A CN109471759A CN 109471759 A CN109471759 A CN 109471759A CN 201811394323 A CN201811394323 A CN 201811394323A CN 109471759 A CN109471759 A CN 109471759A
- Authority
- CN
- China
- Prior art keywords
- database
- controller
- sas
- dual control
- control equipment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2089—Redundant storage control functionality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1415—Saving, restoring, recovering or retrying at system level
- G06F11/1438—Restarting or rejuvenating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/80—Database-specific techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of database failure switching method based on SAS dual control equipment and configurations, are related to distributed data base High Availabitity field.The method includes the steps of: carrying out heartbeat monitor to the database in SAS dual control equipment, when a side controller or database link interruption and another side controller and database can work normally, and judge the database on the side controller for Mishap Database;If delay machine and database instance delay machine do not occur for failed controller, log-on data library example is attempted in the controller;If delay machine occurs for failed controller, directly progress failover, the startup separator database instance on the controller that the other end works normally, and Mishap Database place storage pool is switched to the controller end of normal work.The present invention can quickly carry out the switching of Mishap Database while avoid data synchronization and saving storage resource and Internet resources when the database in SAS dual control equipment breaks down.
Description
Technical field
The invention belongs to internet rear end distributed data base High Availabitity fields, are related to a kind of based in SAS dual control equipment
The failure switching method of data-base cluster.
Background technique
At present in internet rear end distributed data base High Availabitity field, usually using database mirroring (mirror) pair
Form carries out failover.This method can waste a large amount of memory space, and data synchronize and need the regular hour.
Summary of the invention
In view of the above-mentioned problems, the object of the present invention is to provide one kind, based on SAS, (Serial Attached SCSI, that is, gone here and there
Row connection small computer system interface) dual control equipment database failure switching method and equipment.When internet rear end is distributed
When formula data-base cluster breaks down, quickly positioning failure database node and failure can be rapidly performed by using this method
Handover operation.Memory space is saved in failover of the present invention, saves data synchronization time.
To achieve the above object, the present invention takes following technical scheme: in internet rear end distributed data base High Availabitity
Field, can be fast when the database in SAS dual control equipment breaks down to database configuration management using SAS dual control equipment
Speed carries out the switching of Mishap Database, avoids data synchronization simultaneously and save storage resource and Internet resources.
It includes at least 1 main database server (master), at least 1 packet that this method, which is related to distributed experiment & measurement system,
SAS dual control device controller containing two back end (segment host), each controller have oneself independent CPU,
Memory and storage.
The method includes the steps of:
Step 1: corresponding configuration file is generated, for carrying out database instance failover management;
Step 2: back end all in data-base cluster is monitored, it is inaccessible when monitoring back end
When, whether the controller where judging current database example works normally;
Step 3: if the controller where current database example works normally, judging that current database state is controller
Normally, database instance delay machine then restarts database instance on the controller;
Step 4: if the controller where current database example works normally, and being attempted on the controller by certain
Database instance failure is restarted, judges that current database state is controller outage, database instance delay machine, then by failure
The storage pool of controller is switched on the controller of normal work, and the log-on data library example on the controller of normal work;
Step 5: if controller where current database example breaks down, judging that current database state is delayed for controller
The storage pool of failed controller, then be switched on the controller of normal work by machine, database instance delay machine, and is working normally
Controller on log-on data library example.
The invention adopts the above technical scheme, which has the following advantages:
1. data duplication of the dual control in course of normal operation can be reduced, storage resource and Internet resources are saved.
2. resynchronisation process of the former faulty equipment in recovery process can be saved, the time of recovery is saved.
Detailed description of the invention
Fig. 1 is for the system architecture diagram based on SAS dual control device distribution formula data-base cluster failover;
Fig. 2 is based on SAS dual control device distribution formula data-base cluster failover flow chart.
Specific embodiment
Of the invention is described in detail with reference to the accompanying drawings and examples.
As shown in Figure 1, the present invention relates to the realization above methods based on data-base cluster composed structure in SAS dual control equipment
Including data-base cluster monitoring module and database instance management module, database instance management module includes that database positioning is sentenced
Disconnected module, controller state judgment module, storage pool monitoring module, database instance starting module.
Data-base cluster monitoring module is used for: being monitored to all segment back end of data-base cluster, is worked as prison
When controlling segment back end and breaking down, the information of the segment back end to break down is sent to database shape
State judgment module.
Database positioning judgment module is used for: after the information for receiving failure segment back end, judgement is current
Whether segment back end can connect, if can connect, monitoring module be notified to continue to monitor;It otherwise will be current
Segment back end is judged as that controller is normal, database instance delay machine, sends controller to controller state judgment module
Status information.
Controller state judgment module is used for: when receiving controller state information, determining control where Mishap Database example
Device state processed restarts database instance to the transmission of database instance starting module if controller state is normal, database delay machine
Information, if controller outage, to storage pool monitoring module send monitoring storage pool status signal.
Storage pool monitoring module is used for: when receiving monitoring storage pool status signal, judging failed controller
Whether storage pool is switched to the controller end of normal work, after storage pool successful switch, by the data of Mishap Database example
Catalogue is switched to back-up job catalogue, and back-up job catalogue is being flexible coupling for original working directory.Start mould to database instance
Block sends information required for database instance starts.
Database instance starting module is used for: when receiving parameter required for database instance starts, executing starting number
According to the operation of library example.
Shown in Figure 1, it includes at least 1 master master data that the present invention, which implements the distributed experiment & measurement system in use-case,
Library server, at least 1 include two segment back end SAS dual control device server, when normal work, each control
Device processed has independent CPU, memory and the storage of oneself.
As shown in Fig. 2, the present invention implements the method detailed process that use-case is related to are as follows:
S0: according to configuration, generating corresponding configuration file, when database instance carries out failover, according to configuration text
The requirement of part carries out failover process.
The configuration file generated during S0 includes: database is real on two controllers in same SAS dual control equipment
The port of example and port offset, while further including the switching target of storage pool when dual control switches.
S1: being monitored segment back end all in data-base cluster, when monitoring segment data section
When point is inaccessible, S2 is gone to;
The process of database monitoring during S1 are as follows: to all segment numbers by way of multithreading
Heartbeat detection, which is carried out, according to library example judges the database instance failure when segment database instance un-linkable.If number
It does not break down according to library example and controller, re-executes S1.
S2: whether the controller where judging current database example works normally, if so, current database state is
Controller is normal, database instance delay machine, goes to S3, if the controller where current database example breaks down, currently
Database positioning is controller outage, database instance delay machine, goes to S4;
S3: controller where Mishap Database example is working properly, restarts database instance on the controller, if
Start and successfully then go to S1, attempts to judge controller outage, database instance delay machine if starting fails if by certain, go to
S4;
During S3, the number for attempting to restart database instance is determined by configuration file, is avoided because of network communication
Obstruction, is judged as failure for database instance.
S4: the storage pool of failed controller is switched on the controller of normal work, S5 is gone to;
It also needs for the catalogue of database instance to be switched to specified backup catalogue during S4, so that database is real
Example can avoid obscuring with the data directory on normal controller by backup directory search to data;
S5: the log-on data library example on the controller of normal work terminates.
During S5, in order to avoid obscuring with the port numbers of the database instance on normal controller, in configuration file
In need to configure the standby port number of Mishap Database example and use standby port log-on data library example during S5.
When the storage pool monitoring module and database instance starting module work, controller working properly occurs
Failure, then direct power cut-off.
The present invention is not limited to the above-described embodiments, for those skilled in the art, is not departing from
Under the premise of the principle of the invention, several improvements and modifications can also be made, these improvements and modifications are also considered as protection of the invention
Within the scope of.The content being not described in detail in this specification belongs to the prior art well known to professional and technical personnel in the field.
Claims (10)
1. a kind of database failure switching method and equipment based on SAS dual control equipment, this method are related to distributed data base collection
Group's at least 1 database server, at least 1 SAS dual control device controller comprising two back end, each controller have
Independent CPU, memory and the storage of oneself;
It is characterized by: the database failure switching method based on SAS dual control equipment comprising the steps of:
Step 1: corresponding configuration file is generated, for carrying out database instance failover management;
Step 2: back end all in data-base cluster is monitored, when monitor back end it is inaccessible when, sentence
Whether the controller where disconnected current database example works normally;
Step 3: if the controller where current database example works normally, judge current database state be controller just
Often, database instance delay machine then restarts database instance on the controller;
Step 4: if the controller where current database example works normally, and being attempted on the controller again by certain
The failure of log-on data library example, judges that current database state is controller outage, database instance delay machine, then by Fault Control
The storage pool of device is switched on the controller of normal work, and the log-on data library example on the controller of normal work;
Step 5: if controller where current database example breaks down, judge current database state for controller outage,
The storage pool of failed controller is then switched on the controller of normal work by database instance delay machine, and in normal work
Log-on data library example on controller.
2. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step
It also needs for the catalogue of database instance to be switched to specified backup catalogue during rapid 5, database instance is led to
Backup directory search is crossed to data, avoids obscuring with the data directory on normal controller.
3. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step
During rapid 5, in order to avoid obscuring with the port numbers of the database instance on normal controller, needed to configure in configuration file
The standby port number of Mishap Database example uses standby port log-on data library example.
4. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step
In rapid 4, the number for attempting to restart database instance is determined by configuration file, is avoided because network communication is blocked, by database
Example is judged as failure.
5. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step
The process of the database monitoring during rapid 2 are as follows: heartbeat detection is carried out to all database instances, works as database instance
When un-linkable, the database instance failure is judged.
6. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step
The configuration file generated during rapid 1 includes: in same SAS dual control equipment on two controllers database instance port with
And port offset, while further including the switching target of storage pool when dual control switches.
7. the database failure switching method according to any one of claims 1 to 5 based on SAS dual control equipment, feature
Be: all nodes in the distributed experiment & measurement system are distributed in SAS dual control equipment, each SAS dual control equipment
It is one group, failover can only switch in same SAS dual control equipment;
Step 5 is specifically includes the following steps: by data directory information, the port number information of Mishap Database example on this basis
Update is updated in master back end.
8. a kind of database failure switching equipment based on SAS dual control equipment, it is characterised in that: the equipment is related to data-base cluster
Configuration includes data-base cluster monitoring module and database instance management module, and database instance management module includes database shape
State judgment module, controller state judgment module, storage pool monitoring module, database instance starting module;
Data-base cluster monitoring module is used for: being monitored to all back end of data-base cluster, when monitoring data section
When point breaks down, the information of the back end to break down is sent to database positioning judgment module;
Database positioning judgment module is used for: after the information for receiving fault data node, judging whether current data node can be with
Connection, if can connect, notifies monitoring module to continue to monitor;Otherwise current data node is judged as that controller is normal, counts
According to library example delay machine, controller state information is sent to controller state judgment module;
Controller state judgment module is used for: when receiving controller state information, determining controller where Mishap Database example
State sends the letter for restarting database instance to database instance starting module if controller state is normal, database delay machine
Breath sends monitoring storage pool status signal to storage pool monitoring module if controller outage;
Storage pool monitoring module is used for: when receiving monitoring storage pool status signal, judging the storage of failed controller
Whether pond is switched to the controller end of normal work, after storage pool successful switch, by the data directory of Mishap Database example
It is switched to back-up job catalogue, back-up job catalogue is being flexible coupling for original working directory.It is sent out to database instance starting module
Information required for sending database instance to start;
Database instance starting module is used for: when receiving parameter required for database instance starts, executing log-on data library
The operation of example.
9. the database failure switching equipment according to claim 8 based on SAS dual control equipment, it is characterised in that: described
When storage pool monitoring module and database instance starting module work, controller working properly breaks down, then directly
Power cut-off.
10. the database failure switching equipment according to claim 8 based on SAS dual control equipment, it is characterised in that: described
When judging that database instance works normally, then notification database cluster monitoring module continues to monitor database positioning judgment module
All database nodes in data-base cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811394323.8A CN109471759B (en) | 2018-11-21 | 2018-11-21 | A kind of database failure switching method and equipment based on SAS dual control equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811394323.8A CN109471759B (en) | 2018-11-21 | 2018-11-21 | A kind of database failure switching method and equipment based on SAS dual control equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109471759A true CN109471759A (en) | 2019-03-15 |
CN109471759B CN109471759B (en) | 2019-08-02 |
Family
ID=65673152
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811394323.8A Active CN109471759B (en) | 2018-11-21 | 2018-11-21 | A kind of database failure switching method and equipment based on SAS dual control equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109471759B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110086668A (en) * | 2019-04-28 | 2019-08-02 | 杭州迪普科技股份有限公司 | A kind of configuration file switching method and system |
CN110162428A (en) * | 2019-05-17 | 2019-08-23 | 中国铁道科学研究院集团有限公司 | Method of data synchronization and device, electronic equipment and computer readable storage medium |
CN111400111A (en) * | 2020-03-12 | 2020-07-10 | 北京交大思诺科技股份有限公司 | Safe computer platform with standby machine out-of-step state |
CN111488245A (en) * | 2020-04-14 | 2020-08-04 | 深圳市小微学苑科技有限公司 | Advanced management method and system for distributed storage |
CN112527595A (en) * | 2020-12-02 | 2021-03-19 | 平安医疗健康管理股份有限公司 | Monitoring method and device for Greenplus cluster database and computer equipment |
CN113687784A (en) * | 2021-08-20 | 2021-11-23 | 浙江大华技术股份有限公司 | Double-control switching storage method and device and electronic equipment |
CN113886490A (en) * | 2021-09-14 | 2022-01-04 | 北京东方金信科技股份有限公司 | Method and system for realizing high availability of stateless computing instances in distributed database |
CN115933565A (en) * | 2022-12-23 | 2023-04-07 | 广东职业技术学院 | AGV task exchange method, device, system and medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101643074A (en) * | 2009-06-30 | 2010-02-10 | 卡斯柯信号有限公司 | Hot-standby system for primary and standby control center |
CN101739406A (en) * | 2008-11-13 | 2010-06-16 | 英业达股份有限公司 | Method for synchronizing file service operations on double-controller |
CN102081579A (en) * | 2009-11-30 | 2011-06-01 | 英业达股份有限公司 | Cache image system and method for storage equipment with dual controllers |
US20140006357A1 (en) * | 2011-11-14 | 2014-01-02 | Panzura, Inc. | Restoring an archived file in a distributed filesystem |
CN103685250A (en) * | 2013-12-04 | 2014-03-26 | 蓝盾信息安全技术股份有限公司 | Virtual machine security policy migration system and method based on SDN |
CN104793896A (en) * | 2015-02-04 | 2015-07-22 | 北京神州云科数据技术有限公司 | Single-control and double-control switching method and device of double-control equipment |
CN106953744A (en) * | 2017-02-27 | 2017-07-14 | 浙江工商大学 | A kind of SDN cluster controllers High Availabitity architecture design method |
CN107733684A (en) * | 2017-08-31 | 2018-02-23 | 北京宇航系统工程研究所 | A kind of multi-controller computing redundancy cluster based on Loongson processor |
-
2018
- 2018-11-21 CN CN201811394323.8A patent/CN109471759B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101739406A (en) * | 2008-11-13 | 2010-06-16 | 英业达股份有限公司 | Method for synchronizing file service operations on double-controller |
CN101643074A (en) * | 2009-06-30 | 2010-02-10 | 卡斯柯信号有限公司 | Hot-standby system for primary and standby control center |
CN102081579A (en) * | 2009-11-30 | 2011-06-01 | 英业达股份有限公司 | Cache image system and method for storage equipment with dual controllers |
US20140006357A1 (en) * | 2011-11-14 | 2014-01-02 | Panzura, Inc. | Restoring an archived file in a distributed filesystem |
CN103685250A (en) * | 2013-12-04 | 2014-03-26 | 蓝盾信息安全技术股份有限公司 | Virtual machine security policy migration system and method based on SDN |
CN104793896A (en) * | 2015-02-04 | 2015-07-22 | 北京神州云科数据技术有限公司 | Single-control and double-control switching method and device of double-control equipment |
CN106953744A (en) * | 2017-02-27 | 2017-07-14 | 浙江工商大学 | A kind of SDN cluster controllers High Availabitity architecture design method |
CN107733684A (en) * | 2017-08-31 | 2018-02-23 | 北京宇航系统工程研究所 | A kind of multi-controller computing redundancy cluster based on Loongson processor |
Non-Patent Citations (1)
Title |
---|
刘丹等: "基于Openstack私有云平台的高可用性研究", 《长春理工大学学报 (自然科学版)》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110086668A (en) * | 2019-04-28 | 2019-08-02 | 杭州迪普科技股份有限公司 | A kind of configuration file switching method and system |
CN110086668B (en) * | 2019-04-28 | 2022-07-29 | 杭州迪普科技股份有限公司 | Configuration file switching method and system |
CN110162428A (en) * | 2019-05-17 | 2019-08-23 | 中国铁道科学研究院集团有限公司 | Method of data synchronization and device, electronic equipment and computer readable storage medium |
CN111400111A (en) * | 2020-03-12 | 2020-07-10 | 北京交大思诺科技股份有限公司 | Safe computer platform with standby machine out-of-step state |
CN111400111B (en) * | 2020-03-12 | 2024-02-27 | 北京交大思诺科技股份有限公司 | Safe computer platform with standby machine out-of-step state |
CN111488245A (en) * | 2020-04-14 | 2020-08-04 | 深圳市小微学苑科技有限公司 | Advanced management method and system for distributed storage |
CN112527595A (en) * | 2020-12-02 | 2021-03-19 | 平安医疗健康管理股份有限公司 | Monitoring method and device for Greenplus cluster database and computer equipment |
CN113687784A (en) * | 2021-08-20 | 2021-11-23 | 浙江大华技术股份有限公司 | Double-control switching storage method and device and electronic equipment |
CN113886490A (en) * | 2021-09-14 | 2022-01-04 | 北京东方金信科技股份有限公司 | Method and system for realizing high availability of stateless computing instances in distributed database |
CN115933565A (en) * | 2022-12-23 | 2023-04-07 | 广东职业技术学院 | AGV task exchange method, device, system and medium |
CN115933565B (en) * | 2022-12-23 | 2023-10-20 | 广东职业技术学院 | AGV task exchange method, device, system and medium |
Also Published As
Publication number | Publication date |
---|---|
CN109471759B (en) | 2019-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109471759B (en) | A kind of database failure switching method and equipment based on SAS dual control equipment | |
CN104378232B (en) | Fissure discovery, restoration methods and device under active and standby cluster networking pattern | |
CN103346903B (en) | Dual-machine backup method and device | |
CN101546187B (en) | Redundant PLC system | |
CN203482216U (en) | Network equipment | |
CN102394914A (en) | Cluster brain-split processing method and device | |
CN103647781A (en) | Mixed redundancy programmable control system based on equipment redundancy and network redundancy | |
CN101916217A (en) | Method, control device and system for switching a plurality of controllers | |
CN103532753A (en) | Double-computer hot standby method based on memory page replacement synchronization | |
CN105068763B (en) | A kind of virtual machine tolerant system and method for storage failure | |
CN109905275A (en) | A kind of detection of control plane failure and processing method based on SDN layer architecture | |
CN104484243A (en) | High-reliability system device and method combining virtual machine fault-tolerant technique and high-availability cluster technique | |
CN117331863B (en) | Power supply information communication method, system, electronic equipment and storage medium | |
CN104917700A (en) | Management unit and exchange unit dual-redundancy switch | |
CN212541329U (en) | Dual-redundancy computer equipment based on domestic Loongson platform | |
CN107071189B (en) | Connection method of communication equipment physical interface | |
CN103605616A (en) | Multi-controller cache data consistency guarantee method | |
JP5285045B2 (en) | Failure recovery method, server and program in virtual environment | |
JP5285044B2 (en) | Cluster system recovery method, server, and program | |
CN115549751B (en) | Remote sensing satellite ground station monitoring system and method | |
CN110740066A (en) | Cross-machine fault migration method and system with unchangeable seats of types | |
CN101957786A (en) | Method and device for realizing start and fault switching control in dual-control system | |
JP3325785B2 (en) | Computer failure detection and recovery method | |
CN110752955A (en) | Seat invariant fault migration system and method | |
KR100832890B1 (en) | Process obstacle lookout method and recovery method for information communication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder |
Address after: 100193 2nd Floor 201, Block B, Building 12, East 10 Wangdong Road, Northwest Haidian District, Beijing Patentee after: Beijing Gushu Polytron Technologies Inc Address before: 100193 2nd Floor 201, Block B, Building 12, East 10 Wangdong Road, Northwest Haidian District, Beijing Patentee before: BEIJING GUSHU TECHNOLOGY Co.,Ltd. |
|
CP01 | Change in the name or title of a patent holder |