CN109471759A - A kind of database failure switching method and equipment based on SAS dual control equipment - Google Patents

A kind of database failure switching method and equipment based on SAS dual control equipment Download PDF

Info

Publication number
CN109471759A
CN109471759A CN201811394323.8A CN201811394323A CN109471759A CN 109471759 A CN109471759 A CN 109471759A CN 201811394323 A CN201811394323 A CN 201811394323A CN 109471759 A CN109471759 A CN 109471759A
Authority
CN
China
Prior art keywords
database
controller
sas
dual control
control equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811394323.8A
Other languages
Chinese (zh)
Other versions
CN109471759B (en
Inventor
杨刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gushu Polytron Technologies Inc
Original Assignee
Beijing Gushu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gushu Technology Co Ltd filed Critical Beijing Gushu Technology Co Ltd
Priority to CN201811394323.8A priority Critical patent/CN109471759B/en
Publication of CN109471759A publication Critical patent/CN109471759A/en
Application granted granted Critical
Publication of CN109471759B publication Critical patent/CN109471759B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2089Redundant storage control functionality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of database failure switching method based on SAS dual control equipment and configurations, are related to distributed data base High Availabitity field.The method includes the steps of: carrying out heartbeat monitor to the database in SAS dual control equipment, when a side controller or database link interruption and another side controller and database can work normally, and judge the database on the side controller for Mishap Database;If delay machine and database instance delay machine do not occur for failed controller, log-on data library example is attempted in the controller;If delay machine occurs for failed controller, directly progress failover, the startup separator database instance on the controller that the other end works normally, and Mishap Database place storage pool is switched to the controller end of normal work.The present invention can quickly carry out the switching of Mishap Database while avoid data synchronization and saving storage resource and Internet resources when the database in SAS dual control equipment breaks down.

Description

A kind of database failure switching method and equipment based on SAS dual control equipment
Technical field
The invention belongs to internet rear end distributed data base High Availabitity fields, are related to a kind of based in SAS dual control equipment The failure switching method of data-base cluster.
Background technique
At present in internet rear end distributed data base High Availabitity field, usually using database mirroring (mirror) pair Form carries out failover.This method can waste a large amount of memory space, and data synchronize and need the regular hour.
Summary of the invention
In view of the above-mentioned problems, the object of the present invention is to provide one kind, based on SAS, (Serial Attached SCSI, that is, gone here and there Row connection small computer system interface) dual control equipment database failure switching method and equipment.When internet rear end is distributed When formula data-base cluster breaks down, quickly positioning failure database node and failure can be rapidly performed by using this method Handover operation.Memory space is saved in failover of the present invention, saves data synchronization time.
To achieve the above object, the present invention takes following technical scheme: in internet rear end distributed data base High Availabitity Field, can be fast when the database in SAS dual control equipment breaks down to database configuration management using SAS dual control equipment Speed carries out the switching of Mishap Database, avoids data synchronization simultaneously and save storage resource and Internet resources.
It includes at least 1 main database server (master), at least 1 packet that this method, which is related to distributed experiment & measurement system, SAS dual control device controller containing two back end (segment host), each controller have oneself independent CPU, Memory and storage.
The method includes the steps of:
Step 1: corresponding configuration file is generated, for carrying out database instance failover management;
Step 2: back end all in data-base cluster is monitored, it is inaccessible when monitoring back end When, whether the controller where judging current database example works normally;
Step 3: if the controller where current database example works normally, judging that current database state is controller Normally, database instance delay machine then restarts database instance on the controller;
Step 4: if the controller where current database example works normally, and being attempted on the controller by certain Database instance failure is restarted, judges that current database state is controller outage, database instance delay machine, then by failure The storage pool of controller is switched on the controller of normal work, and the log-on data library example on the controller of normal work;
Step 5: if controller where current database example breaks down, judging that current database state is delayed for controller The storage pool of failed controller, then be switched on the controller of normal work by machine, database instance delay machine, and is working normally Controller on log-on data library example.
The invention adopts the above technical scheme, which has the following advantages:
1. data duplication of the dual control in course of normal operation can be reduced, storage resource and Internet resources are saved.
2. resynchronisation process of the former faulty equipment in recovery process can be saved, the time of recovery is saved.
Detailed description of the invention
Fig. 1 is for the system architecture diagram based on SAS dual control device distribution formula data-base cluster failover;
Fig. 2 is based on SAS dual control device distribution formula data-base cluster failover flow chart.
Specific embodiment
Of the invention is described in detail with reference to the accompanying drawings and examples.
As shown in Figure 1, the present invention relates to the realization above methods based on data-base cluster composed structure in SAS dual control equipment Including data-base cluster monitoring module and database instance management module, database instance management module includes that database positioning is sentenced Disconnected module, controller state judgment module, storage pool monitoring module, database instance starting module.
Data-base cluster monitoring module is used for: being monitored to all segment back end of data-base cluster, is worked as prison When controlling segment back end and breaking down, the information of the segment back end to break down is sent to database shape State judgment module.
Database positioning judgment module is used for: after the information for receiving failure segment back end, judgement is current Whether segment back end can connect, if can connect, monitoring module be notified to continue to monitor;It otherwise will be current Segment back end is judged as that controller is normal, database instance delay machine, sends controller to controller state judgment module Status information.
Controller state judgment module is used for: when receiving controller state information, determining control where Mishap Database example Device state processed restarts database instance to the transmission of database instance starting module if controller state is normal, database delay machine Information, if controller outage, to storage pool monitoring module send monitoring storage pool status signal.
Storage pool monitoring module is used for: when receiving monitoring storage pool status signal, judging failed controller Whether storage pool is switched to the controller end of normal work, after storage pool successful switch, by the data of Mishap Database example Catalogue is switched to back-up job catalogue, and back-up job catalogue is being flexible coupling for original working directory.Start mould to database instance Block sends information required for database instance starts.
Database instance starting module is used for: when receiving parameter required for database instance starts, executing starting number According to the operation of library example.
Shown in Figure 1, it includes at least 1 master master data that the present invention, which implements the distributed experiment & measurement system in use-case, Library server, at least 1 include two segment back end SAS dual control device server, when normal work, each control Device processed has independent CPU, memory and the storage of oneself.
As shown in Fig. 2, the present invention implements the method detailed process that use-case is related to are as follows:
S0: according to configuration, generating corresponding configuration file, when database instance carries out failover, according to configuration text The requirement of part carries out failover process.
The configuration file generated during S0 includes: database is real on two controllers in same SAS dual control equipment The port of example and port offset, while further including the switching target of storage pool when dual control switches.
S1: being monitored segment back end all in data-base cluster, when monitoring segment data section When point is inaccessible, S2 is gone to;
The process of database monitoring during S1 are as follows: to all segment numbers by way of multithreading Heartbeat detection, which is carried out, according to library example judges the database instance failure when segment database instance un-linkable.If number It does not break down according to library example and controller, re-executes S1.
S2: whether the controller where judging current database example works normally, if so, current database state is Controller is normal, database instance delay machine, goes to S3, if the controller where current database example breaks down, currently Database positioning is controller outage, database instance delay machine, goes to S4;
S3: controller where Mishap Database example is working properly, restarts database instance on the controller, if Start and successfully then go to S1, attempts to judge controller outage, database instance delay machine if starting fails if by certain, go to S4;
During S3, the number for attempting to restart database instance is determined by configuration file, is avoided because of network communication Obstruction, is judged as failure for database instance.
S4: the storage pool of failed controller is switched on the controller of normal work, S5 is gone to;
It also needs for the catalogue of database instance to be switched to specified backup catalogue during S4, so that database is real Example can avoid obscuring with the data directory on normal controller by backup directory search to data;
S5: the log-on data library example on the controller of normal work terminates.
During S5, in order to avoid obscuring with the port numbers of the database instance on normal controller, in configuration file In need to configure the standby port number of Mishap Database example and use standby port log-on data library example during S5.
When the storage pool monitoring module and database instance starting module work, controller working properly occurs Failure, then direct power cut-off.
The present invention is not limited to the above-described embodiments, for those skilled in the art, is not departing from Under the premise of the principle of the invention, several improvements and modifications can also be made, these improvements and modifications are also considered as protection of the invention Within the scope of.The content being not described in detail in this specification belongs to the prior art well known to professional and technical personnel in the field.

Claims (10)

1. a kind of database failure switching method and equipment based on SAS dual control equipment, this method are related to distributed data base collection Group's at least 1 database server, at least 1 SAS dual control device controller comprising two back end, each controller have Independent CPU, memory and the storage of oneself;
It is characterized by: the database failure switching method based on SAS dual control equipment comprising the steps of:
Step 1: corresponding configuration file is generated, for carrying out database instance failover management;
Step 2: back end all in data-base cluster is monitored, when monitor back end it is inaccessible when, sentence Whether the controller where disconnected current database example works normally;
Step 3: if the controller where current database example works normally, judge current database state be controller just Often, database instance delay machine then restarts database instance on the controller;
Step 4: if the controller where current database example works normally, and being attempted on the controller again by certain The failure of log-on data library example, judges that current database state is controller outage, database instance delay machine, then by Fault Control The storage pool of device is switched on the controller of normal work, and the log-on data library example on the controller of normal work;
Step 5: if controller where current database example breaks down, judge current database state for controller outage, The storage pool of failed controller is then switched on the controller of normal work by database instance delay machine, and in normal work Log-on data library example on controller.
2. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step It also needs for the catalogue of database instance to be switched to specified backup catalogue during rapid 5, database instance is led to Backup directory search is crossed to data, avoids obscuring with the data directory on normal controller.
3. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step During rapid 5, in order to avoid obscuring with the port numbers of the database instance on normal controller, needed to configure in configuration file The standby port number of Mishap Database example uses standby port log-on data library example.
4. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step In rapid 4, the number for attempting to restart database instance is determined by configuration file, is avoided because network communication is blocked, by database Example is judged as failure.
5. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step The process of the database monitoring during rapid 2 are as follows: heartbeat detection is carried out to all database instances, works as database instance When un-linkable, the database instance failure is judged.
6. the database failure switching method according to claim 1 based on SAS dual control equipment, it is characterised in that: in step The configuration file generated during rapid 1 includes: in same SAS dual control equipment on two controllers database instance port with And port offset, while further including the switching target of storage pool when dual control switches.
7. the database failure switching method according to any one of claims 1 to 5 based on SAS dual control equipment, feature Be: all nodes in the distributed experiment & measurement system are distributed in SAS dual control equipment, each SAS dual control equipment It is one group, failover can only switch in same SAS dual control equipment;
Step 5 is specifically includes the following steps: by data directory information, the port number information of Mishap Database example on this basis Update is updated in master back end.
8. a kind of database failure switching equipment based on SAS dual control equipment, it is characterised in that: the equipment is related to data-base cluster Configuration includes data-base cluster monitoring module and database instance management module, and database instance management module includes database shape State judgment module, controller state judgment module, storage pool monitoring module, database instance starting module;
Data-base cluster monitoring module is used for: being monitored to all back end of data-base cluster, when monitoring data section When point breaks down, the information of the back end to break down is sent to database positioning judgment module;
Database positioning judgment module is used for: after the information for receiving fault data node, judging whether current data node can be with Connection, if can connect, notifies monitoring module to continue to monitor;Otherwise current data node is judged as that controller is normal, counts According to library example delay machine, controller state information is sent to controller state judgment module;
Controller state judgment module is used for: when receiving controller state information, determining controller where Mishap Database example State sends the letter for restarting database instance to database instance starting module if controller state is normal, database delay machine Breath sends monitoring storage pool status signal to storage pool monitoring module if controller outage;
Storage pool monitoring module is used for: when receiving monitoring storage pool status signal, judging the storage of failed controller Whether pond is switched to the controller end of normal work, after storage pool successful switch, by the data directory of Mishap Database example It is switched to back-up job catalogue, back-up job catalogue is being flexible coupling for original working directory.It is sent out to database instance starting module Information required for sending database instance to start;
Database instance starting module is used for: when receiving parameter required for database instance starts, executing log-on data library The operation of example.
9. the database failure switching equipment according to claim 8 based on SAS dual control equipment, it is characterised in that: described When storage pool monitoring module and database instance starting module work, controller working properly breaks down, then directly Power cut-off.
10. the database failure switching equipment according to claim 8 based on SAS dual control equipment, it is characterised in that: described When judging that database instance works normally, then notification database cluster monitoring module continues to monitor database positioning judgment module All database nodes in data-base cluster.
CN201811394323.8A 2018-11-21 2018-11-21 A kind of database failure switching method and equipment based on SAS dual control equipment Active CN109471759B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811394323.8A CN109471759B (en) 2018-11-21 2018-11-21 A kind of database failure switching method and equipment based on SAS dual control equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811394323.8A CN109471759B (en) 2018-11-21 2018-11-21 A kind of database failure switching method and equipment based on SAS dual control equipment

Publications (2)

Publication Number Publication Date
CN109471759A true CN109471759A (en) 2019-03-15
CN109471759B CN109471759B (en) 2019-08-02

Family

ID=65673152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811394323.8A Active CN109471759B (en) 2018-11-21 2018-11-21 A kind of database failure switching method and equipment based on SAS dual control equipment

Country Status (1)

Country Link
CN (1) CN109471759B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110086668A (en) * 2019-04-28 2019-08-02 杭州迪普科技股份有限公司 A kind of configuration file switching method and system
CN110162428A (en) * 2019-05-17 2019-08-23 中国铁道科学研究院集团有限公司 Method of data synchronization and device, electronic equipment and computer readable storage medium
CN111400111A (en) * 2020-03-12 2020-07-10 北京交大思诺科技股份有限公司 Safe computer platform with standby machine out-of-step state
CN111488245A (en) * 2020-04-14 2020-08-04 深圳市小微学苑科技有限公司 Advanced management method and system for distributed storage
CN112527595A (en) * 2020-12-02 2021-03-19 平安医疗健康管理股份有限公司 Monitoring method and device for Greenplus cluster database and computer equipment
CN113687784A (en) * 2021-08-20 2021-11-23 浙江大华技术股份有限公司 Double-control switching storage method and device and electronic equipment
CN113886490A (en) * 2021-09-14 2022-01-04 北京东方金信科技股份有限公司 Method and system for realizing high availability of stateless computing instances in distributed database
CN115933565A (en) * 2022-12-23 2023-04-07 广东职业技术学院 AGV task exchange method, device, system and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101643074A (en) * 2009-06-30 2010-02-10 卡斯柯信号有限公司 Hot-standby system for primary and standby control center
CN101739406A (en) * 2008-11-13 2010-06-16 英业达股份有限公司 Method for synchronizing file service operations on double-controller
CN102081579A (en) * 2009-11-30 2011-06-01 英业达股份有限公司 Cache image system and method for storage equipment with dual controllers
US20140006357A1 (en) * 2011-11-14 2014-01-02 Panzura, Inc. Restoring an archived file in a distributed filesystem
CN103685250A (en) * 2013-12-04 2014-03-26 蓝盾信息安全技术股份有限公司 Virtual machine security policy migration system and method based on SDN
CN104793896A (en) * 2015-02-04 2015-07-22 北京神州云科数据技术有限公司 Single-control and double-control switching method and device of double-control equipment
CN106953744A (en) * 2017-02-27 2017-07-14 浙江工商大学 A kind of SDN cluster controllers High Availabitity architecture design method
CN107733684A (en) * 2017-08-31 2018-02-23 北京宇航系统工程研究所 A kind of multi-controller computing redundancy cluster based on Loongson processor

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739406A (en) * 2008-11-13 2010-06-16 英业达股份有限公司 Method for synchronizing file service operations on double-controller
CN101643074A (en) * 2009-06-30 2010-02-10 卡斯柯信号有限公司 Hot-standby system for primary and standby control center
CN102081579A (en) * 2009-11-30 2011-06-01 英业达股份有限公司 Cache image system and method for storage equipment with dual controllers
US20140006357A1 (en) * 2011-11-14 2014-01-02 Panzura, Inc. Restoring an archived file in a distributed filesystem
CN103685250A (en) * 2013-12-04 2014-03-26 蓝盾信息安全技术股份有限公司 Virtual machine security policy migration system and method based on SDN
CN104793896A (en) * 2015-02-04 2015-07-22 北京神州云科数据技术有限公司 Single-control and double-control switching method and device of double-control equipment
CN106953744A (en) * 2017-02-27 2017-07-14 浙江工商大学 A kind of SDN cluster controllers High Availabitity architecture design method
CN107733684A (en) * 2017-08-31 2018-02-23 北京宇航系统工程研究所 A kind of multi-controller computing redundancy cluster based on Loongson processor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘丹等: "基于Openstack私有云平台的高可用性研究", 《长春理工大学学报 (自然科学版)》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110086668A (en) * 2019-04-28 2019-08-02 杭州迪普科技股份有限公司 A kind of configuration file switching method and system
CN110086668B (en) * 2019-04-28 2022-07-29 杭州迪普科技股份有限公司 Configuration file switching method and system
CN110162428A (en) * 2019-05-17 2019-08-23 中国铁道科学研究院集团有限公司 Method of data synchronization and device, electronic equipment and computer readable storage medium
CN111400111A (en) * 2020-03-12 2020-07-10 北京交大思诺科技股份有限公司 Safe computer platform with standby machine out-of-step state
CN111400111B (en) * 2020-03-12 2024-02-27 北京交大思诺科技股份有限公司 Safe computer platform with standby machine out-of-step state
CN111488245A (en) * 2020-04-14 2020-08-04 深圳市小微学苑科技有限公司 Advanced management method and system for distributed storage
CN112527595A (en) * 2020-12-02 2021-03-19 平安医疗健康管理股份有限公司 Monitoring method and device for Greenplus cluster database and computer equipment
CN113687784A (en) * 2021-08-20 2021-11-23 浙江大华技术股份有限公司 Double-control switching storage method and device and electronic equipment
CN113886490A (en) * 2021-09-14 2022-01-04 北京东方金信科技股份有限公司 Method and system for realizing high availability of stateless computing instances in distributed database
CN115933565A (en) * 2022-12-23 2023-04-07 广东职业技术学院 AGV task exchange method, device, system and medium
CN115933565B (en) * 2022-12-23 2023-10-20 广东职业技术学院 AGV task exchange method, device, system and medium

Also Published As

Publication number Publication date
CN109471759B (en) 2019-08-02

Similar Documents

Publication Publication Date Title
CN109471759B (en) A kind of database failure switching method and equipment based on SAS dual control equipment
CN104378232B (en) Fissure discovery, restoration methods and device under active and standby cluster networking pattern
CN103346903B (en) Dual-machine backup method and device
CN101546187B (en) Redundant PLC system
CN203482216U (en) Network equipment
CN102394914A (en) Cluster brain-split processing method and device
CN103647781A (en) Mixed redundancy programmable control system based on equipment redundancy and network redundancy
CN101916217A (en) Method, control device and system for switching a plurality of controllers
CN103532753A (en) Double-computer hot standby method based on memory page replacement synchronization
CN105068763B (en) A kind of virtual machine tolerant system and method for storage failure
CN109905275A (en) A kind of detection of control plane failure and processing method based on SDN layer architecture
CN104484243A (en) High-reliability system device and method combining virtual machine fault-tolerant technique and high-availability cluster technique
CN117331863B (en) Power supply information communication method, system, electronic equipment and storage medium
CN104917700A (en) Management unit and exchange unit dual-redundancy switch
CN212541329U (en) Dual-redundancy computer equipment based on domestic Loongson platform
CN107071189B (en) Connection method of communication equipment physical interface
CN103605616A (en) Multi-controller cache data consistency guarantee method
JP5285045B2 (en) Failure recovery method, server and program in virtual environment
JP5285044B2 (en) Cluster system recovery method, server, and program
CN115549751B (en) Remote sensing satellite ground station monitoring system and method
CN110740066A (en) Cross-machine fault migration method and system with unchangeable seats of types
CN101957786A (en) Method and device for realizing start and fault switching control in dual-control system
JP3325785B2 (en) Computer failure detection and recovery method
CN110752955A (en) Seat invariant fault migration system and method
KR100832890B1 (en) Process obstacle lookout method and recovery method for information communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 100193 2nd Floor 201, Block B, Building 12, East 10 Wangdong Road, Northwest Haidian District, Beijing

Patentee after: Beijing Gushu Polytron Technologies Inc

Address before: 100193 2nd Floor 201, Block B, Building 12, East 10 Wangdong Road, Northwest Haidian District, Beijing

Patentee before: BEIJING GUSHU TECHNOLOGY Co.,Ltd.

CP01 Change in the name or title of a patent holder