CN105528259A - Application-level disaster recovery automatic switching control design method - Google Patents

Application-level disaster recovery automatic switching control design method Download PDF

Info

Publication number
CN105528259A
CN105528259A CN201610114127.5A CN201610114127A CN105528259A CN 105528259 A CN105528259 A CN 105528259A CN 201610114127 A CN201610114127 A CN 201610114127A CN 105528259 A CN105528259 A CN 105528259A
Authority
CN
China
Prior art keywords
disaster tolerance
agent
disaster
management
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610114127.5A
Other languages
Chinese (zh)
Other versions
CN105528259B (en
Inventor
李井鹏
武丽萍
张玉海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Communication Information System Co Ltd
Original Assignee
Inspur Communication Information System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Communication Information System Co Ltd filed Critical Inspur Communication Information System Co Ltd
Priority to CN201610114127.5A priority Critical patent/CN105528259B/en
Publication of CN105528259A publication Critical patent/CN105528259A/en
Application granted granted Critical
Publication of CN105528259B publication Critical patent/CN105528259B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/875Monitoring of systems including the internet

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)

Abstract

The invention particularly relates to an application-level disaster recovery automatic switching control design method. According to the application-level disaster recovery automatic switching control design method, disaster recovery management software comprises WEB management service software serving as a foreground and a BM management server serving as a background. The application-level disaster-tolerant automatic switching control design method can analyze the states of all nodes of the system in advance, when a disaster occurs, the disaster-tolerant switching control program analyzes and judges the system state, selects a required switching command, completes disaster-tolerant switching in a shorter time, shortens service recovery, reduces economic loss caused by long-time service disaster, and reduces professional technical requirements and maintenance cost for maintenance personnel and improves working efficiency because the whole switching process is automatically completed by a computer.

Description

A kind of application redundancy robotization switches control design case method
Technical field
The present invention relates to disaster tolerance control and management technical field, particularly a kind of application redundancy robotization switches control design case method.
Background technology
In the information age, computer information system is more and more important to human lives, and important infosystem is all concentrated and is deployed in data center.Infosystem, through continuous service for many years, have accumulated a large amount of valuable data.The disaster that disaster and human error cause, all may cause infosystem to be paralysed, and produces massive losses.Since system disaster cannot be avoided completely, positive carries out disaster tolerance system construction, just becomes the inevitable choice of important information system.
When production system disaster occurs, important is exactly that disaster tolerance system accurately and fast completes switching, substitutes original production system, continues externally to provide service, the impact that minimizing disaster is brought and loss.In order to tackle the destruction of disaster generation to infosystem, people have done disaster tolerance construction to some key service systems.When disaster occurs, when production system can not use, disaster tolerance system just replaces production system externally to provide information service.
Be switched to disaster tolerance system from production system, relate to many-sided technical matterss such as network address switching, data consistency.Operation steps is more, and Rule of judgment is complicated, and professional operation order is more, and slip-stick artist inputs switching command one by one and easily produces mistake, and spended time is more, extends disaster tolerance system enabling time.
For the deficiency of existing disaster tolerance system, the present invention devises a kind of application redundancy robotization and switches control design case method.When system disaster occurs, replace artificial manual input by one-touch automatic switching program, allow disaster tolerance system switch automatical and efficient completing.
Summary of the invention
The present invention, in order to make up the defect of prior art, provides a kind of application redundancy robotization based on expandable container and switches control design case method.
The present invention is achieved through the following technical solutions:
A kind of application redundancy robotization switches control design case method, it is characterized in that: Disaster Recover Manager Server comprises the WEB management service software as foreground and the BM management server two parts as backstage; Wherein WEB management service software possesses displaying interface and operating function, BM management server containing backstage primary control program disaster tolerance management Server and on each controlled main frame Agent Agent, for the communication service that realizes between disaster tolerance management host and controlled main frame with transmit switching command; Manage Server at each server node deploy Agent Agent with disaster tolerance to communicate, and receive the instruction from disaster tolerance management Server.
This application redundancy robotization switches control design case method, comprises the following steps:
(1) when switch start time, open WEB management service software, and from WEB management service software page invocation disaster tolerance management Server process, when switch stop time or the time of completing, can from WEB management service software page termination disaster tolerance management Server process;
When starting to switch, WEB management service software page is sent on the server of corresponding Agent Agent towards BM management server and starts instruction prepared in advance, BM management server starts AgentJob, until switched according to instruction on corresponding Agent Agent;
(2) initialize routine checks data mode in disaster tolerance management database, and reads in initialization data, implements the data mode in more new database, keep front page layout and background data base consistance with the change of switch step;
(3) the WEB management service software page is according to the data in disaster tolerance management database, represents switching state in real time, goes wrong in switching, during state display mistake, and status data in manual modification database;
(4) in handoff procedure, Server is as the bridge between disaster tolerance management database and Agent Agent client computer in disaster tolerance management, the instruction of next step operation of Agent Agent is obtained in disaster tolerance management database, and send to Agent Agent, then obtain execution result and the state value of instruction, be updated in the tables of data of disaster tolerance management database;
(5) manage Server with disaster tolerance after AgentJob in Agent Agent client computer starts carries out alternately, manages Server transmission current state or previous action result, and obtain next step operational order to disaster tolerance; After being finished, when disaster tolerance management Server process stops, Job also can stop.
The WEB management service software page that described step (1) uses JAVA to write, by arranging startup/mute key, control the disaster tolerance management Server process defined, the WEB management service software page initiate instruction by BM management server unified distribution task, then be delivered to Agent Agent end perform switch script accordingly; Described step (2) preserves switching flow state by disaster tolerance management database, and the progress status in real-time update switching flow, keep the consistance of foreground and background data base.
In described step (3), robotization switches each ingredient in control flow, comprises production system database, production system middleware, production system WEB, disaster tolerance system database, disaster tolerance system middleware and disaster tolerance system WEB represent its state in a database.
In described step (4), create process disaster tolerance management Server and transmit bridge as the data between disaster tolerance management database and Agent Agent client proxy, in time Agent Agent state transfer in disaster tolerance management database; In described step (5), the AgentJob process in Agent Agent client computer and disaster tolerance manage Server process and all can start with the startup of a task, stop with the end of task, and releasing resource.
The invention has the beneficial effects as follows: this application redundancy robotization switches control design case method, the each node state of analytic system can be shifted to an earlier date, when a disaster occurs, by disaster tolerance switching control program analysis judgment system state, select required switching command, complete disaster tolerance in the short period of time to switch, shorten business recovery, reduce the economic loss because long-time business disaster brings, because whole handoff procedure transfers to computing machine automatically to complete, accurately, efficiently, reduce and the professional technique of maintainer is required and maintenance cost, improve work efficiency.
Accompanying drawing explanation
Accompanying drawing 1 is disaster tolerance system robotization switching flow schematic diagram of the present invention.
Accompanying drawing 2 is Disaster Recover Manager Server logical organization schematic diagram of the present invention.
Accompanying drawing 3 is disaster tolerance robotization switching command conveying flow schematic diagram of the present invention.
Accompanying drawing 4 deletes disaster tolerance system schematic flow sheet for increasing in Disaster Recover Manager Server of the present invention.
Embodiment
In order to make technical matters to be solved by this invention, technical scheme and beneficial effect clearly understand, below in conjunction with drawings and Examples, the present invention will be described in detail.It should be noted that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
This application redundancy robotization switches control design case method, and Disaster Recover Manager Server comprises the WEB management service software as foreground and the BM management server two parts as backstage; Wherein WEB management service software possesses displaying interface and operating function, BM management server containing backstage primary control program disaster tolerance management Server and on each controlled main frame Agent Agent, for the communication service that realizes between disaster tolerance management host and controlled main frame with transmit switching command; Manage Server at each server node deploy Agent Agent with disaster tolerance to communicate, and receive the instruction from disaster tolerance management Server.
This application redundancy robotization switches control design case method, comprises the following steps:
(1) when switch start time, open WEB management service software, and from WEB management service software page invocation disaster tolerance management Server process, when switch stop time or the time of completing, can from WEB management service software page termination disaster tolerance management Server process;
When starting to switch, WEB management service software page is sent on the server of corresponding Agent Agent towards BM management server and starts instruction prepared in advance, BM management server starts AgentJob, until switched according to instruction on corresponding Agent Agent;
(2) initialize routine checks data mode in disaster tolerance management database, and reads in initialization data, implements the data mode in more new database, keep front page layout and background data base consistance with the change of switch step;
(3) the WEB management service software page is according to the data in disaster tolerance management database, represents switching state in real time, goes wrong in switching, during state display mistake, and status data in manual modification database;
(4) in handoff procedure, Server is as the bridge between disaster tolerance management database and Agent Agent client computer in disaster tolerance management, the instruction of next step operation of Agent Agent is obtained in disaster tolerance management database, and send to Agent Agent, then obtain execution result and the state value of instruction, be updated in the tables of data of disaster tolerance management database;
(5) manage Server with disaster tolerance after AgentJob in Agent Agent client computer starts carries out alternately, manages Server transmission current state or previous action result, and obtain next step operational order to disaster tolerance; After being finished, when disaster tolerance management Server process stops, Job also can stop.
The WEB management service software page that described step (1) uses JAVA to write, by arranging startup/mute key, control the disaster tolerance management Server process defined, the WEB management service software page initiate instruction by BM management server unified distribution task, then be delivered to Agent Agent end perform switch script accordingly; Described step (2) preserves switching flow state by disaster tolerance management database, and the progress status in real-time update switching flow, keep the consistance of foreground and background data base.
In described step (3), robotization switches each ingredient in control flow, comprises production system database, production system middleware, production system WEB, disaster tolerance system database, disaster tolerance system middleware and disaster tolerance system WEB represent its state in a database.
In described step (4), create process disaster tolerance management Server and transmit bridge as the data between disaster tolerance management database and Agent Agent client proxy, in time Agent Agent state transfer in disaster tolerance management database; In described step (5), the AgentJob process in Agent Agent client computer and disaster tolerance manage Server process and all can start with the startup of a task, stop with the end of task, and releasing resource.
WEB management service software (foreground) possesses displaying interface and operating function, when production system system generation disaster, when needing to carry out disaster tolerance switching, operation maintenance personnel logs in WEB management service software interface, switch from WEB management service software page millet cake hits, start switching flow.After switching flow is opened, the WEB management service software page can show switching progress, and switching the display of Host Status, disaster tolerance system database update state and the WEB management service software page can automatic synchronization.Disaster recovery and backup systems keeper can check switching progress, and whether whole flow process has switched.By monitoring handoff procedure in real time.When switching is broken down, need manual operation intervention, the interactive entrance that the WEB management service software page provides manual operation to control and reference instruction.When switching completes or need to stop in advance, blocked operation flow process can be stopped.
BM management server (backstage) manages Server and Agent Agent on each controlled main frame containing backstage primary control program disaster tolerance, can realize the communication service between disaster tolerance management host and controlled main frame and transmit switching command.
Combing service logic and write application server, disaster tolerance system database server, interface server institute bearer service startup and close script.
Manage Server at each server node deploy Agent Agent with disaster tolerance to communicate, and can receive the instruction to disaster tolerance management Server.
Switching command conveying flow mentality of designing, fill order is initiated from WEB management service software interface, pass to BM management server background service process, connected by the Agent Agent service of background service process and host node, and send instructions and pass to the Agent Agent service of host node, initiate a Job task by Agent Agent, call and can perform script, executing state can be returned after script is complete to BM management server.
This Disaster Recover Manager Server is designed to support to manage many cover disaster tolerance systems simultaneously.Increase in Disaster Recover Manager Server and delete operation system, adopt Excel as visual authoring tool, by perl script, the content of regularly writing inside Exel is imported in database table, change the data of backstage disaster tolerance system database.Can perform by create.vbs the data that script reads background data base, generate create.js script file.Create.js file can be called when the display of WEB management service software interface, in the webpage representation excel file on foreground, write the business tine of change.
Disaster tolerance system database can adopt all kinds of total relation type database, by define system Basic Information Table, and Host Status table, main frame Basic Information Table, state updating table, operation log recording table.Associated by major key between each table.

Claims (5)

1. application redundancy robotization switches a control design case method, it is characterized in that: Disaster Recover Manager Server comprises the WEB management service software as foreground and the BM management server two parts as backstage; Wherein WEB management service software possesses displaying interface and operating function, BM management server containing backstage primary control program disaster tolerance management Server and on each controlled main frame Agent Agent, for the communication service that realizes between disaster tolerance management host and controlled main frame with transmit switching command; Manage Server at each server node deploy Agent Agent with disaster tolerance to communicate, and receive the instruction from disaster tolerance management Server.
2. application redundancy robotization according to claim 1 switches control design case method, it is characterized in that comprising the following steps:
(1) when switch start time, open WEB management service software, and from WEB management service software page invocation disaster tolerance management Server process, when switch stop time or the time of completing, can from WEB management service software page termination disaster tolerance management Server process;
When starting to switch, WEB management service software page is sent on the server of corresponding Agent Agent towards BM management server and starts instruction prepared in advance, BM management server starts AgentJob, until switched according to instruction on corresponding Agent Agent;
(2) initialize routine checks data mode in disaster tolerance management database, and reads in initialization data, implements the data mode in more new database, keep front page layout and background data base consistance with the change of switch step;
(3) the WEB management service software page is according to the data in disaster tolerance management database, represents switching state in real time, goes wrong in switching, during state display mistake, and status data in manual modification database;
(4) in handoff procedure, Server is as the bridge between disaster tolerance management database and Agent Agent client computer in disaster tolerance management, the instruction of next step operation of Agent Agent is obtained in disaster tolerance management database, and send to Agent Agent, then obtain execution result and the state value of instruction, be updated in the tables of data of disaster tolerance management database;
(5) manage Server with disaster tolerance after AgentJob in Agent Agent client computer starts carries out alternately, manages Server transmission current state or previous action result, and obtain next step operational order to disaster tolerance; After being finished, when disaster tolerance management Server process stops, Job also can stop.
3. application redundancy robotization according to claim 2 switches control design case method, it is characterized in that: the WEB management service software page that described step (1) uses JAVA to write, by arranging startup/mute key, control the disaster tolerance management Server process defined, the WEB management service software page initiate instruction by BM management server unified distribution task, then be delivered to Agent Agent end perform switch script accordingly; Described step (2) preserves switching flow state by disaster tolerance management database, and the progress status in real-time update switching flow, keep the consistance of foreground and background data base.
4. application redundancy robotization according to claim 2 switches control design case method, it is characterized in that: in described step (3), robotization switches each ingredient in control flow, comprise production system database, production system middleware, production system WEB, disaster tolerance system database, disaster tolerance system middleware and disaster tolerance system WEB represent its state in a database.
5. application redundancy robotization according to claim 2 switches control design case method, it is characterized in that: in described step (4), create process disaster tolerance management Server and transmit bridge as the data between disaster tolerance management database and Agent Agent client proxy, in time Agent Agent state transfer in disaster tolerance management database; In described step (5), the AgentJob process in Agent Agent client computer and disaster tolerance manage Server process and all can start with the startup of a task, stop with the end of task, and releasing resource.
CN201610114127.5A 2016-03-01 2016-03-01 A kind of application redundancy automation switching control design method Active CN105528259B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610114127.5A CN105528259B (en) 2016-03-01 2016-03-01 A kind of application redundancy automation switching control design method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610114127.5A CN105528259B (en) 2016-03-01 2016-03-01 A kind of application redundancy automation switching control design method

Publications (2)

Publication Number Publication Date
CN105528259A true CN105528259A (en) 2016-04-27
CN105528259B CN105528259B (en) 2018-08-21

Family

ID=55770503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610114127.5A Active CN105528259B (en) 2016-03-01 2016-03-01 A kind of application redundancy automation switching control design method

Country Status (1)

Country Link
CN (1) CN105528259B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294028A (en) * 2016-10-12 2017-01-04 北京智网科技股份有限公司 A kind of key emergency set and method based on physical button
WO2021129008A1 (en) * 2019-12-23 2021-07-01 中国银联股份有限公司 Service invocation method, apparatus and device, and medium
CN113722159A (en) * 2021-09-09 2021-11-30 中国电信集团系统集成有限责任公司 Disaster recovery switching system based on ansable

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101316184A (en) * 2007-06-01 2008-12-03 华为技术有限公司 Disaster tolerance switching method, system and device
CN102291262A (en) * 2011-09-01 2011-12-21 中兴通讯股份有限公司 Disaster recovery method, device and system
CN103812675A (en) * 2012-11-08 2014-05-21 中兴通讯股份有限公司 Method and system for realizing allopatric disaster recovery switching of service delivery platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101316184A (en) * 2007-06-01 2008-12-03 华为技术有限公司 Disaster tolerance switching method, system and device
CN102291262A (en) * 2011-09-01 2011-12-21 中兴通讯股份有限公司 Disaster recovery method, device and system
CN103812675A (en) * 2012-11-08 2014-05-21 中兴通讯股份有限公司 Method and system for realizing allopatric disaster recovery switching of service delivery platform

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294028A (en) * 2016-10-12 2017-01-04 北京智网科技股份有限公司 A kind of key emergency set and method based on physical button
WO2021129008A1 (en) * 2019-12-23 2021-07-01 中国银联股份有限公司 Service invocation method, apparatus and device, and medium
CN113722159A (en) * 2021-09-09 2021-11-30 中国电信集团系统集成有限责任公司 Disaster recovery switching system based on ansable

Also Published As

Publication number Publication date
CN105528259B (en) 2018-08-21

Similar Documents

Publication Publication Date Title
CN107291565B (en) Operation and maintenance visual automatic operation platform and implementation method
CN107203617B (en) The online migratory system of mysql and method based on MHA
CN106301876B (en) Physical machine upgrade method, business migration method and device
CN105260376B (en) Method, apparatus and system for clustered node reducing and expansion
CN101645010B (en) System and method for automatically generating code
CN105700888A (en) Visualization rapid developing platform based on jbpm workflow engine
CN105528259A (en) Application-level disaster recovery automatic switching control design method
CN111966465B (en) Method, system, equipment and medium for modifying host configuration parameters in real time
CN112199157A (en) Cloud environment management method
CN106453541A (en) Data synchronization method, server and data synchronization system
CN115658166A (en) System, method and medium for centralized management and easy-to-use application configuration
CN103778026A (en) Object calling method and device
CN105553746A (en) Automatic configuration migration system and method based on SDN (Software Defined Network)
CN109508223A (en) A kind of virtual machine batch creation method, system and equipment
CN1988477A (en) Network managing system with high usability property
Bwalya et al. An SDN approach to mitigating network management challenges in traditional networks
CN102487332B (en) Fault processing method, apparatus thereof and system thereof
CN105281943A (en) Webpage-based remote equipment management method and device
CN112541746A (en) Full-stack automatic arrangement method and system
CN104598250A (en) System management structure and management implementation method for same
CN107733717A (en) A kind of network collocating method of cloud platform movable type O&M
CN109033483A (en) A kind of method, apparatus and system defining data relationship in YANG model
CN107819598A (en) A kind of method and device for managing network function node
CN105204975A (en) Performance monitoring system and method based on JavaEE system structure
CN109218096A (en) A kind of SCADA real-time database access system based on master-slave redundancy

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: No. 1036, Shandong high tech Zone wave road, Ji'nan, Shandong

Applicant after: Tianyuan Communication Information System Co., Ltd.

Address before: No. 1036, Shandong high tech Zone wave road, Ji'nan, Shandong

Applicant before: Langchao Communication Information System Co., Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 250100 S06 tower, 1036, Chao Lu Road, hi tech Zone, Ji'nan, Shandong.

Patentee after: INSPUR COMMUNICATION AND INFORMATION SYSTEM Co.,Ltd.

Address before: No. 1036, Shandong high tech Zone wave road, Ji'nan, Shandong

Patentee before: INSPUR TIANYUAN COMMUNICATION INFORMATION SYSTEM Co.,Ltd.

CP03 Change of name, title or address