CN104965770B - A kind of central server disaster-tolerant backup method - Google Patents
A kind of central server disaster-tolerant backup method Download PDFInfo
- Publication number
- CN104965770B CN104965770B CN201510330091.XA CN201510330091A CN104965770B CN 104965770 B CN104965770 B CN 104965770B CN 201510330091 A CN201510330091 A CN 201510330091A CN 104965770 B CN104965770 B CN 104965770B
- Authority
- CN
- China
- Prior art keywords
- central server
- equipment
- performance
- monitoring program
- list
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
It is an object of the invention to protect the continuity of operation system, there is provided a kind of reliable central server disaster recovery method.This method is directed to the central server being located in good network environment, and the optional equipment that alternative central server is worked is added to alternate list, and optional equipment is connected with central server.After central server delays machine, by the network and hardware performance that calculate optional equipment, the equipment that best performance is selected from alternate list works on instead of central server, ensure the performance of central server, reduce unplanned downtime so that central server delay after machine can recovery system in time operation.
Description
Technical field
The present invention relates to a kind of central server disaster-tolerant backup method.
Background technology
Calamity is to reduce loss caused by disaster, ensure the important measures that continuously run of computer system for technology, be take precautions against natural calamities,
The core of mitigation, its purpose are exactly that the continuity of operation system is protected after disaster generation, reduce unplanned delay as best one can
The machine time.Different business is had nothing in common with each other to the Tolerance of loss of data and the service recovery time length of requirement, as bank believes
System requirements amount of data lost very little even zero-data loss when disaster occurs is ceased, and requires that energy is in time after disaster generation
Recover the operation of information system in ground.Therefore, from the point of view of calamity is standby, the central server for running operation system is delayed after machine, and one
Kind rational alternate servers selection mechanism can while central server performance is ensured, reduce it is unplanned delay machine when
Between so that central server delay after machine can recovery system in time operation.Secondly, according to the server hardware and net of dynamic change
Network performance indications, the two performance in the range of sometime is calculated, then optimal device is filtered out using the method for weight distribution,
The system normal operation under current allocation optimum is all the time ensured.
The content of the invention
It is an object of the invention to protect the continuity of operation system, there is provided a kind of reliable standby side of central server calamity
Method.This method is directed to the central server being located in good network environment, the optional equipment that alternative central server is worked
Added to alternate list, optional equipment is connected with central server.After central server delays machine, by calculating optional equipment
Network and hardware performance, the equipment that best performance is selected from alternate list works on instead of central server, in guarantee
The performance of central server, reduce unplanned downtime so that central server delay after machine can recovery system in time operation.
What the present invention was realized in:
Operation system is run and has set the server of central database to be referred to as central server, in the server net of classification
In network, central server is referred to as first nodes, and the equipment being directly connected with central server is two-level node.Each two-level node
One local data base, and operational monitoring program on node be set, monitoring program timing detect this node network condition and
Hardware information, and the information record of other equipment is received to local data base.Meanwhile monitoring program is to central server and other
Equipment sends this nodal information detected.
The module being related in this method is as follows:
Configuration module:Alternately equipment is added to alternate list to the two-level node that alternative central server is worked,
Alternate list includes sequence number, device IP two, and the equipment in list has disposed monitoring program, preserves list after addition
To central database.
Memory module:Each optional equipment stores a alternate list in the database of oneself.
Detection module:Whether optional equipment timing detection connects with central server:If connection, continue timing and detect;
If not connecting, conclude that central server is delayed machine.
Performance calculating module:The monitoring program run on each optional equipment collects data below and calculates current device
Energy:
Weight X shared by network performance, weight Y shared by hardware performance;
Latency test:Current device needs to carry out connection test to m two-level node, carries out n test altogether, remembers the
I test P1, P2... ..., PmNetwork delay caused by node is Di1, Di2... ..., Dim。
1 hardware performance is counted every T time, n times is counted altogether, calculates the average profit of CPU, internal memory and hard disk respectively
With rate;
In the calculating to hardware performance, weight A shared by CPU average utilizations, weight B shared by internal memory average utilization,
Weight C (A+B+C=1) shared by hard disk average utilization;
By the calculation of performance indicators data processing being collected into be required form after, according to calculating property of below equation
Energy:
(1) average delay:
(2) hardware performance:HP=CPU utilization rates × A+ memory usages × B+ hard disk utilizations × C;
(3) combination property:TP=(average delay T × network weight X+ hardware performances HP × hardware weight Y) × 100%;
Optimal selection module:After the completion of calculating, it is alternative that this device IP and combination property TP are distributed to remaining by monitoring program
Equipment, after the monitoring program on each optional equipment acknowledges receipt of the information of remaining all optional equipment, select combination property
Conduct optimal device minimum index TP, if there is with this equipment performance index identical equipment, then sequence number in alternate list
Less is optimal device, and optimal device information is deleted from the alternate list of local data base.
Notification module:If equipment is optimal device, the monitoring program run in equipment starts the business system disposed
System, is connected with central database, turns into central server, while deletes the letter of the optimal device in central database alternate list
Breath, notifies the change of all two-level node central servers, and two-level node refers to the equipment for directly accessing central server.
Its specific method step is:
(1) alternately equipment is added to alternate list to the two-level node that alternative central server works, alternative row
Table includes sequence number, device IP two, and the equipment in list has disposed monitoring program.List is preserved to center after addition
Database, each optional equipment store a alternate list in the database of oneself;
(2) whether optional equipment timing detection connects with central server:If connection, perform step (2);If do not connect
It is logical, then conclude that central server is delayed machine, performs step (3);
(3) average delay:
(4) hardware performance:HP=CPU utilization rates × A+ memory usages × B+ hard disk utilizations × C;
(5) combination property:TP=(average delay T × network weight X+ hardware performances HP × hardware weight Y) × 100%;
(6) this device IP and combination property TP are distributed to remaining optional equipment by monitoring program;
(7) after the monitoring program on each optional equipment acknowledges receipt of the information of remaining all optional equipment, select comprehensive
Close the minimum conduct optimal device of performance index TP, if there is with this equipment performance index identical equipment, then alternate list
Less middle sequence number is optimal device, and optimal device information is deleted from alternate list;
(8) if equipment is optimal device, the monitoring program in equipment starts the operation system disposed, with middle calculation
Connected according to storehouse, turn into central server, perform step (9);Otherwise, step (10) is performed;
(9) the optimal device information in central database alternate list is deleted, notifies all two-level node central servers
Change, two-level node refer to directly access central server equipment, perform step (11);
Etc. (10) message for the central server change that optimal device to be received is sent, center service is reconnected after reception
Device;
(11) central server is completed to replace.
The key of the present invention is how to filter out optimal device.The network performance of computing device is needed for this:By more
Secondary calculating and the averaging network time delay of multiple nodes, obtain final average delay;Computing hardware performance again, distribution CPU, internal memory,
The weight of hard disk utilization, is calculated hardware performance;Then the two weight of running environment distribution according to needed for system, after calculating
Combination property TP is obtained, that selects combination property TP minimums turns into central server, if TP is identical, sequence in alternate list
Number less equipment turns into central server.This method can reduce unplanned machine of delaying after central server surprisingly delays machine
Time, the operation of timely recovery system, ensure the central server performance after replacing, improve the continuity of business, reduce because
The loss that the server machine of delaying is brought, saves man power and material.
The novelty of this method is:
1. from the point of view of calamity is standby, centered on server configuration optional equipment list, after central server delays machine, lead to
Cross calculate optional equipment network and hardware performance, selected from alternate list best performance equipment replace central server after
Continuous work, ensures the performance of central server, reduces unplanned downtime so that central server surprisingly delay after machine can and
The operation of Shi Huifu business, it ensure that the continuity of business.
2. optimal device is filtered out according to network and hardware performance dynamic.According to the network of dynamic change and hardware information meter
After calculating the two performance, when can be according to system normal operation required server performance requirement distribute network performance and hardware
Weight shared by energy, while can be also the different weight of hardware Distribution Indexes different in hardware performance, so as to filter out optimal set
It is standby, ensure the system normal operation under allocation optimum all the time.
Embodiment
Illustrate below in conjunction with the accompanying drawings and the present invention is described in more detail:
The method of the invention is characterised by:
Operation system is run and has set the server of central database to be referred to as central server, in the server net of classification
In network, central server is referred to as first nodes, and the equipment being directly connected with central server is two-level node.Each two-level node
One local data base, and operational monitoring program on node be set, monitoring program timing detect this node network condition and
Hardware information, and the information record of other equipment is received to local data base.Meanwhile monitoring program is to central server and other
Equipment sends this nodal information detected.
The module being related in this method is as follows:
Configuration module:Alternately equipment is added to alternate list to the two-level node that alternative central server is worked,
Alternate list includes sequence number, device IP two, and the equipment in list has disposed monitoring program, preserves list after addition
To central database.
Memory module:Each optional equipment stores a alternate list in the database of oneself.
Detection module:Whether optional equipment timing detection connects with central server:If connection, continue timing and detect;
If not connecting, conclude that central server is delayed machine.
Performance calculating module:The monitoring program run on each optional equipment collects data below and calculates current device
Energy:
Weight X shared by network performance, weight Y shared by hardware performance;
Latency test:Current device needs to carry out connection test to m two-level node, carries out n test altogether, remembers the
I test P1, P2... ..., PmNetwork delay caused by node is Di1, Di2... ..., Dim。
1 hardware performance is counted every T time, n times is counted altogether, calculates the average profit of CPU, internal memory and hard disk respectively
With rate;
In the calculating to hardware performance, weight A shared by CPU average utilizations, weight B shared by internal memory average utilization,
Weight C (A+B+C=1) shared by hard disk average utilization;
By the calculation of performance indicators data processing being collected into be required form after, according to calculating property of below equation
Energy:
(1) average delay:
(2) hardware performance:HP=CPU utilization rates × A+ memory usages × B+ hard disk utilizations × C;
(3) combination property:TP=(average delay T × network weight X+ hardware performances HP × hardware weight Y) × 100%;
Optimal selection module:After the completion of calculating, it is alternative that this device IP and combination property TP are distributed to remaining by monitoring program
Equipment, after the monitoring program on each optional equipment acknowledges receipt of the information of remaining all optional equipment, select combination property
Conduct optimal device minimum index TP, if there is with this equipment performance index identical equipment, then sequence number in alternate list
Less is optimal device, and optimal device information is deleted from the alternate list of local data base.
Notification module:If equipment is optimal device, the monitoring program run in equipment starts the business system disposed
System, is connected with central database, turns into central server, while deletes the letter of the optimal device in central database alternate list
Breath, notifies the change of all two-level node central servers, and two-level node refers to the equipment for directly accessing central server.
Its specific method step is:
(1) alternately equipment is added to alternate list to the two-level node that alternative central server works, alternative row
Table includes sequence number, device IP two, and the equipment in list has disposed monitoring program.List is preserved to center after addition
Database, each optional equipment store a alternate list in the database of oneself;
(2) whether optional equipment timing detection connects with central server:If connection, perform step (2);If do not connect
It is logical, then conclude that central server is delayed machine, performs step (3);
(3) average delay:
(4) hardware performance:HP=CPU utilization rates × A+ memory usages × B+ hard disk utilizations × C;
(5) combination property:TP=(average delay T × network weight X+ hardware performances HP × hardware weight Y) × 100%;
(6) this device IP and combination property TP are distributed to remaining optional equipment by monitoring program;
(7) after the monitoring program on each optional equipment acknowledges receipt of the information of remaining all optional equipment, select comprehensive
Close the minimum conduct optimal device of performance index TP, if there is with this equipment performance index identical equipment, then alternate list
Less middle sequence number is optimal device, and optimal device information is deleted from alternate list;
(8) if equipment is optimal device, the monitoring program in equipment starts the operation system disposed, with middle calculation
Connected according to storehouse, turn into central server, perform step (9);Otherwise, step (10) is performed;
(9) the optimal device information in central database alternate list is deleted, notifies all two-level node central servers
Change, two-level node refer to directly access central server equipment, perform step (11);
Etc. (10) message for the central server change that optimal device to be received is sent, center service is reconnected after reception
Device;
(11) central server is completed to replace.
Its specific implementation pattern is such:
Network condition is good, and the server disposed and start operation system is referred to as central server, connects centre data
Storehouse, central server are referred to as first nodes, and the equipment being directly connected with central server is two-level node.First, configuration module
After alternate list configuration successful, alternate list is stored in the local data base of each equipment by memory module, then detects mould
Whether the local data base of block detection device has alternate list, if the operation shape for starting timing inspection center server in the presence of if
State, if detecting, central server is delayed machine, and performance evaluation module begins through Weight Value Distributed Methods and calculates this equipment performance, is counted
After calculation, optimal selection module selects optimal device as central server, while updates the alternative row of local data base
Table, then notification module start the operation system disposed in the equipment, equipment is connected with central database, genuinely convinced in turning into
Business device, while the optimal device information in central database alternate list is deleted, notify all two-level node central servers
Change, the equipment that two-level node directly accesses central server.
Claims (1)
- A kind of 1. central server disaster-tolerant backup method, it is characterised in that:What the present invention was realized in:Operation system is run and has set the server of central database to be referred to as central server, in the server network of classification In, central server is referred to as first nodes, and the equipment being directly connected with central server is two-level node;Each two-level node is set Put a local data base, and operational monitoring program on node, monitoring program timing detects the network condition of this node and hard Part information, and the information record of other equipment is received to local data base;Meanwhile monitoring program is set to central server with other This nodal information that preparation censorship measures;The module being related in this method is as follows:Configuration module:Alternately equipment is added to alternate list to the two-level node that alternative central server is worked, alternatively List includes sequence number, device IP two, and the equipment in list has disposed monitoring program, preserves list into after addition Heart database;Memory module:Each optional equipment stores a alternate list in the database of oneself;Detection module:Whether optional equipment timing detection connects with central server:If connection, continue timing and detect;If no Connection, then conclude that central server is delayed machine;Performance calculating module:The monitoring program run on each optional equipment collects data below and calculates current device performance:Weight X shared by network performance, weight Y shared by hardware performance;Latency test:Current device needs to carry out connection test to m two-level node, carries out n test altogether, remembers ith Test P1, P2... ..., PmNetwork delay caused by node is Di1, Di2... ..., Dim;1 hardware performance is counted every T time, n times is counted altogether, calculates the average utilization of CPU, internal memory and hard disk respectively Rate;In the calculating to hardware performance, weight A shared by CPU average utilizations, weight B, hard disk shared by internal memory average utilization Weight C shared by average utilization, wherein, A+B+C=1;By the calculation of performance indicators data processing being collected into be required form after, calculate performance according to below equation:(1) average delay:(2) hardware performance:HP=CPU utilization rates × A+ memory usages × B+ hard disk utilizations × C;(3) combination property:TP=(average delay T × network weight X+ hardware performances HP × hardware weight Y) × 100%;Optimal selection module:After the completion of calculating, this device IP and combination property TP are distributed to remaining optional equipment by monitoring program, After monitoring program on each optional equipment acknowledges receipt of the information of remaining all optional equipment, synthetic performance examination TP is selected Minimum conduct optimal device, if there is with this equipment performance index identical equipment, then sequence number is less in alternate list For optimal device, and optimal device information is deleted from the alternate list of local data base;Notification module:If equipment is optimal device, the monitoring program run in equipment starts the operation system disposed, with Central database connects, and turns into central server, while deletes the optimal device information in central database alternate list, notifies The change of all two-level node central servers, the equipment that two-level node directly accesses central server;Its specific method step is:(1) alternately equipment is added to alternate list, alternate list bag to the two-level node that alternative central server works Sequence number, device IP two are included, the equipment in list has disposed monitoring program;List is preserved to centre data after addition Storehouse, each optional equipment store a alternate list in the database of oneself;(2) whether optional equipment timing detection connects with central server:If connection, perform step (2);If not connecting, Conclude that central server is delayed machine, perform step (3);(3) average delay:(4) hardware performance:HP=CPU utilization rates × A+ memory usages × B+ hard disk utilizations × C;(5) combination property:TP=(average delay T × network weight X+ hardware performances HP × hardware weight Y) × 100%;(6) this device IP and combination property TP are distributed to remaining optional equipment by monitoring program;(7) after the monitoring program on each optional equipment acknowledges receipt of the information of remaining all optional equipment, select comprehensive Conduct optimal device that can be minimum index TP, if there is with this equipment performance index identical equipment, then sequence in alternate list Number less is optimal device, and optimal device information is deleted from alternate list;(8) if equipment is optimal device, the monitoring program in equipment starts the operation system disposed, with central database Connection, turn into central server, perform step (9);Otherwise, step (10) is performed;(9) the optimal device information in central database alternate list is deleted, notifies the change of all two-level node central servers More, the equipment that two-level node directly accesses central server, step (11) is performed;Etc. (10) message for the central server change that optimal device to be received is sent, central server is reconnected after reception;(11) central server is completed to replace.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510330091.XA CN104965770B (en) | 2015-06-15 | 2015-06-15 | A kind of central server disaster-tolerant backup method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510330091.XA CN104965770B (en) | 2015-06-15 | 2015-06-15 | A kind of central server disaster-tolerant backup method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104965770A CN104965770A (en) | 2015-10-07 |
CN104965770B true CN104965770B (en) | 2018-02-02 |
Family
ID=54219805
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510330091.XA Active CN104965770B (en) | 2015-06-15 | 2015-06-15 | A kind of central server disaster-tolerant backup method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104965770B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105763386A (en) * | 2016-05-13 | 2016-07-13 | 中国工商银行股份有限公司 | Service processing system and method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101309167A (en) * | 2008-06-27 | 2008-11-19 | 华中科技大学 | Disaster allowable system and method based on cluster backup |
CN201571075U (en) * | 2009-07-22 | 2010-09-01 | 马涛 | Intelligent disaster recovery system |
CN102117231A (en) * | 2009-12-30 | 2011-07-06 | 上海文广互动电视有限公司 | Distributed data backup and disaster tolerance system and method |
CN103853634A (en) * | 2014-02-26 | 2014-06-11 | 北京优炫软件股份有限公司 | Disaster recovery system and disaster recovery method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102012789B (en) * | 2009-09-07 | 2014-03-12 | 云端容灾有限公司 | Centralized management type backup and disaster recovery system |
-
2015
- 2015-06-15 CN CN201510330091.XA patent/CN104965770B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101309167A (en) * | 2008-06-27 | 2008-11-19 | 华中科技大学 | Disaster allowable system and method based on cluster backup |
CN201571075U (en) * | 2009-07-22 | 2010-09-01 | 马涛 | Intelligent disaster recovery system |
CN102117231A (en) * | 2009-12-30 | 2011-07-06 | 上海文广互动电视有限公司 | Distributed data backup and disaster tolerance system and method |
CN103853634A (en) * | 2014-02-26 | 2014-06-11 | 北京优炫软件股份有限公司 | Disaster recovery system and disaster recovery method |
Non-Patent Citations (2)
Title |
---|
《云灾备中系统级管理技术的关键问题》;姚文斌等;《中兴通讯技术》;20121231;第18卷(第6期);22-25 * |
《数据中心业务连续性保障技术的探讨》;辛阳;《数据中心业务连续性保障技术的探讨》;20141031;22-23 * |
Also Published As
Publication number | Publication date |
---|---|
CN104965770A (en) | 2015-10-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106470133B (en) | System pressure testing method and device | |
CN102694868B (en) | A kind of group system realizes and task dynamic allocation method | |
CN108809757B (en) | System alarm method, storage medium and server | |
RU2017111477A (en) | Methods and systems for determining non-standard user activity | |
CN106878473A (en) | A kind of message treatment method, server cluster and system | |
CN103067297B (en) | A kind of dynamic load balancing method based on resource consumption prediction and device | |
US10740198B2 (en) | Parallel partial repair of storage | |
CN106407052B (en) | A kind of method and device detecting disk | |
CN101707632A (en) | Method for dynamically monitoring performance of server cluster and alarming real-timely | |
CN104778111A (en) | Alarm method and alarm device | |
CN107769943A (en) | A kind of method and apparatus of active and standby cluster switching | |
Zhou et al. | FTCloudSim: a simulation tool for cloud service reliability enhancement mechanisms | |
CN106656682A (en) | Method, system and device for detecting cluster heartbeat | |
WO2018125628A1 (en) | A network monitor and method for event based prediction of radio network outages and their root cause | |
CN106686099A (en) | Method of realizing active-active mode across machine rooms of OracleRAC database based on infiniband network | |
CN103634167B (en) | Security configuration check method and system for target hosts in cloud environment | |
US9639445B2 (en) | System and method for comprehensive performance and availability tracking using passive monitoring and intelligent synthetic activity generation for monitoring a system | |
CN104965770B (en) | A kind of central server disaster-tolerant backup method | |
CN103428249A (en) | Collecting method and processing method for HTTP request packet, system and server | |
CN107818106B (en) | Big data offline calculation data quality verification method and device | |
CN106909436A (en) | Produce the method and system of the dependency relation of virtual machine message queue application program | |
CN106993027B (en) | Remote data storage location verification method | |
Schörgenhumer et al. | Can We Predict Performance Events with Time Series Data from Monitoring Multiple Systems? | |
Yu et al. | Design and architecture of dell acceleration appliances for database (DAAD): A practical approach with high availability guaranteed | |
Liao et al. | Partial replication of metadata to achieve high metadata availability in parallel file systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |