CN101686261A

CN101686261A - RAC-based redundant server system

Info

Publication number: CN101686261A
Application number: CN200910194974A
Authority: CN
Inventors: 周庭梁; 张立鹏
Original assignee: Casco Signal Ltd
Current assignee: Casco Signal Ltd
Priority date: 2009-09-01
Filing date: 2009-09-01
Publication date: 2010-03-31

Abstract

The invention relates to a RAC-based redundant server system, which comprises N nodes, wherein each node operates an application server, a database server and a watchdog system; each node is connectedwith a shared disk array device; one end of each node is connected into a private network uniformly; the other end of each node is connected to a public network uniformly; each node has a virtual IPof the public network; a public network IP address accessed by a user is a main virtual IP address; a server having the main virtual IP is called as a host computer; other servers are standby computers; and the host computer and the standby computers are switched by the watchdog system. Compared with the prior art, the RAC-based redundant server system has the advantages of low cost, good expansibility, good reliability, short switching time and the like.

Description

A kind of redundant server system based on RAC

Technical field

The present invention relates to redundant server system, relate in particular to a kind of redundant server system based on RAC (the real application cluster of real application clusters).

Background technology

Along with the raising of domestic information degree, more and more higher to the online availability requirement of using system, generally require service uninterruptedly was provided in 7*24 hour, and system redundancy is to realize a kind of effective means of above-mentioned purpose.

The typical application system comprises application server and two parts of data server, and is corresponding, and system redundancy also should cover the redundant and redundant two parts of data server of application server.

Redundant for using, implementation mainly contains Clustering or load-balancing technique (component software load balancing and hardware load balancing again); And for database redundancy, main implementation is to adopt the data-base cluster technology.

Before Oracle 10g released, the redundancy of database also depended on the cluster of operating system, and after Oracle10g released, database itself had comprised cluster external member (RAC).

Share under the situation of a cover hardware platform at application server and data server,, both increased the buying expenses of software, increased system maintenance personnel's burden again if use operating system cluster or third party's cluster to realize the redundancy of using.The RAC that how to make full use of Oracle realizes system redundancy, has just become a good problem to study.

Summary of the invention

Purpose of the present invention is exactly in order to overcome the defective that above-mentioned prior art exists, and a kind of redundant server system based on RAC of with low cost, favorable expandability is provided.

Purpose of the present invention can be achieved through the following technical solutions:

A kind of redundant server system based on RAC, this system comprises N node, wherein each node all moves application server, database server, watchdog system, described each node links to each other with the Disk Array of sharing, the unified private network that inserts of one end of described each node, the unified public network that inserts of the other end of described each node, described each node all has the virtual IP address of a public network, a public network IP address accessed by the user is main virtual ip address, the server that has main virtual IP address is called main frame, other server then is a standby host, and active and standby machine switches to be realized by watchdog system.

Described active and standby machine switching comprises following flow process:

(1) because whole system is the RAC cluster, so we use the host identification of main virtual IP address conduct, after system start-up, if node has been obtained main virtual IP address then has been become main frame, otherwise be standby host, in the active and standby machine running, whether the watchdog system surveillance application is normal;

(2) unusual if host application occurs, then discharge host identification (stopping the RAC service) by watchdog system, make it become standby host, if certain standby host is obtained host identification, then this standby host becomes main frame;

(3) unusual if standby host occurs, watchdog system is attempted restarting application, recovers normal if restart the back application, then switch back to the standby host state, otherwise shutdown excludes whole cluster system with it;

(4) if main frame shuts down or restarts, main frame discharges host identification automatically, and promptly RAC discharges all virtual IP addresses that are bundled in this machine automatically, comprise main virtual IP address, RAC transfers to other nodes with all virtual IP addresses of this node simultaneously, at this moment, certain standby host will obtain host identification, becomes main frame;

(5) if standby host shuts down or restarts, then the RAC of standby host discharges all virtual IP addresses that all are bundled in this machine automatically, transfers to other nodes, but does not influence host work;

The flow process of described watchdog system work is as follows:

1) whether inquiry uses normal after the system start-up;

2) if normal, dormancy is inquiry again after one second;

3) if use unusual (losing response as using thread), watchdog system judges whether present node is host node, and promptly whether present node has main virtual IP address;

4) if present node is a host node, then discharge host identification;

5) restart application;

6) after application was restarted, whether watchdog system is inquired about application again normal, if normal, then continues the poll application state;

7) if application state is undesired, then shutdown.

Compared with prior art, the present invention has the following advantages:

(1) with low cost: as to make full use of the RAC external member that database itself comprises, saved the cost of buying the cluster of operating system, realized that 24 hours of server are online.

(2) favorable expandability: along with the expansion of business, system is free to add server, need not to change the existing application configuration.

(3) good reliability: the reliability of this framework mainly depends on the reliability of RAC.

(4) switching time is shorter: redundant machine forwards the switching time of normal use to smaller or equal to 30s.

Description of drawings

Fig. 1 is the structural representation of a kind of redundant server system based on RAC of the present invention;

Fig. 2 is the active and standby machine switching flow figure of a kind of redundant server system based on RAC of the present invention;

Fig. 3 is the watchdog system workflow diagram of a kind of redundant server system based on RAC of the present invention.

Embodiment

The present invention will be further described below in conjunction with specific embodiment.

Embodiment 1

As Fig. 1, Fig. 2, shown in Figure 3, a kind of redundant server system based on RAC, this system comprises N node 1, wherein each node 1 all moves application server, database server, watchdog system, described each node links to each other with the Disk Array of sharing 2, the unified private network 3 that inserts of one end of described each node, the unified public network 4 that inserts of the other end of described each node, described each node all has the virtual IP address of a public network, a public network IP address accessed by the user is main virtual ip address, the server that has main virtual IP address is called main frame, and other server then is a standby host, and active and standby machine switches to be realized by watchdog system.

1) because whole system is the RAC cluster, so we use the host identification of main virtual IP address conduct.After system start-up,, otherwise be standby host if node has been obtained main virtual IP address then become main frame.In the active and standby machine running, whether the watchdog system surveillance application is normal.

2) unusual if host application occurs, then discharge host identification (stopping the RAC service) by watchdog system, make it become standby host; If certain standby host is obtained host identification, then this standby host becomes main frame.

3) unusual if standby host occurs, watchdog system is attempted restarting application, recovers normal if restart the back application, then switch back to the standby host state, otherwise shutdown excludes whole cluster system with it.

4) if main frame shuts down or restarts, main frame discharges host identification automatically, and promptly RAC discharges all virtual IP addresses that are bundled in this machine automatically, comprise main virtual IP address, RAC transfers to other nodes with all virtual IP addresses of this node simultaneously, at this moment, certain standby host will obtain host identification, becomes main frame.

5) if standby host shuts down or restarts, then the RAC of standby host discharges all virtual IP addresses that all are bundled in this machine automatically, transfers to other nodes, but does not influence host work.

The flow process of described watchdog system work is as follows:

In the 301st step, whether inquiry uses normal after the system start-up.

In the 302nd step, if normal, dormancy is inquiry again after one second.

In the 303rd step, if use unusual (losing response as using thread), watchdog system judges whether present node is host node, and promptly whether present node has main virtual IP address.

In the 304th step,, then discharge host identification if present node is a host node.

In the 305th step, restart application.

In the 306th step, after application was restarted, whether watchdog system is inquired about application again normal, if normal, then continues the poll application state.

The 307th step, if application state is undesired, then shutdown.

Embodiment 2

This invention is applied to somewhere factories and miness transportation Production Scheduling System (hereinafter to be referred as " transportation Production Scheduling System "):

The transportation Production Scheduling System comprises two-server, adopts redundancy structure, is deployed in the central machine room of company.The shared cover disk array of two-server, all data of system all are stored in this disk array, have guaranteed that effectively data and application 24 hours are online.

The application of system comprises two parts: based on the ERP system of B/S structure, reach the interface service program based on TCP/IP.The application of two-server all is in heat and is equipped with state, provides service by unified main VIP to the user.

In the service of central server deploy house dog, realized the automatic switchover when application system breaks down, guaranteed that effectively 24 hours of application are online.

In addition, this method is safeguarded the timing unit of system becomes possibility, as: can regularly equipment be safeguarded by select time, can further improve the availability of system like this.

Prove through field practice,, can effectively reduce the cost of maintenance, improve the reliabilty and availability of system based on the redundant server framework of RAC.

Claims

1. redundant server system based on RAC, it is characterized in that, this system comprises N node, wherein each node all moves application server, database server, watchdog system, described each node links to each other with the Disk Array of sharing, the unified private network that inserts of one end of described each node, the unified public network that inserts of the other end of described each node, described each node all has the virtual IP address of a public network, a public network IP address accessed by the user is main virtual ip address, the server that has main virtual IP address is called main frame, and other server then is a standby host, and active and standby machine switches to be realized by watchdog system.

2. the redundant server system based on RAC according to claim 1 is characterized in that, described active and standby machine switching comprises following flow process:

(5) if standby host shuts down or restarts, then the RAC of standby host discharges all virtual IP addresses that all are bundled in this machine automatically, transfers to other nodes, but does not influence host work.

3. the redundant server system based on RAC according to claim 1 is characterized in that, the flow process of described watchdog system work is as follows:

1) whether inquiry uses normal after the system start-up;

2) if normal, dormancy is inquiry again after one second;

4) if present node is a host node, then discharge host identification;

5) restart application;

7) if application state is undesired, then shutdown.