WO2009081657A1 - ノードシステム、サーバ切換え方法、サーバ装置、およびデータ引き継ぎ方法 - Google Patents
ノードシステム、サーバ切換え方法、サーバ装置、およびデータ引き継ぎ方法 Download PDFInfo
- Publication number
- WO2009081657A1 WO2009081657A1 PCT/JP2008/069589 JP2008069589W WO2009081657A1 WO 2009081657 A1 WO2009081657 A1 WO 2009081657A1 JP 2008069589 W JP2008069589 W JP 2008069589W WO 2009081657 A1 WO2009081657 A1 WO 2009081657A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- server
- active
- data
- stage
- active server
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 24
- 238000012546 transfer Methods 0.000 title description 3
- 230000001360 synchronised effect Effects 0.000 claims abstract description 76
- 238000012545 processing Methods 0.000 claims description 11
- 238000011144 upstream manufacturing Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 description 34
- 238000010586 diagram Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2038—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with a single idle spare processing component
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2097—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2023—Failover techniques
- G06F11/2028—Failover techniques eliminating a faulty processor or activating a spare
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2041—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with more than one idle spare processing component
Definitions
- the present invention relates to a technology for realizing redundancy with a plurality of servers.
- a redundant configuration is realized by combining a plurality of servers in order to improve reliability (see Japanese Patent Laid-Open No. 2001-43105).
- a node of a communication system is configured by a plurality of servers.
- As a general redundant configuration for example, there are duplication, N duplication, and N + 1 redundancy.
- a method hot standby that allows the standby server to take over the service when a failure occurs in the active server by constantly synchronizing data between the active server and the standby server. .
- the standby server 902 corresponds to the active server 901 on a one-to-one basis. Then, during normal operation in which no failure occurs in the active server, the active server 901 and the standby server 902 synchronize data necessary for service continuation. Arrows in the figure indicate data synchronization. In FIG. 1, data 903 of the active server 901 and data 904 of the standby server 902 are synchronized. As a result, the standby server 902 is able to continue the service of the active server 901. Therefore, even if a failure occurs in the active server 901, the standby server 902 can continue the service.
- all servers are active servers.
- data necessary for continuing services of each active server is distributed to other active servers to synchronize data among a plurality of active servers. Thereby, when any active server fails, service can be continued by the other active server.
- one standby server 923 is allocated to the plurality of active servers 921 and 922. If any active server fails, the standby server 923 starts service on behalf of that active server.
- one hot standby server 934 is allocated to the plurality of active servers 931 to 933.
- the configuration of FIG. 4 is the same as the configuration of FIG.
- data is synchronized between the active server and the standby server during normal operation. Thereby, the service can be continued by the standby server 934 when any of the active servers 931 to 933 fails.
- the cost is reduced as compared with the configuration of FIG. 1, and the process of dividing the communication path as shown in FIG. 2 is unnecessary. Further, in the configuration of FIG. 3, the process of synchronizing data is also unnecessary. However, since the data is not synchronized between the active server and the standby server, when the standby server starts operating on behalf of the failed active server, the service provided by the active server is not continued.
- An object of the present invention is to provide a technology that enables service continuation at low cost in a redundant configuration with a plurality of servers without requiring complicated processing such as division of a communication path at switching. It is.
- a node system A plurality of active servers cascaded so as to store data synchronized with data of the server of the former stage in the server of the latter stage;
- a standby server storing data synchronized with data of the last stage of the cascade connection of the plurality of active servers;
- data synchronized with data of a previous active server is stored in a second active server so that a plurality of active servers are cascaded, and a cascade of the plurality of active servers is stored.
- the data synchronized with the data of the last stage of connection is stored in the standby server,
- each server from the subsequent stage of the failed active server to the standby server uses the data synchronized with the respective preceding server, and the server of the previous stage is Take over the services that were
- a plurality of active server apparatuses are cascaded to store data synchronized with data of the active server apparatus of the former stage in the latter active server apparatus, and the active server apparatus of the last stage Storage means for storing data synchronized with the data of the active server apparatus of the preceding stage in the node system in which the data synchronized with the above data is stored in the standby server apparatus;
- the above-mentioned storage is performed after the service performed by the own server apparatus is taken over by the subsequent stage server apparatus
- Processing means for taking over the service provided by the previous active server device using the data synchronized with the data of the previous active server device stored by the means, and performing the service .
- a plurality of active server devices are cascaded to store data synchronized with data of the active server device of the former stage in the latter stage active server device, and A procedure for storing data synchronized with data of a previous active server device in a node system in which data synchronized with data is stored in a standby server device;
- a procedure for causing the subsequent stage server apparatus to take over the service performed by the server apparatus until then.
- a procedure for taking over the service performed by the previous active server apparatus using the data synchronized with the data of the active server apparatus of the previous stage stored in the storage means, and causing the computer to execute the service It is a program.
- FIG. 5 is a block diagram showing the configuration of a node according to the first embodiment.
- the node of this embodiment has active servers 11 1 and 11 2 and a standby server 12.
- the active servers 11 1 and 11 2 and the standby server 12 are connected to the communication path 13.
- the active servers 11 1 and 11 2 provide services using their own data D 1 1 and D 1 2 , and synchronize their own data with other servers. Thereby, the state where the services of the active servers 11 1 and 11 2 can be continued by other servers is maintained.
- Another server is either another active server or a standby server.
- the correlation among the active servers 11 1 and 11 2 is a cascade connection.
- Last active server 11 2 in the cascade connection the own data D1 2, and is further synchronized as data D1 2 'to the standby server 12 are cascade-connected to the next stage.
- the next server of the active server continues the service using data synchronized with the failed active server.
- the service which the active server which provides the service using the data synchronized with the active server of the former stage is performing is continued by the server of the next stage.
- Standby server 12 when the active server 11 2 of the front stage fails, or when the active server 11 2 is further to start the service on behalf of the active server 11 1 of the front, the data D1 2 synchronized with active server 11 2 Use 'to continue the service.
- active server 11 2 when the active server 11 1 fails, active server 11 2 continues the service using the active server 11 1 and synchronized data D1 1 '.
- the service that active server 11 2 is performing is continued by standby server 12.
- a plurality of active servers 11 1 and 11 2 and one standby are synchronized to synchronize data of the previous active server with the next active server and synchronize data of the last active server with the standby server.
- the servers 12 are connected in cascade, and when a failure occurs in any of the active servers, the subsequent servers continue the service of the previous server using data synchronized from the previous server.
- both the active server and the standby server are utilized for data synchronization.
- resources required for the standby server are not dependent on the number of active servers, and the communication path is divided at a lower cost than making the standby servers correspond to the active servers one to one.
- the server can be switched to continue the service without requiring complicated processing.
- the active server 11 includes a processor 14, a storage device 15, and a communication interface 16.
- the processor 14 operates by executing a software program, and provides a service using data stored in the storage device 15.
- the processor 14 synchronizes its own data with a subsequent server while providing a service using its own data. Further, if there is the active server 11 at the front stage of the own server, the processor 14 stores data synchronized with the active server 11 at the front stage in the storage device 15.
- the data D1 synchronized with the active server 11 of the previous stage Use 'to continue the service.
- the storage unit 15 holds data necessary for the service of its own server. In addition, if there is the active server 11 at the front stage, the storage device 15 also holds synchronized data D1 'from the server at the front stage.
- the communication interface 16 is connected to the communication path 13 and communicates between servers. Between servers, synchronization data is transferred between active servers or between active servers and standby servers.
- the standby server 12 includes a processor 17, a storage device 18, and a communication interface 19.
- Processor 17 operates by executing a software program, when the active server 11 2 of the front stage fails, or when the active server 11 2 is further to start the service on behalf of the active server 11 1 of the preceding stage, storage stored in the device 18, to continue the service by using the active server 11 2 and synchronized data D1 2 '.
- Storage device 18 holds data D1 2 'synchronized with active server 11 2 of the front stage.
- Communication interface 19 is connected to the communication path 13 communicates between the active server 11 2 of the front stage. In the communication synchronization data between active server 11 2 and standby server 12 are transferred.
- FIG. 6 is a flowchart showing an operation of a server in which a failure has occurred in a previous active server in the server of the first embodiment.
- the operations of the active server 11 and the standby server 12 are made common.
- the server detects a failure of the upstream active server 11 and starts a server system switching sequence (step 101).
- the server system switching sequence is a series of processing sequences for switching services among a plurality of servers constituting redundancy.
- the server determines whether there is an active server 11 or a standby server downstream of itself (step 102). This is processing to determine whether the server itself is the active server 11 or the standby server 12. When the operations of the active server 11 and the standby server 12 are not shared, this process is unnecessary. Having a server at the latter stage means that it is an active server, and having no server at the latter stage means that it is a standby server.
- the server transmits a server system switching request to the server at the latter stage (step 103).
- the server system switching request is a message for requesting the start of the server system switching sequence.
- the server stops its own operation (step 105).
- the server system switching completion is a message for notifying that the server system switching sequence is completed.
- the server takes over the service provided by the previous server by using the data synchronized with the previous server (step 106).
- step 102 the server shifts to the operation of step 106 and takes over the service that the server at the former stage has performed until then.
- FIG. 7 is a flowchart showing the operation of the server that has received the server system switching request from the active server in the previous stage in the server of the first embodiment.
- the operations of the active server 11 and the standby server 12 are made common.
- the server receives a server system switching request from the server at the previous stage, and starts a server system switching sequence (step 201).
- the server system switching sequence shown in steps 202 to 206 is the same as steps 102 to 106 shown in FIG.
- the server transmits server system switching completion to the server of the previous stage, and ends the processing.
- the active server 11 1 and active server 11 2 from the state of normal operation to provide a service, the operation of the node when active server 11 1 fails. If the active server 11 1 fails, it restarts the operation of the nodes are switching servers going to active server 11 2 and standby server 12 provide services.
- active server 11 2 If the active server 11 1 fails, active server 11 2 detects the failure and starts the server-based switching sequence. Active server 11 2 itself confirms that it is not a standby server, and requests the server system switched to the subsequent server (standby server 12) to continue its service.
- Standby server 12 receives the server-based switching request, using the preceding active server 11 2 and synchronized data D1 2 ', to start the service of active server 11 2 had done before. Then, standby server 12 notifies the server system switching completion preceding to the active server 11 2.
- Active server 11 2 when receiving the server-based switching completion from standby server 12, to stop the service that has been carried out on its own up to it. Next, active server 11 2, front with active server 11 1 and synchronized data D1 1 'and to enter service active server 11 1 had done before.
- the data amount of the server system switching request and the server system switching completion transmitted and received between the servers is sufficiently smaller than the data amount of synchronization data transferred to synchronize data used for service. Therefore, the time taken for communication between servers is short, and server system switching is completed immediately. Therefore, when the active server 11 1 fails, it is possible to continue the service as a whole node.
- the latter server detects the failure of the former server
- the present invention is not limited to this, and the failure monitoring may be performed with any configuration or method. Good.
- one standby server is assigned to one cascaded active server.
- the present invention is not limited thereto.
- the second embodiment exemplifies a configuration in which one standby server is assigned to two cascaded active servers.
- the active server Since the active server also functions as a backup of one other active server, it has a storage capacity sufficient to store the data of the two active servers including its own data. If servers of the same performance are used for both the active server and the standby server, the standby server also has a storage capacity sufficient to store data for the two active servers.
- a plurality of active servers are divided into two systems, data are synchronized by cascade connection for each system, and one standby server is shared in the last stage of the two systems. This makes it possible to limit server system switching to only the system to which the active server belongs when a failure occurs in any of the active servers.
- FIG. 8 is a block diagram showing the configuration of a node according to the second embodiment.
- the node of this embodiment has active servers 21 1 to 21 4 and a standby server 22.
- the active servers 21 1 to 21 4 and the standby server 22 are connected to the communication path 23.
- the active servers 21 1 to 21 4 divide the system for synchronizing data by cascade connection into two systems of the active servers 21 1 and 21 2 and the systems of the active servers 21 4 and 21 3 .
- Active servers 21 1 to 21 4 during normal operation, as well as providing a service using data D2 1 ⁇ D2 4 itself is synchronized to the server in the subsequent stage of the cascaded own data D2 1 ⁇ D2 4.
- the next server of the active server continues the service using data synchronized with the failed active server.
- the service which the active server which provides the service using the data synchronized with the active server of the former stage is performing is continued by the server of the next stage.
- the standby server 22 activates the previous stage active server 21.
- the service is continued using the data D2 'synchronized with the server 21.
- server system switching is closed to the system to which the active server 21 in which the failure has occurred belongs.
- active server 21 2 when the active server 21 1 fails, active server 21 2 continues the service using the active server 21 1 and synchronized with the data D2 1 '.
- the service that active server 21 2 is performing is continued by standby server 22..
- active server 21 4 fails, active server 21 3, to continue the service by using the active server 21 4 synchronized with the data D2 4 '.
- the service that active server 21 3 was performed is continued by standby server 22..
- the active server 21 includes a processor 24, a storage device 25, and a communication interface 26.
- the configuration and operation of the processor 24, the storage device 25 and the communication interface 26 are similar to the processor 14, the storage device 15 and the communication interface 16 of the active server 11 according to the first embodiment shown in FIG.
- the standby server 22 includes a processor 27, a storage device 28, and a communication interface 29.
- the standby server 22 differs from the standby server 12 according to the first embodiment shown in FIG. 5 in that the standby server 22 is shared by two active servers. However, the standby server 22 operates similarly to the standby server 12 of the first embodiment for each system.
- the operations for each system of the processor 27, the storage device 28, and the communication interface 29 are also similar to the processor 17, the storage device 18, and the communication interface 19 according to the first embodiment.
- nodes overall operation when the active server 21 1 fails.
- active server 21 1 to 21 4 from the state of normal operation to provide a service, the operation of the node when active server 21 1 fails. If the active server 21 1 fails, it restarts the operation of the nodes are switching servers going to active server 21 2 and standby server 22 provide services.
- active server 21 2 If the active server 21 1 fails, active server 21 2 detects the failure and starts the server-based switching sequence. Active server 21 2 itself confirms that it is not a standby server, and requests the server system switching to a subsequent server to continue its service (standby server 22).
- Standby server 22 receives the server-based switching request, using the preceding active server 21 2 and synchronous data D2 2 ', to start the service of active server 21 2 had done before. Then, standby server 22 notifies the server system switching completion to the active server 21 2.
- Active server 21 upon receiving the server-based switching completion from standby server 22, to stop the service that has been carried out on its own up to it. Next, active server 21 2, front with active server 21 1 and synchronized with the data D2 1 'and to enter service active server 21 1 had done before.
- the data amount of the server system switching request and the server system switching completion transmitted and received between the servers is sufficiently smaller than the data amount of synchronization data transferred to synchronize data used for service. Therefore, the time taken for communication between servers is short, and server system switching is completed immediately. Therefore, when the active server 21 1 fails, it is possible to continue the service as a whole node.
- each active server belongs to any one family, and synchronization of data of a plurality of active servers is in only one direction.
- the present invention is not limited to this.
- the third embodiment exemplifies a configuration in which data synchronization of a plurality of active servers is cascaded in two directions.
- the active server that is the first tier in one direction is the last tier in the other direction.
- the plurality of active servers are connected in a row such that adjacent active servers synchronize data with each other in two directions, and two active servers at both ends also synchronize their data with the standby server.
- each active server belongs to two systems. Therefore, when a failure occurs in the active server, switching can be performed by selecting an appropriate one of the two systems.
- server system switching can be limited to only one of the directions.
- N active servers are all cascaded in one system, the number of times of communication between servers is N at maximum, but divided into two for every (N / 2) machines and further cascade connected to both. If a total of four systems are used, the number of communications can be at most (N / 4). As a result, the time taken for server system switching as a whole of the node can be shortened.
- FIG. 9 is a block diagram showing the configuration of a node according to the third embodiment.
- Node of this embodiment has an active server 31 1 to 31 6 and standby server 32. Active server 31 1 to 31 6 and standby server 32 are connected to the communication path 33.
- the active servers 31 1 to 3 16 synchronize the data in a cascade connection with the pair of active servers 31 1 , 31 2 , and 31 3 and the pair of active servers 31 3 , 31 4 , 31 5 , and 31 6 Divided into two.
- a plurality of active servers 31 belonging to the same set are connected in a row so that adjacent access servers 31 synchronize data with each other in both directions, and two active servers 31 at both ends further have their own data as standby servers 32. It is also synchronized.
- the active server 31 1 and active server 31 2 is to synchronize data with each other in both directions.
- the active server 31 2 and active server 31 3 has to synchronize data with each other in both directions.
- active server 31 1 and active server 31 3 at each end is also synchronized to the standby server 32 its own data.
- two systems of cascade connection are established by the set of active servers 31 1 31 2 31 3 .
- a position active server 31 3 belongs to two groups, and to connect to the standby server 32 In the last stage in the system of each set of cascade. This configuration can be used also the data of the two systems synchronized to the standby server 32 in one data in the active server 31 3.
- the active server 31 includes a processor 34, a storage device 35, and a communication interface 36.
- the configuration and operation of the processor 34, the storage device 35, and the communication interface 36 are similar to the processor 14, the storage device 15, and the communication interface 16 of the active server 11 according to the first embodiment shown in FIG.
- one active server 31 belongs to a plurality of systems.
- the processor 34 selects a system for performing server system switching when a failure occurs in any of the active servers 31. It may have a function. For example, a system for performing server system switching may be selected according to the position of the active server 31 in which a failure has occurred. More specifically, information in which the active server 31 in which a failure has occurred and a system having the smallest number of server system switching stages with respect to the failure may be set in advance in each server.
- the standby server 32 includes a processor 37, a storage device 38, and a communication interface 39.
- the standby server 32 is shared by a plurality of systems similar to the standby server 22 of the second embodiment shown in FIG.
- the standby server 32 operates similarly to the standby server 12 of the first embodiment for each system.
- the operation of each processor 37, storage 38 and communication interface 39 is similar to that of the processor 17, storage 18 and communication interface 19 according to the first embodiment.
- nodes overall operation when the active server 31 4 fails.
- active server 31 1 to 31 6 from the state of normal operation to provide a service, the operation of the node when active server 31 4 fails.
- active server 31 4 fails, active server 31 3 and active server 31 5 detects the failure.
- the lines through the active server 31 3, the server through which to standby server 32 is one (active server 31 3 only).
- the lines through the active server 31 5, the server through which to standby server 32 are two (active server 31 5, 31 6).
- Active server 31 3 starts the server-based switching sequence. Active server 31 3, itself confirms that it is not a standby server 32, and requests the server system switching to a subsequent server to continue its service (standby server 32).
- Standby server 32 receives the server-based switching request, using the preceding active server 31 3 synchronized with the data D3 3 '' ', to start the service to active server 31 3 had done before. Furthermore, standby server 32 notifies the server system switching completion to the active server 31 3.
- Active server 31 when receiving the server-based switching completion from standby server 32, to stop the service that has been carried out on its own up to it. Next, active server 31 3, by using the data D3 4 "in synchronization with preceding active 31 4, to start the service of active server 31 4 had done before.
- the data amount of the server system switching request and the server system switching completion transmitted and received between the servers is sufficiently smaller than the data amount of synchronization data transferred to synchronize data used for service. Therefore, the time taken for communication between servers is short, and server system switching is completed immediately. Therefore, when the active server 31 4 fails, it is possible to continue the service as a whole node.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Hardware Redundancy (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Multi Processors (AREA)
Abstract
Description
前段のサーバのデータに同期したデータを後段のサーバにて記憶するようにカスケード接続した複数のアクティブサーバと、
前記複数のアクティブサーバのカスケード接続の最後の段のデータに同期したデータを記憶するスタンバイサーバと、を有し、
いずれかのアクティブサーバに障害が発生すると、障害の発生した前記アクティブサーバの後段から前記スタンバイサーバまでの各々のサーバが、それぞれの前段のサーバに同期したデータを用いて、それまで前記前段のサーバが行っていたサービスを引き継いで行うものである。
いずれかのアクティブサーバに障害が発生すると、障害の発生した前記アクティブサーバの後段から前記スタンバイサーバまでの各々のサーバが、それぞれの前段のサーバに同期したデータを用いて、それまで前記前段のサーバが行っていたサービスを引き継ぐものである。
前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせた後、前記記憶手段にて記憶されている、前記前段のアクティブサーバ装置のデータに同期したデータを用いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継いで行う処理手段と、を有している。
前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせる手順と、
前記記憶手段にて記憶されている、前記前段のアクティブサーバ装置のデータに同期したデータを用いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぐ手順と、コンピュータに実行させるためのプログラムである。
図5は、第1の実施形態のノードの構成を示すブロック図である。本実施形態のノードはアクティブサーバ111,112とスタンバイサーバ12とを有している。アクティブサーバ111,112とスタンバイサーバ12は通信路13に接続されている。
第1の実施形態では、カスケード接続された1系統のアクティブサーバに対して1つのスタンバイサーバが割り当てられていた。しかし、本発明はそれに限定されるものではない。第2の実施形態として、アクティブサーバがカスケード接続された2系統に対して1つのスタンバイサーバを割り当てる構成を例示する。
第1および第2の実施形態では、各アクティブサーバはいずれか1つの系統に属しており、複数のアクティブサーバのデータの同期は1方向だけであった。しかし、本発明はこれに限定されるものでない。第3の実施形態では、複数のアクティブサーバのデータの同期が2方向にカスケード接続された構成を例示する。
Claims (21)
- 前段のサーバのデータに同期したデータを後段のサーバにて記憶するようにカスケード接続した複数のアクティブサーバと、
前記複数のアクティブサーバのカスケード接続の最後の段のデータに同期したデータを記憶するスタンバイサーバと、を有し、
いずれかのアクティブサーバに障害が発生すると、障害の発生した前記アクティブサーバの後段から前記スタンバイサーバまでの各々のサーバが、それぞれの前段のサーバに同期したデータを用いて、それまで前記前段のサーバが行っていたサービスを引き継いで行う、ノードシステム。 - 前記複数のアクティブサーバによるカスケード接続の系統が複数あり、前記複数の系統における最後の段のアクティブサーバのデータに同期したデータを同じスタンバイサーバに記録する、請求項1に記載のノードシステム。
- 少なくとも1つのアクティブサーバがカスケード接続の複数の系統に属しており、前記アクティブサーバに障害が発生すると、前記アクティブサーバが属している前記複数の系統の中で、サービスを切り換えるサーバの段数がより少なくなる系統にて切り換えを実行する、請求項2に記載のノードシステム。
- 同じ系統に属する複数のアクティブサーバは、隣接するアクティブサーバが双方向で互いにデータを同期させるようにして一列に接続され、前記系統の両端の2つのアクティブサーバのデータに同期したデータを前記スタンバイサーバにて記憶する、請求項2または3に記載のノードシステム。
- カスケード接続の複数の系統に属し、かつ前記複数の系統のいずれにおいても最後の段に位置するアクティブサーバがあり、前記複数の系統の最後の段に位置する前記アクティブサーバのデータに同期したデータを前記スタンバイサーバに記録する、請求項2から4のいずれか1項に記載のノードシステム。
- 複数のアクティブサーバによるカスケード接続が2系統あり、前記アクティブサーバの各々がいずれか1つの系統に属しており、前記2系統の最後の段のアクティブサーバのデータに同期したデータを1つのスタンバイサーバに記録する、請求項2に記載のノードシステム。
- 複数のアクティブサーバをカスケード接続するように、前段のアクティブサーバのデータに同期したデータを後段のアクティブサーバにて記憶するとともに、前記複数のアクティブサーバのカスケード接続の最後の段のデータに同期したデータをスタンバイサーバにて記憶し、
いずれかのアクティブサーバに障害が発生すると、障害の発生した前記アクティブサーバの後段から前記スタンバイサーバまでの各々のサーバが、それぞれの前段のサーバに同期したデータを用いて、それまで前記前段のサーバが行っていたサービスを引き継ぐ、サーバ切換え方法。 - 前記複数のアクティブサーバによるカスケード接続の系統が複数あり、
前記複数の系統における最後の段のアクティブサーバのデータに同期したデータを同じスタンバイサーバに記録する、請求項7に記載のサーバ切換え方法。 - 少なくとも1つのアクティブサーバがカスケード接続の複数の系統に属しており、
前記アクティブサーバに障害が発生すると、前記アクティブサーバが属している前記複数の系統の中で、サービスを切り換えるサーバの段数がより少なくなる系統にて切り換えを実行する、請求項8に記載のサーバ切換え方法。 - 同じ系統に属する複数のアクティブサーバは、隣接するアクティブサーバが双方向で互いにデータを同期させるようにして一列に接続されており、
前記系統の両端の2つのアクティブサーバのデータに同期したデータを前記スタンバイサーバにて記憶する、請求項8または9に記載のサーバ切換え方法。 - カスケード接続の複数の系統に属し、かつ前記複数の系統のいずれにおいても最後の段に位置するアクティブサーバがあり、
前記複数の系統の最後の段に位置する前記アクティブサーバのデータに同期したデータを前記スタンバイサーバに記録する、請求項8から10のいずれか1項に記載のサーバ切換え方法。 - 複数のアクティブサーバによるカスケード接続が2系統あり、前記アクティブサーバの各々がいずれか1つの系統に属しており、
前記2系統の最後の段のアクティブサーバのデータに同期したデータを1つのスタンバイサーバに記録する、請求項8に記載のサーバ切換え方法。 - 前段のアクティブサーバ装置のデータに同期したデータを後段のアクティブサーバ装置にて記憶するように複数のアクティブサーバ装置がカスケード接続され、最後の段のアクティブサーバ装置のデータに同期したデータがスタンバイサーバ装置に記憶されるノードシステムにおける前段のアクティブサーバ装置のデータに同期したデータを記憶する記憶手段と、
前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせた後、前記記憶手段にて記憶されている、前記前段のアクティブサーバ装置のデータに同期したデータを用いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継いで行う処理手段と、を有するサーバ装置。 - 前記処理手段は、前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、
後段にサーバ装置がなければ、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせる処理を省いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぎ、
後段にサーバ装置があれば、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせた後、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぐ、請求項13に記載のサーバ装置。 - 前記処理手段は、カスケード接続の複数の系統に属しているアクティブサーバ装置に障害が発生すると、前記アクティブサーバが属している前記複数の系統の中で、サービスを切り換えるサーバの段数がより少なくなる系統にて切り換えを実行する、請求項13に記載のサーバ装置。
- 前段のアクティブサーバ装置のデータに同期したデータを後段のアクティブサーバ装置にて記憶するように複数のアクティブサーバ装置がカスケード接続され、最後の段のアクティブサーバ装置のデータに同期したデータがスタンバイサーバ装置に記憶されるノードシステムにおける前段のアクティブサーバ装置のデータに同期したデータを記憶し、
前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせ、
前記前段のアクティブサーバ装置のデータに同期したデータを用いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぐ、データ引き継ぎ方法。 - 前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、
後段にサーバ装置がなければ、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせる処理を省いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぎ、
後段にサーバ装置があれば、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせた後、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぐ、請求項16に記載のデータ引き継ぎ方法。 - カスケード接続の複数の系統に属しているアクティブサーバ装置に障害が発生すると、前記アクティブサーバが属している前記複数の系統の中で、サービスを切り換えるサーバの段数がより少なくなる系統にて切り換えを実行する、請求項16に記載のデータ引き継ぎ方法。
- 前段のアクティブサーバ装置のデータに同期したデータを後段のアクティブサーバ装置にて記憶するように複数のアクティブサーバ装置がカスケード接続され、最後の段のアクティブサーバ装置のデータに同期したデータがスタンバイサーバ装置に記憶されるノードシステムにおける前段のアクティブサーバ装置のデータに同期したデータを記憶する手順と、
前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせる手順と、
前記前段のアクティブサーバ装置のデータに同期したデータを用いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぐ手順と、をコンピュータに実行させるためのプログラム。 - 前記前段のアクティブサーバ装置に障害が発生したとき、または前記前段のアクティブサーバ装置から要求があったとき、
後段にサーバ装置がなければ、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせる処理を省いて、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぎ、
後段にサーバ装置があれば、自サーバ装置がそれまで行っていたサービスを後段のサーバ装置に引き継がせた後、それまで前記前段のアクティブサーバ装置が行っていたサービスを引き継ぐ、請求項19に記載のプログラム。 - カスケード接続の複数の系統に属しているアクティブサーバ装置に障害が発生すると、前記アクティブサーバが属している前記複数の系統の中で、サービスを切り換えるサーバの段数がより少なくなる系統にて切り換えを実行する、請求項19に記載のプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP08864266A EP2224341B1 (en) | 2007-12-21 | 2008-10-29 | Node system, server switching method, server device, and data transfer method |
US12/746,591 US20100268687A1 (en) | 2007-12-21 | 2008-10-29 | Node system, server switching method, server apparatus, and data takeover method |
CN200880121845.9A CN101903864B (zh) | 2007-12-21 | 2008-10-29 | 节点系统、服务器切换方法、服务器装置和数据接管方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2007-330060 | 2007-12-21 | ||
JP2007330060A JP4479930B2 (ja) | 2007-12-21 | 2007-12-21 | ノードシステム、サーバ切換え方法、サーバ装置、データ引き継ぎ方法、およびプログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009081657A1 true WO2009081657A1 (ja) | 2009-07-02 |
Family
ID=40800973
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2008/069589 WO2009081657A1 (ja) | 2007-12-21 | 2008-10-29 | ノードシステム、サーバ切換え方法、サーバ装置、およびデータ引き継ぎ方法 |
Country Status (7)
Country | Link |
---|---|
US (1) | US20100268687A1 (ja) |
EP (1) | EP2224341B1 (ja) |
JP (1) | JP4479930B2 (ja) |
KR (1) | KR20100099319A (ja) |
CN (1) | CN101903864B (ja) |
TW (1) | TWI410810B (ja) |
WO (1) | WO2009081657A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699461A (zh) * | 2013-11-27 | 2014-04-02 | 北京机械设备研究所 | 一种双主机相互冗余热备份方法 |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8578202B2 (en) * | 2010-07-29 | 2013-11-05 | Ca, Inc. | System and method for providing high availability for distributed application |
DE102011116866A1 (de) * | 2011-10-25 | 2013-04-25 | Fujitsu Technology Solutions Intellectual Property Gmbh | Clustersystem und Verfahren zum Ausführen einer Mehrzahl von virtuellen Maschinen |
CN102541693A (zh) * | 2011-12-31 | 2012-07-04 | 曙光信息产业股份有限公司 | 数据的多副本存储管理方法和系统 |
US9021166B2 (en) | 2012-07-17 | 2015-04-28 | Lsi Corporation | Server direct attached storage shared through physical SAS expanders |
JP6056408B2 (ja) * | 2012-11-21 | 2017-01-11 | 日本電気株式会社 | フォールトトレラントシステム |
JP5976589B2 (ja) * | 2013-03-29 | 2016-08-23 | シスメックス株式会社 | 検体分析方法及び検体分析システム |
EP3252607A4 (en) | 2015-01-27 | 2018-08-29 | Nec Corporation | Network function virtualization management and orchestration device, system, management method, and program |
EP3252608B1 (en) | 2015-01-30 | 2021-03-31 | NEC Corporation | Node system, server device, scaling control method, and program |
CN111352878B (zh) * | 2018-12-21 | 2021-08-27 | 达发科技(苏州)有限公司 | 数字信号处理系统及方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0981409A (ja) * | 1995-09-14 | 1997-03-28 | Nec Corp | 相互ホットスタンバイシステム待機系選択方式 |
JP2001043105A (ja) | 1999-07-30 | 2001-02-16 | Toshiba Corp | 高可用性計算機システム及び同システムにおけるデータバックアップ方法 |
JP2005250840A (ja) * | 2004-03-04 | 2005-09-15 | Nomura Research Institute Ltd | 耐障害システムのための情報処理装置 |
JP2007011888A (ja) * | 2005-07-01 | 2007-01-18 | Nippon Telegr & Teleph Corp <Ntt> | ノード間情報共有システム |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6119162A (en) * | 1998-09-25 | 2000-09-12 | Actiontec Electronics, Inc. | Methods and apparatus for dynamic internet server selection |
US6397307B2 (en) * | 1999-02-23 | 2002-05-28 | Legato Systems, Inc. | Method and system for mirroring and archiving mass storage |
US6567376B1 (en) * | 1999-02-25 | 2003-05-20 | Telefonaktiebolaget Lm Ericsson (Publ) | Using system frame number to implement timers in telecommunications system having redundancy |
US6886004B2 (en) * | 2000-08-24 | 2005-04-26 | Red Hat, Inc. | Method and apparatus for atomic file look-up |
US20030005350A1 (en) * | 2001-06-29 | 2003-01-02 | Maarten Koning | Failover management system |
US6922791B2 (en) * | 2001-08-09 | 2005-07-26 | Dell Products L.P. | Failover system and method for cluster environment |
US6978398B2 (en) * | 2001-08-15 | 2005-12-20 | International Business Machines Corporation | Method and system for proactively reducing the outage time of a computer system |
US6978396B2 (en) * | 2002-05-30 | 2005-12-20 | Solid Information Technology Oy | Method and system for processing replicated transactions parallel in secondary server |
US7793060B2 (en) * | 2003-07-15 | 2010-09-07 | International Business Machines Corporation | System method and circuit for differential mirroring of data |
JP4415610B2 (ja) * | 2003-08-26 | 2010-02-17 | 株式会社日立製作所 | 系切替方法、レプリカ作成方法、及びディスク装置 |
TWI257226B (en) * | 2004-12-29 | 2006-06-21 | Inventec Corp | Remote control system of blade server and remote switching control method thereof |
JP2006285448A (ja) * | 2005-03-31 | 2006-10-19 | Oki Electric Ind Co Ltd | 冗長システム |
CN101022451B (zh) * | 2006-02-14 | 2014-07-23 | 杭州华三通信技术有限公司 | 数据通信中连接状态的同步方法及其应用的通信节点 |
US7873702B2 (en) * | 2006-03-31 | 2011-01-18 | Masstech Group Inc. | Distributed redundant adaptive cluster |
US7752404B2 (en) * | 2006-12-29 | 2010-07-06 | Emc Corporation | Toggling between concurrent and cascaded triangular asynchronous replication |
US7899917B2 (en) * | 2007-02-01 | 2011-03-01 | Microsoft Corporation | Synchronization framework for occasionally connected applications |
TW200849001A (en) * | 2007-06-01 | 2008-12-16 | Unisvr Global Information Technology Corp | Multi-server hot-backup system and fault tolerant method |
JP4561800B2 (ja) * | 2007-09-25 | 2010-10-13 | 沖電気工業株式会社 | データ同期システム及び方法 |
US7870095B2 (en) * | 2007-12-03 | 2011-01-11 | International Business Machines Corporation | Apparatus, system, and method for replication of data management information |
-
2007
- 2007-12-21 JP JP2007330060A patent/JP4479930B2/ja active Active
-
2008
- 2008-10-29 WO PCT/JP2008/069589 patent/WO2009081657A1/ja active Application Filing
- 2008-10-29 CN CN200880121845.9A patent/CN101903864B/zh not_active Expired - Fee Related
- 2008-10-29 KR KR1020107016362A patent/KR20100099319A/ko not_active Application Discontinuation
- 2008-10-29 EP EP08864266A patent/EP2224341B1/en not_active Not-in-force
- 2008-10-29 US US12/746,591 patent/US20100268687A1/en not_active Abandoned
- 2008-11-27 TW TW097145987A patent/TWI410810B/zh not_active IP Right Cessation
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0981409A (ja) * | 1995-09-14 | 1997-03-28 | Nec Corp | 相互ホットスタンバイシステム待機系選択方式 |
JP2001043105A (ja) | 1999-07-30 | 2001-02-16 | Toshiba Corp | 高可用性計算機システム及び同システムにおけるデータバックアップ方法 |
JP2005250840A (ja) * | 2004-03-04 | 2005-09-15 | Nomura Research Institute Ltd | 耐障害システムのための情報処理装置 |
JP2007011888A (ja) * | 2005-07-01 | 2007-01-18 | Nippon Telegr & Teleph Corp <Ntt> | ノード間情報共有システム |
Non-Patent Citations (1)
Title |
---|
See also references of EP2224341A4 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699461A (zh) * | 2013-11-27 | 2014-04-02 | 北京机械设备研究所 | 一种双主机相互冗余热备份方法 |
Also Published As
Publication number | Publication date |
---|---|
TWI410810B (zh) | 2013-10-01 |
CN101903864A (zh) | 2010-12-01 |
TW200935244A (en) | 2009-08-16 |
JP2009151629A (ja) | 2009-07-09 |
EP2224341A1 (en) | 2010-09-01 |
EP2224341A4 (en) | 2012-03-07 |
US20100268687A1 (en) | 2010-10-21 |
KR20100099319A (ko) | 2010-09-10 |
JP4479930B2 (ja) | 2010-06-09 |
CN101903864B (zh) | 2016-04-20 |
EP2224341B1 (en) | 2013-03-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009081657A1 (ja) | ノードシステム、サーバ切換え方法、サーバ装置、およびデータ引き継ぎ方法 | |
JP6382454B2 (ja) | 分散ストレージ及びレプリケーションシステム、並びに方法 | |
WO2019085875A1 (zh) | 存储集群的配置修改方法、存储集群及计算机系统 | |
CN102404390B (zh) | 高速实时数据库的智能化动态负载均衡方法 | |
RU2635263C2 (ru) | Способ резервирования для сетей связи | |
US20150019491A1 (en) | Replication of Data Between Mirrored Data Sites | |
JP3798661B2 (ja) | クラスタ化コンピュータ・システム内のグループのメンバによって受信されたマージ要求を処理する方法 | |
WO2007048319A1 (fr) | Systeme et procede de recuperation sur sinistre de dispositif de commande de service dans un reseau intelligent | |
US20130219224A1 (en) | Job continuation management apparatus, job continuation management method and job continuation management program | |
CN113612614B (zh) | 基于区块链网络的共识容灾方法、装置、设备和存储介质 | |
CN105760519A (zh) | 一种集群文件系统及其文件锁分配方法 | |
WO2014177085A1 (zh) | 分布式多副本数据存储方法及装置 | |
CN108512753B (zh) | 一种集群文件系统中消息传输的方法及装置 | |
CN106294031B (zh) | 一种业务管理方法和存储控制器 | |
JP5201134B2 (ja) | 二重化システム、切替プログラムおよび切替方法 | |
CN114244859A (zh) | 数据处理方法及装置和电子设备 | |
CN101145955A (zh) | 网管软件热备份的方法、网管及网管系统 | |
CN110351122A (zh) | 容灾方法、装置、系统与电子设备 | |
CN114598593B (zh) | 消息处理方法、系统、计算设备及计算机存储介质 | |
JP5716460B2 (ja) | クラスタシステムおよびその制御方法 | |
JP5798056B2 (ja) | 呼処理情報の冗長化制御システムおよびこれに利用する予備保守サーバ | |
JP2009217765A (ja) | 複数宛先への同期送信方法、その実施システム及び処理プログラム | |
JP2010086227A (ja) | 計算機間相互結合網における通信経路の冗長化と切り替え方法、この方法を実現するサーバ装置、そのサーバモジュール、および、そのプログラム | |
KR102041793B1 (ko) | 이중화를 이용하여 장애를 처리하는 ptt 서비스 관리 시스템 및 그 방법 | |
CN117555688A (zh) | 基于双活中心的数据处理方法、系统、设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200880121845.9 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08864266 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12746591 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2008864266 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20107016362 Country of ref document: KR Kind code of ref document: A |