WO2012101785A1 - 管理装置、管理方法および管理プログラム - Google Patents
管理装置、管理方法および管理プログラム Download PDFInfo
- Publication number
- WO2012101785A1 WO2012101785A1 PCT/JP2011/051517 JP2011051517W WO2012101785A1 WO 2012101785 A1 WO2012101785 A1 WO 2012101785A1 JP 2011051517 W JP2011051517 W JP 2011051517W WO 2012101785 A1 WO2012101785 A1 WO 2012101785A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- management
- backup
- key
- domain
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
- H04L41/0668—Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4022—Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1658—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2023—Failover techniques
- G06F11/2028—Failover techniques eliminating a faulty processor or activating a spare
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2048—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant where the redundant components share neither address space nor persistent storage
Definitions
- the present invention relates to a management device, a management method, and a management program.
- a technique is known in which an overlay network that constructs another network on a network is used, and a routing table is created and changed based on information on a network failure.
- the operation management manager is hierarchized to manage the operation of a network such as a large-scale data center, processing may be delayed due to concentration of load on the manager. In order to prepare for this load concentration, the use of a high-performance server increases the cost. In addition, in a configuration in which managers are hierarchized, the manager becomes a SPOF (Single Point of Failure), and fault tolerance decreases.
- SPOF Single Point of Failure
- the disclosed technology has been made in view of the above, and aims to improve load distribution, scalability, and reliability in network system management.
- the management device, management method, and management program disclosed in the present application include at least one of a management range to which a node belongs, a data capacity, and an operation time from a network node constructed according to a predetermined rule for a network to be managed.
- a backup node of the management node is selected based on a plurality of indexes including any of them.
- the disclosed apparatus, method, and program replicate management information to a backup node, and switch the backup node to the management node when the management node stops.
- management device management method, and management program disclosed in the present application, it is possible to distribute the load in the management of the network system and improve scalability and reliability.
- FIG. 1 is an explanatory diagram of the management system according to the present embodiment.
- FIG. 2 is an explanatory diagram of the network according to the present embodiment.
- FIG. 3 is a configuration diagram of the management apparatus according to the present embodiment.
- FIG. 4 is an explanatory diagram of implementation by the management program.
- FIG. 5 is an explanatory diagram of hierarchical management.
- FIG. 6 is an explanatory diagram of the relationship between the server hardware and the management program.
- FIG. 7 is an explanatory diagram of an overlay network.
- FIG. 8 is an explanatory diagram of a specific example of the definition of the hash table.
- FIG. 9 is a diagram showing a specific example of the self-node table t2 shown in FIG. FIG.
- FIG. 10 is a diagram showing a specific example of the domain table t3 shown in FIG.
- FIG. 11 is a diagram showing a specific example of the node management table t4 shown in FIG.
- FIG. 12 is a diagram showing a specific example of the routing table t5 shown in FIG.
- FIG. 13 is a flowchart for explaining the processing operation of the backup processing unit m40.
- FIG. 1 is an explanatory diagram of the management system according to the present embodiment.
- the node N1 illustrated in FIG. 1 is a management node (manager) that manages the overlay network including the nodes N2 to N4, and includes a node selection unit m41, a data replication unit m42, and a switching processing unit m43.
- the nodes N2 to N4 also have a node selection unit m41, a data replication unit m42, and a switching processing unit m43, similarly to the node N1.
- the node selection unit m41 acquires the management range, data capacity, and operation time to which the node belongs from the nodes N2 to N4, and selects the backup node of the management node using these as indices.
- the data replication unit m42 replicates the management information to the backup node selected by the node selection unit m41, and the switching processing unit m43 switches the backup node to the management node when the management node stops.
- FIG. 2 is an explanatory diagram of the network according to the present embodiment
- FIG. 3 is a configuration diagram of the management apparatus according to the present embodiment.
- the management target devices n1 to n4 are connected via a network. This network is the network to be monitored.
- the management device m1 is connected to the management target device n1, the management device m2 is connected to the management target device n2, and the management device m3 is connected to the management target device n3.
- the management devices m1 to m4 use the network interface of the management target devices n1 to n4 to construct an overlay network for the network to which the management target devices n1 to n4 belong.
- the management devices m1 to m4 function as nodes of this overlay network and can communicate with each other.
- the management device m1 includes a node selection unit m41, a data replication unit m42, and a switching processing unit m43.
- the management apparatus m1 includes an overlay network construction unit m11, a management object search unit m12, a management information creation unit m13, a life / death monitoring unit m30, and a backup processing unit m40.
- the backup processing unit m40 includes a node selection unit m41, a data replication unit m42, and a switching processing unit m43.
- the management device m1 is connected to a SAN (Storage Area Network) and causes the SAN to hold various types of information described later.
- SAN Storage Area Network
- the overlay network construction unit m11 is a processing unit that constructs an overlay network for a management target network, and includes a communication processing unit m21, a hash processing unit m22, an information acquisition unit m23, and a notification unit m24.
- the communication processing unit m21 performs processing to communicate with other nodes on the network in which the management target device n1 participates as a node.
- the hash processing unit m22 obtains a hash value from information acquired by the communication processing unit m21 from another node or information on the management target device, and uses the obtained hash value as a key of the overlay network.
- the information acquisition unit m23 is a processing unit that acquires information from other nodes of the overlay network via the communication processing unit m21.
- the notification unit m24 is a processing unit that notifies information to other nodes in the overlay network via the communication processing unit m21.
- the management target search unit m12 performs a process of searching for a node belonging to the same management range as the own node that is a management target device to which the management device m1 is directly connected, from the overlay network constructed by the overlay network construction unit m11.
- the management information creation unit m13 creates management information with the node obtained by the search by the management target search unit m12 as the management target node.
- the life / death monitoring unit m30 is a processing unit that monitors the life / death of the node designated as the monitoring target.
- the backup processing unit m40 includes a node selection unit m41, a data replication unit m42, and a switching processing unit m43, and performs switching based on the backup node selection, data replication, and the monitoring result of the life / death monitoring unit m30.
- the management apparatus m1 is preferably implemented as a management program that operates on a computer that is a management target apparatus.
- domain A and domain B each include three servers, and communication between domain A and domain B is possible.
- a VM (Virtual Machines) host program 21 that virtually realizes an operating environment of another computer system is operating.
- VM guest programs 41 to 44 are running on the VM host program 21.
- an operation management program 31 further operates on the VM host program 21.
- the operation management program 31 operating on the VM host program 21 causes the server 11 to function as a management device.
- the management target devices of the operation management program 31 are the server 11 itself, the VM host program 21 operating on the server 11, and the VM guest programs 41 to 44.
- an OS (Operating System) 23 is operating, and an operation management program 32 is operating on the OS 23.
- a switch 51 and a router 53 are connected to the server 12.
- the operation management program 32 operating on the server OS 23 causes the server 12 to function as a management apparatus.
- the management target devices of the operation management program 32 are the server 12 itself, the switch 51 and the router 53 connected to the server.
- an OS Operating System
- an operation management program 33 is operating on the OS 24.
- a storage 55 is connected to the server 13.
- the operation management program 33 operating on the OS 24 of the server 13 causes the server 13 to function as a management device.
- the management target device of the operation management program 33 is the server 13 itself and the storage 55 connected to the server 13.
- the operation management programs 34 to 36 operate on the VM host program 32 and the OSs 25 and 26 on the servers 14 to 16, respectively.
- the servers 14 to 16 various programs (VM host 22, OS 25 and 26, VM guests 45 to 48) operating on each server, and hardware (switch 52, router 54, storage 56) connected to each server Are managed by an operation management program running on the corresponding server.
- the operation management programs 31 to 36 on the servers 14 to 16 communicate with each other to construct an overlay network.
- the operation management programs 31 to 36 can collect information about other nodes in the domain to which the operation management program belongs and create management information.
- the operation management programs 31 to 36 can be acquired from the terminal 1 accessible from both the domain A and the domain B.
- FIG. 5 is a comparative example of FIG. 4 and is an explanatory diagram of hierarchical management.
- a sub-manager 3 that manages the domain A and a sub-manager 4 that manages the domain B are provided, and the two sub-managers 3 and 4 are managed by the integrated manager 2.
- the sub-managers 3 and 4 perform state monitoring polling using SNMP or the like for devices belonging to the domain in which they are in charge. Further, the sub-manager receives events such as SNMP traps from devices belonging to the domain in which the sub-manager is in charge and collects information.
- the domain A includes servers 11 and 12, a switch 51, a router 53, and a storage 55.
- a VM host program 21 operates on the server 11, and VM guest programs 41 to 44 operate on the VM host program 21.
- the domain B includes servers 14 and 15, a switch 52, a router 54, and a storage 56.
- a VM host program 22 operates on the server 14, and VM guest programs 45 to 48 operate on the VM host program 15.
- the management programs 31 to 36 shown in FIG. 4 are the same program distributed to each server, and there is no distinction between the integrated manager and the sub-manager. Further, the management program operates on all management objects without distinguishing between the integrated manager computer and the sub-manager computer. For this reason, by preparing a backup for the manager and taking over the management to the backup side when the manager stops, the load on the management of the network system can be distributed and the scalability and reliability of the system can be improved. .
- FIG. 6 is an explanatory diagram of the relationship between the server hardware and the management program.
- the management program pg10 is stored in an HDD (Hard disk drive) p13 inside the server.
- the management program pg10 is described with an overnetwork construction process pg11 in which an operation as an overlay network construction unit is described, a management target search process pg12 in which an operation as a management target search unit is described, and an operation as a management information creation unit
- the management program pg10 is read from the HDD p13 and expanded in the memory p12. Then, the CPU (Central Processing Unit) p11 sequentially executes the programs expanded in the memory, thereby causing the server to function as a management device. At this time, the communication interface p14 of the server is used as the overlay network interface in the management apparatus.
- the CPU Central Processing Unit
- FIG. 7 is an explanatory diagram of the overlay network.
- the management device or the management program When the management device or the management program is activated, it forms an overlay network.
- the overlay network construction unit m11 uses, for example, Chord of the DHT (distributed hash table) algorithm, a circular overlay network as shown in FIG. 7 is formed.
- Chord of the DHT distributed hash table
- DHT Dynamic HyperText Transfer Protocol
- a pair of key and value is held in a distributed manner at each node participating in the overlay network.
- the value hashed with SHA (Secure Hash Algorithm) -1 is used as the key.
- SHA Secure Hash Algorithm
- the key of vmhost2 is 1, the key of domain1 is 5, the key of server1 is 15, the key of server2 is 20, the key of group1 is 32, the key of user1 is 40, and the key of vmgust11 is 55.
- the key of server3 is 66, the key of vmquest12 is 70, the key of vmhost3 is 75, the key of vmquest13 is 85, and the key of vmquest14 is 90.
- the key of vmhost1 is 100, the key of switch1 is 110, the key of storage1 is 115, and the key of vmgust21 is 120.
- vmhost1 to 3 and server1 to 3 belong to domain1 and are the nodes on which the management program is executed, and are indicated by black circular symbols in FIG. Further, vmmuet, storage, switch, and the like belonging to domain1 are indicated by double circular symbols in FIG. In addition, in FIG. 7, nodes belonging to domain 2 (nodes with keys 4, 33, and 36) are indicated by shaded circular symbols.
- each node consists of the immediately preceding node, the immediately following node, and (own node key + 2 ⁇ (x-1)) mod (2 ⁇ k) (x is a natural number from 1 to k, k is key Node information) as routing information. Specifically, it has information on discrete nodes such as 1,2,4,8,16,32,64,128.
- each node can hold the value for Key in the node with the first Key greater than Key, and further obtain the value corresponding to Key from the node with the first Key greater than Key. It becomes possible.
- FIG. 8 is an explanatory diagram of a specific example of the definition of DHT (distributed hash table). This DHT corresponds to the hash table t1 in the SAN of FIG.
- FIG. 8 shows a key hashed with SHA-1 and a value associated with the key.
- server hash the server name with SHA-1 and use it as Key. And it functions as a tag “server” indicating a server, a server name, a key obtained from the server name, a list of IP addresses of the server (IP list), a list of WWNs of the server (WWN list), and a management node Manager-flag indicating whether the node is registered, secondary-manage indicating that the node is registered as a backup node, and a domain to which the server belongs and a list of domain keys as values.
- server indicating a server, a server name, a key obtained from the server name, a list of IP addresses of the server (IP list), a list of WWNs of the server (WWN list), and a management node Manager-flag indicating whether the node is registered, secondary-manage indicating that the node is registered as a backup node, and a domain to which the server belongs and a list of domain keys as values.
- VM hosts For VM hosts, hash the VM host name with SHA-1 and use it as Key. Then, a tag “vmhost” indicating the VM host, a VM host name, a key obtained from the VM host name, an IP list of the VM host, a domain to which the VM host belongs and a list of domain keys, operate on the VM host A list of VM guests is included as Value.
- VM guests For VM guests, hash the VM guest name with SHA-1 and use it as Key.
- the tag “vmguest” indicating the VM host, the VM guest name, the key obtained from the VM guest name, the IP list of the VM guest, the name and key of the VM host on which the VM guest is operating are included as Value. .
- Switch For switches, hash the switch name with SHA-1 and use it as Key.
- a tag “switch” indicating a switch, a switch name, a key obtained from the switch name, a switch IP list, a domain to which the switch belongs and a list of domain keys are included as Value.
- a tag “storage” indicating storage, a storage name, a key obtained from the storage name, a storage IP list, a storage WWN list, a domain to which the storage belongs and a list of domain keys are included as Value.
- a tag “user” indicating a user, a user name, a key obtained from the user name, a group name to which the user belongs and a list of group keys are included as Value.
- a tag “group” indicating a group, a group name, a key obtained from the group name, and a list of user names and keys belonging to the group are included as Value.
- domain For the domain, hash the domain name with SHA-1 and use it as Key.
- a tag “domain” indicating a domain, a domain name, a key obtained from the domain name, and a list of keys of domain management devices are included as Value.
- FIG. 9 is a specific example of the self-node table t2 shown in FIG.
- the self node table is a table in which information on nodes on the server on which the management program operates, that is, information on the server itself, a VM host operating on the server, a VM guest, and the like is registered.
- FIG. 9 shows a self-node table created by a management program operating on vmhost1 together with vmgusts 11-14.
- the self-node table has items of type, node name, key, IP address, and WWN.
- the type is vmhost
- the node name is vmhost1.domain1.company.com
- the key is 100
- the IP address is 10.20.30.40
- the WWN is 10: 00: 00: 60: 69: 00: 23: 74
- An entry is registered.
- an entry is registered in which the type is vmguest, the node name is vmguest11.domain1.company.com, the key is 55
- the IP address is 10.20.30.41
- the WWN is null.
- FIG. 10 is a specific example of the domain table t3 shown in FIG.
- Each management device or management program obtains a key by hashing the domain name of the domain to which the own node belongs with SHA-1, and registers it in the domain table t3. Further, in the domain table t3, in addition to the domain name and the domain key, a manager key for managing the domain is registered. As long as the management program runs on the node, an arbitrary node can manage the node as a manager, and a plurality of managers may exist in the domain.
- FIG. 11 is a specific example of the node management table t4 shown in FIG.
- the node management table t4 is management information created by a management apparatus or a management program that operates as a manager that manages the nodes in the domain, and is information on all nodes belonging to the same domain as the own node.
- the node management table t4 in FIG. 11 is a table created and held by a manager (Key100, vmhost1) that manages domain1 in the overlay network shown in FIG.
- the node management table t4 shown in FIG. 11 includes items (columns) of type, node name, key, Domain key, Manager Flag, Managed Flag, secondary-manager Key, life / death monitoring flag, and life / death monitoring notification destination.
- Manager Flag takes a value of true if the node is a manager and false if it is not a manager.
- Managed Flag takes a value of true if the node is managed and false if it is not managed.
- secondary-manager Key indicates the key of the backup node for that node.
- the life / death monitoring flag takes a value of true for a monitoring target node, false for a non-monitoring node, and NULL for a non-monitoring target.
- the life / death monitoring notification destination item indicates a notification destination key to which the monitoring result of the node is to be notified when the node operates as the monitoring node.
- the type is vmhost
- the node name is vmhost2.domain1.company.com
- Key is 1
- Domain Key is 5
- Manager Flag is false
- Managed Flag is true
- Secondary-manager Key has blank entry
- life / death monitoring flag is true
- life / death monitoring notification destination is blank.
- the node management table t4 has a type of server, a node name of server1.domain1.company.com, a key of 15, a domain key of 5, a manager flag of true, a managed flag of true, a life / death monitoring flag of false, and a secondary-
- the manager key has a blank entry, and the life / death monitoring notification destination has a blank entry.
- the node management table t4 has the type server, node name server2.domain1.company.com, Key 20, Domain Key 5, Manager Flag false, Managed Flag true, secondary-manager Key blank, life or death
- the monitoring flag has a false entry, and the life / death monitoring notification destination has a blank entry.
- the node management table t4 has the type vmguest, node name vmguest11.domain1.company.com, Key 55, Domain Key 5, Manager Flag false, Managed Flag true, secondary-manager Key blank, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- the node management table t4 has a type of server, a node name of server3.domain1.company.com, a key of 66, a domain key of 5, a manager flag of false, a managed flag of true, and a secondary-manager key of blank, life or death It has an entry in which the monitoring flag is false and the life / death monitoring notification destination is blank.
- the node management table t4 has the type vmguest, node name vmguest12.domain1.company.com, Key 70, Domain Key 5, Manager Flag false, Managed Flag true, secondary-manager Key blank, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- the node management table t4 has the type vmhost, node name vmhost3.domain1.company.com, Key 75, Domain Key 5, Manager Flag false, Managed Flag true, secondary-manager Key blank, life or death It has an entry in which the monitoring flag is false and the life / death monitoring notification destination is blank.
- the node management table t4 has the type vmguest, node name vmguest13.domain1.company.com, Key 85, Domain Key 5, Manager Flag false, Managed Flag true, secondary-manager Key blank, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- the node management table t4 has the type vmguest, node name vmguest14.domain1.company.com, Key 90, Domain Key 5, Manager Flag false, Managed Flag true, secondary-manager Key blank, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- the node management table t4 has the type vmhost, node name vmhost1.domain1.company.com, Key 100, Domain Key 5, Manager Flag true, Managed Flag true, secondary-manager Key 1, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- the node management table t4 has a type of switch, a node name of switch1.domain1.company.com, a key of 110, a domain key of 5, a manager flag of false, a managed flag of true, and a secondary-manager key of blank, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- node management table t4 has the type storage, node name storage1.domain1.company.com, key 115, domain key 5, manager flag false, managed flag true, secondary-manager key blank, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- the node management table t4 has the type vmguest, node name vmguest21.domain1.company.com, Key 120, Domain Key 5, Manager Flag false, Managed Flag true, secondary-manager Key blank, life or death It has an entry with a monitoring flag of NULL and a life / death monitoring notification destination blank.
- Key1 and vmhost2 are monitored, and the backup nodes of Key100 and vmhost1 are set as Key1 and vmhost2. Therefore, when Key100 and vmhost1 are stopped, management is taken over from Key100 and vmhost1 to Key1 and vmhost2. If Key1 and vmhost2 are stopped, Key100 and vmhost1 select a new backup node.
- FIG. 12 is a specific example of the routing table t5 shown in FIG.
- the routing table t5 is a table used by each management device and management program for routing in the overlay network.
- the routing table t5 includes a distance indicating a destination key as a final destination, a destination node name, and a destination key indicating a routing destination when communicating with the destination.
- Key and Destination IP items that are IP addresses of routing destinations.
- FIG. 12 is a specific example of a routing table used by the key 100 node.
- distance is 1
- node name is vmhost1.domain1.company.com
- Destination Key is 1
- Destination IP is a1.b1.c1.d1
- distance is 2
- node name is vmhost2.domain1. company.com
- Destination Key is 1
- Destination IP is a1.b1.c1.d1.
- the routing table t5 has a distance of 3, a node name of vmhost2.domain1.company.com, a destination key of 1, and a destination IP of a1.b1.c1.d1. Have items.
- the routing table t5 has a distance of 5, a node name of vmhost2.domain1.company.com, a Destination Key of 1, and a Destination IP of a1.b1.c1.d1. Have items.
- the routing table t5 has a distance of 9, a node name of vmhost2.domain1.company.com, a Destination Key of 1, and a Destination IP of a1.b1.c1.d1. Have items.
- the routing table t5 has a distance of 17, a node name of vmhost2.domain1.company.com, a Destination Key of 1, and a Destination IP of a1.b1.c1.d1. Have items.
- routing table t5 has items of distance 33, node name node1.domain2.company.com, Destination Key 4 and Destination IP a4.b4.c4.d4.
- the routing table t5 has items of distance 65, node name node3.domain2.company.com, Destination Key 36, Destination IP a36.b36.c36.d36.
- the routing table t5 is Key1 (IP address: a1.b1.c1.d1) when the node (key: 1, 2, 3, 5, 9, 17) belonging to the domain 1 is the destination. To route to. The routing table t5 routes to the key 4 (IP address: a4.b4.c4.d4) when the node key: 33 belonging to the domain 1 is the destination, and the node key: 65 belonging to the domain 2 is the destination. If it is a local location, routing to Key36 (IP address: a36.b36.c36.d36) is specified.
- IP address: a36.b36.c36.d36 IP address: a36.b36.c36.d36
- FIG. 13 is a flowchart for explaining the processing operation of the backup processing unit m40.
- the node selection unit m41 selects one node from the overlay network (S101), and determines whether or not it is in the same domain as the manager (S102).
- the node selection unit m41 determines whether there is sufficient capacity in the data area of the selected node (S103).
- the node selection unit m41 When there is a sufficient capacity in the data area of the selected node (S103, Yes), the node selection unit m41 operates the selected node continuously for a time equal to or greater than the threshold, that is, the selected node is continuously operated for a time equal to or greater than the threshold. (S104).
- the node selection unit m41 sets the selected node as a backup node (S105).
- the selected node is not in the same domain as the manager (S102, No)
- the data area of the selected node does not have sufficient capacity (S103, No)
- the operation time does not reach the threshold (S104, No)
- the node The selection unit m41 returns to step S101 and reselects the node. Specifically, search in order, such as Key 1, 15, 20.
- the node selection unit m41 updates the hash table t1 (S106), and copies the node management table t4, which is management information, to the backup node (S107).
- the life / death monitoring unit m30 starts life / death monitoring with the backup node (S108), and when the backup node is down (S109, Yes), returns to step S101 and newly selects the backup node.
- the management process is automatically taken over by the backup node switching processing unit m43.
- the management task can be recalled from the backup node and returned to the management node.
- a node having a longer operation time than other nodes is set as a backup node.
- the top two nodes with operation time may be used as backup nodes.
- the backup node when the management node goes down and management work is taken over by the backup node, the backup node further selects a backup node for the backup node. If the original management node does not recover after a certain period of time after the backup node takes over the management work, the backup node is promoted to the management manager, and the backup node of the backup node is promoted to the backup node. Therefore, after a predetermined time elapses, the node that was the backup node operates as the management node regardless of whether the original management node has been restored.
- the management apparatus, management method, and management program select a backup node of a management node from the nodes of the overlay network using the management range to which the node belongs, the data capacity, and the operation time as indices. Then, the management information is copied to the backup node, and when the management node stops, the backup node is switched to the management node. For this reason, the load in the management of the network system can be distributed, and the scalability and reliability can be improved.
Abstract
Description
の項目を有する。
の項目を有する。
の項目を有する。
の項目を有する。
m1 管理装置
m11 オーバーレイネットワーク構築部
m12 管理対象検索部
m13 管理情報作成部
m21 通信処理部
m22 ハッシュ処理部
m23 情報取得部
m24 通知部
m30 生死監視部
m31 購読申請部
m32 監視依頼部
m33 監視部
m34 判定部
m40 バックアップ処理部
m41 ノード選択部
m42 データ複製部
m43 切り替え処理部
t1 ハッシュテーブル
t2 セルフノードテーブル
t3 ドメインテーブル
t4 ノード管理テーブル
t5 ルーティングテーブル
p11 CPU
p12 メモリ
p13 HDD
p14 通信インタフェース
pg10 管理プログラム
pg11 オーバーレイネットワーク構築プロセス
pg12 管理対象検索プロセス
pg13 管理情報作成プロセス
pg14 生死監視プロセス
pg15 バックアップ処理プロセス
Claims (6)
- 管理対象のネットワークに対して予め定められたルールによって構築されたネットワークのノードから、ノードの属する管理範囲、データ容量、および運用時間のうち少なくともいずれかを含む複数の指標に基づいて管理ノードのバックアップノードを選択するノード選択部と、
前記バックアップノードに管理情報を複製する複製部と、
前記管理ノードが停止した場合に前記バックアップノードを管理ノードに切り替える切り替え処理部と、
を備えたことを特徴とする管理装置。 - 前記ノード選択部は、前記運用時間が閾値を満たすノードが存在しない場合に、運用時間が他のノードよりも長いノードを複数前記バックアップノードとして選択することを特徴とする請求項1に記載の管理装置。
- 前記切り替え処理部によってバックアップノードから管理ノードに切り替えられたノードは、切り替えから所定時間内に元の管理ノードが復帰した場合には元の管理ノードを管理ノードに戻し、所定時間経過後は元の管理ノードの復帰の有無に関わらず管理ノードとして動作することを特徴とする請求項1に記載の管理装置。
- 前記切り替え処理部によってバックアップノードから管理ノードに切り替えられたノードは、自ノードのバックアップノードを選択することを特徴とする請求項2に記載の管理装置。
- 管理対象のネットワークに対して予め定められたルールによって構築されたネットワークのノードから、ノードの属する管理範囲、データ容量、および運用時間のうち少なくともいずれかを含む複数の指標に基づいて管理ノードのバックアップノードを選択するステップと、
前記バックアップノードに管理情報を複製するステップと、
前記管理ノードが停止した場合に前記バックアップノードを管理ノードに切り替えるステップと、
を含んだことを特徴とする管理方法。 - 管理対象のネットワークに対して予め定められたルールによって構築されたネットワークのノードから、ノードの属する管理範囲、データ容量、および運用時間のうち少なくともいずれかを含む複数の指標に基づいて管理ノードのバックアップノードを選択する手順と、
前記バックアップノードに管理情報を複製する手順と、
前記管理ノードが停止した場合に前記バックアップノードを管理ノードに切り替える手順と、
をコンピュータに実行させることを特徴とする管理プログラム。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2011/051517 WO2012101785A1 (ja) | 2011-01-26 | 2011-01-26 | 管理装置、管理方法および管理プログラム |
EP11856713.0A EP2669808A4 (en) | 2011-01-26 | 2011-01-26 | ADMINISTRATIVE APPROACH, ADMINISTRATIVE PROCEDURES AND MANAGEMENT PROGRAM |
JP2012554572A JP5741595B2 (ja) | 2011-01-26 | 2011-01-26 | 管理装置、管理方法および管理プログラム |
US13/951,526 US20130308442A1 (en) | 2011-01-26 | 2013-07-26 | Management device and management method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2011/051517 WO2012101785A1 (ja) | 2011-01-26 | 2011-01-26 | 管理装置、管理方法および管理プログラム |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/951,526 Continuation US20130308442A1 (en) | 2011-01-26 | 2013-07-26 | Management device and management method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012101785A1 true WO2012101785A1 (ja) | 2012-08-02 |
Family
ID=46580390
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/051517 WO2012101785A1 (ja) | 2011-01-26 | 2011-01-26 | 管理装置、管理方法および管理プログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20130308442A1 (ja) |
EP (1) | EP2669808A4 (ja) |
JP (1) | JP5741595B2 (ja) |
WO (1) | WO2012101785A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7462550B2 (ja) | 2020-12-24 | 2024-04-05 | 株式会社日立製作所 | 通信監視対処装置、通信監視対処方法、及び通信監視対処システム |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8864216B2 (en) | 2013-01-18 | 2014-10-21 | Sabic Global Technologies B.V. | Reinforced body in white and method of making and using the same |
US9965363B2 (en) * | 2013-12-14 | 2018-05-08 | Netapp, Inc. | Techniques for LIF placement in SAN storage cluster synchronous disaster recovery |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006323526A (ja) * | 2005-05-17 | 2006-11-30 | Fujitsu Ltd | クラスタ管理プログラム、該プログラムを記録した記録媒体、クラスタ管理方法、ノード、およびクラスタ |
JP2008210164A (ja) * | 2007-02-27 | 2008-09-11 | Fujitsu Ltd | ジョブ管理装置、クラスタシステム、およびジョブ管理プログラム |
JP2009086741A (ja) * | 2007-09-27 | 2009-04-23 | Hitachi Ltd | 異種ノード混在の分散環境における分散処理制御方法、そのシステム及びそのプログラム |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7370223B2 (en) * | 2000-09-08 | 2008-05-06 | Goahead Software, Inc. | System and method for managing clusters containing multiple nodes |
JP2002215474A (ja) * | 2001-01-15 | 2002-08-02 | Fujitsu Ten Ltd | ネットワークデータバックアップシステム |
JP2003060715A (ja) * | 2001-08-09 | 2003-02-28 | Fujitsu Ltd | Osiトンネルルーティング方法及びその装置 |
JP2004118689A (ja) * | 2002-09-27 | 2004-04-15 | Ricoh Co Ltd | 監視システム、監視方法及びプログラム |
US7636038B1 (en) * | 2003-02-25 | 2009-12-22 | Purdue Research Foundation | Fault-tolerant timeout communication protocol with sensor integration |
US7630299B2 (en) * | 2004-12-30 | 2009-12-08 | Alcatel Lucent | Retention of a stack address during primary master failover |
US7535828B2 (en) * | 2005-03-18 | 2009-05-19 | Cisco Technology, Inc. | Algorithm for backup PE selection |
CN101860559B (zh) * | 2009-04-08 | 2014-11-05 | 中兴通讯股份有限公司 | 基于对等网络的资源信息备份操作方法及对等网络 |
US8817638B2 (en) * | 2009-07-24 | 2014-08-26 | Broadcom Corporation | Method and system for network communications utilizing shared scalable resources |
US8848513B2 (en) * | 2009-09-02 | 2014-09-30 | Qualcomm Incorporated | Seamless overlay connectivity using multi-homed overlay neighborhoods |
-
2011
- 2011-01-26 JP JP2012554572A patent/JP5741595B2/ja not_active Expired - Fee Related
- 2011-01-26 WO PCT/JP2011/051517 patent/WO2012101785A1/ja active Application Filing
- 2011-01-26 EP EP11856713.0A patent/EP2669808A4/en not_active Withdrawn
-
2013
- 2013-07-26 US US13/951,526 patent/US20130308442A1/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006323526A (ja) * | 2005-05-17 | 2006-11-30 | Fujitsu Ltd | クラスタ管理プログラム、該プログラムを記録した記録媒体、クラスタ管理方法、ノード、およびクラスタ |
JP2008210164A (ja) * | 2007-02-27 | 2008-09-11 | Fujitsu Ltd | ジョブ管理装置、クラスタシステム、およびジョブ管理プログラム |
JP2009086741A (ja) * | 2007-09-27 | 2009-04-23 | Hitachi Ltd | 異種ノード混在の分散環境における分散処理制御方法、そのシステム及びそのプログラム |
Non-Patent Citations (1)
Title |
---|
See also references of EP2669808A4 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7462550B2 (ja) | 2020-12-24 | 2024-04-05 | 株式会社日立製作所 | 通信監視対処装置、通信監視対処方法、及び通信監視対処システム |
Also Published As
Publication number | Publication date |
---|---|
US20130308442A1 (en) | 2013-11-21 |
JP5741595B2 (ja) | 2015-07-01 |
JPWO2012101785A1 (ja) | 2014-06-30 |
EP2669808A1 (en) | 2013-12-04 |
EP2669808A4 (en) | 2016-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI813743B (zh) | 在網路路由環境中的獨立資料儲存空間 | |
US10713134B2 (en) | Distributed storage and replication system and method | |
US11228483B2 (en) | Data center resource tracking | |
US8676951B2 (en) | Traffic reduction method for distributed key-value store | |
CN109819004B (zh) | 用于部署多活数据中心的方法和系统 | |
CN106059791B (zh) | 一种存储系统中业务的链路切换方法和存储设备 | |
JP4715920B2 (ja) | 設定方法および管理装置 | |
JP5664662B2 (ja) | 管理システム、管理装置、管理方法および管理プログラム | |
JP5741595B2 (ja) | 管理装置、管理方法および管理プログラム | |
WO2012004872A1 (ja) | 管理装置、管理プログラムおよび管理方法 | |
JP5408359B2 (ja) | 管理装置、管理プログラムおよび管理方法 | |
Fakhouri et al. | Gulfstream-a system for dynamic topology management in multi-domain server farms | |
Wang et al. | Resource allocation for reliable communication between controllers and switches in SDN | |
CN109462642B (zh) | 数据处理方法及装置 | |
Bouget et al. | Polystyrene: the decentralized data shape that never dies | |
TWI839379B (zh) | 在網路路由環境中的單節點和多節點資料儲存空間架構 | |
Gattermayer et al. | Using bootstraping principles of contemporary P2P file-sharing protocols in large-scale grid computing systems | |
Sattar et al. | Network Resiliency and big data applications | |
Cojocar | Replication location decisions | |
WO2016122495A1 (en) | Network switching node with state machine replication | |
CN105490903A (zh) | 一种基于总线模式的集群架构 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11856713 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2012554572 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2011856713 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011856713 Country of ref document: EP |