WO2012125167A1 - Self-organization of a satellite grid - Google Patents

Self-organization of a satellite grid Download PDF

Info

Publication number
WO2012125167A1
WO2012125167A1 PCT/US2011/028840 US2011028840W WO2012125167A1 WO 2012125167 A1 WO2012125167 A1 WO 2012125167A1 US 2011028840 W US2011028840 W US 2011028840W WO 2012125167 A1 WO2012125167 A1 WO 2012125167A1
Authority
WO
WIPO (PCT)
Prior art keywords
satellite
particular node
score
managed
rules
Prior art date
Application number
PCT/US2011/028840
Other languages
French (fr)
Inventor
Carsten Schlipf
Joern Schimmelpfeng
Stefan Bergstein
Original Assignee
Hewlett-Packard Development Company, L.P.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett-Packard Development Company, L.P. filed Critical Hewlett-Packard Development Company, L.P.
Priority to US14/002,274 priority Critical patent/US9438476B2/en
Priority to PCT/US2011/028840 priority patent/WO2012125167A1/en
Priority to CN2011800693461A priority patent/CN103444256A/en
Priority to EP11861207.6A priority patent/EP2687061A4/en
Publication of WO2012125167A1 publication Critical patent/WO2012125167A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0876Aspects of the degree of configuration automation
    • H04L41/0886Fully automatic configuration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0813Configuration setting characterised by the conditions triggering a change of settings
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/30Decision processes by autonomous network management units using voting and bidding

Definitions

  • Figure 1 illustrates an example of a system for self-organizing a satellite grid according to the present disclosure.
  • Figure 2 illustrates a block diagram illustrating an example of a method for self organizing a satellite grid according to the present disclosure.
  • Figure 3 illustrates a block diagram illustrating an example of a set of instructions to self-organize a satellite grid according to the present disclosure.
  • Figure 4 illustrates a block diagram of an example of a machine readable medium in communication with processor resources according to the present disclosure.
  • the present disclosure provides methods, machine readable media, and systems for self-organizing a satellite grid.
  • a list of a plurality of managed nodes and rules can be received by a first satellite.
  • a first claim score for a particular one of the plurality of managed nodes and the first satellite can be calculated according to the rules.
  • the first claim score can be compared with a second claim score for the particular node and a second satellite; and management of the particular node can be claimed based on the comparison.
  • a novel technique for self-organizing a satellite grid that employs satellites claiming management of nodes based on the first and second claim score calculated according to rules (e.g., node claim rules (NCRs)) can provide a management architecture that can adapt to changes in the satellite grid (e.g., an increase or decrease in the number of nodes, an increase or decrease in data events associated with nodes, an increase or decrease in the number of satellites, and failed satellites).
  • NCRs are just one example of rules, for ease of reference, "rules" will be referred to generically herein as NCRs.
  • the satellites can negotiate their configuration with each other and balance their load (e.g., management of nodes and data events sent, received, and forwarded by the nodes), minimizing communication overhead with no required user interaction.
  • FIG. 1 illustrates an example of a system 100 for self- organizing a satellite grid 102 according to the present disclosure.
  • the system can include a satellite grid 102.
  • the satellite grid 102 can be a network that includes a plurality of satellites 108-1 , 108-2, 108-3, 108-4, 108- 5, 108-6, and 108-7, referred to generally herein as satellites 108, as indicated by the triangles, and a plurality of managed nodes 110-1 , 110-2, 1 10-3, 110-4, 10-5, 1 10-6, 1 10-7, 110-8, 1 10-9, referred to generally herein as managed nodes 10, as indicated by the small circles.
  • the satellite grid can include any number of satellites even though 14 satellites are shown in Figure 1.
  • the satellite grid can include any number of nodes even though 22 nodes are shown in Figure 1 .
  • the plurality of satellites 108 in the satellite grid 102 can have equal rights and responsibilities of a certain domain. For example, if there is a failing satellite 108-6, in the plurality of satellites 108, a number of other satellites can take over the responsibilities of the failing satellite 108-6.
  • the plurality of satellites 108 can be connected to each other to allow for communication of information, for example, data events, associated with the plurality of managed nodes 1 10 that the plurality of satellites 108 manage. Data events include any information associated with services that the nodes provide.
  • the plurality of satellites 108 can be nodes that monitor the plurality of managed nodes 110 and provide management functionality over the nodes 1 10 that provide the service.
  • a satellite 108 can manage any number of nodes 1 10 and is not limited to managing 2 nodes, even though 2 nodes are shown in Figure 1 .
  • the plurality of satellites 108 can monitor the plurality of managed nodes 110 by using agents deployed on the plurality of managed nodes 1 10.
  • Agents are software programs that can prepare information and exchange information on behalf of the plurality of satellites 108 and
  • agents When agents are used for monitoring the plurality of managed nodes 110, the agents can be installed on the plurality of managed nodes 1 10 and monitored by the plurality of satellites 108 such that a one to one relationship exists between the agent and a node and also a one to one relationship exists between the agent and a satellite.
  • the plurality of satellites 108 can also monitor the plurality of managed nodes 110 by using agent-less monitoring.
  • Agent-less monitoring allows the plurality of managed nodes 110 to be monitored without installing an agent on each of the plurality of managed nodes 1 10 or the plurality of satellites 108.
  • Agent-less monitoring can be performed by the plurality of satellites 108 using a broadcast ping.
  • a ping is a program used to test whether a particular node is online, by sending an Internet control message protocol (ICMP) echo request and waiting for a response.
  • Agent-less monitoring technology such as, for example, Hewlett Packard's HP
  • SiteScope can be used to monitor the plurality of managed nodes 110.
  • Agent-less monitoring can also be performed through a port scan.
  • the port scan can send a request to the plurality of nodes 110 to test whether the plurality of nodes 110 are online based on a response received.
  • the plurality of satellites 108 can be grouped in a number of subnetworks 04.
  • the plurality of satellites 108 can be grouped by network topology (e.g. 104), however, the present disclosure is not limited to grouping satellites by a particular network topology.
  • the number of subnetworks 104 are a number of subdivisions of the network formed by the plurality of satellites 108 in the satellite grid 102.
  • a particular subnetwork 104 in the satellite grid 102 can include a domain manager 106 (e.g., satellite controller).
  • the domain manager 106 is a program which provides information (e.g., a list of the plurality of managed nodes and node claim rules) through the satellite grid 102 by using, for example, infrastructure management software such as HP Operations Manger i.
  • the domain manager 106 can provide a bridge between a user and the satellite grid 102 in the form of a user interface or web services.
  • a user can configure the plurality of satellites 108 and access and consolidate data associated with the plurality of satellites 108.
  • the domain manager 106 is shown connected to the plurality of satellites 108, it is not necessary for the domain manager 106 to be connected to the plurality of satellites 108 for the satellite grid 102 to function.
  • Each of the subnetworks 104 can further include a plurality of managed nodes 110.
  • the plurality of managed nodes 110 can be active devices attached to the network in the satellite grid 102 that can be managed by the plurality of satellites 108.
  • Examples of managed nodes include 110-1 , 110-2, 110-3, 110-4, 110-5, 110-6, 110-7, 110-8, 110-9.
  • a virtual data center can be a network of virtual machines in a virtualized environment, where the virtual machines are software implementations of a physical machine (e.g., a computer) that execute programs like a physical machine.
  • the virtualized environment can be provided on a virtualization server.
  • the virtualization server can be a physical server or a computer that has a hypervisor running on it.
  • the hypervisor is software that provides a virtualized environment in which software, including operating systems, can run with the appearance of full access to underlying system hardware, but in fact such access is under control of the hypervisor.
  • a hypervisor such as VMware ESX can be used to provide the virtualized environment.
  • servers are configured to manage nodes and data events associated with the nodes. These techniques can use, for example, several servers, configured in a manager of managers structure.
  • a manager of managers structure can consist of, for example, several servers, where each server is configured with a different hierarchy. For example, a first, second, and third server may exist, where the third server forwards data events collected by the third server to the second server and the second server forwards data events collected by the second server from the third server to the first server. In this situation, each server instance must be configured and maintained separately.
  • Examples of the present disclosure can provide a scalable and flexible management system that does not require user interaction to configure management software for adapting to changes in the nodes and the data events associated with the nodes.
  • the plurality of satellites 108 can negotiate their configuration and balance their load with each other.
  • various examples of the present disclosure can provide a management architecture that enables self-configuration and balancing of loads of the plurality of satellites 108 with no need for user interaction.
  • the plurality of satellites 108 can include a first satellite 108-1 that can configure itself automatically after startup and connection to the satellite grid (e.g., the first satellite 108-1 has acquired an internet protocol (IP) address from a dynamic host configuration protocol (DHCP)) by using a discovery process, which can involve sending a user datagram protocol (UDP) broadcast request containing the type of the first satellite 108-1 and the IP address of the first satellite 108-1.
  • IP internet protocol
  • UDP user datagram protocol
  • the discovery process can include using IP unicast or multicast to deliver a message or information (e.g., type of satellite and IP address) to a group of destination objects, in this case, the plurality of satellites 108.
  • the discovery process can be provided through services such as, for example, Jini, JGroup, or ZeroConf.
  • the domain manager 106 or one of the plurality of satellites 108 already connected to the satellite grid 102 will respond to this request with a connection attempt to the first satellite 108-1.
  • the first satellite 108-1 can receive a list of the plurality of managed nodes 110 from the domain manager 106.
  • the first satellite 108-1 can also receive NCRs from the domain manager 106.
  • the first satellite 108-1 can be manually configured with the IP address of the domain manager 106 or one of the plurality of satellites 108 already connected to the satellite grid 102 so that the first satellite can establish contact without using a broadcast or multicast.
  • Manual configuration can be performed, for example, if the subnetwork 104 of the domain manager 106 or one of the plurality of satellites 108 cannot be contacted through the UDP broadcast. Manual configuration can also be performed, for example, for a virtualized environment where the domain manager 106 is always available with a static IP address and the plurality of satellites 108 are hosted on virtualized resources with a DHCP, which are started on demand only.
  • the system can assign a group of nodes 1 10-2 , 110-7 to a group of satellites 108-3,...., 108-5 and apply the NCRs only to the group of nodes 110-2,...., 1 10-7.
  • the IP address of each node in the group of nodes 110-2,...., 110-7 can be assigned to each satellite in the group of satellites 08-3,...., 108-5.
  • Each satellite in the group of satellites 108-3, 108-5 will then apply the NCRs only to the group of nodes 110-2, ...., 110-7 whose IP addresses have been assigned to each satellite in the group of satellites 108-3,...., 108-5.
  • the plurality of satellites 108 can run NCRs on the plurality of managed nodes 1 10 and not just the group of nodes 1 10-2, ...., 110-7.
  • the first satellite 108-1 can calculate a first claim score for a particular one of the plurality of managed nodes 110 and the first satellite 108- 1 by executing a number of NCRs and compare the first claim score with a second claim score for the particular node 1 10-1 and a second satellite 108-2 that is managing the particular node 1 10-1.
  • a claim score includes a numerical value (e.g., -2, -1 , 0, 2, 4, although examples are not so limited) that is calculated by a satellite, for a node, (e.g., the first satellite 108- 1 and/or second satellite 108-2) according to the NCRs ⁇ e.g., the particular node 110-1).
  • the claim score is calculated by summing individual numerical values that are returned upon execution of the NCRs.
  • the first and second claim score can be used to determine what satellite should claim management of the node. In an example, if the first claim score between the first satellite 108-1 and the particular node 110-1 is 4 and the second claim score between the second satellite 108-2 and the particular node 1 10-1 is 1 , the first satellite 108-1 can claim management of the particular node 1 10-1 because the first claim score is greater than the second claim score.
  • NCRs include rules that take into account a number of factors. Upon execution, each NCR returns an individual numerical value (e.g., +2, +1 , +0, -1 , or -2, although these examples are not so limited) that is particular to a condition that can occur under each factor. Factors taken into account by the NCRs can include what subnetwork 104 range the particular node 1 10-1 is in, whether the particular node 1 10-1 is already managed by the plurality of satellites 108, and the average number of managed nodes managed by the plurality of satellites 108, although these examples are not so limited.
  • Each individual numerical value that is generated upon execution of the NCRs can contribute the claim score. These individual numerical values can be summed to calculate the first and/or second claim score. In an example calculation of the first claim score between the first satellite 108-1 and the particular node 1 10-1 , if the particular node 110-1 is not managed, an individual numerical value of +2 can be assigned to the particular node 10-1 by the NCRs. In this example, if the only factor considered was if the particular node 1 10-1 is managed, the claim score would be 2. However, other factors can also be considered, such as what subnetwork 104 range the particular node 110-1 is in and the average number of managed nodes managed by the plurality of satellites 108.
  • an individual numerical value of +2 may be assigned to the particular node 1 10-1 based on its subnetwork range.
  • the first claim score can then be calculated by adding the individual numerical values, +2 and +2, which results in a claim score of 4. Further examples of factors that are accounted for by the NCRs in calculating the first and second claim scores are discussed herein.
  • a factor that can contribute to the first and second claim scores can be the subnetwork 104 range of the particular node 1 10.
  • calculating the first claim score can include calculating the first claim score based on whether the particular node 110-1 is in the same subnetwork 104 range as the first satellite 108-1 in response to one of the NCRs.
  • the second claim score can be calculated based on the subnetwork 104 range of the particular node 110.
  • the second claim score can be calculated based on whether the particular node 1 10-1 is in the same subnetwork 104 range as the second satellite 108-2.
  • the subnetwork 104 range can be determined by a classful network, which is an addressing architecture that divides the address space for Internet Protocol Version 4 into five address classes.
  • the first three classes, classes A, B, and C which are used in examples herein, can define a network size of the subnetworks 104.
  • classes are identified herein as class A, class B, and class C and examples of three classes are given, examples are not so limited.
  • Individual numerical values can be assigned to each class (e.g., a value of -1 for class A, a value of +0 for class B, and a value of +2 for class C, although examples are not so limited), which can contribute to the first and second claim score. In an example, if a particular node 110 is in class C, an individual numerical value of +2 can be contributed to the first and/or second claim score.
  • calculating the first claim score can include calculating the first claim score based on whether the particular node 110-1 is managed by the second satellite 108-2 in response to one of the NCRs. In a further example, if the particular node 10-1 is not managed by the second satellite 108-2, an individual numerical value of +2 can be contributed to the first claim score. Calculating the first claim score can further include calculating the first claim score based on the subnetwork 104 range of the second satellite 108-2 if the second satellite is managing the particular node 1 10-1 in response to one of the NCRs.
  • an individual numerical value of +0 can be contributed to the first claim score.
  • an individual numerical value of -1 can be contributed to the first claim score.
  • the second claim score can also be calculated based on whether the particular node is already managed by a satellite.
  • calculating the first claim score can include calculating the first claim score based on the average number of managed nodes managed by the plurality of satellites 108 in a same subnetwork 104 range as the first satellite 108-1 in response to one of the NCRs.
  • the number of managed nodes managed by the plurality of satellites 108 in the same subnetwork 102 range as the first satellite 108-1 can be calculated.
  • the average number of managed nodes managed by the plurality of satellites 108 in the same subnetwork 104 range as the first satellite 108-1 can then be determined.
  • an individual numerical value of +1 can be assigned to the node; thereby contributing an individual numerical value of +1 to the first claim score.
  • the second claim score can also be calculated based on the average number of managed nodes managed by the plurality of satellites 108.
  • the second claim score can be calculated based on an average number of managed nodes managed by the plurality of satellites 108 in a same subnetwork 104 range as the second satellite 108-2.
  • the system 100 can resolve a conflict that is created when calculating the first and second claim score results in a same first and second score between the particular node 1 10-1 and the first satellite 08-1 and the particular node 1 10-1 and the second satellite 108-2. This can be done by determining a response time between the particular node 110-1 and the first satellite 108-1 and the particular node 110-1 and the second satellite 108-2. For example, if the second satellite 108-2 that is managing the particular node 110-1 has a lower response time than the first satellite, then an individual numerical of +1 can be added to the second claim score.
  • the first satellite 108-1 After the first satellite 108-1 has run the NCRs on all nodes managed by the satellite grid 102, it has calculated a claim score for each node that is individual to the first satellite 108-1 .
  • the first satellite 108-1 now has the information to start claiming nodes.
  • the first satellite 108-1 can claim as many nodes as the average number of managed nodes managed by the plurality of satellites 108 in the same subnetwork range as the first satellite 108-1 .
  • a comparison of the claim score with a claim score for the particular node 110-1 and a second satellite 108-2 can be performed. Based on the comparison of the claim score, the first satellite 108-1 can claim management of the particular node 1 10-1 . As discussed herein, claiming management means that a satellite 108 will take
  • the first satellite can communicate a data event from the particular node 1 10-1 to the second satellite 108-2 when the data event will have an impact on the second satellite 108-2.
  • the system 100 can claim the particular node 110-1 to be managed by the first satellite 108-1 by instructing the second satellite 108-2 managing the particular node 110-1 to release the particular node 10-1.
  • the first satellite 108-1 can then claim responsibility of the particular node 1 10-1 .
  • the first satellite 108-1 can reconfigure the agents to redirect messages to the first satellite 108-1 .
  • a problem with self-organization of a satellite grid 102 can exist when one of the plurality of the satellites 108 fails.
  • a failing satellite is represented by satellite 108-6 in Figure 1 .
  • a plurality of managed nodes 1 0-8 and 110-9 managed by the failing satellite 108-6 can be marked as unmanaged by the plurality of satellites 108- 2.
  • the plurality of satellites 08-2 in the satellite grid 102 can have failover capability and can poll between each other to detect the failing satellite 108-6.
  • the first satellite 108-1 can detect a failing satellite 108-6 based on the polling of the failing satellite 108-6.
  • Failover capability can allow the first satellite 108-1 to switch over for the failing satellite 108-6 automatically when a failing satellite is detected.
  • the plurality of satellites 108 in the satellite grid 102 poll between each other the plurality of satellites 108 can periodically check for a heartbeat on each other. As long as a heartbeat is detected, the plurality of satellites 108 will not initiate systems to take over for a particular one of the plurality of satellites 108. If an alteration in the heartbeat is detected in the failing satellite 108, at least one of the plurality of satellites 108 can immediately take over for the failing satellite 108.
  • the detection of the failing satellite can result in marking the plurality of managed nodes 1 10-8 and 110-9 managed by the failing satellite 108-6 as not managed.
  • the first satellite 108-1 can then run the NCRs on the first satellite 108-1 when the plurality of managed nodes 110-8 and 1 0-9 managed by the failing satellite 108-6 are marked as not managed on the first satellite.
  • the plurality of satellites 108 can also poll between each other to detect whether a data event from the particular node 110-1 managed by the second satellite 108-2 will have an impact on the first satellite 108-1.
  • the second satellite 108-2 can determine if a data event from the particular node 110-1 managed by the second satellite 108-2 will have an impact on the first satellite 108-1 .
  • a data event from a particular node 1 10-1 managed by the second satellite 108-2 can have an impact on the first satellite 108-1 when data events sent, received, and forwarded by the particular node 110-1 managed by the second satellite 108-2 are related to data events that are managed by the first satellite 108-1.
  • the first satellite 108-1 may be interested in Oracle related events from a particular node 1 10-1 managed by the second satellite 108-2. If the second satellite 108-2 determines that a data event from the particular node 1 10-1 managed by the second satellite 108-2 will have an impact on the first satellite 08-1 , the second satellite 108- 2 communicates the event to the first satellite 108-1. In an example, the first satellite 108-1 can then claim management of the particular node 110-1.
  • a problem with self-organization of a satellite grid 102 can exist when the number of data events increases to a point such that a satellite managing nodes associated with the data events becomes overloaded and cannot manage the data events.
  • This problem can be offset by cloning the second satellite 108-2, wherein a new satellite 108-7 is created, when a plurality of data events from the particular node 110-1 managed by the second satellite 108-2 exceed a predefined threshold.
  • the second satellite can be cloned when the satellite (e.g., second satellite 108-2) is running in a virtualized environment.
  • the second satellite 108-2 can instruct a virtualization server to clone the second satellite 108-2 to form the new satellite 108-7 and start the new satellite 108-7 on the same virtualization server or on a different virtualization server.
  • the second satellite 108-2 and the new satellite 108-7 can distribute the data events by running the NCRs on the second satellite 108-2 and the new satellite 108-7.
  • the data events can be redistributed by running the NCRs on the plurality of satellites 08, including the new satellite 108-7.
  • Figure 2 illustrates a block diagram illustrating an example of a method for self organizing a satellite grid according to the present disclosure.
  • the method includes receiving 260, with a first satellite, a list of the plurality of managed nodes.
  • the method includes receiving 262, with the first satellite, NCRs.
  • the method includes calculating 264 a claim score for a particular one of the plurality of managed nodes and the first satellite according to the NCRs.
  • the method includes comparing 266 the claim score with a claim score for the particular node and a second satellite that is managing the particular node.
  • the method also includes claiming 268 management of the particular node based on the comparison.
  • Figure 3 illustrates a block diagram illustrating an example of instructions to self-organize a satellite grid according to the present
  • the medium can receive 370, with a first satellite, a list of a plurality of managed nodes.
  • the medium can receive 372, with the first satellite, node claim rules (NCRs).
  • the medium can calculate 374 a claim score for a particular one of the plurality of managed nodes and the first satellite according to the NCRs.
  • the medium can compare 376 the claim score with a claim score for the particular node and a second satellite that is managing the particular node.
  • the medium can also claim 378 management of the particular node based on the comparison, wherein the first satellite instructs the second satellite to release the particular node.
  • FIG. 4 illustrates a block diagram of an example of a machine readable medium 480 in communication with processor resources 484 according to the present disclosure.
  • a machine e.g., a computing device
  • processor 484 resources can include one or a plurality of processors such as in a parallel processing system.
  • the machine readable medium 480 can include volatile and/or non-volatile memory such as random access memory (RAM), magnetic memory such as a hard disk, floppy disk, and/or tape memory, a solid state drive (SSD), flash memory, phase change memory, etc.
  • RAM random access memory
  • magnetic memory such as a hard disk, floppy disk, and/or tape memory
  • SSD solid state drive
  • flash memory phase change memory, etc.
  • the MRM 480 can be in communication with the processor 484 resources via a communication path 482.
  • the communication path 482 can be local or remote to a machine associated with the processor 484 resources. Examples of a local communication path 482 can include an electronic bus internal to a machine such as a computer where the MRM 480 is one of volatile, non-volatile, fixed, and/or removable storage medium in
  • Examples of such electronic buses can include Industry Standard Architecture (ISA), Peripheral Component Interconnect (PCI), Advanced Technology Attachment (ATA), Small Computer System Interface (SCSI), Universal Serial Bus (USB), among other types of electronic buses and variants thereof.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component Interconnect
  • ATA Advanced Technology Attachment
  • SCSI Small Computer System Interface
  • USB Universal Serial Bus
  • the communication path 482 can be such that the MRM 480 is remote from the processor 484 resources such as in the example of a network connection between the MRM 480 and the processor 484 resources (e.g., the communication path 482 can be a network
  • the MRM 480 may be associated with a first machine (e.g., a server) and the processor 484 resources may be associated with a second machine (e.g., a computing device).
  • the first and second machines can be in communication via a networked communication path 482.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Automation & Control Theory (AREA)
  • Radio Relay Systems (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

Methods, machine-readable media, and systems are provided for self-organization of a satellite grid (102). One method for self-organization of a satellite grid (102) includes receiving, with a first satellite (108-1), a list of a plurality of managed nodes (110-1, 110-2, 110-3, 110-4, 110-5, 110-6, 110-7, 110-8, 110-9), receiving, with the first satellite (108-1), rules, calculating a first claim score for a particular one of the plurality of managed nodes (110-1) and the first satellite (108-1) according to the rules, comparing the first claim score with a second claim score for the particular node (110-1) and a second satellite (108-2) that is managing the particular node (110-1), and claiming management of the particular node (110-1) based on the comparison.

Description

SELF-ORGANIZATION OF A SATELLITE GRID
Background
[0001] Today, businesses are starting to transform their information technology (IT) environments to cloud computing, endeavoring to provide for flexibility and scalability of services. Rather than hosting all services on dedicated servers, virtualized machines can be used to host services.
Hosting these services on virtualized machines, however, can pose
challenges related to management of information associated with the services because the amount of information associated with the services can increase or decrease rapidly.
Brief Description of the Drawings
[0002] Figure 1 illustrates an example of a system for self-organizing a satellite grid according to the present disclosure.
[0003] Figure 2 illustrates a block diagram illustrating an example of a method for self organizing a satellite grid according to the present disclosure.
[0004] Figure 3 illustrates a block diagram illustrating an example of a set of instructions to self-organize a satellite grid according to the present disclosure.
[0005] Figure 4 illustrates a block diagram of an example of a machine readable medium in communication with processor resources according to the present disclosure.
Detailed Description
[0006] The present disclosure provides methods, machine readable media, and systems for self-organizing a satellite grid. A list of a plurality of managed nodes and rules can be received by a first satellite. A first claim score for a particular one of the plurality of managed nodes and the first satellite can be calculated according to the rules. The first claim score can be compared with a second claim score for the particular node and a second satellite; and management of the particular node can be claimed based on the comparison. [0007] A novel technique for self-organizing a satellite grid that employs satellites claiming management of nodes based on the first and second claim score calculated according to rules (e.g., node claim rules (NCRs)) can provide a management architecture that can adapt to changes in the satellite grid (e.g., an increase or decrease in the number of nodes, an increase or decrease in data events associated with nodes, an increase or decrease in the number of satellites, and failed satellites). Although NCRs are just one example of rules, for ease of reference, "rules" will be referred to generically herein as NCRs. The satellites can negotiate their configuration with each other and balance their load (e.g., management of nodes and data events sent, received, and forwarded by the nodes), minimizing communication overhead with no required user interaction.
[0008] In the following detailed description of the present disclosure, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration how examples of the disclosure may be practiced. These examples are described in sufficient detail to enable those of ordinary skill in the art to practice the embodiments of this disclosure, and it is to be understood that other examples may be utilized and that process, electrical, and/or structural changes may be made without departing from the scope of the present disclosure.
[0009] The figures herein follow a numbering convention in which the first digit or digits correspond to the drawing figure number and the remaining digits identify an element or component in the drawing. Elements shown in the various figures herein can be added, exchanged, and/or eliminated so as to provide a number of additional examples of the present disclosure. In addition, the proportion and the relative scale of the elements provided in the figures are intended to illustrate the examples of the present disclosure, and should not be taken in a limiting sense.
[0010] Figure 1 illustrates an example of a system 100 for self- organizing a satellite grid 102 according to the present disclosure. The system can include a satellite grid 102. The satellite grid 102 can be a network that includes a plurality of satellites 108-1 , 108-2, 108-3, 108-4, 108- 5, 108-6, and 108-7, referred to generally herein as satellites 108, as indicated by the triangles, and a plurality of managed nodes 110-1 , 110-2, 1 10-3, 110-4, 10-5, 1 10-6, 1 10-7, 110-8, 1 10-9, referred to generally herein as managed nodes 10, as indicated by the small circles. The satellite grid can include any number of satellites even though 14 satellites are shown in Figure 1. Further, the satellite grid can include any number of nodes even though 22 nodes are shown in Figure 1 . The plurality of satellites 108 in the satellite grid 102 can have equal rights and responsibilities of a certain domain. For example, if there is a failing satellite 108-6, in the plurality of satellites 108, a number of other satellites can take over the responsibilities of the failing satellite 108-6.
[0011] The plurality of satellites 108 can be connected to each other to allow for communication of information, for example, data events, associated with the plurality of managed nodes 1 10 that the plurality of satellites 108 manage. Data events include any information associated with services that the nodes provide. The plurality of satellites 108 can be nodes that monitor the plurality of managed nodes 110 and provide management functionality over the nodes 1 10 that provide the service. A satellite 108 can manage any number of nodes 1 10 and is not limited to managing 2 nodes, even though 2 nodes are shown in Figure 1 .
[0012] The plurality of satellites 108 can monitor the plurality of managed nodes 110 by using agents deployed on the plurality of managed nodes 1 10. Agents are software programs that can prepare information and exchange information on behalf of the plurality of satellites 108 and
communicate with other agents to collectively perform these tasks. When agents are used for monitoring the plurality of managed nodes 110, the agents can be installed on the plurality of managed nodes 1 10 and monitored by the plurality of satellites 108 such that a one to one relationship exists between the agent and a node and also a one to one relationship exists between the agent and a satellite.
[0013] The plurality of satellites 108 can also monitor the plurality of managed nodes 110 by using agent-less monitoring. Agent-less monitoring allows the plurality of managed nodes 110 to be monitored without installing an agent on each of the plurality of managed nodes 1 10 or the plurality of satellites 108. Agent-less monitoring can be performed by the plurality of satellites 108 using a broadcast ping. A ping is a program used to test whether a particular node is online, by sending an Internet control message protocol (ICMP) echo request and waiting for a response. Agent-less monitoring technology, such as, for example, Hewlett Packard's HP
SiteScope can be used to monitor the plurality of managed nodes 110.
Agent-less monitoring can also be performed through a port scan. The port scan can send a request to the plurality of nodes 110 to test whether the plurality of nodes 110 are online based on a response received.
[0014] The plurality of satellites 108 can be grouped in a number of subnetworks 04. The plurality of satellites 108 can be grouped by network topology (e.g. 104), however, the present disclosure is not limited to grouping satellites by a particular network topology. The number of subnetworks 104 are a number of subdivisions of the network formed by the plurality of satellites 108 in the satellite grid 102. A particular subnetwork 104 in the satellite grid 102 can include a domain manager 106 (e.g., satellite controller). The domain manager 106 is a program which provides information (e.g., a list of the plurality of managed nodes and node claim rules) through the satellite grid 102 by using, for example, infrastructure management software such as HP Operations Manger i. The domain manager 106 can provide a bridge between a user and the satellite grid 102 in the form of a user interface or web services. By using the domain manager 106, a user can configure the plurality of satellites 108 and access and consolidate data associated with the plurality of satellites 108. Although the domain manager 106 is shown connected to the plurality of satellites 108, it is not necessary for the domain manager 106 to be connected to the plurality of satellites 108 for the satellite grid 102 to function. Each of the subnetworks 104 can further include a plurality of managed nodes 110. The plurality of managed nodes 110 can be active devices attached to the network in the satellite grid 102 that can be managed by the plurality of satellites 108. Examples of managed nodes include 110-1 , 110-2, 110-3, 110-4, 110-5, 110-6, 110-7, 110-8, 110-9.
[0015] Previous management techniques are manually configured and slow to address the changing needs of virtual data centers. A virtual data center can be a network of virtual machines in a virtualized environment, where the virtual machines are software implementations of a physical machine (e.g., a computer) that execute programs like a physical machine. The virtualized environment can be provided on a virtualization server. In an example, the virtualization server can be a physical server or a computer that has a hypervisor running on it. The hypervisor is software that provides a virtualized environment in which software, including operating systems, can run with the appearance of full access to underlying system hardware, but in fact such access is under control of the hypervisor. In an example, a hypervisor such as VMware ESX can be used to provide the virtualized environment.
[0016] Several techniques exist where servers are configured to manage nodes and data events associated with the nodes. These techniques can use, for example, several servers, configured in a manager of managers structure. A manager of managers structure can consist of, for example, several servers, where each server is configured with a different hierarchy. For example, a first, second, and third server may exist, where the third server forwards data events collected by the third server to the second server and the second server forwards data events collected by the second server from the third server to the first server. In this situation, each server instance must be configured and maintained separately.
[0017] This technique does not address the requirements associated with the changing needs of virtual data centers (e.g., an increase of services brought online, which allows for an increase in data events associated with the services). Therefore, challenges arise in managing nodes and the data events associated with the nodes because of the changing needs of virtual data centers.
[0018] Examples of the present disclosure can provide a scalable and flexible management system that does not require user interaction to configure management software for adapting to changes in the nodes and the data events associated with the nodes. The plurality of satellites 108 can negotiate their configuration and balance their load with each other. Thus, various examples of the present disclosure can provide a management architecture that enables self-configuration and balancing of loads of the plurality of satellites 108 with no need for user interaction.
[0019] The plurality of satellites 108 can include a first satellite 108-1 that can configure itself automatically after startup and connection to the satellite grid (e.g., the first satellite 108-1 has acquired an internet protocol (IP) address from a dynamic host configuration protocol (DHCP)) by using a discovery process, which can involve sending a user datagram protocol (UDP) broadcast request containing the type of the first satellite 108-1 and the IP address of the first satellite 108-1. The discovery process can include using IP unicast or multicast to deliver a message or information (e.g., type of satellite and IP address) to a group of destination objects, in this case, the plurality of satellites 108. The discovery process can be provided through services such as, for example, Jini, JGroup, or ZeroConf. The domain manager 106 or one of the plurality of satellites 108 already connected to the satellite grid 102 will respond to this request with a connection attempt to the first satellite 108-1. Upon connection, the first satellite 108-1 can receive a list of the plurality of managed nodes 110 from the domain manager 106. The first satellite 108-1 can also receive NCRs from the domain manager 106.
[0020] In an example, the first satellite 108-1 can be manually configured with the IP address of the domain manager 106 or one of the plurality of satellites 108 already connected to the satellite grid 102 so that the first satellite can establish contact without using a broadcast or multicast.
Manual configuration can be performed, for example, if the subnetwork 104 of the domain manager 106 or one of the plurality of satellites 108 cannot be contacted through the UDP broadcast. Manual configuration can also be performed, for example, for a virtualized environment where the domain manager 106 is always available with a static IP address and the plurality of satellites 108 are hosted on virtualized resources with a DHCP, which are started on demand only.
[0021] If manual configuration is performed, the system can assign a group of nodes 1 10-2 , 110-7 to a group of satellites 108-3,...., 108-5 and apply the NCRs only to the group of nodes 110-2,...., 1 10-7. In this case, the IP address of each node in the group of nodes 110-2,...., 110-7 can be assigned to each satellite in the group of satellites 08-3,...., 108-5. Each satellite in the group of satellites 108-3, 108-5 will then apply the NCRs only to the group of nodes 110-2, ...., 110-7 whose IP addresses have been assigned to each satellite in the group of satellites 108-3,...., 108-5. If manual configuration is not performed, by default, the plurality of satellites 108 can run NCRs on the plurality of managed nodes 1 10 and not just the group of nodes 1 10-2, ...., 110-7.
[0022] The first satellite 108-1 can calculate a first claim score for a particular one of the plurality of managed nodes 110 and the first satellite 108- 1 by executing a number of NCRs and compare the first claim score with a second claim score for the particular node 1 10-1 and a second satellite 108-2 that is managing the particular node 1 10-1. A claim score, as used herein, includes a numerical value (e.g., -2, -1 , 0, 2, 4, although examples are not so limited) that is calculated by a satellite, for a node, (e.g., the first satellite 108- 1 and/or second satellite 108-2) according to the NCRs {e.g., the particular node 110-1). As discussed herein, the claim score is calculated by summing individual numerical values that are returned upon execution of the NCRs.
[0023] The first and second claim score can be used to determine what satellite should claim management of the node. In an example, if the first claim score between the first satellite 108-1 and the particular node 110-1 is 4 and the second claim score between the second satellite 108-2 and the particular node 1 10-1 is 1 , the first satellite 108-1 can claim management of the particular node 1 10-1 because the first claim score is greater than the second claim score.
[0024] NCRs, as used herein, include rules that take into account a number of factors. Upon execution, each NCR returns an individual numerical value (e.g., +2, +1 , +0, -1 , or -2, although these examples are not so limited) that is particular to a condition that can occur under each factor. Factors taken into account by the NCRs can include what subnetwork 104 range the particular node 1 10-1 is in, whether the particular node 1 10-1 is already managed by the plurality of satellites 108, and the average number of managed nodes managed by the plurality of satellites 108, although these examples are not so limited.
[0025] Each individual numerical value that is generated upon execution of the NCRs can contribute the claim score. These individual numerical values can be summed to calculate the first and/or second claim score. In an example calculation of the first claim score between the first satellite 108-1 and the particular node 1 10-1 , if the particular node 110-1 is not managed, an individual numerical value of +2 can be assigned to the particular node 10-1 by the NCRs. In this example, if the only factor considered was if the particular node 1 10-1 is managed, the claim score would be 2. However, other factors can also be considered, such as what subnetwork 104 range the particular node 110-1 is in and the average number of managed nodes managed by the plurality of satellites 108. For example, an individual numerical value of +2 may be assigned to the particular node 1 10-1 based on its subnetwork range. The first claim score can then be calculated by adding the individual numerical values, +2 and +2, which results in a claim score of 4. Further examples of factors that are accounted for by the NCRs in calculating the first and second claim scores are discussed herein.
[0026] As discussed herein, a factor that can contribute to the first and second claim scores can be the subnetwork 104 range of the particular node 1 10. For example, calculating the first claim score can include calculating the first claim score based on whether the particular node 110-1 is in the same subnetwork 104 range as the first satellite 108-1 in response to one of the NCRs. In this example, the second claim score can be calculated based on the subnetwork 104 range of the particular node 110. For example, the second claim score can be calculated based on whether the particular node 1 10-1 is in the same subnetwork 104 range as the second satellite 108-2.
[0027] The subnetwork 104 range can be determined by a classful network, which is an addressing architecture that divides the address space for Internet Protocol Version 4 into five address classes. The first three classes, classes A, B, and C, which are used in examples herein, can define a network size of the subnetworks 104. Although the classes are identified herein as class A, class B, and class C and examples of three classes are given, examples are not so limited. Individual numerical values can be assigned to each class (e.g., a value of -1 for class A, a value of +0 for class B, and a value of +2 for class C, although examples are not so limited), which can contribute to the first and second claim score. In an example, if a particular node 110 is in class C, an individual numerical value of +2 can be contributed to the first and/or second claim score.
[0028] As discussed herein, another factor that can contribute to the first and second claim scores can be whether the particular node 110-1 is already managed by a satellite 08. For example, calculating the first claim score can include calculating the first claim score based on whether the particular node 110-1 is managed by the second satellite 108-2 in response to one of the NCRs. In a further example, if the particular node 10-1 is not managed by the second satellite 108-2, an individual numerical value of +2 can be contributed to the first claim score. Calculating the first claim score can further include calculating the first claim score based on the subnetwork 104 range of the second satellite 108-2 if the second satellite is managing the particular node 1 10-1 in response to one of the NCRs. If the particular node 1 10-1 is managed by the second satellite 108-2 and the second satellite 108- 2 is in the same subnetwork 104 as the first satellite 108-1 , an individual numerical value of +0 can be contributed to the first claim score. In a further example, if the particular node 1 10-1 is managed by the second satellite 108- 2 and the second satellite 08-2 is in a different subnetwork 104 than the first satellite 108-1 , an individual numerical value of -1 can be contributed to the first claim score. The second claim score can also be calculated based on whether the particular node is already managed by a satellite.
[0029] As discussed herein, another factor that can contribute to the first and second claim scores can consider an average number of managed nodes managed by the plurality of satellites 108. For example, calculating the first claim score can include calculating the first claim score based on the average number of managed nodes managed by the plurality of satellites 108 in a same subnetwork 104 range as the first satellite 108-1 in response to one of the NCRs. In determining the first claim score, the number of managed nodes managed by the plurality of satellites 108 in the same subnetwork 102 range as the first satellite 108-1 can be calculated. The average number of managed nodes managed by the plurality of satellites 108 in the same subnetwork 104 range as the first satellite 108-1 can then be determined. For a node that is managed by a satellite that is managing a number of nodes exceeding the average number of nodes managed by the plurality of satellites 108 in the same subnetwork 104 range as the first satellite 108-1 , an individual numerical value of +1 , for example, can be assigned to the node; thereby contributing an individual numerical value of +1 to the first claim score. [0030] The second claim score can also be calculated based on the average number of managed nodes managed by the plurality of satellites 108. For example, the second claim score can be calculated based on an average number of managed nodes managed by the plurality of satellites 108 in a same subnetwork 104 range as the second satellite 108-2.
[0031] The system 100 can resolve a conflict that is created when calculating the first and second claim score results in a same first and second score between the particular node 1 10-1 and the first satellite 08-1 and the particular node 1 10-1 and the second satellite 108-2. This can be done by determining a response time between the particular node 110-1 and the first satellite 108-1 and the particular node 110-1 and the second satellite 108-2. For example, if the second satellite 108-2 that is managing the particular node 110-1 has a lower response time than the first satellite, then an individual numerical of +1 can be added to the second claim score.
[0032] After the first satellite 108-1 has run the NCRs on all nodes managed by the satellite grid 102, it has calculated a claim score for each node that is individual to the first satellite 108-1 . The first satellite 108-1 now has the information to start claiming nodes. In an example, the first satellite 108-1 can claim as many nodes as the average number of managed nodes managed by the plurality of satellites 108 in the same subnetwork range as the first satellite 108-1 .
[0033] As discussed herein, a comparison of the claim score with a claim score for the particular node 110-1 and a second satellite 108-2 can be performed. Based on the comparison of the claim score, the first satellite 108-1 can claim management of the particular node 1 10-1 . As discussed herein, claiming management means that a satellite 108 will take
responsibility for management of a node 110 and data events sent, received, and forwarded by the node 110. In an example, when the first satellite claims management of the particular node 110-1 , the first satellite can communicate a data event from the particular node 1 10-1 to the second satellite 108-2 when the data event will have an impact on the second satellite 108-2.
[0034] The system 100 can claim the particular node 110-1 to be managed by the first satellite 108-1 by instructing the second satellite 108-2 managing the particular node 110-1 to release the particular node 10-1. The first satellite 108-1 can then claim responsibility of the particular node 1 10-1 . In a case where agents are used, the first satellite 108-1 can reconfigure the agents to redirect messages to the first satellite 108-1 .
[0035] A problem with self-organization of a satellite grid 102 can exist when one of the plurality of the satellites 108 fails. A failing satellite is represented by satellite 108-6 in Figure 1 . As a result of the failing satellite 108-6, a plurality of managed nodes 1 0-8 and 110-9 managed by the failing satellite 108-6 can be marked as unmanaged by the plurality of satellites 108- 2. To counter this problem, the plurality of satellites 08-2 in the satellite grid 102 can have failover capability and can poll between each other to detect the failing satellite 108-6. In an example, the first satellite 108-1 can detect a failing satellite 108-6 based on the polling of the failing satellite 108-6.
Failover capability can allow the first satellite 108-1 to switch over for the failing satellite 108-6 automatically when a failing satellite is detected. When the plurality of satellites 108 in the satellite grid 102 poll between each other, the plurality of satellites 108 can periodically check for a heartbeat on each other. As long as a heartbeat is detected, the plurality of satellites 108 will not initiate systems to take over for a particular one of the plurality of satellites 108. If an alteration in the heartbeat is detected in the failing satellite 108, at least one of the plurality of satellites 108 can immediately take over for the failing satellite 108. The detection of the failing satellite can result in marking the plurality of managed nodes 1 10-8 and 110-9 managed by the failing satellite 108-6 as not managed. In an example, the first satellite 108-1 can then run the NCRs on the first satellite 108-1 when the plurality of managed nodes 110-8 and 1 0-9 managed by the failing satellite 108-6 are marked as not managed on the first satellite.
[0036] The plurality of satellites 108 can also poll between each other to detect whether a data event from the particular node 110-1 managed by the second satellite 108-2 will have an impact on the first satellite 108-1. In an example, the second satellite 108-2 can determine if a data event from the particular node 110-1 managed by the second satellite 108-2 will have an impact on the first satellite 108-1 . A data event from a particular node 1 10-1 managed by the second satellite 108-2 can have an impact on the first satellite 108-1 when data events sent, received, and forwarded by the particular node 110-1 managed by the second satellite 108-2 are related to data events that are managed by the first satellite 108-1. For example, if the first satellite 108-1 were managing a SAP system, the first satellite 108-1 may be interested in Oracle related events from a particular node 1 10-1 managed by the second satellite 108-2. If the second satellite 108-2 determines that a data event from the particular node 1 10-1 managed by the second satellite 108-2 will have an impact on the first satellite 08-1 , the second satellite 108- 2 communicates the event to the first satellite 108-1. In an example, the first satellite 108-1 can then claim management of the particular node 110-1.
[0037] A problem with self-organization of a satellite grid 102 can exist when the number of data events increases to a point such that a satellite managing nodes associated with the data events becomes overloaded and cannot manage the data events. This problem can be offset by cloning the second satellite 108-2, wherein a new satellite 108-7 is created, when a plurality of data events from the particular node 110-1 managed by the second satellite 108-2 exceed a predefined threshold. The second satellite can be cloned when the satellite (e.g., second satellite 108-2) is running in a virtualized environment. In an example, the second satellite 108-2 can instruct a virtualization server to clone the second satellite 108-2 to form the new satellite 108-7 and start the new satellite 108-7 on the same virtualization server or on a different virtualization server.
[0038] Once the second satellite 108-2 has been cloned to form the new satellite 108-7, the second satellite 108-2 and the new satellite 108-7 can distribute the data events by running the NCRs on the second satellite 108-2 and the new satellite 108-7. In an example, the data events can be redistributed by running the NCRs on the plurality of satellites 08, including the new satellite 108-7.
[0039] Figure 2 illustrates a block diagram illustrating an example of a method for self organizing a satellite grid according to the present disclosure. The method includes receiving 260, with a first satellite, a list of the plurality of managed nodes. The method includes receiving 262, with the first satellite, NCRs. The method includes calculating 264 a claim score for a particular one of the plurality of managed nodes and the first satellite according to the NCRs. The method includes comparing 266 the claim score with a claim score for the particular node and a second satellite that is managing the particular node. The method also includes claiming 268 management of the particular node based on the comparison.
[0040] Figure 3 illustrates a block diagram illustrating an example of instructions to self-organize a satellite grid according to the present
disclosure. The medium can receive 370, with a first satellite, a list of a plurality of managed nodes. The medium can receive 372, with the first satellite, node claim rules (NCRs). The medium can calculate 374 a claim score for a particular one of the plurality of managed nodes and the first satellite according to the NCRs. The medium can compare 376 the claim score with a claim score for the particular node and a second satellite that is managing the particular node. The medium can also claim 378 management of the particular node based on the comparison, wherein the first satellite instructs the second satellite to release the particular node.
[0041] Figure 4 illustrates a block diagram of an example of a machine readable medium 480 in communication with processor resources 484 according to the present disclosure. A machine (e.g., a computing device) can include and/or receive a tangible non-transitory machine readable medium (MR ) 480 storing a set of machine readable instructions (MR!) (e.g., software) 486 for self organizing a satellite grid, as described herein. As used herein, processor 484 resources can include one or a plurality of processors such as in a parallel processing system. The machine readable medium 480 can include volatile and/or non-volatile memory such as random access memory (RAM), magnetic memory such as a hard disk, floppy disk, and/or tape memory, a solid state drive (SSD), flash memory, phase change memory, etc.
[0042] The MRM 480 can be in communication with the processor 484 resources via a communication path 482. The communication path 482 can be local or remote to a machine associated with the processor 484 resources. Examples of a local communication path 482 can include an electronic bus internal to a machine such as a computer where the MRM 480 is one of volatile, non-volatile, fixed, and/or removable storage medium in
communication with the processor 484 resources via the electronic bus.
Examples of such electronic buses can include Industry Standard Architecture (ISA), Peripheral Component Interconnect (PCI), Advanced Technology Attachment (ATA), Small Computer System Interface (SCSI), Universal Serial Bus (USB), among other types of electronic buses and variants thereof.
[0043] In other examples, the communication path 482 can be such that the MRM 480 is remote from the processor 484 resources such as in the example of a network connection between the MRM 480 and the processor 484 resources (e.g., the communication path 482 can be a network
connection). Examples of such a network connection can include a local area network (LAN), a wide area network (WAN), a personal area network (PAN), the Internet, among other examples of networks. In such examples, the MRM 480 may be associated with a first machine (e.g., a server) and the processor 484 resources may be associated with a second machine (e.g., a computing device). The first and second machines can be in communication via a networked communication path 482.
[0044] It is to be understood that the above description has been made in an illustrative fashion, and not a restrictive one. Although specific examples have been illustrated and described herein, other component arrangements, instructions, and/or device logic can be substituted for the specific examples shown.

Claims

What is claimed:
1. A method for self-organizing a satellite grid, comprising:
receiving, with a first satellite, a list of a plurality of managed nodes; receiving, with the first satellite, rules;
calculating a first claim score for a particular one of the plurality of managed nodes and the first satellite according to the rules;
comparing the first claim score with a second claim score for the particular node and a second satellite that is managing the particular node; and
claiming management of the particular node based on the comparison.
2. The method of claim 1 , wherein calculating the first claim score includes calculating the first claim score based on whether the particular node is in a same subnetwork range as the first satellite in response to one of the rules.
3. The method of claim 1 , wherein calculating the first claim score includes calculating the first claim score based on whether the particular node is managed by the second satellite in response to one of the rules.
4. The method of claim 3, wherein calculating the first claim score includes calculating the first claim score based on a subnetwork range of the second satellite managing the particular node in response to one of the rules.
5. The method of claim 1 , wherein calculating the first claim score includes calculating the first claim score based on an average number of managed nodes managed by a plurality of satellites in a same subnetwork range as the first satellite in response to one of the rules.
6. The method of claim 1 , wherein the method includes:
resolving a conflict that is created when calculating the first and second claim score results in a same first and second claim score between the particular node and the first satellite and the particular node and the second satellite;
wherein the conflict is resolved by determining a response time between the particular node and the first satellite and the particular node and the second satellite.
7. The method of claim 1 , wherein claiming the particular node to be managed by the first satellite includes the first satellite instructing the second satellite managing the particular node to release the particular node.
8. The method of claim , further comprising:
the first satellite detecting a failing satellite based on a polling of the failing satellite; and
marking a plurality of managed nodes managed by the failing satellite as not managed.
9. The method of claim 8, further comprising:
running the rules on the first satellite when the plurality of managed nodes managed by the failing satellite are marked as not managed.
10. A machine-readable non-transitory medium storing instructions for self- organizing a satellite grid executable by the computer to cause a computer to: receive, with a first satellite, a list of a plurality of managed nodes; receive, with the first satellite, rules;
calculate a first claim score for a particular one of the plurality of managed nodes and the first satellite according to the rules;
compare the first claim score with a second claim score for the particular node and a second satellite that is managing the particular node; and
claim management of the particular node based on the comparison, wherein the first satellite instructs the second satellite to release the particular node.
11. The machine-readable non-transitory medium of claim 10 wherein the instructions include instructions executable by the computer to cause the computer to determine, with the second satellite, if a data event from the particular node managed by the second satellite will have an impact on the first satellite.
12. The machine readable non-transitory medium of claim 1 wherein the instructions include instructions executable by the computer to cause the computer to communicate, with the second satellite, the data event from the particular node managed by the second satellite to the first satellite, when the particular node managed by the second satellite will have an impact on the first satellite.
13. The machine readable non-transitory medium of claim 12 wherein the instructions include instructions executable by the computer to cause the computer to:
clone the second satellite, wherein a new satellite is created, when a plurality of data events from the particular node managed by the second satellite exceed a predefined threshold; and
distribute, with the second satellite and the new satellite, the data events by running the rules on the second satellite and the new satellite.
14. A system for self-organizing a satellite grid, comprising:
a machine including processor resources; and
memory resources associated with the machine, the memory resources storing machine readable instructions that, when executed by the processor resources, cause the processor resources to:
calculate a first claim score for a particular one of a plurality of managed nodes and a first satellite by executing a number of node claim rules (NCRs);
compare the first claim score with a second claim score for the particular node and a second satellite that is managing the particular node; and
claim management of the particular node based on the comparison.
15. The system of claim 14 wherein execution of a number of NCRs returns a number of individual numerical values, wherein the number of individual numerical values are summed to calculate the claim score.
PCT/US2011/028840 2011-03-17 2011-03-17 Self-organization of a satellite grid WO2012125167A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US14/002,274 US9438476B2 (en) 2011-03-17 2011-03-17 Self-organization of a satellite grid
PCT/US2011/028840 WO2012125167A1 (en) 2011-03-17 2011-03-17 Self-organization of a satellite grid
CN2011800693461A CN103444256A (en) 2011-03-17 2011-03-17 Self-organization of a satellite grid
EP11861207.6A EP2687061A4 (en) 2011-03-17 2011-03-17 Self-organization of a satellite grid

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2011/028840 WO2012125167A1 (en) 2011-03-17 2011-03-17 Self-organization of a satellite grid

Publications (1)

Publication Number Publication Date
WO2012125167A1 true WO2012125167A1 (en) 2012-09-20

Family

ID=46831031

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/028840 WO2012125167A1 (en) 2011-03-17 2011-03-17 Self-organization of a satellite grid

Country Status (4)

Country Link
US (1) US9438476B2 (en)
EP (1) EP2687061A4 (en)
CN (1) CN103444256A (en)
WO (1) WO2012125167A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3529916A4 (en) * 2016-10-19 2020-11-18 Lockheed Martin Corporation Virtualization-enabled satellite platforms

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9917741B2 (en) * 2009-08-27 2018-03-13 Entit Software Llc Method and system for processing network activity data
US9722692B1 (en) 2016-10-19 2017-08-01 Vector Launch Inc. Statefulness among clustered satellite platforms
US10805001B2 (en) 2016-10-19 2020-10-13 Lockheed Martin Corporation State transfer among spaceborne and airborne devices
US10530468B2 (en) * 2016-10-19 2020-01-07 Vector Launch Inc. State transfer among virtualized nodes in spaceborne or airborne systems
US9819742B1 (en) 2017-07-19 2017-11-14 Vector Launch Inc. Bandwidth aware state transfer among satellite devices
US9960837B1 (en) 2017-07-19 2018-05-01 Vector Launch Inc. Pseudo-geosynchronous configurations in satellite platforms
US10069935B1 (en) 2017-07-19 2018-09-04 Vector Launch Inc. Role-specialization in clustered satellite platforms
US9998207B1 (en) 2017-07-19 2018-06-12 Vector Launch Inc. Orbital network layering in satellite platforms
US10757027B2 (en) 2017-07-19 2020-08-25 Lockheed Martin Corporation Quality of service management in a satellite platform
US10491710B2 (en) 2017-07-19 2019-11-26 Vector Launch Inc. Role-specialization in spaceborne and airborne computing platforms
US10630378B2 (en) 2018-02-09 2020-04-21 Lockheed Martin Corporation Bandwidth optimizing range adjustments among satellites
US10749959B2 (en) 2018-02-09 2020-08-18 Lockheed Martin Corporation Distributed storage management in a spaceborne or airborne environment
WO2019177631A1 (en) * 2018-03-16 2019-09-19 Vector Launch Inc. Quality of service level selection for peer satellite communications
CN109818669B (en) * 2019-01-18 2021-04-27 中国科学院空间应用工程与技术中心 Virtualization-based satellite service processing method, system and storage medium

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002057609A (en) * 2000-08-10 2002-02-22 Honda Motor Co Ltd Mobile satellite communication system
US6950855B2 (en) 2002-01-18 2005-09-27 International Business Machines Corporation Master node selection in clustered node configurations
US20050197993A1 (en) * 2003-09-12 2005-09-08 Lucent Technologies Inc. Network global expectation model for multi-tier networks
US7639646B2 (en) * 2004-03-17 2009-12-29 Qualcomm Incorporated Satellite diversity system, apparatus and method
US7526268B2 (en) * 2004-09-22 2009-04-28 Delphi Technologies, Inc. Method and system for selectively processing traffic incident information
US7593376B2 (en) * 2005-12-07 2009-09-22 Motorola, Inc. Method and apparatus for broadcast in an ad hoc network using elected broadcast relay nodes
JP4961146B2 (en) 2006-02-20 2012-06-27 株式会社日立製作所 Load balancing method and system
US7876706B2 (en) * 2006-02-28 2011-01-25 Motorola, Inc. Method and apparatus for root node selection in an ad hoc network
US7970903B2 (en) 2007-08-20 2011-06-28 Hitachi, Ltd. Storage and server provisioning for virtualized and geographically dispersed data centers
US8315237B2 (en) * 2008-10-29 2012-11-20 Google Inc. Managing and monitoring emergency services sector resources
JP2010039661A (en) 2008-08-04 2010-02-18 Fujitsu Ltd Server load distribution device, method, and program
US20100036903A1 (en) 2008-08-11 2010-02-11 Microsoft Corporation Distributed load balancer
CN101394350B (en) 2008-09-04 2011-05-18 广州杰赛科技股份有限公司 Service load balancing method for wireless mesh network
US8207890B2 (en) * 2008-10-08 2012-06-26 Qualcomm Atheros, Inc. Providing ephemeris data and clock corrections to a satellite navigation system receiver
US7738504B1 (en) * 2008-12-22 2010-06-15 The United States Of America As Represented By The Director National Security Agency Method of establishing and updating master node in computer network
US7996525B2 (en) 2008-12-31 2011-08-09 Sap Ag Systems and methods for dynamically provisioning cloud computing resources
WO2010132884A1 (en) 2009-05-15 2010-11-18 Ciso Technology, Inc. System and method for a self-organizing network
US8916122B2 (en) * 2012-01-17 2014-12-23 Mayaterials, Inc. Method of producing alkoxysilanes and precipitated silicas from biogenic silicas

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANSARI, N. ET AL.: "The performance evaluation of a new neural network based traffic management scheme for a satellite communication network", IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, 1991. GLOBECOM '91. 'COUNTDOWN TO THE NEW MILLENIUM. FEATURING A MINI-THEME ON: PERSONAL COMMUNICATIONS SERVICES, vol. 1, 2 December 1991 (1991-12-02) - 5 December 1991 (1991-12-05), pages 110 - 114, XP010042742 *
SCHNEIDER, J. ET AL.: "Distributed workflow management for large-scale grid environments", INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET, 23 January 2006 (2006-01-23) - 27 January 2006 (2006-01-27), XP010890143 *
See also references of EP2687061A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3529916A4 (en) * 2016-10-19 2020-11-18 Lockheed Martin Corporation Virtualization-enabled satellite platforms

Also Published As

Publication number Publication date
CN103444256A (en) 2013-12-11
US20130336168A1 (en) 2013-12-19
EP2687061A1 (en) 2014-01-22
EP2687061A4 (en) 2015-03-25
US9438476B2 (en) 2016-09-06

Similar Documents

Publication Publication Date Title
US9438476B2 (en) Self-organization of a satellite grid
US11895016B2 (en) Methods and apparatus to configure and manage network resources for use in network-based computing
US10609159B2 (en) Providing higher workload resiliency in clustered systems based on health heuristics
US10037237B2 (en) Method and arrangement for fault management in infrastructure as a service clouds
US9450813B2 (en) Automated host device virtual network configuration system
CA2863442C (en) Systems and methods for server cluster application virtualization
CN108259629B (en) Virtual internet protocol address switching method and device
US9348653B2 (en) Virtual machine management among networked servers
US8661287B2 (en) Automatically performing failover operations with a load balancer
US20140032753A1 (en) Computer system and node search method
JP2004537126A5 (en)
US10530634B1 (en) Two-channel-based high-availability
US11102142B2 (en) Methods and apparatus to perform dynamic load balancing for a multi-fabric environment in network-based computing
US11409621B2 (en) High availability for a shared-memory-based firewall service virtual machine
EP3788772B1 (en) On-node dhcp implementation for virtual machines
WO2021051570A1 (en) Data storage method based on distributed cluster, and related device thereof
US20170132029A1 (en) Connection Management
US20170141950A1 (en) Rescheduling a service on a node
Brenner et al. Adaptive and scalable high availability for infrastructure clouds
US20240231873A1 (en) High availability control plane node for container-based clusters
US10417093B2 (en) Methods for providing global spare data storage device management and devices thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11861207

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14002274

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE