WO2017108119A1 - Rack awareness - Google Patents

Rack awareness Download PDF

Info

Publication number
WO2017108119A1
WO2017108119A1 PCT/EP2015/081101 EP2015081101W WO2017108119A1 WO 2017108119 A1 WO2017108119 A1 WO 2017108119A1 EP 2015081101 W EP2015081101 W EP 2015081101W WO 2017108119 A1 WO2017108119 A1 WO 2017108119A1
Authority
WO
WIPO (PCT)
Prior art keywords
nodes
sdn
sddc
node
network
Prior art date
Application number
PCT/EP2015/081101
Other languages
French (fr)
Inventor
Gal SAGIE
Eshed GAL-OR
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2015/081101 priority Critical patent/WO2017108119A1/en
Priority to CN201580085446.1A priority patent/CN108475210B/en
Publication of WO2017108119A1 publication Critical patent/WO2017108119A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • H04L41/122Discovery or management of network topologies of virtualised topologies, e.g. software-defined networks [SDN] or network function virtualisation [NFV]

Definitions

  • the present invention generally relates to the field of software-defined data centre, SDDC, and specifically relates to a method and apparatus for supporting rack awareness in an SDDC and for example for supporting rack awareness for hypervisors.
  • the present invention relates further to a method for discovering network-level proximity between nodes of an SDDC, to a software-defined networking, SDN, controller for discovering network-level proximity between nodes of an SDDC, and to an SDDC.
  • the present invention relates to a computer program having a program code for performing such a method.
  • Such an environment relates to the physical world in form of for example servers, racks, and switches, as well as to the virtual world in form of the VMs.
  • the correlation of the physical world and the virtual world is essential for ensuring optimal placement, i.e. scheduling, and alignment between the computation workload and the underlying physical resources, wherein the physical resources are for example network, computation and storage resources.
  • the prior art comprises some existing methods for rack-awareness that utilize the Link Layer Discovery Protocol, LLDP.
  • Such methods have however some disadvantages. Indeed, for such a method to be running, it is necessary that the Simple Network Management Protocol, SNMP, and LLDP are enabled across the entire network. However, for security reasons, these two protocols are usually turned off. Further on, a cloud controller needs to be aware of managed information bases, MIBs, and top-of-rack, ToR, switches configuration.
  • solutions based on LLDP are problematic because LLDP is usually not enabled for security reasons, because they also need SNMP, and because the controller needs to be familiar with the id i.e. with the identification of the MIB switches. Solutions based on a discovery carried out by hand are problematic because they are error prone and they cannot be integrated into an automatic process.
  • the present invention aims to improve the state of the art.
  • the object of the present invention is to provide a method and an apparatus in the form of e.g. a software-defined networking, SDN, controller for an improved discovery.
  • the present invention particularly intends to improve the discovery of network-level proximity between nodes of a software-defined data centre, SDDC.
  • the invention also intends to improve the rack awareness.
  • the use of the present invention allows for a better scheduling placement of networking functions and tenant application VM's. Thereby, big data and data replication solutions may be improved.
  • the above-mentioned object of the present invention is achieved by the solution provided in the enclosed independent claims.
  • Advantageous implementations of the present invention are further defined in the respective dependent claims.
  • a first aspect of the present invention provides a method for discovering network-level proximity between nodes of a software-defined data centre.
  • the SDDC comprises hosts, and each host is connected to one of the nodes.
  • the method comprises discovering the network- level proximity between the nodes based on a software-defined networking, SDN, control plane protocol.
  • a cloud management system may allocate virtual machines adaptively depending on the network- level proximity.
  • the invention may be used in conjunction with a cloud management system, like e.g. Openstack, to improve the placement and/or scheduling of virtual machines, i.e. to improve the selection of the physical host running a virtual machine.
  • a cloud management system like e.g. Openstack
  • the cloud management system allocates the virtual machines on the hosts of the SDDC by:
  • the SDDC comprises an SDN controller.
  • Discovering the network-level proximity between the nodes comprises:
  • the SDN controller selects at least one node, - the SDN controller builds for each selected node a proximity group comprising the nodes that are within a given network-level proximity from the selected node. Thereby, it is possible to carry out the mapping automatically by building proximity groups reflecting the network- level proximity.
  • the SDN controller selects at least one node and builds for each selected node a proximity group iteratively until all nodes of the SDDC are part of at least one proximity group. Thereby, all nodes may be mapped.
  • each node connects to the SDN controller.
  • the SDN controller assigns a unique identifier to each selected node.
  • the SDN controller injects each selected node with a broadcast message comprising the unique identifier of the selected node and a time to live, TTL, value.
  • each node that receives the broadcast message sends an SDN control message to the SDN controller, said SDN control message comprising the unique identifier of the received broadcast message, decrements the TTL value of the received broadcast message and, if the decrements TTL value is above a discarding threshold, sends the broadcast message.
  • the network-level proximity may be precisely obtained.
  • each node connects to the SDN controller comprises: - each node receives an SDN control plane protocol pipeline defining a match rule and an action rule, the match rule relating to messages having a given header information and the action rule consisting in sending an SDN control message to the SDN controller.
  • the network- level proximity may be obtained automatically.
  • the broadcast message injected by the SDN controller comprises said given header information.
  • each node that receives the broadcast message sends an SDN control message to the SDN controller comprises:
  • the hosts of the SDDC are comprised in a plurality of racks, and each rack comprises a top-of-rack switch. At least one rack comprises a plurality of hosts running respectively a plurality of nodes in form of virtual switches, said virtual switches supporting the SDN control plane protocol. In a tenth implementation form of the method according to the first aspect as such or the first to ninth implementation form of the first aspect, the hosts of the SDDC are comprised in a plurality of racks. Each rack comprises a top-of-rack switch. At least one rack comprises a node in form of a top-of-rack switch supporting the SDN control plane protocol.
  • the network-level proximity between two nodes reflects the number of nodes between the two nodes.
  • the SDN control plane protocol may be OpenFlow.
  • the SDN control plane protocol may also be OpFlex.
  • the proposed method may also be achieved in similar manner using another SDN control plane protocol.
  • a second aspect of the present invention provides a computer program having a program code for performing the method according to the first aspect as such or the first to eleventh implementation form of the first aspect when the computer program runs on a computing device.
  • a third aspect of the present invention provides a software-defined networking, SDN, controller for discovering network-level proximity between nodes of a SDDC.
  • the SDDC comprises hosts, each host being connected to one of the nodes.
  • the SDN controller is adapted to discover the network-level proximity between the nodes based on an SDN control plane protocol.
  • SDN controller may be adapted to perform the functionality of the method according to the first aspect of the invention and its different implementation forms.
  • a fourth aspect of the present invention provides an SDDC.
  • the SDDC comprises nodes.
  • the SDDC comprises hosts, each host being connected to one of the nodes.
  • the SDDC comprises a software-defined networking, SDN, controller adapted to discover network-level proximity between the nodes.
  • the SDN controller is adapted to discover the network-level proximity between the nodes based on an SDN control plane protocol.
  • SDDC according to the fourth aspect of the invention may be adapted to perform the functionality of the method according to the first aspect of the invention and its different implementation forms.
  • the present invention may be used in the context of SDDCs.
  • the hardware abstraction layer of such SDDCs may cause sub-optimal performances.
  • the present invention for discovering network-level proximity between nodes, i.e. between virtual machines, in an SDDC is advantageous in that it allows for dynamic and physical-aware optimizations.
  • the present invention uses software-defined networking capabilities to create a logical topology of VMs that are connected on the same physical rack, i.e. on the same top -of-rack switch.
  • the invention is advantageous in that an improved placement or scheduling of virtual machines may be achieved in a cloud environment.
  • affinity scheduling i.e. for ensuring that two VMs are on the same rack
  • anti-affinity scheduling i.e. for ensuring that two VMs are not on the same rack
  • the invention may improve the resilience based on the proximity knowledge.
  • the invention may improve the power management.
  • the invention is advantageous in that it provides new data stream and API for analytic software that optimizes infrastructure utilization in the data centre.
  • the invention is advantageous in that it optimizes performance of highly-distributed software applications, like e.g. Big Data solutions, n-tier applications, Network function virtualization, NFV.
  • highly-distributed software applications like e.g. Big Data solutions, n-tier applications, Network function virtualization, NFV.
  • the invention is advantageous in that it may be leveraged by Data Replication software solutions which currently have no access to this data.
  • Fig. 1 shows a software-defined data centre according to an embodiment of the present invention.
  • Fig. 2 shows a software-defined data centre according to a further embodiment of the
  • Fig. 3 shows a method for discovering network-level proximity between nodes according to an embodiment of the present invention.
  • Fig. 4 shows an application of the method for discovering network-level proximity between nodes according to an embodiment of the present invention.
  • Fig. 5 shows a method for discovering network-level proximity between nodes according to an embodiment of the present invention.
  • Fig. 6 shows a method for discovering network-level proximity between nodes according to a further embodiment of the present invention.
  • Fig. 7 shows a method for discovering network-level proximity between nodes according to a further embodiment of the present invention.
  • Fig. 1 shows a software-defined data centre, SDDC, 100 according to an embodiment of the present invention.
  • the SDDC 100 comprises nodes, or servers, 141, 143.
  • the SDDC 100 further comprises hosts 111-118, 121-128, each host being connected to one of the nodes 141, 143.
  • the SDDC 100 comprises a software-defined networking, SDN, controller 102 adapted to discover network- level proximity between the nodes 141, 143.
  • the SDN controller 102 is adapted to discover the network- level proximity between the nodes based on an SDN control plane protocol.
  • the hosts 111-118, 121-128 may be any computer that is attached to the SDDC 100 and offers information resources, services, and/or applications to users or other hosts or nodes.
  • the hosts 111-118, 121-128 are comprised in a plurality of racks 103, 104. Each rack 103, 104 comprises a respective top-of-rack, ToR, switch 105, 106.
  • Each host 111-118, 121-128 is connected with, and may be in communication with, an attached ToR switch 105, 106.
  • the ToR switches 105, 106 are routing components.
  • the ToR switches 105, 106 may be interconnected within the SDDC 100.
  • At least one rack 103, 104 comprises a plurality of hosts 111-118, 121-128 running respectively a plurality of nodes in form of virtual switches 141, 143, wherein said virtual switches support the SDN control plane protocol.
  • Each host 111-118, 121-128 may comprise a virtual switch 141, 143 and one or more virtual machines, VMs, 107, 108, 109, 110 associated with the virtual switch 141, 143 by a virtual and/or logical data connection.
  • the virtual switch 141, 143 may be a VXLAN Tunnel End Point, VTEP, in the VXLAN network.
  • the rack 103 comprises a plurality of hosts 111-118. Each of these hosts 111-118 runs one node in form of a virtual switch 141 supporting the SDN control plane protocol.
  • the rack 104 comprises a plurality of hosts 121-128. Each of these hosts 121-128 runs one node in form of a virtual switch 143 supporting the SDN control plane protocol.
  • a hypervisor 140 is located in each of the hosts 111-118 of the rack 103, and a hypervisor 142 is located in each of the hosts 121-128 of the rack 104. Each hypervisor 140, 142 is adapted to run one or more VMs on the host 111-118, 121-128.
  • a hypervisor 140, 142 may be a piece of computer software, firmware or hardware that creates and runs the VMs.
  • Fig. 1 shows that in the rack 103 the hypervisor of host 111 runs a virtual machine 107 labelled "VM1", while the hypervisor of host 118 runs a virtual machine 108 labelled "VM2". Likewise, in the rack 104 the respective hypervisors of hosts 121, 128 run a respective virtual machine 109, 110 labelled "VM3" and "VM4".
  • the embodiment of Fig. 1 relates to a physical-to -virtual mapping of the nodes 111-118, 121- 128 in the form of a host-based mapping.
  • the SDN controller 102 is connected to all virtual switches 141, 143 using an SDN control plane protocol.
  • the connection between the SDN controller 102 and the virtual switches 141, 143 is performed in the embodiment of Fig. 1 by means of OpenFlow. Alternatively, the connection may be performed by another SDN control plane protocol, like for example OpFlex.
  • the Fig. 1 exemplary shows the connection between the SDN controller 102 and the virtual switches 141, 143 of hypervisors running on the hosts 111, 112, 113, 118, 121, 122, 123, 128 by means of dash lines 131, 132, 133, 134, 135, 136, 137, 138.
  • all virtual switches 141, 143 of all hypervisors 140, 142 running on all hosts in the SDDC 100 are connected to the SDN controller 102.
  • Fig. 2 shows a SDDC 200 according to a further embodiment of the present invention.
  • the SDDC 200 comprises nodes, or servers, 205, 206.
  • the SDDC 200 further comprises hosts 211-218, 221-228, each host being connected to one of the nodes 205, 206.
  • the SDDC 200 comprises an SDN, controller 202 adapted to discover network-level proximity between the nodes 205, 206.
  • the SDN controller 202 is adapted to discover the network-level proximity between the nodes 205, 206 based on an SDN control plane protocol.
  • the hosts 211-218, 221-228 of the SDDC 200 are comprised in a plurality of racks 203, 204, and each rack 203, 204 comprises a ToR switch 205, 206. At least one rack 203, 204 comprises a node in form of a ToR switch 205, 206 supporting the SDN control plane protocol.
  • the structure of the SDDC 200 according to the embodiment of Fig. 2 is similar to that of the SDDC 100 according to the embodiment of Fig. 1.
  • the hosts 211 -218, 221-228 are comprised in a plurality of racks 203, 204.
  • each host 211-218, 221-228 is connected with, and may be in communication with, the attached ToR switch 205, 206.
  • the ToR switches 205, 206 are routing components.
  • the ToR switches 205, 206 may be interconnected within the SDDC 200.
  • each host 211-218, 221-228 may comprise one or more VMs 207, 208, 209, 210 that may for example be associated with a virtual switch of the respective host.
  • the embodiment of Fig. 2 relates to a physical-to-virtual mapping of the nodes 211-218, 221- 228 in the form of a switch-based mapping.
  • the SDN controller 202 is connected to all ToR switches 205, 206 using an SDN control plane protocol.
  • the connection between the SDN controller 202 and the ToR switches 205, 206 is performed in the embodiment of Fig. 2 by means of OpenFlow. Alternatively, the connection may be performed by another SDN control plane protocol, like for example OpFlex.
  • the nodes supporting the SDN control plane protocol are the virtual switches 141, 143, the SDDC 200 of Fig.
  • the nodes supporting the SDN control plane protocol are physical switches in form of the ToR switches 205, 206.
  • the latter embodiment assumes that the ToR switches 205, 206 support an SDN control plane protocol like e.g. OpenFlow.
  • the method for discovering network-level proximity between nodes of a software-defined data centre according to the present invention will be described in the foUowings with regard to Figs. 3 to 7.
  • Fig. 5 shows a method 500 for discovering network-level proximity between nodes of an SDDC according to an embodiment of the present invention.
  • the method 500 is adapted to discover network-level proximity between nodes 141, 143, 205, 206 of an SDDC 100 200.
  • the SDDC 100, 200 comprises hosts 111-118, 121-128, 211- 218, 221-228, each host being connected to one of the nodes 141, 143, 205, 206.
  • the method comprises discovering 501 the network-level proximity between the nodes based on a software-defined networking, SDN, control plane protocol.
  • a cloud management system may allocate virtual machines 107, 108, 109, 110 on hosts 111-118, 121-128, 211-218, 221-228 of the SDDC 100, 200 depending on the discovered network- level proximity. Further on, the cloud management system may allocate the virtual machines 107, 108, 109, 110 on the hosts 111-118, 121-128, 211-218, 221-228 of the SDDC 100, 200 by:
  • Fig. 6 shows a method 600 for discovering network-level proximity between nodes of an SDDC according to a further embodiment of the present invention.
  • the SDDC 100, 200 comprises an SDN controller 102, 202.
  • Discovering 501 the network- level proximity between the nodes comprises:
  • the SDN controller 102, 202 selects 601 at least one node
  • the SDN controller 102, 202 builds 602 for each selected node a proximity group comprising the nodes that are within a given network-level proximity from the selected node. Particularly, the SDN controller 102, 202 selects at least one node and builds for each selected node a proximity group iteratively until all nodes 141, 143, 205, 206 of the SDDC 100, 200 are part of at least one proximity group.
  • Fig. 7 shows a method 700 for discovering network-level proximity between nodes of an SDDC according to a further embodiment of the present invention.
  • each node connects 701 to the SDN controller 102.
  • the SDN controller 102 assigns 702 a unique identifier id to each selected node.
  • the SDN controller 102 injects 703 each selected node with a broadcast message comprising the unique identifier id of the selected node and a time to live, TTL, value ⁇ .
  • each node that receives the broadcast message sends an SDN control message to the SDN controller 102, said SDN control message comprising the unique identifier id of the received broadcast message, decrements the TTL value ⁇ of the received broadcast message and, if the decrements TTL value ⁇ is above a discarding threshold, sends the broadcast message.
  • Fig. 3 shows a method 300 for discovering network-level proximity between nodes of an SDDC according to a further embodiment of the present invention.
  • each node 141, 143, 205, 206 connects to the SDN controller 102, 202. Also, each node receives an SDN control plane protocol pipeline defining a match rule and an action rule, the match rule relating to messages having a given header information and the action rule consisting in sending an SDN control message to the SDN controller 102.
  • the pipeline may be referred to as a trap pipeline because the pipeline is adapted to implement a trap for the messages having the given header information that should be captured.
  • the SDN control plane protocol pipeline may be received from the SDN controller 102.
  • the nodes to be mapped - being either the virtual switches 141, 143 or the ToR switches 205, 206 - connect to the SDN controller 102 using the SDN control plane protocol, for example OpenFlow.
  • the embodiments of Figs. 1 and 2 show that the SDDC 100, 200 comprises a specialized SDN-based application, e.g. Proximity, 101, 201.
  • the specialized SDN-based application 101, 201 connects to the SDN controller 102, 202 i.e. to the controller's API.
  • the specialized SDN-based application 101, 201 uses the SDN controller to install, on each node that is connected to the SDN controller via the SDN control plane protocol, a pipeline to trap a special packet that is proposed to be used to measure the proximity of the nodes.
  • the pipeline or trap pipeline may be an OpenFlow pipeline.
  • the trap behaves like a custom filter that only catches messages or packets of a specific kind.
  • messages according to the match rule defined by the pipeline may be catch by the node on which the pipeline is installed.
  • an OpenFlow control plane message may be sent by the node to the SDN controller 102, 202.
  • a node may send an OpenFlow control plane message, or more generally an SDN control message, to the SDN controller 102, 202.
  • the result of step 301 is a list of connected nodes 141, 143, 205, 206 - virtual switches 141, 143 or ToR switches 205, 206 - with unmapped relationship and received and installed pipeline.
  • nodes with unmapped relationship or unmapped nodes may refers to nodes connected to the SDN controller for which no topology mapping is available, i.e. for which a network-level proximity is not known.
  • the SDN controller 102, 202 selects at least one node from the unmapped nodes. Particularly, the SDN controller 102, 202 selects a sub-set of all the unmapped nodes 141, 143, 205, 206.
  • the sub-set may be a small set of nodes with regard to the set of unmapped nodes 141, 143, 205, 206.
  • the result of step 303 is a sub-set or small set of selected unmapped nodes 141, 143, 205, 206.
  • step 305 the SDN controller 102 assigns a unique identifier id to each selected node, i.e. it assigns a unique id to each node in the sub-set.
  • the result of step 304 is a sub-set of unmapped nodes, each with a unique id.
  • the SDN controller injects each selected node in the sub-set with a broadcast message, or broadcast packet, with the unique id of the respective node and a time to live, TTL, value ⁇ .
  • the controller 102, 202 injects a broadcast message that fits the trap pipeline installed on the nodes, said broadcast message being injected to each of the previously selected nodes.
  • the broadcast bit of the broadcast message or packet is assigned to "1" so as to allow the message to be flooded on all the connected nodes, i.e. on all the ports of the connected virtual switches or ToR switches.
  • all nodes that are connected to the broadcasting nodes within the preconfigured TTL receive the broadcast message.
  • the selected nodes send a broadcast message that is received by the nodes that are within the TTL.
  • the broadcast message is intercepted in the trap pipeline, i.e. in the installed SDN control plane protocol pipeline.
  • the trap pipeline i.e. the SDN control plane protocol pipeline
  • the trap pipeline is triggered by the broadcast message and runs its action rule consisting in sending an SDN control message to the SDN controller.
  • the SDN control message comprises the unique identifier id of the received or intercepted broadcast message.
  • the SDN control message may be referred to as OpenFlow control message.
  • the node that receives the broadcast message sends a SDN control message via the SDN control plane protocol pipeline, wherein the SDN control message comprises the id of the selected node that has sent the broadcast message.
  • the SDN control message also comprises the value of the TTL.
  • step 311 the SDN control message is received by the SDN controller.
  • step 312 the SDN controller matches the id comprised in the SDN control message to the selected node.
  • the SDN controller is able to build for each selected node a proximity group comprising the nodes that are within a given network-level proximity from the selected node.
  • the SDN controller receives a corresponding SDN control message comprising:
  • the SDN controller is adapted to identify all received broadcast messages that have an identical id, and an identical TTL value. By reading the receiver reference of all identified broadcast messages, it is possible to build a proximity group comprising all nodes that have received the broadcast message within said TTL value. In other words, the SDN controller designates all nodes of the same unique id and reported TLL of ( ⁇ -l) as a proximity group.
  • step 313 the SDN controller checks whether there are still unmapped nodes, i.e. whether all nodes are part of a proximity group. In case a node is still unmapped, the method continues with step 303, in which the SDN controller again selects at least one node from the unmapped nodes. The steps 303 to 313 are carried out recursively, until all nodes are part of a proximity group.
  • step 303 it is possible to select only one node, so that the steps 303 to 313 possibly have to be carried out recursively several times until all the unmapped nodes are exhausted.
  • step 303 it is possible to select several nodes at step 303, i.e. it is possible to work in batches.
  • the later solution provides a quicker mapping. Once all nodes are part of a proximity group, the discovery of the network-level proximity between the nodes is achieved.
  • Fig. 4 shows an application of the method of the present invention.
  • an SDDC comprising n nodes.
  • the SDDC comprises eleven nodes numbered from “11" to "21”.
  • the SDN randomly selects k nodes.
  • nodes "12" and “17” receive the broadcast message sent by selected node "15". The reception is shown by the arrows 411, 412.
  • nodes "12” and “17” then generate via the trap pipeline SDN control messages with the unique identification id of node “15” and TTL decreased by 1 per hop.
  • nodes "16" and “21” receive 414, 415 the broadcast message sent by selected node "19".
  • step 404 the mapped nodes "12" and “17” are placed within a proximity group 416 with the selected node “15”, and taken out of the set of unmapped nodes.
  • the mapped nodes "16" and “21” are placed within a proximity group 417 with the selected node “19”, and taken out of the set of unmapped nodes.
  • the proximity group 416 comprises the selected node “15” and all nodes that are within a distance of one hop from the selected node "15".
  • Steps 405 and 406 show how proximity groups are recursively built until all nodes are part of a proximity group, i.e. until all nodes are mapped into proximity groups.
  • step 405 node "11” is selected and node “13” receives 418 a corresponding broadcast message injected to this node "11". Also node "18” is selected and nodes "14" and “20” receive 419, 420 a corresponding broadcast message injected to this node "14".
  • Fig. 4 shows that the selected node "18” also receives the broadcast message sent by the selected node "11".
  • the SDN controller then receives a corresponding SDN control message from node "18".
  • the SDN controller interprets this situation in that the selected nodes "11” and "18" are within each other's proximity range, and belong to the same proximity group.
  • both selected nodes "11" and “18” as well as the further nodes "13", “14” and “20” are grouped into a single proximity group 422.

Abstract

The present invention relates to a method for discovering network-level proximity between nodes (141, 143, 205, 206) of a software-defined data centre, SDDC, (100, 200), wherein the SDDC (100, 200) comprises hosts (111-118, 121-128, 211-218, 221-228), each host being connected to one of the nodes (141, 143, 205, 206), the method comprising discovering the network-level proximity between the nodes based on a software-defined networking, SDN, control plane protocol.

Description

RACK AWARENESS
TECHNICAL FIELD
The present invention generally relates to the field of software-defined data centre, SDDC, and specifically relates to a method and apparatus for supporting rack awareness in an SDDC and for example for supporting rack awareness for hypervisors. The present invention relates further to a method for discovering network-level proximity between nodes of an SDDC, to a software-defined networking, SDN, controller for discovering network-level proximity between nodes of an SDDC, and to an SDDC. Finally, the present invention relates to a computer program having a program code for performing such a method.
BACKGROUND
In the world of SDDC and highly virtualized computer environments, the placement or scheduling of virtual machines, VMs, is of paramount importance for performance and scale.
Such an environment relates to the physical world in form of for example servers, racks, and switches, as well as to the virtual world in form of the VMs. The correlation of the physical world and the virtual world is essential for ensuring optimal placement, i.e. scheduling, and alignment between the computation workload and the underlying physical resources, wherein the physical resources are for example network, computation and storage resources.
Existing approaches for gaining a physical-to-virtual mapping are based on traditional network concepts and are therefore crude, inefficient, slow and very complicated to manage and maintain.
There is therefore a need to have a self-discovery capability, which is efficient and timely, as well as capable to accommodate the load on large cloud deployments, and the
scale/complexity of modern cloud data centres. The prior art comprises some existing methods for rack-awareness that utilize the Link Layer Discovery Protocol, LLDP. Such methods have however some disadvantages. Indeed, for such a method to be running, it is necessary that the Simple Network Management Protocol, SNMP, and LLDP are enabled across the entire network. However, for security reasons, these two protocols are usually turned off. Further on, a cloud controller needs to be aware of managed information bases, MIBs, and top-of-rack, ToR, switches configuration.
Other known methods are done as a point-solution integration with the data centre inventory management system, with however some disadvantages. For example, such a method does not rely on any standard. It is also subject to manual configuration, and is thus error prone. A further disadvantage is that this known solution is not able to carry out an auto -discovery. Also, the discovery is dependent on the inventory system that is not necessarily up-to-date, so that it cannot be guaranteed that the result of the discovery is correct. Further on, such an inventory system is usually provided with inadequate service availability.
To summarize, the prior art solutions are disadvantageous. E.g. solutions based on LLDP are problematic because LLDP is usually not enabled for security reasons, because they also need SNMP, and because the controller needs to be familiar with the id i.e. with the identification of the MIB switches. Solutions based on a discovery carried out by hand are problematic because they are error prone and they cannot be integrated into an automatic process.
SUMMARY
Having recognized the above-mentioned disadvantages and problems, the present invention aims to improve the state of the art. In particular, the object of the present invention is to provide a method and an apparatus in the form of e.g. a software-defined networking, SDN, controller for an improved discovery.
The present invention particularly intends to improve the discovery of network-level proximity between nodes of a software-defined data centre, SDDC. The invention also intends to improve the rack awareness. Also, the use of the present invention allows for a better scheduling placement of networking functions and tenant application VM's. Thereby, big data and data replication solutions may be improved. The above-mentioned object of the present invention is achieved by the solution provided in the enclosed independent claims. Advantageous implementations of the present invention are further defined in the respective dependent claims.
A first aspect of the present invention provides a method for discovering network-level proximity between nodes of a software-defined data centre. The SDDC comprises hosts, and each host is connected to one of the nodes. The method comprises discovering the network- level proximity between the nodes based on a software-defined networking, SDN, control plane protocol.
Thereby, the discovery of network-level proximity between nodes of an SDDC is improved since it may be done automatically. The rack awareness may be correspondingly improved. Consequently, a cloud management system may allocate virtual machines adaptively depending on the network- level proximity.
In a first implementation form of the method according to the first aspect, a cloud
management system allocates virtual machines on hosts of the SDDC depending on the discovered network- level proximity. Thereby, the needs of users of cloud management systems may be better taken into consideration. The invention may be used in conjunction with a cloud management system, like e.g. Openstack, to improve the placement and/or scheduling of virtual machines, i.e. to improve the selection of the physical host running a virtual machine.
In a second implementation form of the method according to the first implementation form of the first aspect, the cloud management system allocates the virtual machines on the hosts of the SDDC by:
- identifying nodes in such a way that the network-level proximity between the identified nodes corresponds to a desired network-level proximity, and
- allocating the virtual machines on hosts connected to the identified nodes. Thereby, the performances of virtual machines may be adapted to desired user scenarios. For example, if a user scenario requires two virtual machines being as close as possible to reduce the transmission duration between the virtual machines, then hosts connected to nodes with a low network- level proximity may be chosen. Thereby, it is also possible to guarantee VM-VM affinity rules in a cloud management system. In a third implementation form of the method according to the first aspect as such or the first or second implementation form of the first aspect, the SDDC comprises an SDN controller. Discovering the network-level proximity between the nodes comprises:
- the SDN controller selects at least one node, - the SDN controller builds for each selected node a proximity group comprising the nodes that are within a given network-level proximity from the selected node. Thereby, it is possible to carry out the mapping automatically by building proximity groups reflecting the network- level proximity.
In a fourth implementation form of the method according to the third aspect of the first aspect, the SDN controller selects at least one node and builds for each selected node a proximity group iteratively until all nodes of the SDDC are part of at least one proximity group. Thereby, all nodes may be mapped.
In a fifth implementation form of the method according to the third or fourth implementation form of the first aspect, each node connects to the SDN controller. The SDN controller assigns a unique identifier to each selected node. The SDN controller injects each selected node with a broadcast message comprising the unique identifier of the selected node and a time to live, TTL, value. Recursively, each node that receives the broadcast message sends an SDN control message to the SDN controller, said SDN control message comprising the unique identifier of the received broadcast message, decrements the TTL value of the received broadcast message and, if the decrements TTL value is above a discarding threshold, sends the broadcast message. Thereby, the network-level proximity may be precisely obtained.
In a sixth implementation form of the method according to the fifth implementation form of the first aspect, each node connects to the SDN controller comprises: - each node receives an SDN control plane protocol pipeline defining a match rule and an action rule, the match rule relating to messages having a given header information and the action rule consisting in sending an SDN control message to the SDN controller. Thereby, the network- level proximity may be obtained automatically. In a seventh implementation form of the method according to the sixth implementation form of the first aspect, the broadcast message injected by the SDN controller comprises said given header information.
In an eighth implementation form of the method according to the seventh implementation form of the first aspect, each node that receives the broadcast message sends an SDN control message to the SDN controller comprises:
- each node that receives the broadcast message checks whether the match rule is verified by checking whether the received broadcast message comprises said given header information, and, if the match rule is verified, performs the action rule. In a ninth implementation form of the method according to the first aspect as such or the first to eighth implementation form of the first aspect, the hosts of the SDDC are comprised in a plurality of racks, and each rack comprises a top-of-rack switch. At least one rack comprises a plurality of hosts running respectively a plurality of nodes in form of virtual switches, said virtual switches supporting the SDN control plane protocol. In a tenth implementation form of the method according to the first aspect as such or the first to ninth implementation form of the first aspect, the hosts of the SDDC are comprised in a plurality of racks. Each rack comprises a top-of-rack switch. At least one rack comprises a node in form of a top-of-rack switch supporting the SDN control plane protocol.
In an eleventh implementation form of the method according to the first aspect as such or the first to tenth implementation form of the first aspect, the network-level proximity between two nodes reflects the number of nodes between the two nodes.
Particularly, the SDN control plane protocol may be OpenFlow. The SDN control plane protocol may also be OpFlex. Alternatively, the proposed method may also be achieved in similar manner using another SDN control plane protocol. A second aspect of the present invention provides a computer program having a program code for performing the method according to the first aspect as such or the first to eleventh implementation form of the first aspect when the computer program runs on a computing device. A third aspect of the present invention provides a software-defined networking, SDN, controller for discovering network-level proximity between nodes of a SDDC. The SDDC comprises hosts, each host being connected to one of the nodes. The SDN controller is adapted to discover the network-level proximity between the nodes based on an SDN control plane protocol.
Further features or implementations of the SDN controller according to the third aspect of the invention may be adapted to perform the functionality of the method according to the first aspect of the invention and its different implementation forms.
A fourth aspect of the present invention provides an SDDC. The SDDC comprises nodes. The SDDC comprises hosts, each host being connected to one of the nodes. The SDDC comprises a software-defined networking, SDN, controller adapted to discover network-level proximity between the nodes. The SDN controller is adapted to discover the network-level proximity between the nodes based on an SDN control plane protocol.
Further features or implementations of the SDDC according to the fourth aspect of the invention may be adapted to perform the functionality of the method according to the first aspect of the invention and its different implementation forms.
To summarize, the present invention may be used in the context of SDDCs. The hardware abstraction layer of such SDDCs may cause sub-optimal performances. The present invention for discovering network-level proximity between nodes, i.e. between virtual machines, in an SDDC is advantageous in that it allows for dynamic and physical-aware optimizations. The present invention uses software-defined networking capabilities to create a logical topology of VMs that are connected on the same physical rack, i.e. on the same top -of-rack switch.
The invention is advantageous in that an improved placement or scheduling of virtual machines may be achieved in a cloud environment. When used for affinity scheduling, i.e. for ensuring that two VMs are on the same rack, the invention may improve the performance based on the proximity knowledge. When used for anti-affinity scheduling, i.e. for ensuring that two VMs are not on the same rack, the invention may improve the resilience based on the proximity knowledge. When used for rescheduling and/or migrating stray VMs to the same rack to shut down low-utilization infrastructure at off-peak hours, the invention may improve the power management. The invention is advantageous in that it provides new data stream and API for analytic software that optimizes infrastructure utilization in the data centre.
The invention is advantageous in that it optimizes performance of highly-distributed software applications, like e.g. Big Data solutions, n-tier applications, Network function virtualization, NFV.
The invention is advantageous in that it may be leveraged by Data Replication software solutions which currently have no access to this data.
It has to be noted that all devices, elements, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be full formed by eternal entities not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof.
BRIEF DESCRIPTION OF DRAWINGS
The above aspects and implementation forms of the present invention will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which
Fig. 1 shows a software-defined data centre according to an embodiment of the present invention.
Fig. 2 shows a software-defined data centre according to a further embodiment of the
present invention.
Fig. 3 shows a method for discovering network-level proximity between nodes according to an embodiment of the present invention. Fig. 4 shows an application of the method for discovering network-level proximity between nodes according to an embodiment of the present invention.
Fig. 5 shows a method for discovering network-level proximity between nodes according to an embodiment of the present invention.
Fig. 6 shows a method for discovering network-level proximity between nodes according to a further embodiment of the present invention.
Fig. 7 shows a method for discovering network-level proximity between nodes according to a further embodiment of the present invention.
DETAILED DESCRIPTION OF EMBODIMENTS
Fig. 1 shows a software-defined data centre, SDDC, 100 according to an embodiment of the present invention.
The SDDC 100 comprises nodes, or servers, 141, 143. The SDDC 100 further comprises hosts 111-118, 121-128, each host being connected to one of the nodes 141, 143. Further, the SDDC 100 comprises a software-defined networking, SDN, controller 102 adapted to discover network- level proximity between the nodes 141, 143.
The SDN controller 102 is adapted to discover the network- level proximity between the nodes based on an SDN control plane protocol.
In the embodiment of Fig. 1, the hosts 111-118, 121-128 may be any computer that is attached to the SDDC 100 and offers information resources, services, and/or applications to users or other hosts or nodes. The hosts 111-118, 121-128 are comprised in a plurality of racks 103, 104. Each rack 103, 104 comprises a respective top-of-rack, ToR, switch 105, 106. Each host 111-118, 121-128 is connected with, and may be in communication with, an attached ToR switch 105, 106. The ToR switches 105, 106 are routing components. The ToR switches 105, 106 may be interconnected within the SDDC 100.
According to the present invention, at least one rack 103, 104 comprises a plurality of hosts 111-118, 121-128 running respectively a plurality of nodes in form of virtual switches 141, 143, wherein said virtual switches support the SDN control plane protocol. Each host 111-118, 121-128 may comprise a virtual switch 141, 143 and one or more virtual machines, VMs, 107, 108, 109, 110 associated with the virtual switch 141, 143 by a virtual and/or logical data connection. In the case of a Virtual Extensible LAN, VXLAN, the virtual switch 141, 143 may be a VXLAN Tunnel End Point, VTEP, in the VXLAN network. Particularly, the rack 103 comprises a plurality of hosts 111-118. Each of these hosts 111-118 runs one node in form of a virtual switch 141 supporting the SDN control plane protocol. Likewise, the rack 104 comprises a plurality of hosts 121-128. Each of these hosts 121-128 runs one node in form of a virtual switch 143 supporting the SDN control plane protocol.
A hypervisor 140 is located in each of the hosts 111-118 of the rack 103, and a hypervisor 142 is located in each of the hosts 121-128 of the rack 104. Each hypervisor 140, 142 is adapted to run one or more VMs on the host 111-118, 121-128. A hypervisor 140, 142 may be a piece of computer software, firmware or hardware that creates and runs the VMs.
Fig. 1 shows that in the rack 103 the hypervisor of host 111 runs a virtual machine 107 labelled "VM1", while the hypervisor of host 118 runs a virtual machine 108 labelled "VM2". Likewise, in the rack 104 the respective hypervisors of hosts 121, 128 run a respective virtual machine 109, 110 labelled "VM3" and "VM4".
The embodiment of Fig. 1 relates to a physical-to -virtual mapping of the nodes 111-118, 121- 128 in the form of a host-based mapping. Correspondingly, the SDN controller 102 is connected to all virtual switches 141, 143 using an SDN control plane protocol. The connection between the SDN controller 102 and the virtual switches 141, 143 is performed in the embodiment of Fig. 1 by means of OpenFlow. Alternatively, the connection may be performed by another SDN control plane protocol, like for example OpFlex.
The Fig. 1 exemplary shows the connection between the SDN controller 102 and the virtual switches 141, 143 of hypervisors running on the hosts 111, 112, 113, 118, 121, 122, 123, 128 by means of dash lines 131, 132, 133, 134, 135, 136, 137, 138. Preferably, all virtual switches 141, 143 of all hypervisors 140, 142 running on all hosts in the SDDC 100 are connected to the SDN controller 102.
Fig. 2 shows a SDDC 200 according to a further embodiment of the present invention. The SDDC 200 comprises nodes, or servers, 205, 206. The SDDC 200 further comprises hosts 211-218, 221-228, each host being connected to one of the nodes 205, 206. Further, the SDDC 200 comprises an SDN, controller 202 adapted to discover network-level proximity between the nodes 205, 206. The SDN controller 202 is adapted to discover the network-level proximity between the nodes 205, 206 based on an SDN control plane protocol.
Preferably, the hosts 211-218, 221-228 of the SDDC 200 are comprised in a plurality of racks 203, 204, and each rack 203, 204 comprises a ToR switch 205, 206. At least one rack 203, 204 comprises a node in form of a ToR switch 205, 206 supporting the SDN control plane protocol.
The structure of the SDDC 200 according to the embodiment of Fig. 2 is similar to that of the SDDC 100 according to the embodiment of Fig. 1. Particularly, the hosts 211 -218, 221-228 are comprised in a plurality of racks 203, 204. Within a rack 203, 204, each host 211-218, 221-228 is connected with, and may be in communication with, the attached ToR switch 205, 206. The ToR switches 205, 206 are routing components. The ToR switches 205, 206 may be interconnected within the SDDC 200. Also, each host 211-218, 221-228 may comprise one or more VMs 207, 208, 209, 210 that may for example be associated with a virtual switch of the respective host.
The embodiment of Fig. 2 relates to a physical-to-virtual mapping of the nodes 211-218, 221- 228 in the form of a switch-based mapping. Correspondingly, the SDN controller 202 is connected to all ToR switches 205, 206 using an SDN control plane protocol. The connection between the SDN controller 202 and the ToR switches 205, 206 is performed in the embodiment of Fig. 2 by means of OpenFlow. Alternatively, the connection may be performed by another SDN control plane protocol, like for example OpFlex. While in the embodiment of Fig. 1 the nodes supporting the SDN control plane protocol are the virtual switches 141, 143, the SDDC 200 of Fig. 2 provides an embodiment where the nodes supporting the SDN control plane protocol are physical switches in form of the ToR switches 205, 206. The latter embodiment assumes that the ToR switches 205, 206 support an SDN control plane protocol like e.g. OpenFlow. The method for discovering network-level proximity between nodes of a software-defined data centre according to the present invention will be described in the foUowings with regard to Figs. 3 to 7.
Fig. 5 shows a method 500 for discovering network-level proximity between nodes of an SDDC according to an embodiment of the present invention.
The method 500 is adapted to discover network-level proximity between nodes 141, 143, 205, 206 of an SDDC 100 200. The SDDC 100, 200 comprises hosts 111-118, 121-128, 211- 218, 221-228, each host being connected to one of the nodes 141, 143, 205, 206.
The method comprises discovering 501 the network-level proximity between the nodes based on a software-defined networking, SDN, control plane protocol.
Particularly, a cloud management system may allocate virtual machines 107, 108, 109, 110 on hosts 111-118, 121-128, 211-218, 221-228 of the SDDC 100, 200 depending on the discovered network- level proximity. Further on, the cloud management system may allocate the virtual machines 107, 108, 109, 110 on the hosts 111-118, 121-128, 211-218, 221-228 of the SDDC 100, 200 by:
- identifying nodes in such a way that the network-level proximity between the identified nodes corresponds to a desired network-level proximity, and
- allocating the virtual machines on hosts connected to the identified nodes.
Fig. 6 shows a method 600 for discovering network-level proximity between nodes of an SDDC according to a further embodiment of the present invention. Specifically, the SDDC 100, 200 comprises an SDN controller 102, 202.
Discovering 501 the network- level proximity between the nodes comprises:
- the SDN controller 102, 202 selects 601 at least one node, and
- the SDN controller 102, 202 builds 602 for each selected node a proximity group comprising the nodes that are within a given network-level proximity from the selected node. Particularly, the SDN controller 102, 202 selects at least one node and builds for each selected node a proximity group iteratively until all nodes 141, 143, 205, 206 of the SDDC 100, 200 are part of at least one proximity group.
Fig. 7 shows a method 700 for discovering network-level proximity between nodes of an SDDC according to a further embodiment of the present invention.
In a first step, each node connects 701 to the SDN controller 102.
Then, it is proposed that the SDN controller 102 assigns 702 a unique identifier id to each selected node.
The SDN controller 102 injects 703 each selected node with a broadcast message comprising the unique identifier id of the selected node and a time to live, TTL, value Θ.
In a further step 704, recursively, each node that receives the broadcast message sends an SDN control message to the SDN controller 102, said SDN control message comprising the unique identifier id of the received broadcast message, decrements the TTL value Θ of the received broadcast message and, if the decrements TTL value Θ is above a discarding threshold, sends the broadcast message.
Fig. 3 shows a method 300 for discovering network-level proximity between nodes of an SDDC according to a further embodiment of the present invention.
In a first step 301, each node 141, 143, 205, 206 connects to the SDN controller 102, 202. Also, each node receives an SDN control plane protocol pipeline defining a match rule and an action rule, the match rule relating to messages having a given header information and the action rule consisting in sending an SDN control message to the SDN controller 102. The pipeline may be referred to as a trap pipeline because the pipeline is adapted to implement a trap for the messages having the given header information that should be captured. The SDN control plane protocol pipeline may be received from the SDN controller 102. Particularly, the nodes to be mapped - being either the virtual switches 141, 143 or the ToR switches 205, 206 - connect to the SDN controller 102 using the SDN control plane protocol, for example OpenFlow. In this regard, the embodiments of Figs. 1 and 2 show that the SDDC 100, 200 comprises a specialized SDN-based application, e.g. Proximity, 101, 201. The specialized SDN-based application 101, 201 connects to the SDN controller 102, 202 i.e. to the controller's API. The specialized SDN-based application 101, 201 uses the SDN controller to install, on each node that is connected to the SDN controller via the SDN control plane protocol, a pipeline to trap a special packet that is proposed to be used to measure the proximity of the nodes. The pipeline or trap pipeline may be an OpenFlow pipeline.
The trap behaves like a custom filter that only catches messages or packets of a specific kind. Particularly, messages according to the match rule defined by the pipeline may be catch by the node on which the pipeline is installed. Thereby, whenever such a message is trapped at a node, an OpenFlow control plane message may be sent by the node to the SDN controller 102, 202. In other words, if a node receives a message with a header information that corresponds to the match rule of the pipeline, then said node may send an OpenFlow control plane message, or more generally an SDN control message, to the SDN controller 102, 202. As illustrated in block 302, the result of step 301 is a list of connected nodes 141, 143, 205, 206 - virtual switches 141, 143 or ToR switches 205, 206 - with unmapped relationship and received and installed pipeline. In the context of the invention, nodes with unmapped relationship or unmapped nodes may refers to nodes connected to the SDN controller for which no topology mapping is available, i.e. for which a network-level proximity is not known.
In step 303, the SDN controller 102, 202 selects at least one node from the unmapped nodes. Particularly, the SDN controller 102, 202 selects a sub-set of all the unmapped nodes 141, 143, 205, 206. The sub-set may be a small set of nodes with regard to the set of unmapped nodes 141, 143, 205, 206. As illustrated in block 304, the result of step 303 is a sub-set or small set of selected unmapped nodes 141, 143, 205, 206.
In step 305, the SDN controller 102 assigns a unique identifier id to each selected node, i.e. it assigns a unique id to each node in the sub-set.
As illustrated in block 306, the result of step 304 is a sub-set of unmapped nodes, each with a unique id. In step 307, the SDN controller injects each selected node in the sub-set with a broadcast message, or broadcast packet, with the unique id of the respective node and a time to live, TTL, value Θ. The TTL value may be preconfigured to a given value, for example to the value θ=1. Particularly, the controller 102, 202 injects a broadcast message that fits the trap pipeline installed on the nodes, said broadcast message being injected to each of the previously selected nodes.
Particularly, the broadcast bit of the broadcast message or packet is assigned to "1" so as to allow the message to be flooded on all the connected nodes, i.e. on all the ports of the connected virtual switches or ToR switches. In step 308, all nodes that are connected to the broadcasting nodes within the preconfigured TTL receive the broadcast message. In other words, the selected nodes send a broadcast message that is received by the nodes that are within the TTL.
In step 309, the broadcast message is intercepted in the trap pipeline, i.e. in the installed SDN control plane protocol pipeline. In step 310 the trap pipeline, i.e. the SDN control plane protocol pipeline, is triggered by the broadcast message and runs its action rule consisting in sending an SDN control message to the SDN controller. The SDN control message comprises the unique identifier id of the received or intercepted broadcast message. In the case of the SDN control plane protocol being OpenFlow, the SDN control message may be referred to as OpenFlow control message. Particularly, the node that receives the broadcast message sends a SDN control message via the SDN control plane protocol pipeline, wherein the SDN control message comprises the id of the selected node that has sent the broadcast message. The SDN control message also comprises the value of the TTL.
In step 311, the SDN control message is received by the SDN controller. In step 312, the SDN controller matches the id comprised in the SDN control message to the selected node. By using the TTL information, the SDN controller is able to build for each selected node a proximity group comprising the nodes that are within a given network-level proximity from the selected node. Particularly, each time a broadcast message is intercepted in the SDN control plane protocol pipeline, the SDN controller receives a corresponding SDN control message comprising:
- the id of the selected node to which the broadcast message has been injected,
- the value of the TTL at reception of the broadcast message, and - a receiver reference to the node that has received the broadcast message.
The SDN controller is adapted to identify all received broadcast messages that have an identical id, and an identical TTL value. By reading the receiver reference of all identified broadcast messages, it is possible to build a proximity group comprising all nodes that have received the broadcast message within said TTL value. In other words, the SDN controller designates all nodes of the same unique id and reported TLL of (θ-l) as a proximity group.
In step 313, the SDN controller checks whether there are still unmapped nodes, i.e. whether all nodes are part of a proximity group. In case a node is still unmapped, the method continues with step 303, in which the SDN controller again selects at least one node from the unmapped nodes. The steps 303 to 313 are carried out recursively, until all nodes are part of a proximity group.
At step 303, it is possible to select only one node, so that the steps 303 to 313 possibly have to be carried out recursively several times until all the unmapped nodes are exhausted.
Alternatively, it is possible to select several nodes at step 303, i.e. it is possible to work in batches. The later solution provides a quicker mapping. Once all nodes are part of a proximity group, the discovery of the network-level proximity between the nodes is achieved.
Fig. 4 shows an application of the method of the present invention.
In step 401, an SDDC comprising n nodes is provided. In the embodiment of Fig. 4, the SDDC comprises eleven nodes numbered from "11" to "21". In step 402, the SDN randomly selects k nodes. In Fig. 4, the number of selected nodes is k=2, the first selected node being node "15" and the second selected node being node "19". Each selected node is injected with a broadcast message comprising the unique identification of the selected node, and the preconfigured TTL value Θ, e.g. θ=1. In step 403, nodes "12" and "17" receive the broadcast message sent by selected node "15". The reception is shown by the arrows 411, 412. These two nodes "12" and "17" then generate via the trap pipeline SDN control messages with the unique identification id of node "15" and TTL decreased by 1 per hop. Likewise, nodes "16" and "21" receive 414, 415 the broadcast message sent by selected node "19".
In step 404, the mapped nodes "12" and "17" are placed within a proximity group 416 with the selected node "15", and taken out of the set of unmapped nodes. Likewise the mapped nodes "16" and "21" are placed within a proximity group 417 with the selected node "19", and taken out of the set of unmapped nodes. The proximity group 416 comprises the selected node "15" and all nodes that are within a distance of one hop from the selected node "15". For building this proximity group 416, the SDN controller correspondingly only looks at the SDN control messages comprising the id of the selected node "15" and comprising a TTL=(9-1).
Steps 405 and 406 show how proximity groups are recursively built until all nodes are part of a proximity group, i.e. until all nodes are mapped into proximity groups. In step 405, node "11" is selected and node "13" receives 418 a corresponding broadcast message injected to this node "11". Also node "18" is selected and nodes "14" and "20" receive 419, 420 a corresponding broadcast message injected to this node "14".
Such a situation would result in two new proximity groups comprising on the one hand the nodes "11" and "13", and on the other hand the nodes "18", "14" and "20". However, Fig. 4 shows that the selected node "18" also receives the broadcast message sent by the selected node "11". The SDN controller then receives a corresponding SDN control message from node "18". The SDN controller interprets this situation in that the selected nodes "11" and "18" are within each other's proximity range, and belong to the same proximity group.
Correspondingly, both selected nodes "11" and "18" as well as the further nodes "13", "14" and "20" are grouped into a single proximity group 422.
So, in case a first randomly selected node returns an SDN control message with the unique identification id of another second selected node, the two proximity groups built for the first and second selected nodes shall be grouped into one single proximity group. The present invention has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word "comprising" does not exclude other elements or steps and the indefinite article "a" or "an" does not exclude a plurality. A single element or other unit may fulfil the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Claims

1. Method for discovering network-level proximity between nodes (141, 143, 205, 206) of a software-defined data centre, SDDC, (100, 200),
wherein the SDDC (100, 200) comprises hosts (111-118, 121-128, 211-218, 221-228), each host being connected to one of the nodes (141, 143, 205, 206)
the method comprising:
discovering the network-level proximity between the nodes based on a software- defined networking, SDN, control plane protocol.
2. Method according to claim 1,
wherein a cloud management system allocates virtual machines (107, 108, 109, 110) on hosts (111-118, 121-128, 211-218, 221-228) of the SDDC (100, 200) depending on the discovered network- level proximity.
3. Method according to claim 2,
wherein the cloud management system allocates the virtual machines (107, 108, 109, 110) on the hosts (111-118, 121-128, 211-218, 221-228) of the SDDC (100, 200) by:
- identifying nodes in such a way that the network- level proximity between the identified nodes corresponds to a desired network-level proximity, and
- allocating the virtual machines on hosts connected to the identified nodes.
4. Method according to any of the preceding claims,
wherein the SDDC (100, 200) comprises an SDN controller (102, 202), and
wherein discovering the network-level proximity between the nodes comprises:
- the SDN controller (102, 202) selects (303) at least one node,
- the SDN controller (102, 202) builds for each selected node a proximity group comprising the nodes that are within a given network-level proximity from the selected node.
5. Method according to claim 4,
wherein the SDN controller (102) selects (303) at least one node and builds for each selected node a proximity group iteratively until all nodes (141, 143, 205, 206) of the SDDC (100, 200) are part of at least one proximity group.
6. Method according to claim 4 or 5,
wherein
- each node connects (301) to the SDN controller (102),
- the SDN controller (102) assigns (305) a unique identifier (id) to each selected node,
- the SDN controller (102) injects (307) each selected node with a broadcast message comprising the unique identifier (id) of the selected node and a time to live, TTL, value (Θ),
- recursively, each node that receives the broadcast message sends an SDN control message to the SDN controller (102), said SDN control message comprising the unique identifier (id) of the received broadcast message, decrements the TTL value (Θ) of the received broadcast message and, if the decrements TTL value (Θ) is above a discarding threshold, sends the broadcast message.
7. Method according to claim 6,
wherein each node connects (301) to the SDN controller (102) comprises:
- each node receives an SDN control plane protocol pipeline defining a match rule and an action rule, the match rule relating to messages having a given header information and the action rule consisting in sending an SDN control message to the SDN controller (102).
8. Method according to claim 7,
wherein the broadcast message injected by the SDN controller (102) comprises said given header information.
9. Method according to claim 8,
wherein each node that receives the broadcast message sends an SDN control message to the SDN controller (102) comprises:
- each node that receives the broadcast message checks whether the match rule is verified by checking whether the received broadcast message comprises said given header information, and, if the match rule is verified, performs the action rule.
10. Method according to any of the preceding claims,
wherein the hosts of the SDDC (100, 200) are comprised in a plurality of racks (103, 104, 203, 204), and each rack (103, 104, 203, 204) comprises a top-of-rack switch (105, 106, 205, 206), wherein at least one rack (103, 104) comprises a plurality of hosts (111-118, 121-128) running respectively a plurality of nodes in form of virtual switches (141, 143), said virtual switches supporting the SDN control plane protocol.
11. Method according to any of the preceding claims,
wherein the hosts of the SDDC (100, 200) are comprised in a plurality of racks (103, 104, 203, 204), and each rack (103, 104, 203, 204) comprises a top-of-rack switch (105, 106, 205, 206),
wherein at least one rack (203, 204) comprises a node in form of a top-of-rack switch (205, 206) supporting the SDN control plane protocol.
12. Method according to any of the preceding claims,
wherein the network- level proximity between two nodes reflects the number of nodes between the two nodes.
13. Computer program having a program code for performing the method according to any of the preceding claims, when the computer program runs on a computing device.
14. Software-defined networking, SDN, controller (102, 202) for discovering network-level proximity between nodes (141, 143, 205, 206) of a software-defined data centre, SDDC, (100, 200),
wherein the SDDC (100, 200) comprises hosts (111-118, 121-128, 211-218, 221-228), each host being connected to one of the nodes (141, 143, 205, 206)
wherein the SDN controller (102, 202) is adapted to discover the network- level proximity between the nodes based on an SDN control plane protocol.
15. Software-defined data centre, SDDC, (100, 200), comprising:
- nodes (141, 143, 205, 206),
- hosts (111-118, 121-128, 211-218, 221-228), each host being connected to one of the nodes (141, 143, 205, 206), and
- a software-defined networking, SDN, controller (102, 202) adapted to discover network- level proximity between the nodes (141, 143, 205, 206),
wherein the SDN controller (102, 202) is adapted to discover the network- level proximity between the nodes based on an SDN control plane protocol.
PCT/EP2015/081101 2015-12-23 2015-12-23 Rack awareness WO2017108119A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/EP2015/081101 WO2017108119A1 (en) 2015-12-23 2015-12-23 Rack awareness
CN201580085446.1A CN108475210B (en) 2015-12-23 2015-12-23 Rack sensing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2015/081101 WO2017108119A1 (en) 2015-12-23 2015-12-23 Rack awareness

Publications (1)

Publication Number Publication Date
WO2017108119A1 true WO2017108119A1 (en) 2017-06-29

Family

ID=54979702

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2015/081101 WO2017108119A1 (en) 2015-12-23 2015-12-23 Rack awareness

Country Status (2)

Country Link
CN (1) CN108475210B (en)
WO (1) WO2017108119A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11469958B1 (en) * 2021-02-25 2022-10-11 Juniper Networks, Inc. Network controller deployment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120236761A1 (en) * 2011-03-15 2012-09-20 Futurewei Technologies, Inc. Systems and Methods for Automatic Rack Detection
US20140052973A1 (en) * 2012-08-14 2014-02-20 Alcatel-Lucent India Limited Method And Apparatus For Providing Traffic Re-Aware Slot Placement
US20140223122A1 (en) * 2013-02-06 2014-08-07 International Business Machines Corporation Managing virtual machine placement in a virtualized computing environment

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7574499B1 (en) * 2000-07-19 2009-08-11 Akamai Technologies, Inc. Global traffic management system using IP anycast routing and dynamic load-balancing
US8111674B2 (en) * 2006-12-07 2012-02-07 Cisco Technology, Inc. Maintaining network availability for wireless clients in a wireless local area network
CN101299726B (en) * 2008-06-30 2011-04-20 中兴通讯股份有限公司 Method for calculating forwarding shortest path
US9137209B1 (en) * 2008-12-10 2015-09-15 Amazon Technologies, Inc. Providing local secure network access to remote services
US20140004839A1 (en) * 2012-06-29 2014-01-02 Frederick P. Block Proximity based transfer
EP2879049A4 (en) * 2013-10-23 2015-07-15 Huawei Tech Co Ltd Method and device for creating virtual machine
CN104506435B (en) * 2014-12-12 2018-05-18 杭州华为数字技术有限公司 Shortest path in SDN controllers and SDN determines method
CN104618475B (en) * 2015-01-28 2018-10-30 清华大学 Horizontal direction communication means and SDN systems for isomery SDN network
CN104734954B (en) * 2015-03-27 2019-05-10 华为技术有限公司 A kind of route determining methods and device for software defined network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120236761A1 (en) * 2011-03-15 2012-09-20 Futurewei Technologies, Inc. Systems and Methods for Automatic Rack Detection
US20140052973A1 (en) * 2012-08-14 2014-02-20 Alcatel-Lucent India Limited Method And Apparatus For Providing Traffic Re-Aware Slot Placement
US20140223122A1 (en) * 2013-02-06 2014-08-07 International Business Machines Corporation Managing virtual machine placement in a virtualized computing environment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAOQIAO MENG ET AL: "Improving the Scalability of Data Center Networks with Traffic-aware Virtual Machine Placement", INFOCOM, 2010 PROCEEDINGS IEEE, IEEE, PISCATAWAY, NJ, USA, 14 March 2010 (2010-03-14), pages 1 - 9, XP031674817, ISBN: 978-1-4244-5836-3 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11469958B1 (en) * 2021-02-25 2022-10-11 Juniper Networks, Inc. Network controller deployment

Also Published As

Publication number Publication date
CN108475210A (en) 2018-08-31
CN108475210B (en) 2021-05-11

Similar Documents

Publication Publication Date Title
US11695583B2 (en) System and method to provide homogeneous fabric attributes to reduce the need for SA access in a high performance computing environment
US11336716B2 (en) System and method for supporting heterogeneous and asymmetric dual rail fabric configurations in a high performance computing environment
US10375015B2 (en) Methods and system for allocating an IP address for an instance in a network function virtualization (NFV) system
US20200145330A1 (en) System and method for supporting inter-subnet control plane protocol for ensuring consistent path records in a high performance computing environment
US10630583B2 (en) System and method for supporting multiple lids for dual-port virtual routers in a high performance computing environment
US9294349B2 (en) Host traffic driven network orchestration within data center fabric
US10313272B2 (en) System and method for providing an infiniband network device having a vendor-specific attribute that contains a signature of the vendor in a high-performance computing environment
US11405229B2 (en) System and method to provide explicit multicast local identifier assignment for per-partition default multicast local identifiers defined as subnet manager policy input in a high performance computing environment
US20100287262A1 (en) Method and system for guaranteed end-to-end data flows in a local networking domain
US11916745B2 (en) System and method for using infiniband routing algorithms for ethernet fabrics in a high performance computing environment
US11070394B2 (en) System and method for redundant independent networks in a high performance computing environment
US11411860B2 (en) System and method for on-demand unicast forwarding in a high performance computing environment
US10461947B2 (en) System and method to provide default multicast lid values per partition as additional SMA attributes in a high performance computing environment
WO2017108119A1 (en) Rack awareness
US10601765B2 (en) System and method to provide combined IB and IP address and name resolution schemes via default IB multicast groups in a high performance computing environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15813883

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15813883

Country of ref document: EP

Kind code of ref document: A1