WO2019017842A1

WO2019017842A1 - Network virtualisation method, computer-readable medium, and virtualisation network

Info

Publication number: WO2019017842A1
Application number: PCT/SG2018/050352
Authority: WO
Inventors: Pravein GOVINDAN KANNAN; Ahmad SOLTANI; Mun Choon Chan; Ee Chien Chang
Original assignee: National University Of Singapore
Priority date: 2017-07-18
Filing date: 2018-07-17
Publication date: 2019-01-24

Abstract

Network virtualisation method, computer-readable medium, and virtualisation network 5 Disclosed is a network virtualisation method, comprising: receiving a virtual flow entry in respect of a virtual link (vL3); and providing a physical flow entry relating to the virtual flow entry in respect of a physical loopback link (loopLink1) mapped to the virtual link (vL3) for causing a packet flow matching the physical flow entry to be forwarded via the physical loopback link (loopLink1) with an 0 identifier of the virtual link (vL3). The physical loopback link (loopLink1) corresponds to a pair of physical ports of a physical switch (pS1).

Description

Network virtualisation method, computer-readable medium, and virtualisation network

Technical field

The present disclosure relates to a network virtualisation method, a computer- readable medium and virtualisation network.

Background

Network emulation is an integral part of network research and development, and is a technique used by, for example, data centres and network operators to estimate actual performance of network topologies. Many tools and techniques for network emulation exist, some of which are discussed below.

Mininet and MaxiNet are such emulation tools used to create network topologies by way of container-based virtualisation on at least one host machine. Network topologies are thus created, emulated and tested on host machines. Performance of the network topologies is dependent on performance of the host machines (e.g., CPU performance). While these tools may be useful for functional testing, they cannot achieve a peak line-rate for performance load testing due to, for example, encapsulation overheads and TCAM unavailability. Emulation results obtained using these tools thus lack fidelity and repeatability.

CloudLab and DeterLab, on the other hand, are network testbeds or staging platforms used to establish a pre-production environment for network emulation. However, networks thus created cannot be programmed because they are considered to be an infrastructure and are managed using VLANs. Moreover, available testbed topologies (or a subset thereof) are often limited to generic Clos topologies or miniaturised versions of the production networks. Due to such a limitation, in the case where a modification is made to the topology, the underlying physical network needs to be correspondingly modified. This often entails the procurement and configuration of switches, which can be resource consuming. These testbeds thus lack flexibility. CoVisor is a network hypervisor that can be used to implement multiple arbitrarily connected virtual switches with a single physical switch flow table. However, the technique of CoVisor is unable to emulate delay and queueing behaviours of interconnected network switches. Shown in Figure 1 is a virtual network topology consisting of first and second switches S1 , S2 and first to fourth hosts H1 -H4 connected to the switches S1 , S2. The switches S1 , S2 are implemented on a single physical switch (not shown). In an experiment, the first host H1 is configured to ping the fourth host H4 for 90 seconds. The second and third hosts H2, H3 are configured to start and end an iperf TCP session at the 30^th second and the 60^th second, respectively. It can be observed that the ping RTT increases significantly from 0.18 ms to fall within the range from 3 ms to 4 ms. This increase in ping RTT occurs due to the TCP packets quickly filling the buffer queues of the switches S1 , S2.

Figure 2 shows a measurement of ping RTT obtained using an instance of CoVisor emulation of the topology of Figure 1 (labelled as "CoVisor") and a measurement of ping RTT obtained using an actual physical implementation of the same topology (labelled as "Multi-Switch"). It can be appreciated that the CoVisor emulation fails to emulate any delay and queueing behaviours of the switches S1 , S2 during the iperf TCP session. This occurs because, in the case of CoVisor emulation, the associated TCP packets traverse only the backplane of the switches S1 , S2, which leads to no contention at all. In addition to the above drawbacks, some of the existing tools and techniques are unable to provide flowspace and dataplane isolation for better data privacy and network performance.

It is desirable to provide a network virtualisation method, a computer-readable medium, and a virtualisation network, which address at least one of the drawbacks of the prior art and/or to provide the public with a useful choice. Summary

According to one aspect, there is provided a network virtualisation method, comprising: receiving a virtual flow entry in respect of a virtual link; and providing a physical flow entry relating to the virtual flow entry in respect of a physical loopback link mapped to the virtual link for causing a packet flow matching the physical flow entry to be forwarded via the physical loopback link with an identifier of the virtual link, wherein the physical loopback link corresponds to a pair of physical ports of a physical switch. The described embodiments are particularly advantageous. When mapped to a virtual inter-switch link, a physical loopback link does not result in additional overheads. Furthermore, because a packet flow traversing a physical loopback link of a switch actually leaves and returns to the switch, delay and queuing characteristics can be observed without incurring any additional overheads. More specifically, a matching packet traversing a virtual inter-switch link is forwarded through the physical loopback link once (i.e., one forwarding operation). In contrast, a less preferred link arrangement known as the "bounce link" requires a matching packet to leave from a physical switch to be received and bounced back by another physical switch, which involves two forwarding operations, with the second forwarding operation resulting in additional overheads.

Preferably, the matching packet flow is caused to be forwarded via the physical loopback link further based on a parameter of the virtual link affecting performance of the physical loopback link. In at least one described embodiment, the parameter serves to affect performance of the physical loopback link in a manner that the physical loopback link is reflective of the virtual link in terms of performance. The parameter may represent at least one of a rate limit and a burst size limit. With the parameter, the physical loopback link may be caused to observe the parameter to faithfully emulate or exhibit the performance of the mapped virtual link. Preferably, the virtual flow entry identifies an outbound port of the virtual link and a destination IP address. The virtual flow entry may further identifies an inbound port of an adjacent virtual link adjacent to the virtual link. The term "adjacent" means that, where two virtual links are connected to a virtual switch, one virtual link immediately precedes or follows the other in a path of packet flow.

The method may further comprise: receiving another virtual flow entry in respect of another virtual link adjacent to the virtual link; and providing another physical flow entry relating to the another virtual flow entry in respect of a physical link mapped to the another virtual link for causing the packet flow further matching the another physical flow entry to be further forwarded via the physical link.

Preferably, the packet flow is caused to be forwarded via the physical link with the identifier removed. Prior to the matching packet flow reaching the intended recipient host, the identifier of the virtual link may be removed from the packet flow. Ideally, any other such identifiers of any other virtual links should also be removed altogether. In such a manner, the physical link, the physical loopback link and any other physical links traversed by the packet flow are transparent to the recipient host. Alternatively, the recipient host may be configured to ignore or disregard any such identifiers associated with the packet flow if they are not removed.

The packet flow may be caused to be forwarded via the physical link with the another virtual link's identifier. In the case where the another virtual link is another inter-switch link, the packet flow can be forwarded with the another virtual link's identifier, which indicates a correspondence relationship between the packet flow and the another virtual link, allowing the packet flow to be distinguished from any other packet flows in the context of the physical link.

Preferably, the method further comprises: receiving a further virtual flow entry in respect of a further virtual link; and providing a further physical flow entry relating to the further virtual flow entry in respect of the physical loopback link mapped to the further virtual link. The virtual link and the further virtual link may preferably belong to respective network topologies. By mapping multiple virtual links to a physical loopback link, a larger network topology or more network topologies can be mapped to a common physical network infrastructure.

According to another aspect, there is provided a network virtualisation method, comprising mapping a virtual inter-switch link to a pair of interconnected ports of a physical switch. Advantages of a lookback link thus formed are as discussed above.

According to another aspect, there is provided a computer-readable medium for network virtualisation, comprising instructions for causing a processor to perform any of the above methods.

According to another aspect, there is provided a virtualisation network, comprising: a plurality of virtual links adapted for configuration based on virtual flow entries; and a physical loopback link mapped to the virtual links and adapted for configuration based on respective physical flow entries relating to the virtual flow entries so as to forward a packet flow matching one of the physical flow entries with an identifier of the corresponding virtual link, wherein the physical loopback link corresponds to a pair of physical ports of a physical switch. The network may further comprise: a plurality of network topologies each including a plurality of virtual switches, each virtual link interconnecting two of the virtual switches of one of the network topologies; a physical network infrastructure including the physical loopback link; a mapping module operable to map the physical loopback link to the virtual links; and a flow translation module operable to provide the physical flow entries based on the virtual flow entries. It is envisaged that using a parameter of the virtual link to determine how to forward a matching packet is an important aspect, and this forms another aspect in which there is provided a network virtualisation method, comprising: receiving a first virtual flow entry in respect of a virtual link; and providing a physical flow entry relating to the virtual flow entry in respect of a physical link mapped to the virtual link for causing a packet flow matching the physical flow entry to be forwarded via the physical link with an identifier of the virtual link; wherein the matching packet flow is caused to be forwarded via the physical link further based on a parameter of the virtual link affecting performance of the physical link. The physical link may be a loopback link.

It is envisaged that features relating to one aspect may be applicable to the other aspects. Brief Description of the drawings

Example embodiments will now be described hereinafter with reference to the accompanying drawings, wherein like parts are denoted by like reference numerals. Among the drawings:

Figure 1 is schematic diagram of an example network topology;

Figure 2 is a line chart of Ping RTTs obtained using an actual network implementation, an embodiment of a network virtualisation system of the present invention, and a known network virtualisation tool, in respect of the network topology of Figure 1 ;

Figure 3 shows a diagram of the network virtualisation system of Figure 2 mapping multiple virtual network topologies to a physical network infrastructure;

Figure 4 shows a mapping relationship between one of the virtual network topologies and the physical network infrastructure of Figure 3;

Figure 5 shows a flowchart of a method performed by the network virtualisation system of Figure 2;

Figure 6 shows another mapping relationship between said one of the virtual network topologies and the physical network infrastructure of Figure 3; Figure 7 shows an example of hardware implementation of the network virtualisation system of Figure 2 with static loopback links;

Figure 8 shows another example of hardware implementation of the network virtualisation system of Figure 2 with dynamic loopback links;

Figure 9A illustrates a simple model used for the purpose of demonstrating an achievable range of gain;

Figure 9B illustrates two diagrams showing gain ranges achievable with the use of loopback links;

Figure 10 shows an example virtual network topology mapped to a physical network infrastructure of three switches, indicating bandwidth information of each switch with respect to each connected host;

Figure 1 1 shows a table of API commands for the creation of a virtual network by the network virtualisation system of Figure 2 based on a given virtual network topology;

Figure 12 is a schematic diagram showing a network testbed used for testing the network virtualisation system of Figure 2;

Figure 13 shows different topology configurations for evaluating performance fidelity;

Figure 14 is a line chart plotting CDFs of shuffle read times obtained for the topology configurations of Figure 13 using the testbed of Figure 12;

Figure 15 shows a FatTree mapped to a physical network infrastructure;

Figure 16 is a bar graph showing averaged shuffle read times of different topologies;

Figure 17 is a line chart showing CDFs of shuffle read times of two of the topologies of Figure 16;

Figure 18 shows experiment results obtained for three different applications associated with respective topologies;

Figure 19 is a line chart showing throughput isolation of a TCP transmission from a UDP burst;

Figure 20 shows a line chart of throughput, demonstrating a metering effect;

Figure 21 shows a line chart of throughput obtained in an experiment; Figure 22 shows an example network topology for testing network embedding efficiency;

Figure 23 shows line chart of percentage core bandwidth utilisation for an experiment involving a FatTree topology;

Figure 24 shows line charts of percentage core bandwidth utilisation for different random network topologies;

Figure 25 shows a line chart of percentage core bandwidth utilisation for an experiment involving the Internet Topology Zoo;

Figure 26 is a diagram of a topology mapped to two physical switches;

Figure 27 is a diagram of the topology of Figure 26 mapped to a single physical switch using a single loopback link; and

Figure 28 is a table of notations used in the operation of a mapping module according to one embodiment of the present invention. Detailed Description

Figure 3 shows a diagram illustrating a bare-metal network virtualisation (BNV) system 100 in an example scenario, according to an embodiment of the present invention. The system 100 is shown to associate a plurality of tenants 200 with a physical network infrastructure 300. The physical network infrastructure 300 includes first to third physical switches pS1 -pS3. Each tenant 200 corresponds to a controller node 210 and an associated network topology 220, and is operable to configure the associated network topology 220 via the corresponding controller node 210 using an SDN protocol. Each of the network topologies 220 can be configured independently without affecting the operations or configurations of the other network topologies 220.

The system 100 includes a mapping module 1 10, a topology abstraction module 120, and a flow translation module 130. The mapping module 1 10 is configured to map the topology of each tenant 200 to the physical network infrastructure 300, the abstraction module 120 is configured to determine meter characteristics (including rate and burst limits) of mapped virtual links, and the flow translation module 130 is configured to convert or translate each flow entry received by the flow translation module 130 for configuration of the physical network infrastructure.

Operation of the flow translation module 130 will now be described. Operations of the mapping module 1 10 and the abstraction module 120 will be described after the description of the flow translation module 130.

Shown in the upper left-hand portion of Figure 4 is the virtual network topology 220 of one of the tenants 200, including: first to third virtual switches vS1 -vS3; three hosts H; and first to third virtual links vL1 -vL3. Each host H is connected to a corresponding one of the virtual switches vS1 -vS3 via a respective hostlink (not labelled). Each virtual link vL1 -vL3 of 1 Gbps interconnects a corresponding pair of the virtual switches vS1 -vS3. Referring to the upper right-hand portion of Figure 4, the virtual network topology 220 is mapped by the mapping module 1 10 to the first and second physical switches pS1 , pS2 to form a mapped virtual network topology 220'. Each of the first and second physical switches pS1 , pS2 provides first to fifth physical ports. The hosts H of the first and second virtual switches vS1 , vS2 are connected to the first and second physical ports of the first physical switch pS1 and are referred to below as the hosts H1 , H2, respectively. The host of the third virtual switch vS3 is connected to the first port of the second physical switch pS2 and is referred to below as the host H5. The third and fourth physical ports of the first physical switch pS1 and the second to fourth physical ports of the second physical switch pS2 are connected to respective hosts unrelated to the mapped virtual network topology 220'. The fifth physical port of the first physical switch pS1 is connected to the fifth physical port of the second physical switch pS2 via a physical link CLinkl of 10 Gbps. In the mapped virtual network topology 220', each of the first to third virtual switches vS1 -vS3, which is represented by a corresponding unshaded box, includes first to third virtual ports, which are represented by respective smaller shaded boxes. Each of the hosts H1 , H2, H5 corresponds to the first virtual port of the respective virtual switch vS1 -vS3. The hosts H1 , H2, H5 are assigned the IP addresses "y y.y.y", "z.z.z.z" and "x.x.x.x", respectively. The first virtual link vL1 interconnects the second virtual port of the first virtual switch vS1 and the second virtual port of the third virtual switch vS3. The second virtual link vL2 interconnects the third virtual port of the second virtual switch vS2 and the third virtual port of the third virtual switch vS3. The third virtual link vL3 interconnects the third virtual port of the first virtual switch vS1 and the second virtual port of the second virtual switch vS2. In this embodiment, the first and second virtual links vl_1 , vL2 are mapped to the physical link CLinkl

The flow translation module 1 10 is operable to configure the physical network infrastructure 300 with respect to mapped virtual network topology 220' based on virtual flow entries received by the system 100 from, for example, the controller node 210 of the mapped virtual network topology 220'.

In one example scenario, the flow translation module 1 10 receives flow entries from the controller node 210 of the mapped virtual network topology 220' to establish a flow between the hosts H1 , H5 via the first and third virtual switches vS1 , vS3. Figure 5 shows an example embodiment of a method 400 of the present invention, performed by the flow translation module 1 10 with respect to in response to these virtual flow entries.

At step 410, the flow translation module 1 10 receives a first virtual flow entry in respect of the first virtual link vL1 . The first virtual flow entry is received for configuration of a flow table of the first virtual switch vS1 . For example, the first virtual flow entry may be represented as follows: vS1 : {in_port: 1 , ip_dest: x.x.x.x; actions= output: 2}. The first virtual flow entry is intended to cause each packet of a packet flow at the first virtual switch vS1 , whose header identifies the destination IP address "x.x.x.x" (i.e., the host H5) and which is received via the first virtual port of the first virtual switch vS1 ("in_port: 1 "), to be forwarded or outputted via the second virtual port of the first virtual switch vS1 ("action = output: 2"). A packet flow as used herein means, from the perspective of a switch, whether virtual or physical, a sequence of packets that matches a specific entry in a flow table of the switch, where the entry specifies a condition (e.g., "in_port: 1 " and "ip dest: x.x.x.x") for an action (e.g., "action = output: 2").

At step 420, the flow translation module 1 10 provides a first physical flow entry relating to the first virtual flow entry in respect of the first physical link CLinkl mapped to the first virtual link vL1 for causing a packet flow matching the first physical flow entry to be forwarded via the first physical link CLinkl with an identifier of the first virtual link vL1 and based on a parameter of the first virtual link vL1 . The first physical flow entry is provided for configuration of a flow table of the first physical switch pS1 to reflect the first virtual flow entry. For example, the first physical flow entry may be represented as follows: pS1 : {in_port: 1 , ip_dest: x.x.x.x; actions = meter: 1 , LTag: 1 , output: 5}.

The first physical flow entry is intended to cause each packet of the packet flow at the first physical switch pS1 , whose header identifies the destination IP address "x.x.x.x" and which is received by the first physical port of the first physical switch pS1 , to be forwarded via the fifth physical port of the first physical switch pS1 with the identifier of the first virtual link vL1 by the first physical switch pS1 in a manner reflective of the parameter of the first virtual link vL1 . In this embodiment, the identifier of the first virtual link vL1 includes a link number of 1 ("LTag: 1 ") identifying the first virtual link vL1 , and the parameter of the first virtual link vL1 includes a meter value of 1 ("actions = meter: 1 ") affecting performance of the first physical link CLinkl in respect of the first virtual link vL1 . Other arbitrary forms of identifier can also be used, as long as they uniquely identify corresponding virtual links within the flow table of a physical switch. In this embodiment, the identifier is associated with each packet of the matching packet flow by way of packet tagging. The parameter represents at least one of a rate limit and a burst size limit in this embodiment, and may represent other metering parameters in other embodiments. The rate limit is dependent on the virtual topology. If a virtual link is specified to have a transmission speed of 1 Gbps, then the rate limit is set to ". That is, the unit of the rate limit is Gbps. On the other hand, the burst size limit is switch specific, and is identified or determined by the abstraction module 120 based on a buffer size of the switch. It should also be noted that packets of a matching packet flow are tagged with the LTag value of the identifier but not the meter value of the parameter. Instead, the meter value serves as a basis for affecting the operation of the first physical switch pS1 in respect of the matching packet flow. The abstraction module 120 is configured with information pertaining to switch port buffer size, and is configured to partition or slice the buffer size by allocating the burst size of each traffic based on the respective rate limit. The buffer size may be calculated by the abstraction module 120 in any suitable way. For example, a UDP traffic of burst N size may be sent at the nth time instance while a TCP traffic is being sent at the line-rate. Assuming the UDP traffic suffers a loss of K bytes and the TCP traffic suffers a loss of L bytes, the buffer size B may be calculated to be B = N - (K + L).

At step 430, the flow translation module 1 10 receives a second virtual flow entry in respect of a virtual link of the third virtual switch vS3 adjacent to the first virtual link vL1 and connecting to the host H5. For example, the second virtual flow entry may be represented as follows: vS3: {in_port: 2, ip_dest: x.x.x.x; actions= output: 1 }.

The second virtual flow entry is received for configuration of a flow table of the third virtual switch vS3. The second virtual flow entry is intended to cause each packet of the matching packet flow at the second virtual switch vS2, whose header identifies the destination IP address "x.x.x.x" (i.e., the host H5) and which is received via the second virtual port of the third virtual switch vS3 ("in_port: 2), to be forwarded or outputted via the first virtual port of the third virtual switch vS3 ("actions= output: 1 ").

At step 440, the flow translation module 1 10 provides a second physical flow entry relating to the second virtual flow entry in respect of a physical link mapped to the virtual link of the host H5 and interconnecting the first physical port of the second physical switch pS2 and the host H5. The second physical flow entry is provided for configuration of the second physical switch pS2. For example, the second physical flow entry may be represented as follows: pS2: {in_port: 5, ip_dest: x.x.x.x; actions= RemoveLTag, output: 1 }.

The second physical flow entry is intended to cause each packet of the packet flow at the second physical switch pS2, whose header identifies the destination IP address "x.x.x.x" and which is received via the fifth physical port of the second physical switch pS2, to be forwarded via the first physical port of the second physical switch pS2 with the identifier of the first virtual link vL1 removed.

By providing the first and second physical flow entries, based on the first and second virtual flow entries, for configuration of the flow tables of the first and second physical switches pS1 , pS2, the flow between the hosts H1 , H5 via the first and third virtual switches vS1 , vS3 is established.

Shown in the lower portion of Figure 6 is an alternative arrangement of the first and second physical switches pS1 , pS2. In contrast with the arrangement shown in Figure 4, in the alternative arrangement, the third and fourth physical ports of the first physical switch pS1 are interconnected by a cable to form a physical link of 1 Gbps, referred to herein as the first loopback link and marked in Figure 6 as "loopLinkT. Similarly, in the alternative arrangement, the third and fourth physical ports of the second physical switch pS2 are interconnected by another cable to form another physical link of 1 Gbps, referred to herein as the second loopback link and marked in Figure 6 as "loopLink2".

Further referring to the upper right-hand portion of Figure 6, the virtual network topology 220 is mapped in this scenario to the first and second physical switches pS1 , pS2 to form a mapped virtual network topology 220" different from the mapped virtual network topology 220' of Figure 4. In particular, in this arrangement, the third virtual link vL3 is mapped to the first loopback link.

In the scenario of Figure 6, the flow translation module 1 10 performs the method 400 with respect to flow entries received from the controller node 210 of the mapped virtual network topology 220" to establish a flow between the hosts H1 , H2 via the first and second virtual switches vS1 , vS2.

At step 410 in respect of Figure 6, the flow translation module 1 10 receives a first virtual flow entry in respect of the third virtual link vL3 for configuration of the flow table of the first virtual switch vS1 . For example, the first virtual flow entry may be represented as follows:

The first virtual flow entry is intended to cause each packet of the matching packet flow at the first virtual switch vS1 , whose header identifies the destination IP address "z.z.z.z" (i.e., the host H2) and which is received via the first virtual port of the first virtual switch vS1 ("in_port: 1 ), to be forwarded or outputted via the third virtual port of the first virtual switch vS1 ("actions= output: 3").

At step 420 in respect of Figure 6, the flow translation module 1 10 provides a first physical flow entry relating to the first virtual flow entry in respect of the first loopback link (loopLinkl ) mapped to the third virtual link vL3 for causing a packet flow matching the first physical flow entry to be forwarded via the first loopback link with an identifier of the third virtual link vL3 and based on a parameter of the third virtual link vL3. The first physical flow entry is provided for configuration of a flow table of the first physical switch pS1 to reflect the first virtual flow entry. For example, the first physical flow entry may be represented as follows: pS1 : {in_port: 1 , ip_dest: z.z.z.z; actions= meter: 1 , LTag: 3, output: 3}.

The first physical flow entry is intended to cause each packet of the packet flow at the first physical switch pS1 , whose header identifies the destination IP address "z.z.z.z" and which is received via the first physical port of the first physical switch pS1 , to be forwarded via the third physical port of the first physical switch pS1 with the identifier of the third virtual link vL3 and in a manner reflective of the parameter of the third virtual link vL3. In this embodiment, the identifier of the third virtual link vL3 includes a link number of 3 ("LTag: 3") identifying the third virtual link vL3, and the parameter includes a meter value of 1 ("actions = meter: 1 ") affecting performance of the first loopback link in respect of the third virtual link vL3. The parameter represents at least one of a rate limit and a burst size limit.

At step 430 in respect of Figure 6, the flow translation module 1 10 receives a second virtual flow entry in respect of a virtual link of the second virtual switch vS2 adjacent to the third virtual link vL3 and connecting to the host H2. The second virtual flow entry is received for configuration of a flow table of the second virtual switch vS2. For example, the second virtual flow entry may be represented as follows: vS2: {in_port: 2, ip_dest: z.z.z.z; actions= output: 1 }. The second virtual flow entry is intended to cause each packet of the matching packet flow at the second virtual switch vS2, whose header identifies the destination IP address "z.z.z.z" (i.e., the host H2) and which is received via the second virtual port of the second virtual switch vS2 ("in_port: 2), to be forwarded or outputted via the first virtual port of the second virtual switch vS2 ("actions= output: 1 ").

At step 440 in respect of Figure 6, the flow translation module 1 10 provides a second physical flow entry relating to the second virtual flow entry in respect of a physical link mapped to the virtual link of the host H2 and interconnecting the second physical port of the first physical switch pS1 and the host H2. The second physical flow entry is provided for configuration of the first physical switch pS1 . For example, the second physical flow entry may be represented as follows: pS1 : {in_port: 4, ip_dest: z.z.z.z, LTag: 3; actions= RemoveLTag, output: 2}.

The second physical flow entry is intended to cause each packet of the packet flow at the first physical switch pS1 , whose header identifies the destination IP address "z.z.z.z" and which is received via the fourth physical port of the first physical switch pS1 , to be forwarded via the second physical port of the first physical switch pS1 with the identifier of the third virtual link vL3 removed. By providing the first and second physical flow entries based on the first and second virtual flow entries for configuration of the flow table of the first physical switch pS1 , the flow between the hosts H1 , H2 via the first and second virtual switches vS1 , vS2 is established. It is worth noting that, contrary to the arrangement of Figure 4 where the third virtual link vL3 is mapped in the form of a bounce link to the fifth physical port of the first physical switch pS1 with respect to the fifth physical port of the second physical switch pS2, the third virtual link vL3 of Figure 6 is instead mapped to the first loopback link (loopLinkl ). The loopback link requires the corresponding physical ports to be linked by a cable, and does not suffer from disadvantages of the bounce link, such as additional traffic, queueing and lookup overheads arising from the need for a traversing packet flow to leave a physical switch to be received and bounced back by another physical switch, which involves two forwarding operations. A virtual inter-switch link mapped to a loopback link is thus comparatively resource efficient and achieves a higher emulation fidelity since a packet traversing a virtual inter-switch link involves only one forwarding operation. Furthermore, multiple virtual inter-switch links belonging to the same topology or different topologies can be mapped to a single loopback link. The loopback link can be used in place or in conjunction with, and is preferred over, the bounce link. Furthermore, because a packet traversing a virtual inter-switch link mapped to a loopback link actually leaves from and returns to the same physical switch, delay and queueing behaviour of the physical switch can be faithfully emulated. Referring again to Figure 2, this is evident in that one of the dashed lines (labelled as "BNV"), representing a measurement of Ping RTT obtained using the BNV system 100, closely follows the other dashed line (labelled as "Multi- Switch"), representing a measurement of Ping RTT obtained using an actual network, for the same network topology.

As can be appreciated from a comparison of the scenarios of Figures 4 and 6, each physical switch, apart from connecting to hosts and/or other switches, may have one or more loopback links. A loopback link connects two physical ports belonging to a same switch. Figure 7 shows a static configuration of loopback links while Figure 8 shows a dynamic configuration of loopback links. In the case of the dynamic configuration, each L2 switch is arranged between a plurality of hosts and a plurality of SDN Switchs (ToR). With the dynamic configuration, the ports can be dynamically allocated to the hosts and the loopback links during runtime through VLAN or circuit reconfiguration. It can be understood from the examples of Figures 4 and 6 that a virtual flow entry for a virtual switch includes only an outbound port of the virtual switch and a destination IP address. It should be noted that each physical link (connected to two physical switches) and each loopback link (connected to a single physical switch) can be mapped to multiple virtual links (each connecting one virtual switch to another).

Figure 26 shows a topology mapped to two physical switches. The topology includes four hosts, five links and six ports. Two switches are needed because of the requirement of the bounce link interconnecting port 3 of the switch on the left and port 4 of the switch on the right. Shown in Figure 27 is the same topology mapped to a single switch implementing a loopback link between ports 3 and 4. The loopback link, as would be appreciated by a skilled person, provides numerous advantages. Firstly, higher flexibility and scalability in mapping of arbitrary network topologies can be achieved without incurring additional overhead (contrary to bounceback links) on the core backbone links, which may otherwise be over-subscribed. Consistency and fidelity in emulation results can be improved where loopback links are used. Where a loopback link is associated with a sole tenant, traffic isolation can be achieved since a loopback link in such a configuration can be a dedicated link not shared by other tenants.

As a person skilled in the art would appreciate, multiple virtual inter-switch links may be mapped to a single physical loopback link or a single physical inter- switch link. Referring again to Figure 4, assuming a flow is to be established from the host H2 to the host H1 via the second, third and first virtual switches vS2, vS3, vS1 in the given order, the following virtual flow entries may be received by the flow translation module 1 10: vS2: {in_port: 1 , ip_dest: y.y.y.y; actions= output: 3},

vS3: {in_port: 3, ip_dest: y.y.y.y; actions= output: 2}, and vS1 : {in_port: 2, ip_dest: y.y.y.y; actions= output: 1 }.

The flow translation module 1 10 is operable to perform steps similar to those of the method 400 in relation to the above virtual flow entries so as to provide the following physical flow entries, respectively: pS1 : {in_port: 2, ip_dest: y.y.y.y; actions= meter: 1 , LTag: 2, output: 5}, pS2: {in_port: 5, ip_dest: y.y.y.y, LTag: 2; actions= meter: 1 , LTag: 1 , output: 5}, and

pS1 : {in_port: 5, ip_dest: y.y.y.y, LTag: 1 ; actions= RemoveLTag, output:

1 }.

Further assuming that another virtual network topology is to be mapped to the physical network infrastructure 300, a virtual inter-switch link of that other virtual network topology can be mapped, for example, to the first physical link CLinkl and its flow or traffic can be distinguished by a corresponding identifier. That is to say, a flow or traffic of each virtual link mapped to a same physical link (e.g., an inter-switch or loopback) can be distinguished by the identifier of the respective virtual link. Consider the following example (not shown), where two identical virtual flow entries for respective virtual switches vS10, vS20 of different topologies result in two corresponding physical flow entries for a physical switch pS5: vS10: {in_port: 1 ; ip_dest: w.w.w.w; output: 2}, and

vS20: {in_port: 1 ; ip_dest: w.w.w.w; output: 2}; and pS5: {in_port: 1 , ip_dest: w.w.w.w, LTag: 5; actions= meter: 1 , LTag: 1 1 , output: 5}, and

pS5: {in_port: 1 , ip_dest: w.w.w.w, LTag: 6; actions= meter: 1 , LTag: 21 , output: 5}. It can be understood that while identical virtual flow entries are received for the virtual switches vS10, vS20 of different topologies, respective matching packet flows in the physical switch pS5 distinguishable by the different identifiers (link numbers, LTags, of 5 and 6) can be forwarded accordingly with new or replacement LTag numbers of 1 1 and 21 , respectively. That is, the LTag numbers of 5 and 6 identify the virtual links in respect of which the packet flows are received by the physical switch pS5, and the LTag numbers of 1 1 and 21 identify the virtual links in respect of which the packet flows are to be forwarded by the physical switch pS5, respectively. In this particular examples, the respective matching packet flows correspond to the same parameters (meter values of 1 ).

Therefore, it can be appreciated that the conversion or translation of a virtual flow entry into a physical flow entry with an identifier associated with a parameter of a corresponding virtual link provides flowspace isolation and dataplane isolation, which is useful for virtualising multiple links to support arbitrary topologies.

The topology abstraction module is configured to implement dataplane isolation. Where multiple virtual links share a single physical link, those virtual links share the same queues. To reduce interference, each virtual link is prevented from exhausting the port buffers. This is achieved by partitioning the maximum burst size of the corresponding physical link for allocation to the virtual links.

Furthermore, where a single physical switch is mapped to (e.g., sliced into) multiple virtual switches, it is important to partition the buffer of the physical switch for allocation to the virtual switches such that each virtual switch can use only the allocated buffer portion.

Shown in Figure 9A is a simple model to illustrate a gain (e.g., a capacity gain) achievable with the use of loopback links. The model includes N switches. Each switch has H ports. The ports can be connected to hosts or arranged to form loopback links. Each switch is also connected to a backbone network with a link of capacity L (uplink + downlink). For simplicity, assume that each backbone link, with a capacity of L, can support up to L inter-switch (switch-to-switch) links.

Referring to the diagram on the left of Figure 9B, where no loopback link is implemented and NH hosts need to be provisioned using the virtual network, up to inter-switch links can be supported in the best case. However, where all switch pairs are mapped with connecting links on the same physical switches, up to Y inter-switch links can be supported in the worst case. Where fewer hosts are needed in the virtual network, 4 (host) ports (2 host ports on each switch) can be used to support one switch-to-switch link.

Referring to the diagram on the right of Figure 9B, in the case where loopback links are implemented and NH hosts need to be provisioned using the virtual network, up to inter-switch links can be supported in any case. Where fewer

hosts are connected (i.e., additional physical ports are available), one inter- switch link can be added for each pair of available physical ports of the same physical switch. The maximum number of inter-switch links can be expressed as

The solidly shaded area in Figure 9B shows the number of inter-switch links always feasible or supported. The gain of using loopback links over no loopback link is In terms of the total feasible area, the gain is

In response to a virtual network topology received from a user, the mapping module 1 10 functioning as a network embedder calculates and returns an optimal mapping of resources of the physical network infrastructure 300 for the user, taking into account future expandability. The user becomes a tenant once the topology is mapped. The mapping module 1 10 uses integer linear programming, ILP, to increase the fidelity of the embedding of the virtual topology and to decrease the overheads. Scalability of mapping may also be improved. The virtual network topology received from the user consists of a set of: hosts V, virtual switches S, and virtual links L. Each virtual link is bi-directional and has a bandwidth b. A link can be an inter-switch link (or a corelink) L_c connecting one switch to another, or a hostlink L_h connecting a switch to a host. The mapping module 1 10 converts the received virtual network topology into a form where the switches are broken down into core links, each core link interconnecting two switch ports, and hostlinks, each hostlink connecting a switch port to a host. In this arrangement, each virtual switch S can be represented as a set of switch- links or hostlinks. For each core link, a TCAM size t(s) can be specified. Figure 28 shows a table of notation used in an optimisation model of the mapping module 1 10.

Mapped to the virtual network topology, a physical topology consists of a set of: physical server machines H, physical switches R and physical links Q. Similar to the virtual network topology, the physical links Q can be categorized into corelinks Q_c and hostlinks Q_h. Each link i has a bandwidth bj. Each corelink Qc can be either a loopback link of a single physical switch or a backbone link between two physical switches.

A binary decision variable x_iv has a value set to 1 if the virtual corelink i is mapped to a physical corelink v and set to 0 if otherwise. Similarly, y_jw indicates a virtual hostlink j mapped to a physical hostlink w. The mapping of a hostlink means the mapping of a virtual host to a physical server. Equation (1 ) represents an objective function:

The objective function has two sigma terms. The first one represents the amount of resources to support the mapping of the corelinks (x_ivbi) and the second one represents the resources needed to map a single virtual switch over multiple physical switches Equation (1 ) is intended to minimise the

usage of physical or substrate backbone (core) links. These links may otherwise be under provisioned relative to the hostlinks. This maximises the usage of loopback links, providing a higher emulation fidelity. The constraints are:

With constraint (2), each virtual corelink is to be mapped to only one substrate corelink. With constraint (3), each virtual hostlink is to be mapped to only one physical hostlink. With constraint (4), each physical hostlink is to be provisioned within its capacity. With constraint (5), each physical host is not allocated beyond its core capacity. The notation uses hostlink instead of host. However, in the model, hostlink is synonymous to host. Further, constraints (4) and (5) are needed only for VM-based provisioning, such as OpenStack, and are not needed for bare-metal provisioning because each virtual host is allocated an entire server blade. Constraint (6) ensures that the TCAM capacity bounds of the switch are not exceeded. The TCAM specified for each virtual switch is preferably split equally among its corelinks and hostlinks for simplicity of allocation.

Constraints (7) to (9) are used to model the resources needed to support

intra-switch traffic (between physical switch p,q) for a virtual switch m, mapped to multiple substrate or physical switches. To ensure the provision of sufficient bandwidth, the number of hosts/links mapped on each physical switch needs to be taken into account, as well as the amount of traffic that can be sent and received.

Constraint (7) defines a variable z_{m n} which represents the total bandwidth of virtual links belonging to virtual switch m and mapped to physical switch n. Constraint (8) creates a R x R matrix for each virtual switch which indicates the amount of intra-switch bandwidth between any two physical switches corresponding to a single virtual switch. It allocates the minimum (host/core) link bandwidth provided by each physical switch for a given pair of physical switches mapped to a same big virtual switch. Figure 10 illustrates an example configuration where a virtual switch with six hosts are embedded onto three physical switches, with one switch mapped to three hosts, another switch mapped to one host, and the remaining switch mapped to two hosts. Each virtual switch is shown to be associated with a table containing bandwdith information for each host. Constraint (9) ensures that the physical corelinks are provisioned within their capacity accounting both inter-switch links and intra- switch link utilization.

In relation to the above objective function and its constraints, buffer partitioning can be readily incorporated in the form of metering by way of rate and burst size limits (parameter information), and implemented using the abstraction module 120.

The BNV Mapper (or the mapping module 1 10) is further implemented to allow evolution or conversion of one topology to another with a reduced or minimal overall migration. The BNV mapper is configured to perform: snapshot mapping and remapping. Snapshot mapping is where a snapshot of the current mapping of host/core links is taken for the tenant, with the mapping recorded as x_i' _v (from

Remapping follows snapshot mapping, and is where the

ILP is erformed for the tenant with a modified objective as follows:

The modified objective function maximises or increases mapping similarity between the original virtual topology and the modified virtual topology by attempting to use the same switch ports and hosts where possible, thereby minimizing potential migration.

A BNV system, such as that shown in Figure 3, can be implemented using Java over OpenVirtex with OpenFlow 1 .3. The BNV system can be integrated with a bare-metal provisioning system, such as DeterLab. Gurobi can be used as an ILP solver for the purpose of the mapping module 1 10. The system receives virtual network topologies from users in the form of NS files. The SDN switches are specified by defining the switch-type as "ofswitch" (i.e., OpenFlow Switch). Controller nodes are created so that the users can use their own SDN controllers which are connected to the BNV system using VLAN provisioning via an out-of-band network. Figure 1 1 shows a table of API commands for virtual network creation by the BNV system. "createNetwork" initialises a tenant, and identifies a controller IP address and port. "createVSwitch" creates a virtual switch. It performs two types of mapping: multiple-to-one abstraction, which takes several physical switches and abstracts them into a single virtual switch; and one-to-multiple abstraction, which slices a physical switch into multiple virtual switches for the same tenant. This API command can be called multiple times to create multiple virtual switches. "createVSwitchPort" takes as an input a virtual switchID returned by "createVSwitch", a max outgoing rate and burst (implementing the metering function), a physical switchID and a port number to be assigned to the virtual switch. "createVSwitchPort" returns a unique vPort for the vSwitch. "createCoreLink" creates a virtual link between two virtual switch ports. "createHostLink" creates a virtual link between a host and a virtual switch port.

The BNV system virtualises the physical network by maintaining a mapping table {pSwitch, pSwitchPort} -> {Switch}. Any incoming packet, is virtualized to a particular virtual network by identifying the switch port and the physical switch from which the packet is received and the LTag value (i.e., the identifier) of the packet. In order to implement the virtual link-tags (LTag), MAC address encoding of OpenVirtex can be used. An incoming packet of a virtual network undergoes a MAC translation to encode the LTag (32-bit) and, if needed, metering for rate limit. This isolates multiple tenants and also multiple virtual links within the same tenant, allowing a single physical link to be mapped to multiple virtual links for the same tenant or different tenants, thus achieving dataplane isolation and controlplane isolation.

Through experiments conducted using a production testbed of the National Cybersecurity Lab, it has been shown that the functionalities of OpenFlow 1 .3 (meters, groups, etc.) and custom network applications (BGP, congestion control, etc.) can be used with no compatibility issues. Figure 12 shows a schematic diagram of the network testbed, including four HP3800 SDN switches, each connecting to a cluster of 24 Lenovo X3550 servers. The SDN switches are connected to one another using a core switch which is used only for L2 connectivity. The server blades are also connected to a control switch, which uses out-of-band management (IPMI, PXE booting). Each SDN switch has 12 loopback links and a 10G uplink connecting to the core switch. Each loopback link is formed by connecting two ports using a short Ethernet cable. Software- configurable loopback is configured for one cluster in the staging platform.

Performance of the BNV system is discussed below, focusing on the ability of the BNV system to provide a network that faithfully emulates queueing and delay behaviours (i.e., performance fidelity), to flexibly embed complex topologies, and to provide performance isolation for multiple tenants.

Performance fidelity of the BNV system is evaluated with the use of the one-to- multiple and multiple-to-one abstractions. Figure 13 shows different topology configurations used for evaluating the performance fidelity. Figure 1 1 (a) shows a star topology physically implemented with 1 switch and 16 servers. Figure 1 1 (b) shows a physical topology (i.e., a topology configuration of a physical network infrastructure) with 4 switches and 16 hosts virtualising the star topology of Figure 1 1 (a) using the BNV system, where the core switch is employed solely for L2 connectivity. Figure 1 1 (c) shows a clos topology physically implemented with 4 switches and 16 hosts (based on the configuration shown in Figure 12 with rewiring to bypass the core switch). Figure 1 1 (d) shows a physical topology with 1 switches and 16 hosts virtualising the clos topology of Figure 1 1 (c) using the BNV system, where the core switch is employed solely for L2 connectivity. The switch of Figure 1 1 (d) is configured with loopback links.

For the purpose of the evaluation, an instance of Apache Spark application for a wordcount of a 50GB file is run. Figure 14 shows CDFs of shuffle read times thus obtained. As can be understood from Figure 14, the CDFs of the physically implemented and emulated networks are very similar. Through a comparison of Figures 13(a) and 13(b), it can be observed that the behaviour of the application is substantially unaffected by the longer physical hops and the use of multiple physical switches to support a single virtual switch. Through a comparison of Figures 1 1 (c) and 1 1 (d), it can be observed that the behaviour of the application is substantially unaffected by the use of loopback links.

Flexibility refers to the ability of the BNV system to map arbitrary network topologies and to allow fine-grain topology optimisation for a particular workload. The BNV system can quickly (typically within a few seconds) perform network virtualisation to support multiple datacentre topologies, including FatTree, Clos, Hyper- X, JellyFish, Mesh, Ring, HyperCube, and Xpander. Figure 15 shows a FatTree mapped to a substrate topology (i.e., a physical topology).

For the purpose of demonstrating flexibility, Big-data application workloads are used. Four experiments of different topologies are created, including: a single rooted tree topology (16 hosts, 15 switches), a star topology (16 hosts, 1 switch), a FatTree topology with a degree of 4 (16 hosts, 20 switches), and a JellyFish topology (16 hosts, 20 switches). An instance of Apache Spark application for a wordcount of a 50GB file is run and custom partitioning is used in order to increase inter-rack level traffics. For each topology, the application is run 10 times with the averaged results shown in Figure 16.

As can be observed from Figure 16, the binary tree topology has the longest average shuffle read time due to its very limited bisection bandwidth. The star topology has the shortest average shuffle read time because it has full bisection bandwidth. The FatTree and JellyFish topologies have shuffle average shuffle read times falling between those of the binary tree and star topologies. For each of the FatTree and JellyFish topologies, the traffic is equally split using ECMP based on link utilisation calculated at the controller. The obtained results demonstrate the flexibility of the BNV system to implement different characteristics of virtual topologies virtualised on the same substrate network. Further referring to Figure 17, whilst similar trends of shuffle read times are observed in respect of the FatTree and Jellyfish topologies, the shuffle read times of the JellyFish topology are, on average, about 7-8% shorter than those of the FatTree topology. This difference is attributed to higher high inter-rack traffics of the FatTree topology. Data placement in Spark is varied for a higher intra-pod locality. The JellyFish topology performs significantly better than the FatTree topology where substantial inter-rack network traffics are present, and that the FatTree topology performs better than the JellyFish topology where there are substantial intrarack exchanges. As one skilled in the art would appreciate, the BNV system is able to emulate multiple network topologies with a high fidelity and a high repeatability by maintaining the key characteristics of each topology.

Isolation refers to the ability of the BNV system to isolate traffics. Figure 18 shows experiment results obtained for three applications associated with respective topologies. The first application (Expt-1 ) involves a FatTree4 topology including 16 hosts and 20 switches, running a wordcount of 1 GB file using Spark continuously with a heavy inter-rack traffic for average shuffle time measurement. The second application (Expt-2) involves a JellyFish (random topology) topology including 16 hosts and 20 switches, running iperf among all pairs for aggregate throughput measurement. The third application (Expt- 3) involves a random topology with the 8 remaining switch ports and the 8 remaining hosts, running ping between each pair of the nodes with the highest hop count for ping measurement.

The experiment is designed as follows:

1 ) For the first 60 minutes, the applications run in succession for 20 minutes each;

2) For the next 20 minutes (i.e., from the 60^th minute to the 80^th minute, marked by the shaded region), the applications run in parallel; and

3) For the next 60 minutes (i.e., from the 80^th minute to the 140^th minute) the applications run in succession for 20 minutes each. As can be appreciated from Figure 18, each of the applications experiences little or no performance drop when run in parallel, achieving a performance similar to that achieved by the respective application running individually.

Buffer partitioning refers to the ability of the BNV system to prevent exhaustion of buffer resources by traffics of a mapped virtual network topology. When multiple network topologies share a physical link (e.g., a backbone link) without buffer partitioning, a burst of traffic for one topology can quickly fill up the buffer and transmission queue, thus impacting performances of the other network topologies. Buffer partitioning can be implemented by allocating the maximum burst for a particular virtual link or set of ports (using OpenFlow meters, as discussed above). This requires the abstraction module of the BNV system to take into consideration the switch port buffer size as a mapping constraint.

Figures 19 and 20 show throughput percentage measurements obtained from two experiments conducted to show the effect of metering on throughput maintenance. For the experiment of Figure 19, two virtual network topologies are allocated different slices of a same switch, with one topology configured with a burst of UDP traffic (sending at the maximum rate) from 1 1 hosts to 1 host, and with the other topology configured with a TCP traffic transmission at line-rate between two hosts. Figure 19 shows the throughput of the TCP transmission, with the shaded region marking the period of the UDP burst. Clearly, as a result of buffer isolation, the UDP traffic has little or no impact on the TCP traffic. It should be noted that, although some switch architectures perform buffer isolation inherently, the effect of such inherent buffer isolation is suppressed or reduced in the experiment by allocating a burst rate to each port of each virtual switch and by applying the same meter to all flows belonging to the ports of the virtual switch. The obtained results of Figures 19 and 20 are thus generic and applicable to most switches. In this way, the amount of buffer that can be accessed by a single flow or a set of flows can be allocated. Isolation, fidelity and repeatability can thus be achieved. For the experiment of Figure 20, a backbone link is shared by two virtual links. Whether the virtual links belong to a same topology or respective topologies is unimportant in the context of Figure 20. One tenth of the bandwidth of a physical link is allocated to a first virtual link while the remaining nine tenth of the bandwidth is allocated to a second first virtual link. A TCP traffic is arranged to traverse the first virtual link while 10 UDP traffics for saturating the physical link are arranged to traverse the second virtual link in order to affect the TCP traffic. The TCP traffic occurs continuously throughout the experiment while the UDP traffics occur from the 60^th second to the 120^th second and after the 180^th second. Metering is enabled for the first 180 seconds and is disabled thereafter. As can be understood from Figure 20, during the period where metering is enabled (marked by the shaded region on the left), the TCP traffic of the first virtual link experiences little or no impact on the throughput. In contrast, during the period where metering is disabled (marked by the shaded region on the right), the TCP traffic experiences a significant impact on the throughput. Metering in the context of Figures 19 and 20 mean burst and rate limiting. Thus, the BNV system can achieve isolation, fidelity and repeatability even in the presence of multiple virtual network topologies by virtue of controlplane isolation (using packet tagging) and dataplane isolation (rate limiting, and buffer slicing using burst-rate limiting).

As discussed above, users may submit their topologies as NS files. Specified in the NS files are the types of switch (by default, software switch, and ofswitch for BNV). The NS files are parsed for compatibility with the mapping module of the BNV system, with switches converted to switch-ports to the. For each tenant that has an SDN switch (ofswitch), at least one controller node is created through which a user of the tenant can use their own SDN controllers via VLAN based provisioning. The BNV system is implemented to include a switch identification module, which virtualizes the physical switch based on the port, and the identifier (LTag). To evaluate the traffic isolation performance, an instance of the BNV system containing one programmable HP3800 SDN switch is used. The HP3800 has 48 ports, of which 24 are connected to hosts, and the other 24 ports are used to make 12 loop-back links. Spanning tree is disabled so that the loopback links are not disabled by the Spanning-Tree module. A large number of tenants of different topologies (multi-switch, hence using loopback links) are created so that no tenant can be further added. An iperf² traffic is created to so that the achievable percentage bandwidth of each workload can be observed. Initially, an iperf3 traffic is generated by one tenant for the first 30 minutes. Thereafter, iperf³ traffics are then generated by all of the other tenants for 20 minutes (i.e., from the 30^th minute to the 50^th minute). Figure 21 shows an observed line-rate performance. This result is within expectation since the packet does not go through any software layers and since the switching is bare-metal. No changes in throughput is caused by the other tenants' traffic (marked by the dashed line).

Shown in Figure 22 is a network topology used for testing the network embedding efficiency of the BNV system. The network topology consists of five SDN switches connected to a Core L2 switch using respective links of 40 Gbps. Each of SDN switches (HP3800) consist of 48 downlink ports (1 G each), with 24 ports connected to respective physical servers, and the other 24 ports configured to form loopback links, resulting in 12 loopback links per switch.

In one experiment, a FatTree topology of an increasing degree is fitted to a restrictive physical topology. All links in the FatTree are links of 1 Gbps. For simplicity, only one tenant is considered in this evaluation. We increase the degree of FatTree, and try to map two variants of the topology: 1 ) Generic Topology (no loops) and, 2) Loop-back topology until the mapping becomes infeasible. Figure 23 shows the percentage of backbone bandwidth utilisation. Each data point is averaged over 10 runs and a successful (optimal) mapping is observed. Generally, the network size is increased as long as the core bandwidth is available. We observe that without loopback links, core bandwidth utilization increases immediately due to overhead of bounce-back links. Without loopback links, a FatTree with degree 6 (212 links (1 Gbps each) cannot be mapped. In contrast, with loopback links, a FatTree with degree 6 can be mapped with a better utilization performance of the core backbone's bandwidth. In another experiment, random topologies of an increasing number of switches and links are fitted to a restrictive physical topology. Two sets of random topologies with a fixed number of n switches are considered. The number of links for one set is 2n. The number of links for the other set is nVn. Network mapping is performed with 10 set of topologies for each variant of random topology. Figure 24, left, shows a measured percentage of backbone link bandwidth utilization against the topology size (proportional to the number of switches). For the random topologies with 2n links, the gain is clearly evident. Larger topologies (up to 130 nodes and 260 links) can be accommodated with loopback links compared to only 60 nodes (and 120 links) without loops. This translates to a gain of approximately 2. Similarly, referring to Figure 24, right, for topologies with nVn links, larger topologies of 40 switches (252 links) can be accommodated with loopback links. In contrast, only 25 switches (125 links) can be accommodated without loopback links. This translates to a gain of approximately 2. We observe running times for the network mapper algorithm to be about 10 ms to (100-250) ms for larger topologies.

Referring to Figure 25, in another experiment (internet topology zoo), the entire set of topologies from the Internet Topology Zoo is taken and embedded onto the physical topology. The physical topologies are sorted according to the number of links in the graph and the backbone link usage percentage with an increasing size of the internet zoo topologies is plotted on the graph. Figure 25 shows that about 96.5% (252 of 261 ) of the topologies can be mapped using the given physical topology with loopback links. Overall, 50% more topologies can be accommodated with loopback links in the network. The mapping module of the BNV system is therefore able to embed a large collection of topology. The addition of loopback links further increases the efficiency significantly. The BNV system can emulate topologies of up to 130 switches, and up to 300 links with just 5 top-of-rack (ToR) switches.

In summary, the BNV system allows network experimentation to be performed with high fidelity at both the dataplane and controlplane, using programmable switches. The BNV system can support mapping of arbitrary network topologies with high efficiency to switches with loopback links in order to provide high fidelity. The ILP based formulation facilitates efficient embedding of complex topologies to substrate networks. The BNV system is able to accurately emulate the characteristics of the topologies implemented over the substrate network while isolating operations of the topologies.

Claims

1 . A network virtualisation method, comprising:

receiving a first virtual flow entry in respect of a virtual link; and

providing a physical flow entry relating to the virtual flow entry in respect of a physical loopback link mapped to the virtual link for causing a packet flow matching the physical flow entry to be forwarded via the physical loopback link with an identifier of the virtual link;

wherein the physical loopback link corresponds to a pair of physical ports of a physical switch.

2. The method of claim 1 , wherein the matching packet flow is caused to be forwarded via the physical loopback link further based on a parameter of the virtual link affecting performance of the physical loopback link.

3. The method of claim 2, wherein the parameter represents at least one of a rate limit and a burst size limit.

4. The method of any one of the preceding claims, wherein the virtual flow entry identifies an outbound port of the virtual link and a destination IP address.

5. The method of claim 4, wherein the virtual flow entry further identifies an inbound port of an adjacent virtual link adjacent to the virtual link.

6. The method of any one of the preceding claims, comprising:

receiving another virtual flow entry in respect of another virtual link subsequent to the virtual link; and

providing another physical flow entry relating to the another virtual flow entry in respect of a physical link mapped to the another virtual link for causing the packet flow further matching the physical flow entry to be further forwarded via the physical link.

7. The method of claim 6, wherein the packet flow is caused to be forwarded via the physical link with the identifier removed.

8. The method of claim 6 or 7, wherein the packet flow is caused to be forwarded via the physical link with the another virtual link's identifier.

9. The method of any one of the preceding claims, further comprising:

receiving a further virtual flow entry in respect of a further virtual link; and providing a further physical flow entry relating to the further virtual flow entry in respect of the physical loopback link mapped to the further virtual link.

10. The method of claim 9, wherein the virtual link and the further virtual link belong to respective network topologies.

1 1 . A network virtualisation method comprising mapping a virtual inter-switch link to a pair of interconnected ports of a physical switch.

12. A computer-readable medium comprising instructions for causing a processor to perform the method of any one of claims 1 to 1 1 .

13. A virtualisation network, comprising:

a plurality of virtual links adapted for configuration based on virtual flow entries; and

a physical loopback link mapped to the virtual links and adapted for configuration based on respective physical flow entries relating to the virtual flow entries so as to forward a packet flow matching one of the physical flow entries with an identifier of the corresponding virtual link;

14. The virtualisation network of claim 13, further comprising: a plurality of network topologies each including a plurality of virtual switches, each virtual link interconnecting two of the virtual switches of one of the network topologies;

a physical network infrastructure including the physical loopback link;

a mapping module operable to map the physical loopback link to the virtual links; and

a flow translation module operable to provide the physical flow entries based on the virtual flow entries.