AU2013251205B2 - Interface monitoring for link aggregation - Google Patents

Interface monitoring for link aggregation Download PDF

Info

Publication number
AU2013251205B2
AU2013251205B2 AU2013251205A AU2013251205A AU2013251205B2 AU 2013251205 B2 AU2013251205 B2 AU 2013251205B2 AU 2013251205 A AU2013251205 A AU 2013251205A AU 2013251205 A AU2013251205 A AU 2013251205A AU 2013251205 B2 AU2013251205 B2 AU 2013251205B2
Authority
AU
Australia
Prior art keywords
heartbeats
network interfaces
network
aggregated
link aggregation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
AU2013251205A
Other versions
AU2013251205A1 (en
Inventor
Zhengrong Ji
Yuguang Wu
Junlan Zhou
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US12/364,634 priority Critical
Priority to AU2010210714A priority patent/AU2010210714C1/en
Application filed by Google LLC filed Critical Google LLC
Priority to AU2013251205A priority patent/AU2013251205B2/en
Publication of AU2013251205A1 publication Critical patent/AU2013251205A1/en
Application granted granted Critical
Publication of AU2013251205B2 publication Critical patent/AU2013251205B2/en
Assigned to GOOGLE LLC reassignment GOOGLE LLC Request to Amend Deed and Register Assignors: GOOGLE, INC.
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Abstract

The present invention provides network interface monitoring and management that may be employed with link aggregation technologies. Multiple network interfaces may be aggregated into a single bond and data may be transferred to and from a backbone network via this aggregated bond. A link aggregation monitor employs a heartbeat generator, sniffer and data store to keep track of health and availability of network interfaces. The heartbeat generator sends heartbeats to the network interfaces, which pass the heartbeats around in a token ring configuration. If a network interface fails or otherwise goes offline, detection of this condition causes the monitor and heartbeat generator to prepare new or modified heartbeats so that data may be efficiently and accurately routed around the token ring and health of all remaining alive interfaces can be monitored properly. If a network interface re-enters or is added to the aggregate bond, new/modified heartbeats are then employed.

Description

IIN JIFICE1, aM;O- I iU L P C- <' FO LIN -. It l' I(~ iI ONi \ [0001] The present application is a divisional application from Australian patent Application No. 2010210714, the entire disclosure of which is incorporated herein by reference. Pji P 7'T T hO _ iA' - I i' TioTh TIT ;-N I 1001 a] rTh In e7- rn r ye ye -- er a, 1 -- ompu -t e i- ne w rk pore \/re part i]u--arly thr venion ntot-| 1 aggregation interface monitoring. Desc rpt of Related Art 0 2 02] Ona networked mac rine it is s -ie to nc rease tb communication Landwdth or te availaVil of network Connectivi, V by u .ultiple interfcLs concrLL TCil .s knUwn1 cs "li aggegtionI." Link agCregatio techoloies eable' the m'ahie dployedC with mnuliple ' network intrfacs callid "slave,"t aggregate th band"idth Of Yoilt~iple interfaces, maintain network connect1v' despite intepaace faibier Sr tcular existing link aggregaton technologies supp3or~t t~wo nodes< oIf opera i dus l 1 u e aLcth uc da calaocaing oCde resC CS the rout er (s /switch (as) (connected to thea naetW.orked rac ,--i -o suppot r i agCrerfat ,.0r techno l.ois as well. in lCadi bal dancing mOde the networked machine uses multiple network doe neur Cl1nk aggregat.~on support o th ruter (s /stc .e it . . . -to the iet'orke' machIne n -. r 1s mde, the netword machine only uses one network interface the other network interfaces operate as standbyy slaves" and do not +~1 'a e 'a r ' ^ Y> i'T(i T ' iSTY (C e ~I A) )7 1 Q+~ f K' ''?! ' trnsilt/ reeive. it the dcive s.ave faiis, Ith leeworkued mch switches to one o -he standby sla es and Uses th1-e new act.-i.ve slave for transmitting and receiving packets, 0 003] it is possible for one or more linos to go down or otherwise fail. A link failure may degrade or prevent communication amongdevices on the network. This can be a serious problem in network communication. In the past, monitoring techniques such as ARP monitoring and Mil monitoring have been used to evaluate aggregated links. [0004] In ARP monitoring ARP requests are sent to designated peers in the network and determine the health of slave interfaces based on any received ARP replies. One limitation on this technique is that it relies on the "liveness" of designated peers. Another limitation is that it 5 may not be used in an active/standby link aggregation mode. In this mode, only one active slave is allowed to transmit and receive, and the health of standby slave interfaces (not allowed to send/receive packets) cannot be determined. [00051 In MII monitoring, the technique monitors only the carrier state of aggregated interfaces. It does not detect interface failure when the carrier state is up but the link is down due to bad cables or other issues. [0006] A reference herein to a patent document or other matter which is given as prior art is not to be taken as an admission that that document or matter was known or that the information it contains was part of the common general knowledge as at the priority date of any of the claims. BRIEF SUMMARY OF THE INVENTION [0007] In accordance with aspects of the present invention, robust interface monitoring and management is provided for link aggregation technologies. -2- [0007a] According to a first aspect the present invention provides a link aggregation system in a computer network, comprising: a plurality of network interfaces including a first network interface and a second network interface aggregated into a single bond and providing an aggregated communication link to a network; and a link aggregation monitor coupled to the aggregated first and second network interfaces, the link aggregation monitor configured to provide heartbeats to and receive heartbeats from the aggregated network interfaces to determine a health status of each aggregated network interface; wherein each heartbeat is configured by the link aggregation monitor as a frame having a destination address, source address and heartbeat identifier, and wherein the frame further includes a heartbeat sequence identifier and timestamp to record a system time when a given heartbeat is generated; and wherein the link aggregation monitor maintains a variable hardware address for each of the aggregated network interfaces for transmitting heartbeats to the aggregated network interfaces, and the variable hardware address is reset to a predetermined address upon a fail-over and changed to a newly assigned address when a heartbeat is received from the network interfaces. [0007b] According to a second aspect the present invention provides a link aggregation monitoring apparatus, comprising: a heartbeat generator configured to prepare heartbeats for a plurality of network interfaces arranged in an aggregated communication link to a network; a data store configured to maintain source and destination addresses of the heartbeats for use by each of the network interfaces and to maintain transmit and receive counters for the plurality of network interfaces, the data store being further configured to provide network interface data to the heartbeat generator; and a sniffer configured to observe the heartbeats received by the plurality of network interfaces, the sniffer being further configured to send - 2a information associated with the observed heartbeats to the data store, the information including sequence numbers of heartbeats; wherein the link aggregation monitoring apparatus is configured to correlate the transmit and receive counters with the sequence numbers; and wherein the heartbeat generator is further configured to analyze a health status of each network interface in the aggregated communication link based on the network interface data received from the data store, and to prepare new heartbeats to account for any nonfunctional network interfaces and any added network interfaces. [0007c] According to a third aspect the present invention provides a link aggregation method for use in a computer network, the method comprising: configuring a plurality of network interfaces including a first network interface and a second network interface into an aggregated single bond for providing an aggregated communication link to a network; providing heartbeats to the aggregated network interfaces for circulation among each of the network interfaces in the aggregated single bond; receiving the heartbeats from the aggregated network interfaces; determining a health status of each aggregated network interface based upon the received heartbeats; and maintaining a variable hardware address for each of the aggregated network interfaces for transmitting the heartbeats to the aggregated network interfaces, the variable hardware address being reset to a predetermined address upon a fail-over and changed to a newly assigned address when a heartbeat is received from the network interfaces; wherein each heartbeat is configured as a frame having a destination address, source address and heartbeat identifier, and wherein the frame further includes a heartbeat sequence identifier and timestamp to record a system time when a given heartbeat is generated. [0007d] According to a fourth aspect the present invention provides a link aggregation monitoring method, comprising: preparing heartbeats for a plurality of network interfaces - 2b arranged in an aggregated communication link to a network; maintaining in a data store source and destination addresses of the heartbeats for use by each of the network interfaces, the data store being configured to provide network interface data to a heartbeat generator; maintaining in the data store transmit and receive counters for the plurality of network interfaces; observing the heartbeats received by the plurality of network interfaces with a sniffer, the sniffer being configured to send information associated with the observed heartbeats to the data store, the information including sequence numbers of heartbeats; correlating the transmit and receive counters with the sequence numbers; analyzing a health status of each network interface in the aggregated communication link based on the network interface data received from the data store; and preparing new heartbeats with the heartbeat generator to account for any nonfunctional network interfaces and any added network interfaces. [0008] In accordance with one embodiment of the present invention, a link aggregation system in a computer network is provided. The link aggregation system comprises a plurality of network interfaces and a link aggregation monitor. The plurality of network interfaces includes a first network interface and a second network interface aggregated into a single bond. They provide an aggregated communication link to a network. The link aggregation monitor is coupled to the - 2c aggre gated first and second network interfaces. The link aggregation monitor is configured to provide heartbeats to and receive heartbeats from the aggregated network interfaces to determine a health status of each aggregated network interface. E ach heartbeat is configured by the link aggregation monitor as an Ethernet frame having a destination address, source address and heartbeat identifier. [0009] In one alternative, the Ethernet frame further incl des a heartbeat. sequence identifier and timestamp to record a system time when a gLven heartbeat is generated. In another alternative, if a gi one of the plurality of network interfaces fails, the given network interface is removed from the aggregated bond and the link aggregation monitor prepares new heartbeats conf riured to omit the failed network interface. In a further example, if an additional network interface becomes available, the additional network interface is added to the aggregated bond and the link aggregation monitor prepares new heartbeats configured to include the added network interface. In vet another example, the aggregated network interfaces are arranged in a token ring configuration and the heartbeats are routed around the token ri g. [0010] In, accordance with another embodiment of the present invention, a link aggregation monitoring apparatus is provided. The apparatus comprises a heartbeat generator, a data store and a sniffer device. T.he heartbeat generator is confiured to prepare heartbeats for a plurality of network interfaces arranged in an aggregated communication link to a network. The data store is configured to maintain source and destination addresses of the heartbeats for use by each of the network interfaces. The data store is further configured to provide network interface data to the heartbeat generator. The sniffer device is configured to observe the heartbeats received by t-he plurality of network interfaces. The sniffer -3 device is further configured to send information associated with the observed heartbeats to the data store. The heartbeat generator is further configured to analyze a health status of each network interface in the aggregated communication link based on the network interface data received from the data store, and to prepare new heartbeats to accou-,int for any nonfunctional network interfaces and any added network interfaces. [0011] In one example, the data store is further configured to maintain transmission and reception statistics of selected heartbeats based on the information sent. by the sniffer device. In this case, the transmission and reception statistics may include at least one of timestamps, counters and sequence numbers. [0012] In another example, the link aggregation monitoring apparatus further comprises a user interface configured to display heartbeat statistics associated with the health status of selected network interfaces. [0013] in an alternative example, each network interface arranged in the aggregated communication link is assigned a slave ID associated with a unique address, and each network interface uses the unique address associated with the slave ID of its heartbeat's intended receiver as a destination address. In this case, assignment of the slave IDs to respective network interfaces may depend on a link aggregation mode. Optionally, the heartbeat generator is further configured to detect a fail-over condition when a signal reports that a given one of the network interfaces arranged in the aggregated communication link has switched from a standby mode to a primary mode. [0014] In accordance with yet another embodiment of the present invention, a link aggregation method for use in a computer network comprises: configuring a plurality of network interfaces including a first network interface and a second network interface into an aggregated single bond for providing an aggregated communication link to a. network; providing heartbeats to the aggregated network interfaces for circulation among each of the network interfaces in the aggregated single bond; receiving the heartbeats from the aggregated network interfaces; and determining a health status of each aggregated network interface based upon the received heartbeats; wherein each heartbeat is configured as an Ethernet frame having a. destination address, source address and eartbeat identifier. [0015] In one example, the Ethernet frame further includes a heartbeat sequence identifier and timestamp to record a system time when a given heartbeat is generated. In another example, given one of the plurality of network interfaces fails, the method further comprises: removing the failed network interface from the aggregated bond; and preparing new heartbeats configured to omit the failed network interface. [0016] In a further example, if an additional network interface becomes available, the method further comprises: adding the additional network interface to the aggregated bond; preparing new heartbeats configured to include the added network interface. And in vet another example, the aggregated network interfaces are arranged in a token ring configuration and the heartbeats are routed around the token ring. [0017] In accordance with another embodiment of the present mention, a. link aggregation monitoring method comprises preparing heartbeats for a plurality of network interfaces arranged in an aggregated communication link to a network; maintaining in a data store source and destination addresses of the heartbeats for use by each of the network interfaces, the data store being configured to Provide network- interface data to a heartbeat generat or; observing the heartbeats received by the plurality of network interfaces with a sniffer device, the sniffer device being configured to send information associated with the observed heartbeats to the data store; analyzing a health status of each network interface in the aggregated communication link based on the network interface data received from the data store; and preparing new heartbeats with the heartbeat generator to account for any nonfunctional network interfaces and any added network interfaces. [0018] In one example, the data store is further configured to maintain transmission and reception statistics of selected heartbeats based on the information sent by the sniffer device. Here, the transmission and reception statistics may include at. least one of timestamps, counters and sequence numbers. [0019] In another example, the link aggregation monitoring method further comprises displaying heartbeat statistics associated with the health status of selected network interfaces to a user. [0020] 1n a further example, the link aggregation monitoring method also comprises assigning a slave ID to each network interface arranged in the aggregated communication link; and associating each slave ID with a unique address; wherein each network interface uses the unique address associated with the slave ID of its heartbeat's intended receiver as a destination address; and wherein each network interface uses the unique address associated with its assigned slave ID as the source address of its outgoing heartbeats. In this case, assignment of the slave IDs to respective network interfaces may depend on a link aggregation mode. And in another alternative, the link aggregation monitoring method may further comprise detecting a fail-over condition when a signal reports that a given one of the network interfaces arranged in the aggregated communication link has switched. from a standby mode to a primary mode. DESCRIPTION OF THE DRAWINGS -6 [0021] FIG. illustrates a link a ggregation monitoring system in accordance with aspects of the present irvetion. [0022] FIG. 2 illustrates a link monitoring apparatus in accordance with. aspects of the present invention. [0023] IS 3 A-C illustrate heartbeat distribution in accordance with aspects of the present invention. [0024] Figure 4 illustrates a heartbeat configuration in accordance with aspects of the present invention.

DETA

T T ED DESCRIPTION [0025] The aspects, features and advantages of the present invention will be appreciated wnen considered with reference to the following description of preferred embodiments and accompanying figures. The following description does not limit the present invention; rather, the scope of the invention is defined by the appended claimS and equivalents. [0026] FIG. I illustrates a system 100 that implements link aggregation monitoring in accordance with aspects of the present invention. The system 100 includes a host device 102, a backbone network 104, and internetworking devices 106 such as routers, hubs, bridges or switches. The host device 102 may comprise a server, P C, network switch etc. in one example, the host device 102 may be a switch used in a network datacenter. As shown, the host device 102 is coupled to the backbone network 104 via the internetworking devices 106. [0027] The host device 102 includes a link aggrecation Sonitor 108 and a pair of network interfaces 110 identified as "Eth0" and "Eh" While only two network interfaces 110 are illustrated, more than two such interfaces may be employed with the embodiments of the invention presented herein. The network i 110 are desirable aggregated into a single bond 112. As shown by arrows 114, the link aggregation monitor 108 enables the host device 102 to exchange heartbeatst" among the network interfaces 110 and monitor the transmit ("Tx") and receive ("Rx") health of the links to the backbone network 104. As used herein, the term "heartbeat" includes messages of a unique type that may be configured as data frames (e.g., Ethernet frames) for transmission among multiple network elements. Details and examples of various heartbeat formats are provided below. [0028] The link aggregation monitor 108 desirable includes a heartbeat generator (not shown), which constructs heartbeats to be sent and forward the heartbeats to the desired network interfaces/device drivers. The network k interfaces/ device drivers will, in turn, transmit heartbeats out on a physical layer/medium ("PHY") to the backbone network 104 . In addition, the heartbeat generator receives PEY down/up events of network interfaces, and removes/adds interfaces im a token ring or loo--type configuration. [0029] As will be discussed in more detail below, the heartbeats are exchanged amono network interfaces in a token rig pattern. The heartbeats flow in one of two directions through the network interfaces 110, the internetworking devices 106 and the backbone network 104. For instance, as snown in FIG. 1, a first heartbeat 116a may pass in a clockwise direc tin first through network interface Eth1, through a first internetworking device, the backbone network, a second inte rnetworking device and then through network interface EthO. And a second heartbeat 116 rb may pass in a counterclockwise direction first through network interface EthO, through a first internetworking device, the backbone network, a second internetworking device and then through network interface Ethl. [0030] FIG. 2 is a block diagram 200 illustration one example of the link aggregation monitor 108 of FIG. 1. The link aggregation monitor 108 may include or otherwise be logically associated with certain devices. As noted above, a heartbeat generator 202 is desir ably cart of the link aggregation monitor 108. Also shown in the block diagram 200 -8 are a data store 204, a sniffer 206, a command line interface 208 and a web interface 210. Each of these elements may be part of the link aggregation monitor 108. Alternatively, sore or all of these elements may be separate components and/or program-s used by the link aggregation monitor. These elements are discussed in more detail below. [0031] As noted above, the heartbeat generator 202 generates outgoing heartbeats. Source and destination MAC address of heartbeats are desirably provided by the data store 204. When a heartbeat is sent, the heartbeat generator 202 may send a transmit/Tx event to the data store 204, which increments a Tx counter maintained for all network interfaces 11( being aggregated (e.g., Eth0, Ethl). [0032] primary function of the data store 204 is to store, update, and output various information concerning the network interfaces 110. For each interface, the data store 204 may maintain source/destination MAC address of heartbeats to be transmitted from a given interface. The data store 204 may also maintain transmission and recent ion statistics of heartbeats and data such as timestamps (e.g., primary and standby slaves' Tx and Rx times), counters (e.g., primary and standby slaves' Tx and Rx counters) and sequence numbers (e.g., primary and standby slaves' Tx and Rx sequence numbers) may be received and maintained by the data store 204. It also desirably receives PHY up/down events, Tx/Rx events of heartbeats for network interfaces from other components, and updates its database accordingv. "PHY up" indicates that a driver or other device detects a carrier state on the network interface. The above information concerning network interfaces is provided to other component s, such as the heartbeat generator 202, upon request [0033] The sniffer 206 intercepts or otherwise observes heartbeats received on the network interfaces 110. The sn i ffer desirable forwards in formation concerning received. -9 heartbeats (e.g., the incoming network interface, source/dest nation MAC address, sequence number or received hearbeats, and etc.) to the data store 204. By correlating the Tx/Rx counters and sequence numbers of primary and standby interfaces, the link aggregation monitor may determine any losses of heartbeats. [0034] The command line interface 208 provides a user interface in which users may query Tx/Rx statistics regarding the network interfaces 110. The web interface 210 provides a web page displaying Rx/Tx statistics of the network interfaces 110. The command line and web interfaces may provide additional functionality such as enabling a user to manage operation of the link aggregation monitor 108 and,/or the aggregat-ion bond 112 of the network interfaces 110. [0035] In one embodiment, the link aggrega ion monitor 108 is implemented as a module run as a single process. The module nesirably runs a loop that multiplex 1/O events rrom at least the heartbeat generator 202 and the sniffer 206. In one example, the heartbeat generator 202 sends heartbeats periodically from. th and Ethl. During its i heartbeat generator 202 opens two raw socket s, one on EthO and the other on Ethi. Heartbeat generator 202 then adds a periodic alarm/indicator to send heartbeats from Eth0 and from Ethi in an interleaved manner. This may be done at a fixed interval, which may be set or otherwise configured by a heartbeat interval flag. [0036] In this example, the heartbeats from Eth0 are destined to Ethl, and vice versa, to monitor the Tx and Rx health of a standby link. The sniffer 206 desirable opens a raw socket on the aggregate bond to intercept all heartbeat s received on Eth0 and Ethi. The above two threads will read and write into the data store 204, which holds transmission/reception statistics of heartbeats as well as other global control information including the heartbeat forma. Data store 204 may also serve link health requests triggered by user inputs. And the data store 204 may report k health data via the web interface 210. [0037] FIG. 3A illustrates an exemolary token-ring arrangement for pairing multiple network interfaces 110 and exchange heartbeats among them in accordance with aspects of the present invention. As noted above, more than two network interfaces may be employed. In t he example of F1G. 3A, eight network interfaces 110 (Eth,0 Eth7) are used. It should be understood that any number of network interfaces greater than two may be used in an aggregated bond 112 in accordance with aspects of the present invention. [0038) n thea example of FIG. 3A, it is assumed that all network interfaces are PHY up and can transmit data. TI he nh network interface, ethn, exchanges heartbeats with eth (n+1) In the present example, ethn is considered to be Tx health if eth (n-f) receives all heartbeats transmitted by ethn. ethn is considered to be Rx healthy if it receives all heartbeats from [0039] I accordance with an aspect of the present invention, when one interface ethic fails, it is removed from the token ring. The remaining interfaces maintain an aggregate bond. In this case, eth (i-7) now sends heartbeats to eth (i ) . eth (i1) is considered to be Tx healthy if eth (i+1) receives all heartbeats from eth (i-) . Similarly, eth (i-f) is considered to be Rx healthy if it receives all heartbeats from Eth(i-1) . [0040] An example on how heartbeat flow changes as interfaces o up and down is shown in FIGS 3B-C . In FIG. 3B, when eth] is PHY down, eth starts to send heartbeats direct lv to eth2. In this case, ethic is removed from the token ring. Information regarding the modified token ring may be stored in data store 204. New/modified heartbeats are generated by the heartbeat generator 202 using such information.

[0041] At some point, t-he disabled/inactive int-erface eth1 may become fully operational When this occurs, the interface ethi may be incorporated into the token ng. As shown in FIG. 3C, when eth transits from PHY down to up, ethl is added to the token ring. In this case, the heartbeat (s) are reconfigured so that ethO sends a heartbeat to ethic and eth? sends heartbeats to eth2. [0042] The heartbeats in accordance with aspects of the present invention include the format shown in FIG. 4 Heartbeats are desirably configured as Ethernet frames with a new ethertype, ETH_P_EARTBEAT, to distinguish them from other types of ethernet frames, such as AREP, IP etc. Each heartbeat carries a sequence number (e.g., 32 bits), generated by the sendinQ network interface, a timestamp fieId (e.g., 64 bits) recording the system time when the heartbeat is generated, and optionall a1 padclino fiel:dif needed to satisfy the minimum length of an Ethernet frame. The length of the padding field may be reduced to accommodate new fields in the heartbeat frame. Note that heartbeats are sent from and received by the same serve-r host. Therefore, byte ordering of elds in the heartbeats is not a problem. [0043] To identify the sending network interface of a given heartbeat, eacr network interface is assigned a s ave ID. Eo slave id is dcesirabily assigned a unique MAC acidress. A network interface may always use the MAC address assigned to its slave ID as the source MAC of its heartbeat s. The network interfa ce uses the MAC address assigned to the slave 1 I of its heartbeats' intended receiver (such as in the token ring of FIGS. 3A-C) as the destination MAC address. As shown in FIG. 4, network interface ethi is desirably assigned slave ID i, and its heartbeats are sent to slave ID (i+1) [0044] In accordance with another aspect of the present inVention, assignment of MAC addresses to slave ID, and assignment of slave IDs to network interfaces depends on the -12mode of link aggregation. These include load balance nog mode and active/standby mode. In active/standby mode, only one network interface, referred to as the primary slave, is actively transmitting and receiving. When the primary slave fails, one standby slave is selected to become the new primary slave and to transmit/ receive traffic. The active/standby mode is used when a switch or other device connected to the slave interfaces does not support link aggregation, and can associate a MAC address to only one int erface at any time. The associated interface of specific MAC address can change over time In one example, each switch is deployed with at least two network interfaces aggregated in active/standby mode . [0045] By correlating heartbeat Tx/Rx counts of a standby network interface with heartbeat Tx/Rx counts or a primary network interface, the Tx/Rx quality of a standby link may be determined by the link aggregation monitor 108. [0046] In load balancing mode, traffic originated from (the application of) the server host is spread across all network ilterffaces that are PHY up. In both modes, application traffic sent by the host (from different network interfaces) carries the same MAC, referred to as primary MAC, which is yically ethO's permanent MAC address stored in its EEPROM. [0047] In load balancing mode, each network interface is assigned a unique slave ID ranging from 0 to n-1, where n is the total number of available network interfaces. Each slave needs a unique MAC address which is different From the primary MAC address. Among the total n different M ACs require d, n-1 can be selected from the permanent MAC addresses of network interfaces being aggregated, whose MACs are not chosen as the primary MAC. The nth MAC address used for this scheme is new. In one example, the assignment-s of MAC addresses to slave ID, and assignment of slave IDs to network interfaces, never changes.

[0048] In act ive/standby mode, the assignrments of a MAC address to a sl:ve ID desirably never changes. Thus, in an example, slave id 0 is assigned the primary MAC. Slave IDs 1 to n-1 are each assigned a unique MAC from the other (n- ) permanent MAC addresses of network interfaces which are not used as the prImary MAC. The assignment of slave ILS to network interfaces changes as interfaces go up and down. The active interface may always be assigned to slave ID 0, and standby interfaces may be assigned to slave IDs 1 to n-1. [0049] When a standby network interface is P21Y down, exIsting "alive" network interfaces des rably do not change their slave IDs. When the active network interface "di es, the new active network interface may switch to slave IC 0. When a network nterface becomes alive again, if it is not the active interface, it may employ a slave IC between I to n-I that is not being used by any other alive interface. If a network interface replaces the existing active interface (to be the new act ive interface), then in this example the rep acement. network interface uses slave 1 d. ne previous active interface would then change to a slave ID angin-g from 1 to n-1 that is not used by any other alive interface. [0050] In the example of FIG. 3A, if a heartbeat -is lost, both the sender (A) and the receiver (B) of that heartbeat may be the cause of the failure . To further identify which interface is faulty, in accordance w ith another aspects of the present invention the sender of the heartbeat now sends broadcast heartbeats (with a broadcast MAC address as the destination) . If at least one interface in the token ring receives the broadcast heartbeats, the receiver (B) is determi ned to be faulty. The receiver is then removeD from the token ring as illustrated in FIG. 3B. If no interfaces in the token ring receive the broadcast heartbeats, then the sender (A) is considered as faulty. The sender is then removed from the token ring. This process continues until all faulty interfaces are identified and removed from the token [0051] To minimize bandwidth usage, heartbeats may be configured as unicast frames whenever possible. Heartbeats may have broadcast destination addresses in the active/standby mode of link aggre gation, when the switch bas incorrect knowledge of a port attached to a MAC address. This situation occurs after a fail-over, when a formerly standby slave now becomes the new active slave and is assigned a different slave ID and thus new MAC address. The switch desirably learns the new port attached to a MAC address through the heartbeats initiated from this MAC address. After fail over, if a network interface is assiigned a different MAC address, the destination address of heartbeats sent to this network interface should first be set to a broadcast MAC address, and then set to its newly assigned MAC address, after the sniffer has received heartbeat s from this net work interface using the newly v assigned MAC address. To keep track of the appropriate destination MAC address to use for outgoing heartbeats, the irk aagregation monitor desirably maintains a variable hear teat dest ination mac for each network interface. Upon a a. -over, for network interfaces assigned wit h different MAC addresses, their heartbeaL destination MAC addresses are des rably reset to a predeermn ed addre-ss, such as f f Lff:ff:ff:ff. Once the sniffer has ceid a heartbeat frm these interfaces, heartbeatdestination MAC addresses of the e interfaces are changed to their ne wly ass]Gned MAC addresses. Fail--over is detected by the heartbeat generator wnen a signal on a bonding driver reports that a network -interface switches from "standby" to "primary". [0052] The sniffer 206 may monitor such operations/events and direct the data store 204 to modify its database accordingly. As faulty network interfaces are identified and removed from the token rinq, the data store database is

---

updated, and the heartbeat generator 202 may configure new or modified heartbeats accordingly. Such information may be provided to the command line and/or web interfaces. [0053] While certain steps and configurations have been described and illustrated in a particular order, it should be understood that such actions may occur in a different order or concurrently. By way of example, the token ring configuration of FIG. 3A illustrates an aggregate bond including all network interfaces ethO through eth7. However, different token ring topologies may be employed. For instance, one or more aggregate bonds may be configured. in this case, a first. aggregate bond may include a subset of network interfaces arranged in a token ring, such as interfaces ethO, eth2, eth4 and eth6. A second aggregate bond may include a second subset of network interfaces arranged in another token ring, such as interfaces ethi, eth3, eth5 and eth7. Furthermore, the token ring configurations disclosed herein may be used on machines in any network topology. Each machine on the network may employ its own set of heartbeats in its own token ring configuration. INDUSTRIAL APPLICABILITY [0054] The present invention enjoys wide industrial applicability including, but not limited to, computer network topologies and interface monitoring and management of link aggre gat ion technologies s.

Claims (22)

1. A link aggregation system in a computer network, comprising: a plurality of network interfaces including a first network interface and a second network interface aggregated into a single bond and providing an aggregated communication link to a network; and a link aggregation monitor coupled to the aggregated first and second network interfaces, the link aggregation monitor configured to provide heartbeats to and receive heartbeats from the aggregated network interfaces to determine a health status of each aggregated network interface; wherein each heartbeat is configured by the link aggregation monitor as a frame having a destination address, source address and heartbeat identifier, and wherein the frame further includes a heartbeat sequence identifier and timestamp to record a system time when a given heartbeat is generated; and wherein the link aggregation monitor maintains a variable hardware address for each of the aggregated network interfaces for transmitting heartbeats to the aggregated network interfaces, and the variable hardware address is reset to a predetermined address upon a fail-over and changed to a newly assigned address when a heartbeat is received from the network interfaces.
2. The link aggregation system of claim 1, wherein if a given one of the plurality of network interfaces fails, the given network interface is removed from the aggregated bond and the link aggregation monitor prepares new heartbeats configured to omit the failed network interface.
3. The link aggregation system of claim 1, wherein if an additional network interface becomes available, the additional network interface is added to the aggregated bond and the link 17 aggregation monitor prepares new heartbeats configured to include the added network interface.
4. The link aggregation system of claim 1, wherein the aggregated network interfaces are arranged in a token ring configuration and the heartbeats are routed around the token ring.
5. A link aggregation monitoring apparatus, comprising: a heartbeat generator configured to prepare heartbeats for a plurality of network interfaces arranged in an aggregated communication link to a network; a data store configured to maintain source and destination addresses of the heartbeats for use by each of the network interfaces and to maintain transmit and receive counters for the plurality of network interfaces, the data store being further configured to provide network interface data to the heartbeat generator; and a sniffer configured to observe the heartbeats received by the plurality of network interfaces, the sniffer being further configured to send information associated with the observed heartbeats to the data store, the information including sequence numbers of heartbeats; wherein the link aggregation monitoring apparatus is configured to correlate the transmit and receive counters with the sequence numbers; and wherein the heartbeat generator is further configured to analyze a health status of each network interface in the aggregated communication link based on the network interface data received from the data store, and to prepare new heartbeats to account for any nonfunctional network interfaces and any added network interfaces. 18
6. The link aggregation monitoring apparatus of claim 5, wherein the data store is further configured to maintain transmission and reception statistics of selected heartbeats based on the information sent by the sniffer.
7. The link aggregation monitoring apparatus of claim 6, wherein the transmission and reception statistics include at least one of timestamps, counters and sequence numbers.
8. The link aggregation monitoring apparatus of claim 5, further comprising a user interface configured to display heartbeat statistics associated with the health status of selected network interfaces.
9. The link aggregation monitoring apparatus of claim 5, wherein each network interface arranged in the aggregated communication link is assigned a slave ID associated with a unique address, and each network interface uses the unique address associated with the slave ID of its heartbeat's intended receiver as a destination address.
10. The link aggregation monitoring apparatus of claim 9, wherein assignment of the slave IDs to respective network interfaces depends on a link aggregation mode.
11. The link aggregation monitoring apparatus of claim 9, wherein the heartbeat generator is further configured to detect a fail-over condition when a signal reports that a given one of the network interfaces arranged in the aggregated communication link has switched from a standby mode to a primary mode.
12. A link aggregation method for use in a computer network, the method comprising: 19 configuring a plurality of network interfaces including a first network interface and a second network interface into an aggregated single bond for providing an aggregated communication link to a network; providing heartbeats to the aggregated network interfaces for circulation among each of the network interfaces in the aggregated single bond; receiving the heartbeats from the aggregated network interfaces; determining a health status of each aggregated network interface based upon the received heartbeats; and maintaining a variable hardware address for each of the aggregated network interfaces for transmitting the heartbeats to the aggregated network interfaces, the variable hardware address being reset to a predetermined address upon a fail-over and changed to a newly assigned address when a heartbeat is received from the network interfaces; wherein each heartbeat is configured as a frame having a destination address, source address and heartbeat identifier, and wherein the frame further includes a heartbeat sequence identifier and timestamp to record a system time when a given heartbeat is generated.
13. The link aggregation method of claim 12, wherein if a given one of the plurality of network interfaces fails, the method further comprises: removing the failed network interface from the aggregated bond; and preparing new heartbeats configured to omit the failed network interface.
14. The link aggregation method of claim 12, wherein if an additional network interface becomes available, the method further comprises: 20 adding the additional network interface to the aggregated bond; and preparing new heartbeats configured to include the added network interface.
15. The link aggregation method of claim 12, wherein the aggregated network interfaces are arranged in a token ring configuration and the heartbeats are routed around the token ring.
16. A link aggregation monitoring method, comprising: preparing heartbeats for a plurality of network interfaces arranged in an aggregated communication link to a network; maintaining in a data store source and destination addresses of the heartbeats for use by each of the network interfaces, the data store being configured to provide network interface data to a heartbeat generator; maintaining in the data store transmit and receive counters for the plurality of network interfaces; observing the heartbeats received by the plurality of network interfaces with a sniffer, the sniffer being configured to send information associated with the observed heartbeats to the data store, the information including sequence numbers of heartbeats; correlating the transmit and receive counters with the sequence numbers; analyzing a health status of each network interface in the aggregated communication link based on the network interface data received from the data store; and preparing new heartbeats with the heartbeat generator to account for any nonfunctional network interfaces and any added network interfaces. 21
17. The link aggregation monitoring method of claim 16, wherein the data store is further configured to maintain transmission and reception statistics of selected heartbeats based on the information sent by the sniffer.
18. The link aggregation monitoring method of claim 17, wherein the transmission and reception statistics include at least one of timestamps, counters and sequence numbers.
19. The link aggregation monitoring method of claim 16, further comprising displaying heartbeat statistics associated with the health status of selected network interfaces to a user.
20. The link aggregation monitoring method of claim 16, further comprising: assigning a slave ID to each network interface arranged in the aggregated communication link; and associating each slave ID with a unique address; wherein each network interface uses the unique address associated with the slave ID of its heartbeat's intended receiver as a destination address; and wherein each network interface uses the unique address associated with its assigned slave ID as the source address of its outgoing heartbeats.
21. The link aggregation monitoring method of claim 20, wherein assignment of the slave IDs to respective network interfaces depends on a link aggregation mode.
22. The link aggregation monitoring method of claim 20, further comprising detecting a fail-over condition when a signal reports that a given one of the network interfaces arranged in the aggregated communication link has switched from a standby mode to a primary mode. 22
AU2013251205A 2009-02-03 2013-10-30 Interface monitoring for link aggregation Active AU2013251205B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US12/364,634 2009-02-03
AU2010210714A AU2010210714C1 (en) 2009-02-03 2010-02-03 Interface monitoring for link aggregation
AU2013251205A AU2013251205B2 (en) 2009-02-03 2013-10-30 Interface monitoring for link aggregation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2013251205A AU2013251205B2 (en) 2009-02-03 2013-10-30 Interface monitoring for link aggregation

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU2010210714A Division AU2010210714C1 (en) 2009-02-03 2010-02-03 Interface monitoring for link aggregation

Publications (2)

Publication Number Publication Date
AU2013251205A1 AU2013251205A1 (en) 2013-11-21
AU2013251205B2 true AU2013251205B2 (en) 2015-08-13

Family

ID=49584798

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2013251205A Active AU2013251205B2 (en) 2009-02-03 2013-10-30 Interface monitoring for link aggregation

Country Status (1)

Country Link
AU (1) AU2013251205B2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6229538B1 (en) * 1998-09-11 2001-05-08 Compaq Computer Corporation Port-centric graphic representations of network controllers
US20080016402A1 (en) * 2006-07-11 2008-01-17 Corrigent Systems Ltd. Connectivity fault management (CFM) in networks with link aggregation group connections

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6229538B1 (en) * 1998-09-11 2001-05-08 Compaq Computer Corporation Port-centric graphic representations of network controllers
US20080016402A1 (en) * 2006-07-11 2008-01-17 Corrigent Systems Ltd. Connectivity fault management (CFM) in networks with link aggregation group connections

Also Published As

Publication number Publication date
AU2013251205A1 (en) 2013-11-21

Similar Documents

Publication Publication Date Title
US8400929B2 (en) Ethernet performance monitoring
US7043541B1 (en) Method and system for providing operations, administration, and maintenance capabilities in packet over optics networks
US7065040B2 (en) Ring switching method and node apparatus using the same
JP4212476B2 (en) Method for supporting SDH / SONET Automatic Protection Switching over Ethernet
US7924725B2 (en) Ethernet OAM performance management
JP4598462B2 (en) Provider network providing an L2-VPN service and edge router
US7768928B2 (en) Connectivity fault management (CFM) in networks with link aggregation group connections
US20050099949A1 (en) Ethernet OAM domains and ethernet OAM frame format
JP4257509B2 (en) Network system, node device, redundancy construction method, and redundancy construction program
US20030223376A1 (en) Testing network communications
JP2533998B2 (en) Automatic fault recovery in the packet network
CN1246994C (en) Method and system for implementing a fast recovery process in a local area network
US20050099951A1 (en) Ethernet OAM fault detection and verification
US20040208129A1 (en) Testing network communications
US7639605B2 (en) System and method for detecting and recovering from virtual switch link failures
US20070140126A1 (en) Method and system for originating connectivity fault management (CFM) frames on non-CFM aware switches
KR101706006B1 (en) A method and system for updating distributed resilient network interconnect states
US7630301B2 (en) Method and apparatus for line and path selection within SONET/SDH based networks
EP2144400B1 (en) Distributed ethernet system and method for detecting fault based thereon
US20060092856A1 (en) Node device
CA2651861A1 (en) Method and system for protecting a sub-domain within a broadcast domain
US9148297B2 (en) Information processor and control network system
WO2004112327A1 (en) Router and network connecting method
WO2011144495A1 (en) Methods and apparatus for use in an openflow network
US7869376B2 (en) Communicating an operational state of a transport service

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)
HB Alteration of name in register

Owner name: GOOGLE LLC

Free format text: FORMER NAME(S): GOOGLE, INC.