WO2022077972A1 - Mlag链路故障切换方法和装置 - Google Patents

Mlag链路故障切换方法和装置 Download PDF

Info

Publication number
WO2022077972A1
WO2022077972A1 PCT/CN2021/105665 CN2021105665W WO2022077972A1 WO 2022077972 A1 WO2022077972 A1 WO 2022077972A1 CN 2021105665 W CN2021105665 W CN 2021105665W WO 2022077972 A1 WO2022077972 A1 WO 2022077972A1
Authority
WO
WIPO (PCT)
Prior art keywords
mlag
forwarding database
member interface
link
switch
Prior art date
Application number
PCT/CN2021/105665
Other languages
English (en)
French (fr)
Inventor
刘树名
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21879020.2A priority Critical patent/EP4207646A4/en
Priority to JP2023521837A priority patent/JP2023544870A/ja
Publication of WO2022077972A1 publication Critical patent/WO2022077972A1/zh
Priority to US18/298,958 priority patent/US20230246949A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • H04L45/245Link aggregation, e.g. trunking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/44Distributed routing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Definitions

  • the present application relates to a communication technology, and in particular, to a method and apparatus for MLAG link failover.
  • a multi-chassis link aggregation group refers to the aggregation of two or more switches across device links into an active-active system to improve link reliability.
  • switch 1 is connected through port 1 and port 3 of switch 2 (the link between port 1 and port 3 is called a peer-to-peer link, also known as a peer-link link) to form an MLAG system.
  • the server is connected to the MLAG system, that is, the server is simultaneously connected to switch 1 and switch 2 through two independent physical ports.
  • the server can use the active-active configuration to send packets to two switches at the same time, or configure the active-standby mode to send packets to only one switch (in case of failure, send packets to the other switch).
  • switch 1 When switch 1 receives a packet sent from the network side, it will look up the local forwarding database (FDB) according to the destination address of the packet, such as the MAC address table, to find the physical outgoing port corresponding to the server. Such as port 2, and forward the traffic to the server through port 2.
  • FDB local forwarding database
  • switch 1 updates the forwarding database, and updates the value of the outgoing port of all entries corresponding to port 2 to port 1 (peer -link link). Another method is that switch 1 deletes all entries corresponding to port 2 or deletes all entries in the forwarding database, and through the re-learning of the MAC address, learns the mapping record between the server's corresponding MAC address and port 1, and completes the MLAG link switching.
  • both of these two methods have certain defects.
  • the refresh of the port value of the forwarding database needs to be processed by the CPU of switch 1.
  • the forwarding database needs to be frozen (freezing), so that the forwarding database cannot be used for packet forwarding, and the switch 1 will be interrupted for a period of time.
  • the latter method will cause the switch 1 to broadcast within a certain period of time during the re-learning process.
  • the re-learning also takes a certain period of time.
  • the present application provides a method and device for MLAG link failover, which are used to reduce the time overhead of link switchover and improve the efficiency of link failover.
  • the present application provides a network device for MLAG link failover, the network device includes a chip, the chip is used to generate a forwarding database, and forward received messages according to the forwarding database; the forwarding database includes the first A forwarding database, a second forwarding database and a third forwarding database.
  • the first forwarding database is used to store a mapping from at least one multi-frame link aggregation group MLAG member interface to at least one active/standby switchover flag;
  • the second forwarding database is used to store at least one first a mapping from a physical port to the at least one MLAG member interface;
  • the third forwarding database is used to store the mapping from the MAC address of the media access control to the first MLAG member interface, where the first MLAG member interface is in the at least one MLAG member interface .
  • the network device decouples the association between the MAC address and the physical port through the MLAG member interface, so that when the network link changes (if there is a link failure), the information of the MAC address and physical port does not need to be modified. The switchover of the faulty link can be realized, and the efficiency of the switchover is improved.
  • the chip is further configured to store the mapping of the source MAC address of the packet to the second MLAG member interface in the third forwarding database, where the second MLAG member interface is based on The second forwarding database corresponds to the MLAG member interface of the first physical port, where the first physical port is the physical port that receives the message.
  • the chip updates the value of the active-standby switchover flag corresponding to the third MLAG member interface in the first forwarding database to the standby flag, and the standby flag is used to indicate
  • the chip sends a message directed to the third MLAG member interface through the peer-to-peer link, where the third MLAG member interface is the MLAG member interface corresponding to the second physical port according to the second forwarding database, and the second physical port is the Physical port of the MLAG link.
  • the network device Only by modifying the value of the active/standby switchover flag, the network device completes the switchover of the faulty link, which improves the efficiency of the link switchover.
  • the chip updates the value of the active-standby switchover flag corresponding to the third MLAG member interface to the main flag, and the main flag is used to instruct the chip according to the The second forwarding database forwards the message. Only by modifying the value of the active/standby switchover flag, the network device completes the recovery of the faulty link, which improves the efficiency of link recovery.
  • the chip is configured to: determine, according to the third forwarding database, the MLAG member interface corresponding to the destination MAC address of the packet as the first MLAG member interface; when the first forwarding When the value of the active/standby switching flag corresponding to the first MLAG member interface in the data is the primary flag, the chip forwards the message through a third physical port, where the third physical port is the same as the second forwarding database. The physical port corresponding to the first MLAG member interface.
  • the chip forwards the packet through the peer-to-peer link.
  • the network device and another network device form an MLAG system
  • the network device is configured to synchronize the third forwarding data to the other network device.
  • the synchronization between the MLAG systems through the third forwarding database ensures that after a link fails, when a new external device accesses the MLAG system, the network device can obtain the information of the new access device.
  • the present application discloses a method for switching an MLAG link.
  • the method includes: when a MLAG link of a network device fails, changing the value of the active-standby switchover flag corresponding to the first MLAG member interface in the first forwarding database It is updated to a standby flag, and the standby flag is used to instruct the network device to send a packet directed to the first MLAG member interface through the peer-to-peer link, and the first MLAG member interface is based on the second forwarding database and the first physical interface.
  • the value of the active/standby switchover flag corresponding to the first MLAG member interface is updated to the main flag, and the main flag is used to instruct the network device according to the first 2.
  • the forwarding database sends the message.
  • the network device further includes a third forwarding database
  • the method further includes: storing the mapping of the source MAC address of the packet to the second MLAG member interface in the third forwarding database
  • the second MLAG member interface is an MLAG member interface corresponding to a second physical port according to the second forwarding database
  • the second physical port is a physical port that receives the message.
  • the method further includes: according to the third forwarding database, determining that the MLAG member interface corresponding to the destination MAC address of the packet is the first MLAG member interface, wherein the third The forwarding database also includes the mapping from the destination MAC address to the first MLAG member interface; when the value of the active-standby switchover label corresponding to the first MLAG member interface in the first forwarding database is the primary label, the first physical The port forwards the packet.
  • the packet is forwarded through the peer-to-peer link.
  • the third forwarding database is synchronized to another network device, wherein the another network device and the network device form an MLAG system.
  • Fig. 1 is the schematic diagram of MLAG
  • FIG. 2 is a schematic diagram of a MLAG networking mode
  • FIG. 3 is a schematic diagram of another MLAG networking mode
  • FIG. 4 is a schematic diagram of another MLAG networking mode
  • FIG. 5 is a schematic diagram of an apparatus for MLAG link failover provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an L2FDB of a third forwarding database provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an L3FDB of a third forwarding database provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a first forwarding database provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of second forwarding data provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of generating a third forwarding database entry according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a MLAG link failover provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a newly added access device provided by an embodiment of the present application.
  • Multiple switches implement link aggregation between multiple devices through the MLAG mechanism. These devices form an active-active system, also known as the MLAG system.
  • the server or customer edge (CE) accesses the ordinary Ethernet network through the MLAG system, and the transparent interconnection of lots of links, TRILL), Virtual Extensible Local Area Network (VXLAN), or the Internet.
  • TRILL Transmission Control Protocol
  • VXLAN Virtual Extensible Local Area Network
  • the MLAG system access, on the one hand, it can play the role of load balancing traffic, and on the other hand, it can play the role of backup protection.
  • MLAG system There are many networking modes of MLAG system, such as server access, switch access and multi-level MLAG mode.
  • Server access As shown in Figure 2, the server is connected to the network by connecting switch 1 and switch 2 (switch 1 and switch 2 form an MLAG system).
  • Switch access As shown in Figure 3, the user edge device (switch 3) is connected by connecting switch 1 and switch 2 (switch 1 and switch 2 form an MLAG system), and the server is connected to the network through switch 3.
  • Multi-level MLAG is shown in Figure 4.
  • Switch 1 and switch 2 form an MLAG system (assuming mlag1)
  • switch 3 and switch 4 form a MLAG system (assuming mlag2)
  • switch 3 and switch 4 are connected to the mlag1 system. .
  • the MLAG link switching device disclosed in this application can be any switch in the MLAG system as shown above, such as switch 1 or switch 2 shown in FIG. 2 and FIG. 3 , switch 1 and switch 2 shown in FIG. 4 . , Switch 3 or Switch 4, etc.
  • this application takes the switch 1 in the server access scenario shown in FIG. 2 as an example to describe the MLAG link switching apparatus.
  • the MLAG link switching device is shown as a switch 500 in FIG. 5 , and the switch 500 is composed of a chip, a memory 507 and a port.
  • the chips include chip 502 .
  • Port 506 is used to forward packets.
  • the port 506 can be connected with an external device to form an MLAG link or a peer-link link.
  • the switch 1 for the switch 1, both the port 1 and the port 2 belong to the port 506.
  • the chip 502 may be an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a network processor (NP), or the like.
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • NP network processor
  • the memory 507 can be random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), content addressable memory (CAM) It can also be one of non-volatile memory (non-volatile memory) such as read-only memory (ROM), solid state disk (SSD), etc. or a combination of multiple types of memory.
  • non-volatile memory such as read-only memory (ROM), solid state disk (SSD), etc. or a combination of multiple types of memory.
  • the memory 507 may be composed of a plurality of different types or the same type of memory, and the plurality of memories may be deployed in different modules of the switch 500 .
  • the memory 507 may be located in the chip 502 , but the memory 507 may also be located outside the chip 502 .
  • the memory 507 may store instructions 508 or store entries for the first forwarding database 503 , the second forwarding database 504 and the third forwarding data 505 .
  • the entries of the instruction 508, the first forwarding database 503, the second forwarding database 504, and the third forwarding data 505 may be stored in one memory, or may be stored separately in different memories, which are not limited in this application.
  • the chip 502 can also include a central processing unit (CPU) 501, and the CPU 501 can be used for forwarding control, maintaining some software table items (such as software routing table, software ARP (address resolution protocol) table, etc.
  • the chip 502 can pass the CPU 501 Look up software routing table or software APR table etc. to support route forwarding.
  • Switch 500 generates first forwarding database 503, second forwarding data 504 and third forwarding database 505 by calling instruction 508 of CPU 501 or chip 502.
  • the first forwarding database 503, as shown in FIG. 8, includes the mapping from the MLAG member interface (interface) to the active/standby switch-flag (switch-flag).
  • the MLAG member interface is the physical port (also known as the MLAG port) connecting the switch 1 with the external device.
  • the logical identifier is used to indicate the specific external device connected to the switch 1, and the link connected to the external device (the external device can be a server or a switch, such as server 1 in Figure 2, switch 3 in Figure 3, etc.) through the MLAG port (Including the physical ports at both ends of the link) is called the MLAG link, and the MLAG link is also the main link for external devices to communicate with the switch 1.
  • m-interface1 indicates that switch 1 is connected to server 1 through port 1
  • m-interface2 indicates that switch 1 is connected to server 2 through port 2.
  • the primary link between server 1 and switch 1 is the MLAG link connected through port 1
  • the backup link is the peer-link link between switch 1 and switch 2 through port 5.
  • the primary link between server 2 and switch 2 is the MLAG link through port 2
  • the backup link is the peer-link link between server 2 and switch 2 through port 5.
  • "m-interface1" indicates that switch 2 is connected to server 1 through port 3
  • m-interface2 indicates that switch 2 is connected to server 2 through port 4.
  • the primary link for communication between server 1 and switch 2 is the MLAG link connected through port 3, and the backup link is the peer-link link between switch 2 and switch 1 through port 6.
  • the primary link for communication between server 2 and switch 2 is the MLAG link connected through port 4, and the backup link is the peer-link link between switch 2 and switch 1 through port 6.
  • the active/standby switchover flag is used to indicate the current state of the MLAG link. As shown in Figure 8, entry 801 indicates that the MLAG link (port 1) between switch 1 and server 1 in Figure 2 is normal, and switch 1 passes the port 1 forwards the message to server 1.
  • MLAG link When the MLAG link (port 1) fails, the value of the active/standby switchover flag of entry 801 will be switched from “primary” to "backup", indicating that switch 1 needs to use the outgoing port originally used to send packets to server 1. Switch from port 1 to port 5.
  • character strings such as "m-interface1 ⁇ 2" are used to identify MLAG member interfaces.
  • MLAG member interfaces can be marked with an identifier composed of arbitrary characters. For example, in addition to being marked with "primary/backup", the active/standby switchover mark may also be marked with "active/standby", which is not limited in this application.
  • the second forwarding database 504 includes the mapping of MLAG member interfaces to physical ports connected to external devices. As shown in FIG. 9 , it is the MLAG member interface of switch 1 in FIG. 2 and the physical ports of switch 1 connected to server 1 and server 2 (port) mapping. Entry 901 indicates that switch 1 is connected to server 1 through port 1, and entry 902 is connected to server 2 through port 2.
  • the third forwarding database 505 includes a layer 2 forwarding database (layer 2 forwarding database, L2FDB), and the L2FDB is mainly used for message forwarding at the data link layer (layer 2).
  • the third forwarding database 505 further includes a Layer 3 forwarding database (layer 3 forwarding database, L3FDB), and the L3FDB is used for packet forwarding at the network layer (layer 3).
  • the L2FDB includes information such as MAC address and port mapping, and the port information can be a physical port or an MLAG member interface.
  • the L2FDB can also include a virtual local area network (VLAN) identifier (VID).
  • VLAN virtual local area network
  • VLAN virtual local area network
  • the L2FDB is shown in Figure 6.
  • L3FDB includes IP address, VID, MAC address and port information.
  • the L3FDB does not include port information.
  • the port information needs to be obtained by the switch 500 by searching the L2FDB according to the VLAN and MAC address.
  • the contents of the first forwarding database 503 and the second forwarding database 504 may be generated when the switch 500 receives the packet and forwards the packet, but is usually generated when the network administrator configures the MLAG.
  • switch 1 and switch 2 when switch 1 and switch 2 are configured to form an MLAG system, and server 1 accesses the MLAG system, switch 1 can be automatically generated (through link discovery, etc.) or manually configured, as shown in Figure 8 information of related entries in the first forwarding database 503 and the second forwarding database 504 shown in FIG. 9 .
  • the entries of the third forwarding database 505 may be generated by manual configuration or dynamic learning (MAC learning). This application takes dynamic learning as an example, and expands FIG. 2 to describe the process of creating an entry of the first forwarding data 503 .
  • the configurations of switch 1, switch 2, server 1, server 2 and server 3 are as follows:
  • IP MAC gateway VLAN switch 1 IP-S MAC-S switch 2 IP-S MAC-S server 1 1.1.1.2 MAC1 1.1.1.1 VLAN 100 server 2 2.1.1.2 MAC2 2.1.1.1 VLAN 200 server 3 1.1.1.3 MAC3 1.1.1.1 VLAN 100
  • switch 1 and switch 2 form an MLAG system
  • the same IP address and the same MAC address are set for switch 1 and switch 2 (in an implementation manner, the MAC addresses forming the MLAG system may be different).
  • the interface IP address of VLAN 100 also known as the Layer 3 interface IP
  • the interface IP address of VLAN 200 is set to 2.1.1.1.
  • the MAC address (MAC-S) of the switch 1 may be the physical address of the switch 1 or a virtual address, and the MAC address is used for the exchange of Layer 3 packets.
  • Both server 1 and server 3 belong to the same VLAN (VLAN 100), and the default gateway is 1.1.1.1.
  • Switch 2 belongs to another VLAN (VLAN 200) and the default gateway is 2.1.1.1.
  • server 1 physical host or virtual machine
  • server 2 When communicating between server 1 (physical host or virtual machine) and server 2, it is assumed that a message is sent from server 1 to server 2, and the process is as follows (assuming that the L2FDB of the third forwarding database of switch 1 contains VLAN ID, and the L3FDB contains port information ):
  • Server 1 determines that the destination IP address 2.1.1.1 (Server 2) does not belong to the same VLAN as itself, so it sends an ARP request for the MAC address corresponding to gateway 1.1.1.1.
  • switch 1 After the chip of switch 1 receives the ARP request from server 1, it finds that the requested IP address is its own Layer 3 interface IP address, so it sends an ARP reply and includes its own MAC address (MAC-S). In addition, since switch 1 receives the ARP request message through port 1, switch 1 can search the second forwarding database through the identifier of port 1 (port1), such as through entry 801 shown in FIG.
  • the source MAC address and source IP address of the ARP request are the addresses of server 1.
  • the source MAC address can also be the MAC address of switch 3.
  • server 1 After receiving the APR response from switch 1, server 1 assembles a packet (packet A) and sends it to switch 1.
  • the chip of switch 1 After the chip of switch 1 receives the packet A, it will look up the L2FDB according to the destination MAC address + VID of the packet A, and find an entry that matches the MAC address of its own Layer 3 interface (the entry 601 shown in Figure 6). , this entry is automatically added when switch 1 configures VLAN 100, and is set for the Layer 3 forwarding flag of this entry 601, which is not marked in Figure 6, and the setting information is used to indicate when the message After the destination address matches the entry, Layer 3 forwarding is required), so continue to search the L3FDB of the third forwarding database.
  • the chip of switch 1 searches the L3FDB according to the destination address (2.1.1.2) of the packet. Since no entry has been created before, the search fails, so the packet is sent to the CPU of switch 1 for software processing.
  • the CPU of switch 1 searches its software routing table according to the destination IP (2.1.1.2) of the packet, and finds that the interface IP address matches VLAN 200, so it continues to search its software APR table, but the search still fails. Then switch 1 will send an ARP request for the MAC address corresponding to address 2.1.1.2 to all ports in VLAN 200.
  • switch 1 After the switch 1 receives the ARP reply from the server 2, it searches the second forwarding database according to the receiving port (port 2) of its message, and finds the corresponding MLAG member interface identifier according to the entry 903 as shown in Figure 9.
  • the switch 1 After confirming the destination MAC address (MAC2) corresponding to the destination IP address 2.1.1.2, the switch 1 sends the packet B to the server 2.
  • the difference between packet B and packet A is that the destination MAC address of packet A is MAC2 and the source MAC address is MAC-S.
  • the server 2 After receiving the message B, the server 2 sends a response message to the server 1.
  • the forwarding process of the response packet is similar to the previous steps, except that because the L3FDB of switch 1 already contains the relevant entry information of server 1, the response packet does not need to be processed by the CPU of switch 1 again, but by switch 1.
  • the chip according to the L3FDB information (entry 701 in Figure 7), according to the MLAG member interface identification "m-interface1", in the first forwarding database, according to the entry 801 shown in Figure 8, confirm that the current MLAG link is in "m-interface1" primary" (normal) state, switch 1 needs to forward packets through the MLAG link. Then, the chip of the switch 1 searches the second forwarding database, and sends the packet to the server 1 through the port 1 according to the table entry 901 shown in FIG. 9 .
  • switch 1 completes the learning of the entries of the third forwarding database, and the subsequent packets between server 1 and server 2 can be directly searched by the chip of switch 1 through the first forwarding database, the second forwarding database and the third forwarding database.
  • the forwarding database is used for hardware forwarding instead of routing and forwarding through the CPU of the switch, which improves the efficiency of packet forwarding.
  • Server 1 and server 3 belong to the same VLAN (VLAN 100). Since there is no MAC address information of server 3 in the ARP table of initial server 1, server 1 broadcasts an ARP request requesting the MAC address of server 3. The destination IP address of the ARP request is 1.1.1.3.
  • switch 1 After the chip of switch 1 receives the ARP request from server 1, it finds the corresponding MLAG member interface identifier "m-interface1" in the second forwarding database according to the ingress port (port 1) of the ARP request.
  • Switch 1 identifies the destination MAC address of the packet as a broadcast address, and then broadcasts and sends the packet in VLAN 100.
  • server 3 After receiving the broadcast message, server 3 updates the information of server 1 (source MAC address, source IP address) to its own ARP table, because the destination IP address of the broadcast message is the IP address of server 3 itself , so send an ARP reply to server 1 and include its own MAC address (MAC2) in it;
  • the source MAC address of the ARP reply message here, the MAC address of server 3, MAC3
  • VLAN ID corresponding to port 7 VLAN 100
  • the server 1 After the server 1 receives the ARP response, it adds the MAC address of the server 3 to its own ARP table, and then the server 1 can send the message C with the destination MAC address of MAC3 to the server 3.
  • the chip of switch 1 After the chip of switch 1 receives the message C, it finds the port identifier "port7" corresponding to the entry (entry 602 shown in Figure 6) in the L2FDB according to the destination MAC address and VID in the message C. "port7" indicates a physical port, so the switch 3 does not need to continue to search the first forwarding database and the second forwarding database, and directly sends the packet C to the server 3 through port 7.
  • step 8 After switch 1 receives the response message of message C, the process is similar to step 4. By looking up the third forwarding database (L2FDB) of the forwarding database, the first forwarding database of the forwarding database, and the second forwarding database of the forwarding database, the physical out port 1 and send to server 1 via port 1.
  • L2FDB third forwarding database
  • the switch 1 completes the learning of the entry to which the third forwarding database of the forwarding database belongs. Subsequent packet forwarding between server 1 and server 3, switch 1 only needs to search the first forwarding database, the second forwarding database and the third forwarding database through the chip to perform hardware forwarding.
  • server 1 is connected to the MLAG system composed of switch 1 and switch 2.
  • the chip or CPU of switch 1 detects that the link with MLAG (port 1) is faulty
  • the chip of switch 1 determines that the faulty MLAG link corresponds to the second forwarding database, such as entry 901 shown in FIG. 9 . If the MLAG member interface is "m-interface1”, the chip will look up the corresponding active/standby switchover flag in the first forwarding database according to "m-interface1", such as entry 1101, and change the active/standby switchover flag "primary" to "backup” ” (entry 1102) to instruct switch 1 to communicate with server 1 through the peer-link link (port 5).
  • the chip searches the second forwarding database and the first forwarding database, and changes the active-standby switchover flag corresponding to the MLAG link from "backup" to "primary” to indicate switch 1 Communication with Server 1 is via the MLAG link (Port 1).
  • the switch 1 only needs to modify the corresponding active/standby switchover flag in the first forwarding database according to the event, so as to realize the link switchover.
  • switch 3 is connected to the MLAG system composed of switch 1 and switch 2, and server 1 communicates with server 2 connected to switch 1 through switch 3.
  • server 1 communicates with server 2 via switch 3 , link 1201 , switch 1 and link 1204 .
  • link 1201 fails, the switch 1 switches the MLAG link (link 1201) to the peer-link link (link 1203). Since the third forwarding database of the switch 1 already has the information of the server 1 and the server 2, the communication between the server 1 and the server 2 can continue. At this time, a new server 4 is connected to switch 3.
  • the MLAG link (link 1202) corresponding to switch 3 is normal, and the MAC information of server 4 can be learned normally and filled in to in the third forwarding database of switch 2.
  • the link 1201 also cannot fill the MAC information of the server 4 (the MLAG member interface identifier cannot be determined) into the third forwarding database of the switch 1 .
  • switch 2 synchronizes the information of the third forwarding database (including information of server 4) saved by itself to switch 1 through a peer-link link.
  • the switch 1 synchronizes the information of the server 4 to its own third forwarding database according to the information of the third forwarding database on the switch 2 .
  • entries conflict between the third forwarding databases on switch 1 and switch 2 for example, an entry with a MAC address of MAC10 exists on both switch 1 and switch 2, and the entry on switch 1
  • the last update time of the entry on switch 2 is t1
  • the last update time of the entry on switch 2 is t2 (t1 ⁇ t2, indicating that the last update time of the entry on switch 2 is later than the entry on switch 1)
  • switch 1 and switch 2 have the same MLAG member interface identifiers corresponding to switch 3 (assuming that the MLAG member interface identifier is "m-interface4"), for server 4, after link 1201 fails, it can also be Like server 1, communicate with server 2 through the peer-link link. After the link 1201 returns to normal, the active/standby switchover flag corresponding to "m-interface4" in the first forwarding database is modified by the chip of the switch 1, and the server 4 can communicate with the server 2 through the link 1201.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请公开了一种MLAG链路故障切换装置和方法,该装置包括芯片,该芯片用于生成转发数据库,并根据该转发数据库转发报文;该转发数据库包括第一转发数据库、第二转发数据和第三转发数据库。该装置通过修改第一转发数据库中该故障MLAG链路对应的主备切换标记的值,对故障链路进行切换,减少了链路切换的时间,提高了链路切换的效率。

Description

MLAG链路故障切换方法和装置
本申请要求于2020年10月12日提交、申请号为202011086369.0、申请名称为“MLAG链路故障切换方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及一种通信技术,尤其涉及一种MLAG链路故障切换方法和装置。
背景技术
多框链路聚合组(multi-chassis link aggregation group,MLAG),指的是两台或多台交换机跨设备链路聚合成一个双活(active-active)系统,以提高链路的可靠性。
参见图1,交换机1通过端口1和交换机2的端口3进行连接(端口1和端口3之间的链路称为对等链路,又称peer-link链路),组成一个MLAG系统。服务器接入该MLAG系统,即服务器通过两个独立的物理端口同时连接了交换机1和交换机2。服务器可以通过双活配置,同时向两台交换机发送报文,或配置主备(active-standby)方式,只向一台交换机发送报文(故障情况下,向另一台交换机发送报文)。正常情况下,交换机1接收从网络侧发送过来的报文,会根据其报文的目的地址查找本地的转发数据库(forwarding database,FDB),如MAC地址表,找到对应连接服务器的物理出端口,如端口2,并通过端口2将流量转发至服务器。
当服务器与交换机1之间的链路(端口2)出现故障时,一种方法是,交换机1对转发数据库进行更新,将端口2对应的所有表项的出端口的值更新为端口1(peer-link链路)。另一种方法是,交换机1删除所有与端口2对应的表项或删除该转发数据库的所有表项,通过MAC地址的重新学习,学习到服务器对应的MAC地址与端口1的映射记录,完成MLAG链路的切换。
这两种方式都存在一定的缺陷,前一种方法,对转发数据库的端口值刷新,需要通过交换机1的CPU进行处理。在CPU处理过程中,需要将该转发数据库进行冻结(freezing),导致转发数据库无法用于报文转发,会导致交换机1在一段时间内断流。后一种方法,会导致重新学习过程中,造成交换机1在一定时间内的广播,此外,重新学习也需要一定的时间。这两种方式,都会导致交换机1的MLAG链路切换的时间比较长。
发明内容
本申请提供一种MLAG链路故障切换的方法和装置,用以降低链路切换的时间开销,提升链路故障切换的效率。
第一方面,本申请提供了一种MLAG链路故障切换的网络设备,该网络设备包括芯 片,该芯片用于生成转发数据库,并根据该转发数据库转发接收到的报文;该转发数据库包括第一转发数据库、第二转发数据库和第三转发数据库。通过将转发数据库分成三部分,可以减少链路故障时,对转发数据库中的内容修改的条目数量,降低时间开销,提高链路切换的效率。
在一种可选的实现方式中,该第一转发数据库用于存储至少一个多框链路聚合组MLAG成员接口到至少一个主备切换标记的映射;该第二转发数据库用于存储至少一个第一物理端口到该至少一个MLAG成员接口的映射;该第三转发数据库用于存储媒体接入控制MAC地址到第一MLAG成员接口的映射,该第一MLAG成员接口在该至少一个MLAG成员接口中。网络设备通过MLAG成员接口,将MAC地址和物理端口之间的关联解耦,可以使得当网络链路发生变化时(如有链路故障),不需要修改MAC地址和物理端口的信息,网络设备就可以实现故障链路的切换,提高了切换的效率。
在一种可选的实现方式中,该芯片还用于将该报文的源MAC地址到第二MLAG成员接口的映射存储于该第三转发数据库中,其中,该第二MLAG成员接口为根据该第二转发数据库与第一物理端口对应的MLAG成员接口,该第一物理端口为接收该报文的物理端口。
在一种可选的实现方式中,当MLAG链路故障时,该芯片将该第一转发数据库中第三MLAG成员接口对应的主备切换标记的值更新为备标记,该备标记用于指示该芯片通过对等链路发送指向该第三MLAG成员接口的报文,该第三MLAG成员接口为根据该第二转发数据库与第二物理端口对应的MLAG成员接口,该第二物理端口为该MLAG链路的物理端口。仅通过修改主备切换标记的值,网络设备就完成了故障链路的切换,提高了链路切换的效率。
在一种可选的实现方式中,当该MLAG链路恢复时,该芯片将该第三MLAG成员接口对应的主备切换标记的值更新为主标记,该主标记用于指示该芯片根据该第二转发数据库转发该报文。仅通过修改主备切换标记的值,网络设备就完成了故障链路的恢复,提高了链路恢复的效率。
在一种可选的实现方式中,该芯片用于:根据该第三转发数据库,确定与该报文的目的MAC地址对应的MLAG成员接口为所述第一MLAG成员接口;当该第一转发数据中与该第一MLAG成员接口对应的主备切换标记的值为主标记时,该芯片通过第三物理端口转发该报文,其中,该第三物理端口为该第二转发数据库中与该第一MLAG成员接口对应的物理端口。
在一种可选的实现方式中,当该第一MLAG成员接口对应的主备切换标记的值为备标记时,该芯片通过对等链路转发该报文。
在一种可选的实现方式中,该网络设备与另一网络设备组成MLAG系统,该网络设备用于将该第三转发数据同步至另一网络设备。MLAG系统之间通过第三转发数据库的同步,保证了一条链路失效后,当有新的外部设备接入该MLAG系统时,网络设备可以得到到该新接入设备的信息。
第二方面,本申请公开了一种MLAG链路切换方法,该方法包括:当网络设备的MLAG链路故障时,将与第一转发数据库中第一MLAG成员接口对应的主备切换标记的值更新为备标记,所述备标记用于指示该网络设备通过对等链路发送指向该第一MLAG成员接 口的报文,该第一MLAG成员接口为根据所述第二转发数据库与第一物理端口对应的MLAG成员接口,所述第一物理端口为所述MLAG链路的物理端口,其中,该网络设备包括所述第一转发数据库和所述第二转发数据库,所述第一转发数据库用于存储至少一个多框链路聚合组MLAG成员接口到至少一个主备切换标记的映射,所述第二转发数据库用于存储至少一个物理端口到所述至少一个MLAG成员接口的映射。
在一种可选的实现方式中,当该MLAG链路恢复时,将该第一MLAG成员接口对应的主备切换标记的值更新为主标记,该主标记用于指示该网络设备根据该第二转发数据库发送该报文。
在一种可选的实现方式中,该网络设备还包括第三转发数据库,该方法还包括:将该报文的源MAC地址到第二MLAG成员接口的映射存储于该第三转发数据库中,其中,该第二MLAG成员接口为根据该第二转发数据库与第二物理端口对应的MLAG成员接口,该第二物理端口为接收该报文的物理端口。
在一种可选的实现方式中,该方法还包括:根据该第三转发数据库,确定与该报文的目的MAC地址对应的MLAG成员接口为所述第一MLAG成员接口,其中,该第三转发数据库还包括该目的MAC地址到该第一MLAG成员接口的映射;当该第一转发数据库中与该第一MLAG成员接口对应的主备切换标记的值为主标记时,通过该第一物理端口转发该报文。
在一种可选的实现方式中,当该第一MLAG成员接口对应的主备切换标记的值为备标记时,通过该对等链路转发该报文。
在一种可选的实现方式中,将所述第三转发数据库同步至另一网络设备,其中,所述另一网络设备与所述网络设备组成MLAG系统。
本申请第二方面有益效果可以参考第一方面及其实现方式。
附图说明
图1为MLAG的示意图;
图2为一种MLAG组网方式的示意图;
图3为另一种MLAG组网方式的示意图;
图4为另一种MLAG组网方式的示意图;
图5为本申请实施例提供的一种MLAG链路故障切换的装置示意图;
图6为本申请实施例提供的第三转发数据库的L2FDB的示意图;
图7为本申请实施例提供的第三转发数据库的L3FDB的示意图;
图8为本申请实施例提供的第一转发数据库的示意图;
图9为本申请实施例提供的第二转发数据的示意图;
图10为本申请实施例提供的生成第三转发数据库表项的示意图;
图11为本申请实施例提供的一种MLAG链路故障切换的示意图;
图12为本申请实施例提供的新增接入设备的示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的 附图,对本申请实施例中的技术方案进行清楚的描述。
多台交换机通过MLAG机制,实现多台设备之间链路聚合。这些设备组成一个双活(active-active)系统,又称MLAG系统,服务器或用户边缘设备(customer edge,CE)通过这个MLAG系统接入普通以太网网络、多链接透明互联(transparent interconnection of lots of links,TRILL)、虚拟局域网扩展(Virtual Extensible Local Area Network,VXLAN)或因特网等网络。通过MLAG系统接入,一方面可以起到负载分担流量的作用,另一方面可以起到备份保护的作用。
MLAG系统的组网方式有多种,如服务器接入、交换机接入和多级MLAG方式。服务器接入如图2所示,服务器通过连接交换机1、交换机2(交换机1和交换机2组成MLAG系统)接入网络。交换机接入如图3所示,用户边缘设备(交换机3)通过连接交换机1、交换机2(交换机1和交换机2组成MLAG系统)接入,服务器通过交换机3接入网络。多级MLAG如图4所示,交换机1和交换机2组成一个MLAG系统(假设为mlag1),交换机3和交换机4组成一个MLAG系统(假设为mlag2),同时交换机3和交换机4又接入mlag1系统。MLAG系统的组网方式还可以有更多种,本申请并不做限定。本申请所公开的MLAG链路切换装置可以是如上所示的任意一种MLAG系统中的交换机,如图2、图3所示的交换机1或交换机2,图4所示的交换机1、交换机2、交换机3或交换机4等。为描述方便,本申请以图2所示的服务器接入场景中的交换机1为例来描述该MLAG链路切换装置。
MLAG链路切换装置如图5的交换机500所示,交换机500由芯片、存储器507和端口组成。芯片包括芯片502。端口506用于转发报文。另外端口506可以跟外部设备连接,形成MLAG链路或peer-link链路,如图1所示,对于交换机1来说,端口1、端口2都属于端口506。芯片502可以是专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)和网络处理器(network processor,NP)等。存储器507可以是随机存储器(random access memory,RAM)、动态随机存储器(dynamic random access memory,DRAM)、静态随机存储器(static random access memory,SRAM)、内容可寻址存储器(content addressable memory,CAM)等易失性存储(volatile memory),也可以只读内存(read-only memory,ROM)、固态硬盘(solid state disk,SSD)等非易失性存储器(non-volatile memory)中的一种,或多种类型存储器的组合。在实际部署时,存储器507可以由多个不同类型或相同类型的存储器组成,并且这多个存储器可以部署在交换机500的不同模块中。例如,如图5所示,存储器507可以位于芯片502中,但存储器507也可以位于芯片502的外部。存储器507可以存储指令508或存储第一转发数据库503、第二转发数据库504和第三转发数据505的表项。指令508、第一转发数据库503、第二转发数据库504和第三转发数据505的表项可以存储在一个存储器中,也可以分开存储在不同的存储器中,本申请不做限定。
芯片502还可以包括中央处理器(CPU)501,CPU 501可以用于转发控制,维护一些软件表项(例如软件路由表、软件ARP(address resolution protocol)表等。此外,芯片502可以通过CPU 501查找软件路由表或软件APR表等方式支持路由转发。交换机500通过CPU 501或芯片502调用指令508生成第一转发数据库503、第二转发数 据504和第三转发数据库505。
第一转发数据库503如图8所示,包括MLAG成员接口(interface)到主备切换标记(switch-flag)的映射,MLAG成员接口是交换机1与外部设备连接的物理端口(又称MLAG口)的逻辑标识,用于指示连接交换机1的具体的外部设备,通过MLAG口跟外部设备(外部设备可以是服务器或交换机,如图2中的服务器1、图3中交换机3等)连接的链路(包括链路两端的物理端口)称为MLAG链路,MLAG链路也是外部设备跟交换机1进行通信的主链路。如图2所示,对与交换机1来说,“m-interface1”指示了交换机1通过端口1连接了服务器1,“m-interface2”指示了交换机1通过端口2连接了服务器2。服务器1与交换机1通信的主链路就是通过端口1连接的MLAG链路,备链路就是交换机1通过端口5与交换机2连接的peer-link链路。服务器2与交换机2通信的主链路就是通过端口2连接MLAG链路,备链路就是通过端口5与交换机2连接的peer-link链路。对于交换机2来说,“m-interface1”指示了交换机2通过端口3连接了服务器1,“m-interface2”指示了交换机2通过端口4连接服务器2。服务器1与交换机2通信的主链路就是通过端口3连接的MLAG链路,备链路就是交换机2通过端口6连接交换机1的peer-link链路。服务器2与交换机2通信的主链路是通过端口4连接的MLAG链路,备链路是交换机2通过端口6连接交换机1的peer-link链路。主备切换标记用于指示当前MLAG链路的状态,如图8所示,表项801,指示了图2中交换机1与服务器1之间的MLAG链路(端口1)正常,交换机1通过端口1转发报文至服务器1。当该MLAG链路(端口1)故障,表项801的主备切换标记的值会从“primary”切换至“backup”,表明交换机1需要将原来用于发送报文给服务器1的出端口,从端口1切换至端口5。在本申请中,采用如“m-interface1\2”等字符串来标识MLAG成员接口,在实际部署时,MLAG成员接口可以用任意字符组成的标识来标记。例如,主备切换标记除了用“primary/backup”来标记外,还可以用“active/standby”等标识,本申请不做限定。
第二转发数据库504包括MLAG成员接口到连接外部设备物理端口的映射,如图9所示,为图2中的交换机1的MLAG成员接口(interface)和交换机1连接服务器1、服务器2的物理端口(port)的映射。表项901表明了交换机1通过端口1和服务器1进行连接,表项902通过端口2和服务器2进行连接。
第三转发数据库505包括第二层转发数据库(layer 2 forwarding database,L2FDB),L2FDB主要用于数据链路层(二层)的报文转发。当交换机500带有三层交换功能时,第三转发数据库505还包括第三层转发数据库(layer 3 forwarding database,L3FDB),L3FDB用于网络层(三层)的报文转发。L2FDB包括MAC地址和端口映射等信息,端口信息可以是物理端口或MLAG成员接口。除了MAC地址和端口信息外,L2FDB还可以包括虚拟局域网(virtual local area network,VLAN)标识(VID),L2FDB如图6所示。L3FDB如图7所示,包括IP地址、VID、MAC地址和端口信息。在一种实现方式中,L3FDB不包括端口信息,在报文转发过程中,端口信息需要交换机500根据VLAN、MAC地址查找L2FDB获得。
第一转发数据库503、第二转发数据库504的内容可以在交换机500接收报文进行转发时生成,但是通常是在网络管理员配置MLAG时生成。如图2所示,当配置交换机 1和交换机2组成MLAG系统,服务器1接入该MLAG系统时,交换机1可以自动生成(通过链路发现等方式)或通过手工配置,生成如图8所示的第一转发数据库503、图9所示的第二转发数据库504的相关表项的信息。第三转发数据库505的表项可以通过手工配置或动态学习(MAC learning)方式生成。本申请以动态学习为例,并通过扩展附图2,来描述第一转发数据503的表项的创建过程。具体如图10所示,交换机1、交换机2、服务器1、服务器2和服务器3的配置如下表:
设备名称 IP MAC 网关 VLAN
交换机1 IP-S MAC-S    
交换机2 IP-S MAC-S    
服务器1 1.1.1.2 MAC1 1.1.1.1 VLAN 100
服务器2 2.1.1.2 MAC2 2.1.1.1 VLAN 200
服务器3 1.1.1.3 MAC3 1.1.1.1 VLAN 100
由于交换机1和交换机2组成MLAG系统,所以设置交换机1和交换机2相同的IP地址和相同的MAC地址(在一种实现方式中,组成MLAG系统的MAC地址可以不相同)。由于交换机1和交换机2都连接了VLAN 100和VLAN 200,设置了VLAN 100的接口IP地址(又称三层接口IP)为1.1.1.1,VLAN 200的接口IP地址为2.1.1.1。交换机1的MAC地址(MAC-S)可以是交换机1的物理地址,也可以是一个虚拟地址,该MAC地址用于三层报文交换。服务器1和服务器3都属于同一个VLAN(VLAN 100),默认网关为1.1.1.1。交换机2属于另一个VLAN(VLAN 200),默认网关为2.1.1.1。
当服务器1(物理主机或虚拟机)和服务器2之间进行通信时,假设从服务器1向服务器2发送报文,过程如下(假设交换机1第三转发数据库的L2FDB包含VLAN ID,L3FDB包含端口信息):
1.服务器1确定目的IP地址2.1.1.1(服务器2)与自己不属于同一个VLAN,因此它发出请求网关1.1.1.1对应的MAC地址的ARP请求。
2.交换机1的芯片收到服务器1的ARP请求后,发现被请求IP地址是自己的三层接口IP地址,因此发送ARP应答并将自己的MAC地址(MAC-S)包含其中。此外,由于交换机1是通过端口1接收该ARP请求报文,交换机1可以通过端口1的标识(port1),查找第二转发数据库,如通过图9所示的表项801,找到对应的MLAG成员接口标识“m-interface1”,交换机1将该ARP请求报文中的源MAC地址、源IP地址、端口1所属的VLAN标识和MLAG成员接口的对应关系(1.1.1.2<=>MAC1<=>100<=>m-interface1),保存到第三转发数据库的L3FDB中,如图7所示的表项701。此外,交换机1还会将ARP请求报文中的源MAC地址、VLAN标识和MLAG成员接口的对应关系(MAC1<=>100<=>m-interface1)保存至L2FDB中,如图6所示的表项603。在本例中,ARP请求的源MAC地址、源IP地址是服务器1的地址,当MLAG组网方式是如图3所示的交换机接入时,源MAC地址还可以是交换机3的MAC地址。
3.服务器1收到交换机1的APR应答后,组装报文(报文A)并发送给交换机1。报文的目的MAC=MAC-S、源MAC=MAC1、源IP=1.1.1.2、目的IP=2.1.1.2。
4.交换机1的芯片收到报文A后,会根据报文A的目的MAC地址+VID查找L2FDB, 发现匹配了自己的三层接口MAC地址的表项(如图6所示的表项601,该表项在交换机1配置VLAN 100时自动添加,并且为该表项601的三层转发标志置位,该三层转发标志在图6中未标识,该置位信息用于指示当报文的目的地址匹配该表项后,需要做三层转发),于是继续查找第三转发数据库的L3FDB。
5.交换机1的芯片根据报文的目的地址(2.1.1.2)在L3FDB中查找,由于之前未建立任何表项,因此查找失败,于是将报文发送至交换机1的CPU中进行软件处理。
6.交换机1的CPU根据报文的目的IP(2.1.1.2)查找其软件路由表,发现匹配了VLAN 200的接口IP地址,于是继续查找其软件APR表,仍然查找失败。然后交换机1会向VLAN 200的所有端口发送请求地址2.1.1.2对应的MAC地址的ARP请求。
7.服务器2接收到交换机1发送的ARP后,发现被请求IP是自己的IP地址,因此发送ARP应答并将自己的MAC地址(MAC2)包含其中。同时,将交换机1的IP与MAC的对应关系(2.1.1.1<==>MAC-S)记录到自己的ARP表中。
8.交换机1接收到服务器2的ARP应答后,根据其报文的接收端口(端口2),查找第二转发数据库,根据如图9所述的表项903,找到对应的MLAG成员接口标识“m-interface2”,交换机1将该ARP应答报文中的源MAC地址、源IP地址、VLAN标识和MLAG成员接口的对应关系(2.1.1.2<=>MAC1<=>200<=>m-interface2),记录到第三转发数据库的L3FDB中,如图7所示的表项702。并同时将ARP应答报文中的源MAC地址、VLAN标识和MLAG成员接口的对应关系(MAC2<=>200<=>m-interface2)保存至L2FDB中,如图6所示的表项604。交换机1在确认目的IP地址2.1.1.2对应的目的MAC地址(MAC2)后,将报文B发送给服务器2。报文B与报文A的区别在于,报文A的目的MAC地址为MAC2,源MAC地址为MAC-S。
9.服务器2在收到报文B后,发送应答报文给服务器1。该应答报文的转发过程与前面步骤类似,只是由于交换机1的L3FDB中已经含有服务器1的相关表项信息,所以该应答报文不需要再次由交换机1的CPU进行处理,而是由交换机1的芯片根据L3FDB信息(图7中的表项701),根据MLAG成员接口标识“m-interface1”,在第一转发数据库中,根据图8所示的表项801,确认当前MLAG链路处于“primary”(正常)状态,交换机1需要通过MLAG链路转发报文。于是,交换机1的芯片查找第二转发数据库,根据图9所示的表项901,将报文通过端口1发送给服务器1。
通过上述步骤,交换机1完成了第三转发数据库所属表项的学习,后续服务器1和服务器2之间的报文可以由交换机1的芯片直接通过查找第一转发数据库、第二转发数据库和第三转发数据库,进行硬件转发,而不需要通过交换机的CPU进行路由转发了,提高了报文转发的效率。
当交换机1为二层交换机,只做二层转发时,第三转发数据库所述表项的学习过程如下,如图10所示的:
1.服务器1和服务器3同属于一个VLAN(VLAN 100),由于初始服务器1的ARP表 中没有服务器3的MAC地址信息,于是服务器1广播发送了一个请求服务器3的MAC地址的ARP请求,该ARP请求的目标IP地址为1.1.1.3。
2.交换机1的芯片收到服务器1的ARP请求后,根据该ARP请求的入端口(端口1),在第二转发数据库中查找到对应的MLAG成员接口标识“m-interface1”,交换机1将ARP请求的源MAC地址、端口1所属的VLAN标识(VLAN 100)和MLAG成员接口标识的对应关系(100<=>MAC1<=>m-interface1)记录到L2FDB中,如图6的表项603所示。交换机1识别该报文的目的MAC地址为广播地址,随后将报文在VLAN 100内广播发送。
3.服务器3在收到广播报文后,将服务器1的信息(源MAC地址、源IP地址)更新到自己的ARP表中,由于该广播报文的目标IP地址是服务器3自身的IP地址,于是向服务器1发送ARP应答,并将自己的MAC地址(MAC2)包含其中;
4.交换机1的芯片接收到ARP应答后,根据其ARP应答报文的入端口(端口7),在第二转发数据库中查找对应的MLAG成员接口标识,由于服务器3只与交换机1连接,未跟交换机2连接,所以在第二转发数据库中找不到与端口7对应的MLAG成员接口标识。于是交换机1的芯片将ARP应答报文的源MAC地址(此处为服务器3的MAC地址MAC3)、端口7对应的VLAN标识(VLAN 100)和物理端口(端口7)的对应关系(MAC3<=>VLAN 100<=>port7)添加到自己的L2FDB中,如图6所示的表项602。并根据ARP请求报文的目的MAC地址(MAC1)和VID(VLAN 100),通过图6所示的表项603,在L2FDB中找到对应的MLAG成员接口标识“m-interface1”,通过查找第一转发数据库(图8所示的表项801)、第二转发数据库(图9所示的表项901),确认“m-interface1”对应的物理端口为“端口1”,当前的MLAG链路状态为“primary”,最终通过端口2将ARP应答发送给服务器1。
5.服务器1在收到ARP应答后,将服务器3的MAC地址添加到自己的ARP表中,然后服务器1就可以向服务器3发出目的MAC地址为MAC3的报文C。
6.交换机1的芯片接收到报文C后,根据报文C中的目的MAC地址和VID在L2FDB中找到表项(如图6所示的表项602)对应的端口标识“port7”,由于“port7”指示的是物理端口,所以交换机3不需要再继续查找第一转发数据库和第二转发数据库,直接通过端口7将报文C发送至服务器3。
7.当服务器3接收到报文C后,发送报文C的应答报文;
8.交换机1收到报文C的应答报文后,过程跟步骤4类似,通过查找转发数据库第三转发数据库(L2FDB)、转发数据库第一转发数据库和转发数据库第二转发数据库,最终找到物理出端口1,并通过端口1发送至服务器1。
通过如上步骤,交换机1完成了转发数据库第三转发数据库所属表项的学习。后续服务器1和服务器3之间的报文转发,交换机1只需要通过芯片查找第一转发数据库、转发数据库第二转发数据库和第三转发数据库,就可以进行硬件转发。
如图11所示,服务器1接入了交换机1和交换机2所组成的MLAG系统。当交换机1的芯片或CPU监听到有MLAG的链路(端口1)故障时,交换机1的芯片通过查找第二转发数据库,如图9的所示的表项901,确定该故障MLAG链路对应的MLAG成员接 口为“m-interface1”,芯片会根据“m-interface1”在第一转发数据库中查找对应的主备切换标记,如表项1101,将主备切换标记“primary”更改为“backup”(表项1102),以指示交换机1通过peer-link链路(端口5)与服务器1进行通信。当MLAG链路(端口1)恢复正常时,芯片根据查找第二转发数据库,第一转发数据库,将该MLAG链路对应的主备切换标记从“backup”更改为“primary”,以指示交换机1通过MLAG链路(端口1)与服务器1进行通信。MLAG链路故障或恢复,交换机1只需要根据该事件修改第一转发数据库中对应的主备切换标记,即可实现链路的切换。通过上述方法进行故障链路切换,相比通过更新MAC地址表的多条表项(一个故障端口会对应多个MAC地址),或者通过先删除该故障端口的映射记录,再重新学习(学习MAC地址,需经过报文广播、报文响应等过程)等方式,本申请公开的故障链路切换方法耗时更短,可以极大地提高链路切换的效率。
如图12所示,交换机3接入交换机1和交换机2组成的MLAG系统,服务器1通过交换机3与连接交换机1的服务器2通信。正常情况下,服务器1经过交换机3、链路1201、交换机1和链路1204与服务器2进行通信。当链路1201故障后,交换机1将MLAG链路(链路1201)切换至peer-link链路(链路1203)。由于交换机1的第三转发数据库已有服务器1、服务器2的信息,服务器1和服务器2之间可以继续通信。此时,在交换机3上接入一台新的服务器4,对于交换机2来说,对应交换机3的MLAG链路(链路1202)是正常的,可以正常学习到服务器4的MAC信息,并填充至交换机2的第三转发数据库中。但是对于交换机1来说,由于服务器4发送的报文是经过peer-link链路(链路1203)接收的,报文中不携带交换机3的信息,所以无法确定服务器4对应的MLAG链路是链路1201,也就无法将服务器4的MAC信息(无法确定MLAG成员接口标识)填充至交换机1的第三转发数据库中。在一个实现方式中,在交换机1接收到服务器4发送过来的报文之前,交换机2将自己保存的第三转发数据库的信息(包括服务器4的信息)通过peer-link链路同步至交换机1。交换机1根据交换机2上的第三转发数据库的信息,将服务器4的信息,同步至自己的第三转发数据库中。在一个实现方式中,当交换机1和交换机2上的第三转发数据库之间的表项冲突时,例如,交换机1和交换机2上都存在MAC地址为MAC10的表项,交换机1上该表项的最后更新时间为t1,交换机2上该表项的最后更新时间为t2(t1<t2,说明交换机2上该表项最后更新的时间比交换机1上的所属表项要晚),当交换机2上的数据同步至交换机1时,可以根据t1<t2,直接将交换机1上的表项覆盖。由于交换机1和交换机2,对应于交换机3的MLAG成员接口标识是一样的(假设MLAG成员接口标识是“m-interface4”),所以,对于服务器4来说,在链路1201故障后,也可以跟服务器1一样,通过peer-link链路跟服务器2通信。当链路1201恢复正常后,通过交换机1的芯片修改第一转发数据库中“m-interface4”对应的主备切换标记,服务器4可以通过链路1201与服务器2进行通信。
需要说明的是,本申请所提供的实施例仅仅是示意性的。所属领域的技术人员可以清楚的了解到,为了描述的方便和简洁,在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。在本申请实施例、权利要求以及附图中揭示的特征可以独立存在也可以组合存在,在此不做 限定。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (14)

  1. 一种网络设备,其特征在于,包括芯片,
    所述芯片用于生成转发数据库,并根据所述转发数据库转发接收到的报文;
    所述转发数据库包括第一转发数据库、第二转发数据库和第三转发数据库。
  2. 根据权利要求1所述的网络设备,其特征在于,
    所述第一转发数据库用于存储至少一个多框链路聚合组MLAG成员接口到至少一个主备切换标记的映射;
    所述第二转发数据库用于存储至少一个物理端口到所述至少一个MLAG成员接口的映射;
    所述第三转发数据库用于存储媒体接入控制MAC地址到第一MLAG成员接口的映射,所述第一MLAG成员接口在所述至少一个MLAG成员接口中。
  3. 根据权利要求2所述的网络设备,其特征在于,所述芯片还用于:
    将所述报文的源MAC地址到第二MLAG成员接口的映射存储于所述第三转发数据库中,其中,所述第二MLAG成员接口为根据所述第二转发数据库与第一物理端口对应的MLAG成员接口,所述第一物理端口为接收所述报文的物理端口。
  4. 根据权利要求2或3所述的网络设备,其特征在于,所述芯片用于当MLAG链路故障时,将所述第一转发数据库中第三MLAG成员接口对应的主备切换标记的值更新为备标记,所述备标记用于指示所述芯片通过对等链路发送指向所述第三MLAG成员接口的报文,所述第三MLAG成员接口为根据所述第二转发数据库与第二物理端口对应的MLAG成员接口,所述第二物理端口为所述MLAG链路的物理端口。
  5. 根据权利要求4所述的网络设备,其特征在于,当所述MLAG链路恢复时,所述芯片将所述第三MLAG成员接口对应的主备切换标记的值更新为主标记,所述主标记用于指示所述芯片根据所述第二转发数据库发送所述报文。
  6. 根据权利要求2-5中任意一项所述的网络设备,其特征在于,所述芯片用于:
    根据所述第三转发数据库,确定与所述报文的目的MAC地址对应的MLAG成员接口为所述第一MLAG成员接口;
    当所述第一转发数据库中与所述第一MLAG成员接口对应的主备切换标记的值为主标记时,通过第三物理端口转发所述报文,其中,所述第三物理端口为所述第二转发数据库中与所述第一MLAG成员接口对应的物理端口。
  7. 根据权利要求6所述的网络设备,其特征在于,所述芯片还用于,
    当所述第一MLAG成员接口对应的主备切换标记的值为备标记时,通过对等链路转发所述报文。
  8. 根据权利要求1-7中任意一项所述的网络设备,其特征在于,所述网络设备与另一网络设备组成MLAG系统,
    所述网络设备用于将所述第三转发数据库同步至所述另一网络设备。
  9. 一种MLAG链路切换方法,包括:
    当网络设备的MLAG链路故障时,将与第一转发数据库中第一MLAG成员接口对应的主备切换标记的值更新为备标记,所述备标记用于指示所述网络设备通过对等链路发送指向所述第一MLAG成员接口的报文,所述第一MLAG成员接口为根据所述第二转发数据库与第一物理端口对应的MLAG成员接口,所述第一物理端口为所述MLAG链路的物理端口,其中,所述网络设备包括所述第一转发数据库和所述第二转发数据库,所述第一转发数据库用于存储至少一个多框链路聚合组MLAG成员接口到至少一个主备切换标记的映射,所述第二转发数据库用于存储至少一个物理端口到所述至少一个MLAG成员接口的映射。
  10. 根据权利要求9所述的方法,其特征在于,包括:
    当所述MLAG链路恢复时,将所述第一MLAG成员接口对应的主备切换标记的值更新为主标记,所述主标记用于指示所述网络设备根据所述第二转发数据库发送所述报文。
  11. 根据权利要求9或10所述的方法,其特征在于,所述网络设备还包括第三转发数据库,所述方法还包括:
    将所述报文的源MAC地址到第二MLAG成员接口的映射存储于所述第三转发数据库中,其中,所述第二MLAG成员接口为根据所述第二转发数据库与第二物理端口对应的MLAG成员接口,所述第二物理端口为接收所述报文的物理端口。
  12. 根据权利要求11所述的方法,其特征在于,还包括:
    根据所述第三转发数据库,确定与所述报文的目的MAC地址对应的MLAG成员接口为所述第一MLAG成员接口,其中,所述第三转发数据库还包括所述目的MAC地址到所述第一MLAG成员接口的映射;
    当所述第一转发数据库中与所述第一MLAG成员接口对应的主备切换标记的值为主标记时,通过所述第一物理端口转发所述报文。
  13. 根据权利要求12所述的方法,其特征在于,还包括:
    当所述第一MLAG成员接口对应的主备切换标记的值为备标记时,通过所述对等链路转发所述报文。
  14. 根据权利要求11-13中任意一项所述的方法,其特征在于,还包括:
    将所述第三转发数据库同步至另一网络设备,其中,所述另一网络设备与所述网络设备组成MLAG系统的。
PCT/CN2021/105665 2020-10-12 2021-07-12 Mlag链路故障切换方法和装置 WO2022077972A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP21879020.2A EP4207646A4 (en) 2020-10-12 2021-07-12 MLAG CONNECTION FAULT SWITCHING METHOD AND APPARATUS
JP2023521837A JP2023544870A (ja) 2020-10-12 2021-07-12 Mlagリンク障害時切換え方法および装置
US18/298,958 US20230246949A1 (en) 2020-10-12 2023-04-11 Mlag link failure switching method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011086369.0 2020-10-12
CN202011086369.0A CN114338512A (zh) 2020-10-12 2020-10-12 Mlag链路故障切换方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/298,958 Continuation US20230246949A1 (en) 2020-10-12 2023-04-11 Mlag link failure switching method and apparatus

Publications (1)

Publication Number Publication Date
WO2022077972A1 true WO2022077972A1 (zh) 2022-04-21

Family

ID=81032113

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/105665 WO2022077972A1 (zh) 2020-10-12 2021-07-12 Mlag链路故障切换方法和装置

Country Status (5)

Country Link
US (1) US20230246949A1 (zh)
EP (1) EP4207646A4 (zh)
JP (1) JP2023544870A (zh)
CN (1) CN114338512A (zh)
WO (1) WO2022077972A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024001315A1 (zh) * 2022-06-29 2024-01-04 中兴通讯股份有限公司 网元切换方法、装置、跨设备链路聚合组及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115225468B (zh) * 2022-07-26 2024-06-14 苏州盛科通信股份有限公司 流量快速切换方法、系统及计算机可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160036728A1 (en) * 2014-07-31 2016-02-04 Arista Networks, Inc Method and system for vtep redundancy in a multichassis link aggregation domain
CN106656789A (zh) * 2016-12-30 2017-05-10 盛科网络(苏州)有限公司 Mlag广播和组播的芯片实现方法
CN108390821A (zh) * 2018-02-27 2018-08-10 盛科网络(苏州)有限公司 一种openflow交换机实现双活的方法及系统
CN109088819A (zh) * 2018-07-25 2018-12-25 新华三技术有限公司合肥分公司 一种报文转发方法、交换机及计算机可读存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8787149B1 (en) * 2012-02-01 2014-07-22 Juniper Networks, Inc. MAC address synchronization for multi-homing with multichassis link aggregation
US20140204760A1 (en) * 2013-01-22 2014-07-24 Brocade Communications Systems, Inc. Optimizing traffic flows via mac synchronization when using server virtualization with dynamic routing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160036728A1 (en) * 2014-07-31 2016-02-04 Arista Networks, Inc Method and system for vtep redundancy in a multichassis link aggregation domain
CN106656789A (zh) * 2016-12-30 2017-05-10 盛科网络(苏州)有限公司 Mlag广播和组播的芯片实现方法
CN108390821A (zh) * 2018-02-27 2018-08-10 盛科网络(苏州)有限公司 一种openflow交换机实现双活的方法及系统
CN109088819A (zh) * 2018-07-25 2018-12-25 新华三技术有限公司合肥分公司 一种报文转发方法、交换机及计算机可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4207646A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024001315A1 (zh) * 2022-06-29 2024-01-04 中兴通讯股份有限公司 网元切换方法、装置、跨设备链路聚合组及存储介质

Also Published As

Publication number Publication date
CN114338512A (zh) 2022-04-12
JP2023544870A (ja) 2023-10-25
US20230246949A1 (en) 2023-08-03
EP4207646A1 (en) 2023-07-05
EP4207646A4 (en) 2024-02-21

Similar Documents

Publication Publication Date Title
CN107733793B (zh) 一种转发表项维护方法及装置
US7885180B2 (en) Address resolution request mirroring
US9154419B2 (en) Traffic forwarding in a layer 2 edge network
US6751191B1 (en) Load sharing and redundancy scheme
US9019814B1 (en) Fast failover in multi-homed ethernet virtual private networks
US6941487B1 (en) Method, system, and computer program product for providing failure protection in a network node
US20230246949A1 (en) Mlag link failure switching method and apparatus
CA2318747C (en) Router pooling in a network flowswitch
US8787149B1 (en) MAC address synchronization for multi-homing with multichassis link aggregation
US20050111352A1 (en) Method and system for monitoring a network containing routers using a backup routing protocol
US6128296A (en) Method and apparatus for distributed packet switching using distributed address tables
US9716687B2 (en) Distributed gateways for overlay networks
US7801150B1 (en) Multiple media access control (MAC) addresses
WO2017054770A1 (zh) 集群通信
CN108600069B (zh) 链路切换方法及装置
WO2019201209A1 (zh) 报文转发
US20180278577A1 (en) High availability bridging between layer 2 networks
CN108540386B (zh) 一种防止业务流中断方法及装置
US20210234727A1 (en) Evpn multihoming optimizations for ethernet segment connection interruptions
US11646991B2 (en) System and method for optimizing ARP broadcast
CN113381929B (zh) 一种路由处理方法、网关设备及计算机存储介质
US20080144634A1 (en) Selective passive address resolution learning
US11178045B2 (en) System and method for efficient route update in an EVPN network
CN112511419B (zh) 一种分布式转发系统
WO2015117465A1 (zh) 环形网络中fdb刷新方法、装置、节点及系统

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021879020

Country of ref document: EP

Effective date: 20230331

ENP Entry into the national phase

Ref document number: 2023521837

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21879020

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE