CN111030926B - Method and device for improving high availability of network - Google Patents

Method and device for improving high availability of network Download PDF

Info

Publication number
CN111030926B
CN111030926B CN201911330774.XA CN201911330774A CN111030926B CN 111030926 B CN111030926 B CN 111030926B CN 201911330774 A CN201911330774 A CN 201911330774A CN 111030926 B CN111030926 B CN 111030926B
Authority
CN
China
Prior art keywords
counter top
server
network
switch
network card
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911330774.XA
Other languages
Chinese (zh)
Other versions
CN111030926A (en
Inventor
任长雷
李德新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN201911330774.XA priority Critical patent/CN111030926B/en
Publication of CN111030926A publication Critical patent/CN111030926A/en
Application granted granted Critical
Publication of CN111030926B publication Critical patent/CN111030926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/10Mapping addresses of different types
    • H04L61/103Mapping addresses of different types across network layers, e.g. resolution of network layer into physical layer addresses or address resolution protocol [ARP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/70Virtual switches

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a method and a device for improving network high availability, wherein the method comprises the following steps: the server sends the corresponding ARP request message to the two counter top switches periodically through the two network cards with equivalent routes respectively; the two counter top switches respectively generate host routes of the server according to the ARP request message and report the host routes to the spine switch; the spine switch seeks a path to the server according to the host routing and transmits uplink flow and/or downlink flow through the two counter top switches and the two network cards; responding to the fault of one of the two network cards, and transmitting the uplink flow by the server through the other network card and the corresponding counter top switch; and in response to the fact that one of the two counter top switches does not receive the ARP request message when the threshold value is exceeded, informing the spine switch that the downlink flow is transmitted only through the other counter top switch and the corresponding network card. The embodiment of the invention does not need to be limited by the special requirements of stacking or M-LAG in the existing scheme to purchase new equipment, thereby improving the utilization rate of the equipment and the reliability of the network.

Description

Method and device for improving high availability of network
Technical Field
The invention relates to the technical field of cloud data. The invention further relates to a method and a device for improving the high availability of the network.
Background
With the rapid development of the cloud in recent years, more and more users deploy parts of even key services or applications to public or private clouds. With the rapid increase of the number of users and businesses, cloud service providers have deeply realized the importance of improving the availability of cloud products to the development and retention of users. The realization effect of the user-oriented product SLA (Service Level Agreement) depends on each link of the cloud Service, and the basic network is an important environment for all products to depend on, so how to more effectively improve the high availability of the network has an important role in integrally improving the SLA of the product.
To achieve high availability of networks, especially of Leaf-Spine (Leaf-Spine) networks, with reference to fig. 1, four server-switch connection schemes are currently mainstream:
1. each node under a TOR switch (cabinet top switch) is a single-point fault point;
2. the server double network cards are bound by LACP (Link Aggregation Control Protocol), so that single-point failure of the server is solved, but the TOR is still a single-point failure point;
3. the method comprises the following steps that (1) double network cards are stacked with switches, a server is connected to 2 TOR switches in a LACP binding mode, the 2 TOR switches need to be configured into the stacked arrangement of the switches of the same manufacturer and the same series, and network nodes are all redundant;
4. the dual network cards and the switches are stacked, the dual network cards of the server are bound and connected to the 2 TOR switches by LACP, the 2 TOR switches need to be configured to perform M-LAG (Multi-link Aggregation) work with the switches of the same manufacturer, and all network nodes are redundant.
The 4. dual network card-switch unstacking mode of the four modes has relatively high availability, but is limited by special requirements of stacking or M-LAG, corresponding equipment needs to be added, the cost is increased, and the utilization rate of the equipment is correspondingly reduced.
Therefore, a scheme for more effectively improving the high availability of the network is required to be provided, and particularly, for a dual-network card-switch unstacking manner, the limitation that the same manufacturer equipment must be used in two schemes, namely a server dual-network card connection stacking switch and a server dual-network card connection M-LAG unstacking switch in the current mainstream connection scheme of the server and the switch is broken through, so that heterogeneous structures among different manufacturer equipment are realized, and the utilization rate of the existing equipment and the high reliability of the network are improved.
Disclosure of Invention
In one aspect, the present invention provides a method for improving high availability of a network based on the above object, wherein the method comprises the following steps:
the server sends the corresponding ARP request message to the two counter top switches periodically through the two network cards with equivalent routes respectively;
the two counter top switches respectively generate host routes of the server according to the ARP request message and report the host routes to the spine switch;
the spine switch seeks a path to the server according to the host routing and transmits uplink flow and/or downlink flow through the two counter top switches and the two network cards;
responding to the fault of one of the two network cards, and transmitting the uplink flow by the server through the other network card and the corresponding counter top switch;
and in response to the fact that one of the two counter top switches does not receive the ARP request message when the threshold value is exceeded, informing the spine switch that the downlink flow is transmitted only through the other counter top switch and the corresponding network card.
According to the embodiment of the method for improving network high availability of the present invention, the step in which the server periodically sends the corresponding ARP request messages to the two counter top switches through the two network cards with equivalent routes further includes:
configuring equivalent routes of IP addresses with the same mask for the two network cards;
and the server sends a corresponding ARP request message to a first counter top switch directly connected with the first network card through the first network card according to the equivalent route, and sends a corresponding ARP request message to a second counter top switch directly connected with the second network card through the second network card according to the equivalent route.
According to the embodiment of the method for improving the network high availability, the two counter top switches generate the host routes of the servers according to the ARP request message and report the host routes to the spine switch further comprise:
the two counter top switches receive and analyze the ARP request message, and generate a host route routed to the server and corresponding ARP table entries related to the two network cards according to the information related to the analyzed equivalent route;
the two counter top switches report host routing to the spine switch via the external border gateway protocol.
According to an embodiment of the method for improving network high availability of the present invention, the spine switch routing to the server according to the host and transmitting the upstream traffic and/or the downstream traffic through the two counter top switches and the two network cards further comprises:
the spine switch seeks to the server according to the host routing;
establishing two links between the spine switch and the server through a first counter top switch and a first network card and through a second counter top switch and a second network card, wherein the two links have equivalent routes;
and the ridge switch and the server transmit uplink traffic and/or downlink traffic through the two links according to the host routing.
In an embodiment of the method for improving network high availability according to the present invention, in response to a failure of one of the two network cards, the server transmits the upstream traffic only through the other network card and the corresponding counter top switch further includes:
configuring and running an Ethernet link detection daemon in a server;
and judging whether the two network cards have faults or not by detecting the physical connection state of the two network cards monitored by the daemon process through the Ethernet link.
In an embodiment of the method for improving network high availability according to the present invention, in response to a failure of one of the two network cards, the server transmits the upstream traffic only through the other network card and the corresponding counter top switch further includes:
responding to the fault of one of the two network cards, and deleting the routing information of the fault network card from the server;
and the server transmits the uplink flow to the spine switch through the other network card and the counter top switch directly connected with the other network card according to the routing information of the other network card.
According to the embodiment of the method for improving network high availability of the present invention, in response to that one of the two counter top switches does not receive the ARP request message for more than the threshold time, notifying the spine switch to transmit the downstream traffic only through the other counter top switch and the corresponding network card further comprises:
responding to the situation that one of the two counter top switches does not receive an ARP request message of a network card directly connected with the counter top switch when the threshold value is exceeded, and judging that a corresponding link is abnormal;
deleting the host route and the ARP table entry of the network card directly connected with the host route from the counter top switch of the abnormal link;
the counter top switch of the abnormal link informs the spine switch of the abnormal information through an external border gateway protocol.
According to the embodiment of the method for improving network high availability of the present invention, in response to that one of the two counter top switches does not receive the ARP request message for more than the threshold time, notifying the spine switch to transmit the downstream traffic only through the other counter top switch and the corresponding network card further comprises:
and the spine switch deletes the abnormal link between the spine switch and the server according to the received abnormal information and transmits downlink flow to the server through another link.
An embodiment of the method of increasing network high availability according to the invention, wherein the method further comprises:
and responding to the recovery of one of the two network cards from the fault, and periodically sending corresponding ARP request messages to the two counter top switches from the server through the two network cards respectively.
In another aspect, the present invention further provides an apparatus for improving network high availability, where the apparatus includes:
at least one processor; and
a memory storing processor-executable program instructions that, when executed by the processor, perform the steps of the method of any of the preceding embodiments.
By adopting the technical scheme, the invention at least has the following beneficial effects: the server is connected with two switches on the double network cards, the existing switches in the existing network are utilized to start ARP (address resolution protocol) route conversion and rapid ARP aging functions, meanwhile, the double network cards of the server are configured with equivalent routes, relevant routes are configured to point to 2 independent TOR (time of arrival) switches, two actual data links related to the same host route are established, and links for transmitting uplink flow and downlink flow are switched according to the network card state and signals received by the TOR switches, so that the redundancy of network nodes can be realized.
The present invention provides aspects of embodiments, which should not be used to limit the scope of the present invention. Other embodiments are contemplated in accordance with the techniques described herein, as will be apparent to one of ordinary skill in the art upon study of the following figures and detailed description, and are intended to be included within the scope of the present application.
Embodiments of the invention are explained and described in more detail below with reference to the drawings, but they should not be construed as limiting the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the description of the prior art and the embodiments will be briefly described below, parts in the drawings are not necessarily drawn to scale, and related elements may be omitted, or in some cases the scale may have been exaggerated in order to emphasize and clearly show the novel features described herein. In addition, the structural order may be arranged differently, as is known in the art.
FIG. 1 is a schematic diagram of a conventional implementation of a network high availability mainstream server-switch connection;
FIG. 2 shows a schematic block diagram of an embodiment of a method of improving network high availability according to the present invention;
fig. 3 shows a schematic diagram of a server-switch connection according to the invention.
Detailed Description
While the present invention may be embodied in various forms, there is shown in the drawings and will hereinafter be described some exemplary and non-limiting embodiments, with the understanding that the present disclosure is to be considered an exemplification of the invention and is not intended to limit the invention to the specific embodiments illustrated.
It should be noted that the steps mentioned in the following description of the embodiments of the present invention are only numbered for convenience and clarity of indicating the steps without specific description, and do not limit the sequence of the steps.
Fig. 2 shows a schematic block diagram of an embodiment of a method of improving network high availability according to the present invention. In the embodiment shown in fig. 2, the method comprises at least the following steps:
s1: the server sends the corresponding ARP request message to the two counter top switches periodically through the two network cards with equivalent routes respectively;
s2: the two counter top switches respectively generate host routes of the server according to the ARP request message and report the host routes to the spine switch;
s3: the spine switch seeks a path to the server according to the host routing and transmits uplink flow and/or downlink flow through the two counter top switches and the two network cards;
s4: responding to the fault of one of the two network cards, and transmitting the uplink flow by the server through the other network card and the corresponding counter top switch;
s5: and in response to the fact that one of the two counter top switches does not receive the ARP request message when the threshold value is exceeded, informing the spine switch that the downlink flow is transmitted only through the other counter top switch and the corresponding network card.
First, in order to perform the method of the embodiments within the scope of the present invention, it is necessary to deploy relevant configurations at the switch side and the server side, respectively, including:
exchange side configuration:
1. configuring the same or different IP for two tor switches;
2. starting an ARP agent function;
3. starting an ARP host-to-host routing function;
4. configuring the ARP timeout threshold time so as to rapidly withdraw the host route when the server does not respond to the ARP, wherein the timeout threshold time is configured to be 5 seconds and the like;
5. the 2 TOR (Top of Cabinet) switches and the upper spine switch establish EBGP (External Border Gateway Protocol) neighbors.
Server side configuration:
1. the system version of the server needs a kernel version to support an ECMP (Equal-Cost Multi Routing) mode of L4 HASH (Hash);
2. two network cards of the server are configured with the same IP address of 32-bit mask, so that the routing path reaching the server is ensured to be completely equivalent for the upper node;
3. configuring a default equivalent route and simultaneously pointing to the two network cards;
4. and configuring an ARPing command to the default starting, and after ensuring that the network cards up, the two network cards respectively send ARP messages to make the switch learn the ARP list items, so as to accelerate convergence.
On this basis, with reference to the schematic diagram of a server-switch connection according to the invention of fig. 3, an embodiment of the method according to the invention may be implemented as: first, step S1 is to periodically send the corresponding ARP request message from the server 10 to the two top switches 22 and 24 through the two network cards 12 and 14 having equivalent routes, respectively. Step S2 is to generate host routes of the server 10 according to the ARP request messages at the two counter top switches 22 and 24 and report the host routes to the spine switch 30. That is, the host route is a host route for the server 10, and the spine switch 30 can directly seek to the server by using the host route without processing for the two counter switches 22 and 24 or the two network cards 12 and 14, in other words, for the upper node, i.e. the spine switch 30 of the upper layer, the lower node, i.e. the two counter switches 22 and 24 or the two network cards 12 and 14 of the lower layer, is not sensible, and the spine switch 30 only needs to know who the server 10 it is to be docked with, and does not need to know what the path is. Thus, step S3 routes from the spine switch 30 to the server 10 according to host routing and transmits upstream and/or downstream traffic through the two counter top switches 22, 24 and the two network cards 12, 14. The upstream traffic refers to data traffic uploaded from the server 10 to the spine switch 30, and the downstream traffic refers to data traffic sent from the spine switch 30 to the server 10. In the case where nodes at various levels in the network are operating normally, data in the network is transmitted between the server and the switch as described above. When a node fails, in order to ensure high availability of the network, a corresponding redundancy protection mechanism needs to be implemented, wherein when one of the two network cards 12 or 14 of the server 10 fails, the server 10 only transmits upstream traffic through the other network card 14 or 12 and the corresponding counter top switch 24 or 22 in step S4, that is, the server no longer transmits upstream traffic through the two counter top switches 22, 24 and the two network cards 12, 14, but actively stops the operation of the failed network card 12 or 14, and only uses the other network card 14 or 12 and the corresponding counter top switch 24 or 22. On the other hand, when one of the two counter top switches 22 or 24 does not receive the ARP request message after the threshold time is exceeded, it indicates that the corresponding data link has a failure and cannot normally transmit data, so the counter top switch 22 or 24 that found the failure in step S5 notifies the spine switch 30 to transmit the downstream traffic only through the other counter top switch 24 or 22 and the corresponding network card 14 or 12. To this end, the two counter top switches 22, 24 and the two network cards 12, 14 are redundant of each other to ensure high availability of the network, without being limited to the special requirements of the switch stack or M-LAG. Therefore, the connection between the double-network-card double-uplink of the server and the two TOR switches can be completed, the Linux kernel of the server does not need to be modified, and the server side supports a 2-layer mode or a BGP (Border gateway protocol) routing mode to access the network. The two TOR switches are completely independent, the risk of a single manufacturer is avoided by the heterogeneous switches, and the stability of the network side is improved.
In some embodiments of the method for improving network high availability of the present invention, the step S1 where the server periodically sends the corresponding ARP request messages to the two counter top switches through the two network cards with equivalent routes further includes:
s11: configuring equivalent routes of IP addresses with the same mask for the two network cards;
s12: and the server sends a corresponding ARP request message to a first counter top switch directly connected with the first network card through the first network card according to the equivalent route, and sends a corresponding ARP request message to a second counter top switch directly connected with the second network card through the second network card according to the equivalent route.
Specifically, step S1 configures equivalent routes having the same IP address, for example, the same 32-bit IP address, with the same mask for the two network cards 12 and 14 of the server 10, so as to ensure that the routing paths to the server 10 are completely equivalent for the upper node, i.e., the spine switch 30. In addition, for the connection, the first network card 12 is directly connected to the first counter top switch 22, and the second network card 22 is directly connected to the second counter top switch 24, so that after the network cards 12 and 14 of the server 10 are started, the server 10 simultaneously sends ARP request messages to the counter top switches 22 and 24 directly connected to the two network cards 12 and 14, respectively, according to the configured equivalent route, that is, step S12 sends corresponding ARP request messages from the server 10 to the first counter top switch 22 directly connected to the first network card 12 through the first network card 12 according to the equivalent route, and sends corresponding ARP request messages from the server 10 to the second counter top switch 24 directly connected to the second network card 14 through the second network card 14 according to the equivalent route.
In some embodiments of the method for improving network high availability of the present invention, the step S2 in which the two counter top switches generate host routes of the servers according to the ARP request message and report the host routes to the spine switch further includes:
s21: the two counter top switches receive and analyze the ARP request message, and generate a host route routed to the server and corresponding ARP table entries related to the two network cards according to the information related to the analyzed equivalent route;
s22: the two counter top switches report host routing to the spine switch via the external border gateway protocol.
After the TOR top switches 22 and 24 connected to the server 10 receive the ARP request message sent by the server 10, ARP learning is performed, that is, in step S21, the two top switches 22 and 24 receive and analyze the ARP request message, and simultaneously, a host route routed to the server 10 and corresponding ARP entries about the two network cards 12 and 14 are generated according to the information related to the equivalent route that is analyzed. Subsequently, in step S22, the two top-cabinet switches 22 and 24 report the host routing to the spine switch 30 through the EBGP external border gateway protocol, that is, the two TOR top-cabinet switches 22 and 24 advertise the host routing generated in step S21 to the server 10 to the upper spine switch 30 through the EBGP.
In some embodiments of the method for improving network high availability of the present invention, the step S3 the spine switch routing to the server according to the host and transmitting the upstream traffic and/or the downstream traffic through the two counter top switches and the two network cards further includes:
s31: the spine switch seeks to the server according to the host routing;
s32: establishing two links between the spine switch and the server through a first counter top switch and a first network card and through a second counter top switch and a second network card, wherein the two links have equivalent routes;
s33: and the ridge switch and the server transmit uplink traffic and/or downlink traffic through the two links according to the host routing.
Specifically, when the data traffic to be transmitted is generated, the spine switch 30 routes to the server 10 according to the host route in step S31, and then two links passing through the first counter top switch 22 and the first network card 12 and passing through the second counter top switch 24 and the second network card 14 are spontaneously established between the spine switch 30 and the server 10 in step S32, and the two links have equivalent routes, that is, the upper spine switch generates equivalent routes of two ECMPs in the local computer. Based on this, in step S33, the spine switch 30 and the server 10 transmit uplink traffic and/or downlink traffic through the two links according to the host routing, that is, the traffic from the spine switch 30 to the server 10, and enter the server 10 through the two links of the ECMP from the two network cards 12 and 14 via the two counter top switches 22 and 24, respectively.
In some embodiments of the method for improving network high availability of the present invention, the step S4, in response to a failure of one of the two network cards, the transmitting, by the server, the upstream traffic only through the other network card and the corresponding counter top switch further includes:
s41: configuring and running an Ethernet link detection daemon in a server;
s42: and judging whether the two network cards have faults or not by detecting the physical connection state of the two network cards monitored by the daemon process through the Ethernet link.
In these embodiments, the server side is further configured to configure and run an ethernet link detection daemon to detect the status of the network port, and monitor the physical connection, preferably with iflugd as the daemon. When the physical connection is changed in UP/DOWN state, the corresponding ARP Ping and route modification operation can be executed in time. That is, the ethernet link detection daemon in the server 10 runs in the background and monitors the link status of the two network cards 12 and 14 in real time. Whether the two network cards 12 and 14 have faults is judged according to the physical connection state of the two network cards 12 and 14 monitored by the Ethernet link detection daemon.
In some embodiments of the method for improving network high availability of the present invention, the step S4, in response to a failure of one of the two network cards, the transmitting, by the server, the upstream traffic only through the other network card and the corresponding counter top switch further includes:
s43: responding to the fault of one of the two network cards, and deleting the routing information of the fault network card from the server;
s44: and the server transmits the uplink flow to the spine switch through the other network card and the counter top switch directly connected with the other network card according to the routing information of the other network card.
In these embodiments, when one of the two network cards 12 or 14 of the server 10 fails, the server 10 actively stops the operation of the failed network card 12 or 14, that is, step S43 deletes the routing information of the failed network card 12 or 14 from the server 10. And the server 10 uses the other network card 14 or 12 and the corresponding counter top switch 24 or 22 for upstream data traffic transmission, that is, step S44 the server 10 transmits upstream traffic to the spine switch 30 through the other network card 14 or 12 and the counter top switch 24 or 22 directly connected to the other network card 14 or 12 according to the routing information of the other network card 14 or 12.
In some embodiments of the method for improving network high availability of the present invention, the step S5, in response to that one of the two counter top switches does not receive the ARP request message for more than a threshold time, notifying the spine switch to transmit the downstream traffic only through the other counter top switch and the corresponding network card further includes:
s51: responding to the situation that one of the two counter top switches does not receive an ARP request message of a network card directly connected with the counter top switch when the threshold value is exceeded, and judging that a corresponding link is abnormal;
s52: deleting the host route and the ARP table entry of the network card directly connected with the host route from the counter top switch of the abnormal link;
s53: the counter top switch of the abnormal link informs the spine switch of the abnormal information through an external border gateway protocol.
From the perspective of the counter top switches 22 and 24, when one of the two counter top switches 22 or 24 does not receive the ARP request message after the threshold time (for example, 5 seconds) is exceeded, step S51 determines that an abnormality occurs on the corresponding data link, for example, a failure of the corresponding network card 12 or 14 occurs, and data cannot be normally transmitted. Thus, step S52 deletes the host route and the ARP entry of the network card 12 or 14 directly connected to the counter top switch 22 or 24 from the counter top switch 22 or 24 of the abnormal link, i.e. stops the abnormal link. The counter switch 22 or 24 of the abnormal link in step S53 then notifies the spine switch 30 of the abnormal information through the external border gateway protocol EBPG, so that the spine switch 30 transmits the downstream traffic only through the other link, i.e., the other counter switch 24 or 22 and the corresponding network card 14 or 12.
In some embodiments of the method for improving network high availability of the present invention, the step S5, in response to that one of the two counter top switches does not receive the ARP request message for more than a threshold time, notifying the spine switch to transmit the downstream traffic only through the other counter top switch and the corresponding network card further includes:
s54: and the spine switch deletes the abnormal link between the spine switch and the server according to the received abnormal information and transmits downlink flow to the server through another link.
From the perspective of the spine switch 30, after the spine switch 30 receives the abnormal information notified by the counter top switch 22 or 24 of the abnormal link, the abnormal link between the spine switch 30 and the server 10 is deleted according to the information, and further, one of the two ECMP equivalent routes generated by the upper spine switch 30 in the local computer in step S32, which corresponds to the abnormal link, is deleted, and the downlink traffic is transmitted to the server 10 through another link, that is, the downlink traffic is all switched to the other link.
In some embodiments of the method of increasing network high availability of the present invention, the method further comprises:
s6: and responding to the recovery of one of the two network cards from the fault, and periodically sending corresponding ARP request messages to the two counter top switches from the server through the two network cards respectively.
After one of the two network cards 12 or 14 of the server 10 fails, and after the network card is recovered by itself or manually maintained, if one of the two network cards 12 or 14 recovers from the failure, step S6 sends corresponding ARP request messages from the server 10 through the two network cards 12, 14 to the two top switches 22, 24, respectively, or re-executes step S1, and further executes steps S2, S3, and re-establishes the dual network card-switch de-stacked high-availability network.
In another aspect, the present invention further provides an apparatus for improving network high availability, where the apparatus includes: at least one processor; and a memory storing program instructions executable by the processor to perform the steps of the method of any of the preceding embodiments when executed by the processor.
The devices and apparatuses disclosed in the embodiments of the present invention may be various electronic terminal apparatuses, such as a mobile phone, a Personal Digital Assistant (PDA), a tablet computer (PAD), a smart television, and the like, or may be a large terminal apparatus, such as a server, and therefore the scope of protection disclosed in the embodiments of the present invention should not be limited to a specific type of device and apparatus. The client disclosed in the embodiment of the present invention may be applied to any one of the above electronic terminal devices in the form of electronic hardware, computer software, or a combination of both.
The computer-readable storage media (e.g., memory) described herein may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which can act as external cache memory. By way of example and not limitation, RAM is available in a variety of forms such as synchronous RAM (DRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.
By adopting the technical scheme, the invention at least has the following beneficial effects: the server is connected with two switches on the double network cards, the existing switches in the existing network are utilized to start ARP (address resolution protocol) route conversion and rapid ARP aging functions, meanwhile, the double network cards of the server are configured with equivalent routes, relevant routes are configured to point to 2 independent TOR (time of arrival) switches, two actual data links related to the same host route are established, and links for transmitting uplink flow and downlink flow are switched according to the network card state and signals received by the TOR switches, so that the redundancy of network nodes can be realized.
It is to be understood that the features listed above for the different embodiments may be combined with each other to form further embodiments within the scope of the invention, where technically feasible. Furthermore, the specific examples and embodiments described herein are non-limiting, and various modifications of the structure, steps and sequence set forth above may be made without departing from the scope of the invention.
In this application, the use of the conjunction of the contrary intention is intended to include the conjunction. The use of definite or indefinite articles is not intended to indicate cardinality. In particular, references to "the" object or "an" and "an" object are intended to mean one of many such objects possible. However, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated. Furthermore, the conjunction "or" may be used to convey simultaneous features, rather than mutually exclusive schemes. In other words, the conjunction "or" should be understood to include "and/or". The term "comprising" is inclusive and has the same scope as "comprising".
The above-described embodiments, particularly any "preferred" embodiments, are possible examples of implementations, and are presented merely for a clear understanding of the principles of the invention. Many variations and modifications may be made to the above-described embodiments without departing substantially from the spirit and principles of the technology described herein. All such modifications are intended to be included within the scope of this disclosure.

Claims (9)

1. A method for increasing network high availability, the method comprising the steps of:
the server sends the corresponding ARP request message to the two counter top switches periodically through the two network cards with equivalent routes respectively;
the two counter top switches respectively generate host routes of the server according to the ARP request message and report the host routes to the spine switch;
the spine switch seeks to the server according to the host routing and transmits uplink flow and/or downlink flow through the two counter top switches and the two network cards;
responding to the fault of one of the two network cards, and transmitting the uplink flow by the server through only the other network card and the corresponding counter top switch;
responding to the situation that one of the two counter top switches does not receive the ARP request message when the time exceeds the threshold value, and informing the spine switch to transmit the downlink flow only through the other counter top switch and the corresponding network card;
wherein, the server periodically sends the corresponding ARP request message to the two counter top switches respectively through the two network cards with equivalent routes further comprises:
configuring equivalent routes of IP addresses with the same mask for the two network cards;
and the server sends a corresponding ARP request message to a first counter top switch directly connected with the first network card through a first network card according to the equivalent route, and sends a corresponding ARP request message to a second counter top switch directly connected with the second network card through a second network card according to the equivalent route.
2. The method of claim 1, wherein the two on-top switches generating the host route of the server according to the ARP request message and reporting the host route to a spine switch further comprises:
the two counter top switches receive and analyze the ARP request message, and generate a host route routed to the server and corresponding ARP table entries related to the two network cards according to the analyzed information related to the equivalent route;
and the two counter top switches report the host routing to the spine switch through an external border gateway protocol.
3. The method of claim 2, wherein the spine switch routing to the server according to the host route and transmitting upstream and/or downstream traffic through the two counter top switches and the two network cards further comprises:
the spine switch routes to the server according to the host route;
establishing two links between the spine switch and the server through a first counter top switch and a first network card and through a second counter top switch and a second network card, wherein the two links have the equivalent route;
and the spine switch and the server transmit uplink flow and/or downlink flow through the two links according to the host routing.
4. The method of claim 1, wherein the server transmitting the upstream traffic only through the other network card and the corresponding counter top switch in response to a failure of one of the two network cards further comprises:
configuring and running an Ethernet link detection daemon in the server;
and judging whether the two network cards have faults or not according to the physical connection state of the two network cards monitored by the Ethernet link detection daemon.
5. The method of claim 1, wherein the server transmitting the upstream traffic only through the other network card and the corresponding counter top switch in response to a failure of one of the two network cards further comprises:
responding to the fault of one of the two network cards, and deleting the routing information of the fault network card from the server;
and the server transmits the uplink flow to the spine switch through the other network card and a counter top switch directly connected with the other network card according to the routing information of the other network card.
6. The method of claim 3, wherein the notifying the spine switch to transmit the downstream traffic only through the other counter top switch and corresponding network card in response to one of the two counter top switches not receiving the ARP request message for more than a threshold amount of time further comprises:
responding to the situation that one of the two counter top switches does not receive the ARP request message of the network card directly connected with the counter top switch within the time exceeding the threshold value, and judging the corresponding link is abnormal;
deleting the host routing and the ARP table entry of the network card directly connected with the host routing from the counter top switch of the abnormal link;
and the counter top switch of the abnormal link informs the spine switch of abnormal information through an external border gateway protocol.
7. The method of claim 6, wherein the notifying the spine switch to transmit the downstream traffic only through the other counter top switch and corresponding network card in response to one of the two counter top switches not receiving the ARP request message for more than a threshold amount of time further comprises:
and the spine switch deletes the abnormal link between the spine switch and the server according to the received abnormal information, and transmits the downlink flow to the server through the other link.
8. The method of claim 1, further comprising:
and responding to the recovery of one of the two network cards from the fault, and periodically sending corresponding ARP request messages to the two counter top switches from the server through the two network cards respectively.
9. An apparatus for improving network high availability, the apparatus comprising:
at least one processor; and
a memory storing processor-executable program instructions which, when executed by the processor, perform the steps of the method of any preceding claim 1 to 8.
CN201911330774.XA 2019-12-20 2019-12-20 Method and device for improving high availability of network Active CN111030926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911330774.XA CN111030926B (en) 2019-12-20 2019-12-20 Method and device for improving high availability of network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911330774.XA CN111030926B (en) 2019-12-20 2019-12-20 Method and device for improving high availability of network

Publications (2)

Publication Number Publication Date
CN111030926A CN111030926A (en) 2020-04-17
CN111030926B true CN111030926B (en) 2021-07-27

Family

ID=70212407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911330774.XA Active CN111030926B (en) 2019-12-20 2019-12-20 Method and device for improving high availability of network

Country Status (1)

Country Link
CN (1) CN111030926B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111988213B (en) * 2020-07-16 2022-06-03 浪潮思科网络科技有限公司 Method, equipment and medium for synchronizing VXLAN tunnel in EVPN MLAG environment
CN112491700B (en) * 2020-12-14 2023-05-02 成都颜创启新信息技术有限公司 Network path adjustment method, system, device, electronic equipment and storage medium
CN113630346B (en) * 2021-09-14 2023-08-04 北京百度网讯科技有限公司 Distributed network system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109587286A (en) * 2018-12-27 2019-04-05 新华三技术有限公司 A kind of equipment connection control method and device
CN110417569A (en) * 2018-04-28 2019-11-05 华为技术有限公司 A kind of network link failure processing method and endpoint of a tunnel equipment

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030202473A1 (en) * 2002-04-25 2003-10-30 General Instrument Corporation Traffic network flow control using dynamically modified metrics for redundancy connections
CN101488918B (en) * 2009-01-09 2012-02-08 杭州华三通信技术有限公司 Multi-network card server access method and system
US8929377B2 (en) * 2011-03-15 2015-01-06 Futurewei Technologies, Inc. Systems and methods for automatic rack detection
US9246804B1 (en) * 2012-11-30 2016-01-26 Hewlett Packard Enterprise Development Lp Network routing
US20150078152A1 (en) * 2013-09-13 2015-03-19 Microsoft Corporation Virtual network routing
CN105391636A (en) * 2015-10-16 2016-03-09 东南大学 Interconnection mechanism between software defined network (SDN) subnet and IP subnet in autonomous system
CN107645402B (en) * 2016-07-22 2021-02-26 新华三技术有限公司 Route management method and device
US10673736B2 (en) * 2017-04-25 2020-06-02 Cisco Technology, Inc. Traffic reduction in data center fabrics
CN108306759B (en) * 2017-12-28 2020-12-15 中国银联股份有限公司 Method and equipment for disturbance simulation of link between Leaf-Spine switches
CN109510751B (en) * 2018-12-19 2021-07-20 迈普通信技术股份有限公司 Message forwarding method and routing equipment
CN110505095B (en) * 2019-08-27 2022-04-08 浪潮云信息技术股份公司 Method for building large-scale virtual data center by using small number of servers

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110417569A (en) * 2018-04-28 2019-11-05 华为技术有限公司 A kind of network link failure processing method and endpoint of a tunnel equipment
CN109587286A (en) * 2018-12-27 2019-04-05 新华三技术有限公司 A kind of equipment connection control method and device

Also Published As

Publication number Publication date
CN111030926A (en) 2020-04-17

Similar Documents

Publication Publication Date Title
CN110912780B (en) High-availability cluster detection method, system and controlled terminal
CN111030926B (en) Method and device for improving high availability of network
CN107454155B (en) Fault processing method, device and system based on load balancing cluster
CN108574614B (en) Message processing method, device and network system
US8339940B2 (en) Multi-active detection method and stack member device
WO2017162184A1 (en) Method of controlling service traffic between data centers, device, and system
EP2701331B1 (en) Method for processing packet when server fails and router thereof
US20100306572A1 (en) Apparatus and method to facilitate high availability in secure network transport
US20110164494A1 (en) Method for operating a virtual router redundancy protocol router and communication system therefor
WO2012000234A1 (en) Method, apparatus and system for fast switching between links
CN101729426B (en) Method and system for quickly switching between master device and standby device of virtual router redundancy protocol (VRRP)
CN104168193A (en) Virtual router redundancy protocol fault detection method and router equipment
WO2021217872A1 (en) Method and apparatus for configuring gateway node on the basis of virtual private cloud, and medium
US9270558B2 (en) Method, local gateway, and system for local voice survivability
CN102006189A (en) Primary access server determination method and device for dual-machine redundancy backup
CN103944698A (en) Hot standby method
CN112583708B (en) Connection relation control method and device and electronic equipment
CN101741740B (en) Method, system and equipment for balancing loads
CN102932249B (en) A kind of transmission method of VRRP message and device
CN104618148B (en) The backup method and equipment of a kind of firewall box
CN108270593B (en) Dual-computer hot backup method and system
CN115333994B (en) Method and device for realizing VPN route rapid convergence and electronic equipment
JP2021061478A (en) Relay device, relay system, and relay program
CN107959626B (en) Communication method, device and system of data center
WO2019208460A1 (en) Failure monitoring device and failure monitoring method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant