CN115955434B - ECMP group failure recovery method and device, electronic equipment and storage medium - Google Patents

ECMP group failure recovery method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115955434B
CN115955434B CN202310239663.8A CN202310239663A CN115955434B CN 115955434 B CN115955434 B CN 115955434B CN 202310239663 A CN202310239663 A CN 202310239663A CN 115955434 B CN115955434 B CN 115955434B
Authority
CN
China
Prior art keywords
next hop
ecmp group
neighbor
invalid
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310239663.8A
Other languages
Chinese (zh)
Other versions
CN115955434A (en
Inventor
郭巍松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310239663.8A priority Critical patent/CN115955434B/en
Publication of CN115955434A publication Critical patent/CN115955434A/en
Application granted granted Critical
Publication of CN115955434B publication Critical patent/CN115955434B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The embodiment of the invention provides a recovery method, a recovery device, electronic equipment and a storage medium for ECMP group failure, which relate to the technical field of Internet and comprise the following steps: when the next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period; when the invalid next hop returns a neighbor response aiming at the neighbor request, acquiring next hop information corresponding to the invalid next hop again according to the neighbor response and obtaining neighbor information aiming at the next hop information; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending a neighbor request to the invalid next hop and re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the invalid next hop can be recovered more quickly according to the neighbor information and can be found out quickly to realize load balancing again, and the data flow on the non-invalid next hop is hardly or hardly affected in the recovery process.

Description

ECMP group failure recovery method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of Internet, in particular to a recovery method for ECMP group failure, a recovery device for ECMP group failure, electronic equipment and a computer readable storage medium.
Background
There are a number of situations where ECMP (Equal Cost Multi-Path Equal Cost routing) is employed to achieve higher reliability and load sharing, both in legacy networks and in data center networks. Typically, such routes may be calculated by routing protocols such as BGP (Border Gateway Protocol border gateway protocol) and OSPF (Open Shortest Path First open shortest path first) or by static routing configurations, where data flows are processed by HASH (HASH or HASH) algorithms based on key fields of the message to assign the data flows to a member of the ECMP group for transmission based on HASH values. When an associated link or device fails, the route changes to another ECMP group that is directed to fewer members, and each data flow through the route is reassigned (although there is a small amount of traffic that may be assigned to the original group), requiring that all entries be reconstructed almost completely for stateful nodes on the path (e.g., firewalls, etc.), and severely churning traffic on other paths that should not be affected. To alleviate this situation, in some devices, an elastic HASH technique is supported, where, when a single-path anomaly occurs, only data on the path is uniformly distributed to other paths (and all treatments are distributed to a specific path), but when the paths are restored, only flows on the other paths can be randomly placed on a new path according to the HASH value, and still an inherently stable flow is caused to drift.
In actual deployment, due to link abnormality and other reasons, the routes of all the quoted related ECMP groups are changed and updated again; secondly, in actual operation, connectivity changes of some ECMP members either directly cause routing changes so as to cause severe routing update actions, or cannot be found by a system so as to cause partial message forwarding loss; also, when the NEIGHBOR information of the NEIGH/NEIGHBOR (e.g. ARP (Address Resolution Protocol address resolution protocol) is aged), even if the switching chip sends the message distributed on the failure path to the CPU (Central Processing Unit central processing unit), due to the difference of the upper and lower HASH algorithms, the flow cannot be ensured to trigger the NEIGHBOR request of the related path, so that the current path cannot be quickly repaired; in addition, using the elastic HASH algorithm, the flow of a path can be uniformly distributed to other paths when a single path fails, but when the path is restored, a large number of flows which are not influenced originally can be caused to be redistributed to a new path.
Disclosure of Invention
The embodiment of the invention provides a recovery method, a recovery device, electronic equipment and a computer readable storage medium for ECMP group failure, which are used for solving or partially solving the problem that a large number of flows which are not influenced originally are influenced and redistributed to new paths due to route change caused by path abnormality and updating.
The embodiment of the invention discloses a recovery method for failure of an ECMP group, wherein the ECMP group consists of a plurality of next hops, the next hops contain next hop information, and the method comprises the following steps:
when the next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period;
when the invalid next hop returns a neighbor response aiming at the neighbor request, acquiring next hop information corresponding to the invalid next hop again according to the neighbor response and obtaining neighbor information aiming at the next hop information; wherein the neighbor response contains neighbor information corresponding to the failed next hop;
and recovering the invalid next hop according to the neighbor information to recover the ECMP group.
Optionally, the method further comprises:
when the next hop in the ECMP group fails, the next hop information corresponding to the failed next hop is input into a state database
Optionally, after the recovering the failed next hop according to the neighbor information to recover the ECMP group, the method further includes:
and deleting the next hop information corresponding to the invalid next hop from the state database.
Optionally, the next hop information includes an IP address corresponding to the next hop and an outgoing interface of the next hop.
Optionally, the neighbor information is a correspondence between the next hop and a MAC address of the next hop.
Optionally, the recovering the failed next hop to recover the ECMP group according to the neighbor information includes:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
Optionally, the method further comprises:
when the next hop in the ECMP group fails, marking the next hop as the failed next hop.
Optionally, the method further comprises:
and deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
Optionally, when the next hop is marked as a failed next hop, deleting the failed next hop in the ECMP group includes:
and deleting the invalid next hop in the ECMP group when the state of the out interface of the invalid next hop is out interface DOWN.
Optionally, deleting the failed next hop in the ECMP group when the next hop is marked as the failed next hop, including:
And deleting the next hop which is invalid in the ECMP group when the neighbor information is invalid.
Optionally, the ECMP group is provided with a bidirectional forwarding detection protocol, and when the next hop is marked as a failed next hop, deleting the failed next hop in the ECMP group includes:
and deleting the invalid next hop in the ECMP group when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted.
Optionally, the next hop in the ECMP group includes a plurality of HASH buckets, where the HASH buckets correspond to HASH values, and the method further includes:
when the ECMP group is recovered, obtaining the affinity and update coefficient value corresponding to the HASH bucket;
obtaining the value of the priority of the HASH barrel according to the affinity and the updated marginal value;
and recovering the invalid next hop according to the value of the priority and the number of the HASH barrels corresponding to each next hop in the ECMP group so as to distribute the HASH barrels to the recovered next hop.
Optionally, the value of the priority is used as a basis for the HASH bucket to be allocated to the next hop after recovery.
Optionally, the method further comprises:
the HASH bucket is allocated to a next hop in the ECMP group in response to an allocation instruction of the HASH bucket.
Optionally, the obtaining the affinity and update coefficient value corresponding to the HASH bucket includes:
when the HASH bucket is initially assigned to a next hop in the ECMP group, the affinity between the HASH bucket and its corresponding next hop is recorded.
Optionally, the updated proxy value is attribute information of the HASH bucket.
Optionally, the method further comprises:
and when the next hop in the ECMP group fails and/or the failed next hop is recovered, updating the updating marginal value of the HASH bucket corresponding to the next hop of the current ECMP group.
Optionally, the obtaining the value of the priority of the HASH bucket according to the affinity and the updated value comprises:
and obtaining the value of the priority of the HASH bucket in response to the weighted instruction of the affinity and the updated marginal value.
The embodiment of the invention also discloses a recovery device for failure of the ECMP group, wherein the ECMP group consists of a plurality of next hops, the next hops contain next hop information, and the device comprises:
the neighbor request sending module is used for sending a neighbor request to the next hop which fails according to a preset period when the next hop in the ECMP group fails;
the neighbor information acquisition module is used for acquiring the next hop information corresponding to the invalid next hop according to the neighbor response when the invalid next hop returns the neighbor response aiming at the neighbor request, and acquiring the neighbor information aiming at the next hop information; wherein the neighbor response contains neighbor information corresponding to the failed next hop;
And the ECMP group recovery module is used for recovering the invalid next hop according to the neighbor information so as to recover the ECMP group.
Optionally, the apparatus further comprises:
and the information input module is used for inputting next hop information corresponding to the failed next hop into the state database when the next hop in the ECMP group fails.
Optionally, the apparatus further comprises:
and the information deleting module is used for deleting the next hop information corresponding to the invalid next hop from the state database.
Optionally, the ECMP group recovery module is specifically configured to:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
Optionally, the apparatus further comprises:
and the next hop marking module is used for marking the next hop as the invalid next hop when the next hop in the ECMP group fails.
Optionally, the apparatus further comprises:
and the next hop deleting module is used for deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
Optionally, the next hop deletion module is specifically configured to:
and deleting the invalid next hop in the ECMP group when the state of the out interface of the invalid next hop is out interface DOWN.
Optionally, the next hop deletion module is specifically configured to:
and deleting the next hop which is invalid in the ECMP group when the neighbor information is invalid.
Optionally, the ECMP group is provided with a bidirectional forwarding detection protocol, and the next hop deletion module is specifically configured to:
and deleting the invalid next hop in the ECMP group when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted.
Optionally, the next hop in the ECMP group includes a plurality of HASH buckets, where HASH buckets correspond to HASH values, and the apparatus further includes:
the data acquisition module is used for acquiring the affinity and updating the marginal value corresponding to the HASH barrel when the ECMP group is recovered;
a priority value obtaining module, configured to obtain a value of the priority of the HASH bucket according to the affinity and the updated proxy value;
and the next hop recovery module is used for recovering the invalid next hop according to the value of the priority and the number of the HASH barrels corresponding to each next hop in the ECMP group so as to distribute the HASH barrels to the recovered next hop.
Optionally, the apparatus further comprises:
and the HASH bucket allocation module is used for responding to the allocation instruction of the HASH bucket and allocating the HASH bucket to the next hop in the ECMP group.
Optionally, the data acquisition module is specifically configured to:
when the HASH bucket is initially assigned to a next hop in the ECMP group, the affinity between the HASH bucket and its corresponding next hop is recorded.
Optionally, the apparatus further comprises:
and the updating agent value updating module is used for updating the updating agent value of the HASH barrel corresponding to the next hop of the current ECMP group when the next hop in the ECMP group fails and/or the failed next hop is recovered.
Optionally, the priority value obtaining module is specifically configured to:
and obtaining the value of the priority of the HASH bucket in response to the weighted instruction of the affinity and the updated marginal value.
The embodiment of the invention also discloses electronic equipment, which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the method according to the embodiment of the present invention when executing the program stored in the memory.
Embodiments of the present invention also disclose a computer-readable storage medium having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the method according to the embodiments of the present invention.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, the ECMP group consists of a plurality of next hops, wherein the next hops comprise next hop information, and when the next hops in the ECMP group fail, neighbor requests are sent to the failed next hops according to a preset period; when the invalid next hop returns a neighbor response aiming at the neighbor request, acquiring next hop information corresponding to the invalid next hop again according to the neighbor response and obtaining neighbor information aiming at the next hop information; the neighbor response comprises neighbor information corresponding to the next hop of failure; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending a neighbor request to the invalid next hop and re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the invalid next hop can be recovered more quickly according to the neighbor information and can be found out quickly to realize load balancing again, and the data flow on the non-invalid next hop is hardly or hardly affected in the recovery process.
Drawings
FIG. 1 is a flow chart of steps of a method for recovering from an ECMP group failure provided in an embodiment of the present invention;
FIG. 2 is a schematic diagram of an ECMP group configuration according to an embodiment of the present invention;
FIG. 3 is a flow chart of a priority algorithm provided in an embodiment of the present invention;
FIG. 4 is a block diagram of an ECMP group failure recovery apparatus provided in an embodiment of the present invention;
FIG. 5 is a schematic diagram of a computer-readable storage medium provided in an embodiment of the present invention;
fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As an example, there are a number of situations where ECMP is employed to achieve higher reliability and load sharing, both in legacy networks and in data center networks. Typically, such routes may be calculated by the routing protocol and OSPF protocol or may be derived by a static routing configuration, and the data flow may be processed by the HASH algorithm according to key fields of the message, so as to allocate the data flow to a member of the ECMP group for transmission according to the HASH value. When an associated link or device fails, the route changes to another ECMP group that is directed to fewer members, and each data flow through the route is reassigned (although there is a small amount of traffic that may be assigned to the original group), requiring that all entries be reconstructed almost completely for stateful nodes on the path (e.g., firewalls, etc.), and severely churning traffic on other paths that should not be affected. To alleviate this situation, in some devices, an elastic HASH technique is supported, where, when a single-path anomaly occurs, only data on that path is uniformly distributed to other paths (and all treatments are distributed to a specific path), but when a path is restored, only flows on other paths can be randomly placed on a new path, and still a flow that is originally stable is caused to oscillate. Specifically, in actual deployment, due to link abnormality and other reasons, the routes of all the reference related ECMP groups are changed and updated again; secondly, in actual operation, connectivity changes of some ECMP members either directly cause routing changes so as to cause severe routing update actions, or cannot be found by a system so as to cause partial message forwarding loss; when the neighbor information is changed, even if the exchange chip sends the information distributed on the failure path to the CPU, the flow can not be ensured to trigger the neighbor request of the related path due to the difference of the upper and lower HASH algorithms, so that the current path can not be ensured to be quickly repaired; in addition, using the elastic HASH algorithm, the flow of a path can be uniformly distributed to other paths when a single path fails, but when the path is restored, a large number of flows which are not influenced originally can be caused to be redistributed to a new path.
In this regard, one of the core inventions of the present invention is that the ECMP group is composed of a plurality of next hops, the next hops include next hop information, and when the next hops in the ECMP group fail, a neighbor request is sent to the failed next hops according to a preset period; when the invalid next hop returns a neighbor response aiming at the neighbor request, acquiring next hop information corresponding to the invalid next hop again according to the neighbor response and obtaining neighbor information aiming at the next hop information; the neighbor response comprises neighbor information corresponding to the next hop of failure; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending a neighbor request to the invalid next hop and re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the invalid next hop can be recovered more quickly according to the neighbor information and can be found out quickly to realize load balancing again, and the data flow on the non-invalid next hop is hardly or hardly affected in the recovery process.
Referring to fig. 1, a step flow chart of a method for recovering an ECMP group failure provided in an embodiment of the present invention is shown, where the ECMP group is composed of a plurality of next hops, and the next hops include next hop information, and specifically may include the following steps:
Step 101, when the next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period;
wherein, for ECMP, it is an equivalent multipath route, ECMP has multiple different links to reach the network environment of the same destination address; the ECMP group is composed of a plurality of next hops, and the next hops include next hop information, and it should be noted that one next hop may be understood as one member, that is, the ECMP group may be composed of a plurality of members.
Alternatively, the next hop information may include an IP (protocol of interconnection between Internet Protocol networks) address corresponding to the next hop and an outgoing interface of the next hop;
the preset period is a set result, and a person skilled in the art can adjust the preset period according to the actual situation, which is not limited in the embodiment of the present invention.
For a neighbor request, which is an information request sent to the failed next hop, in order to acquire next hop information and obtain neighbor information based on the next hop information, the failed next hop can be recovered based on the neighbor information.
In a specific implementation, the ECMP group is formed by a plurality of next hops, where the next hops include next hop information, and when the next hop of the ECMP group fails, a neighbor request may be sent to the failed next hop according to a preset period.
102, when the invalid next hop returns a neighbor response to the neighbor request, re-acquiring next hop information corresponding to the invalid next hop according to the neighbor response and obtaining neighbor information for the next hop information; wherein the neighbor response contains neighbor information corresponding to the failed next hop;
for a neighbor request, which is an information request sent to the failed next hop, in order to acquire the next hop information and obtain neighbor information based on the next hop information.
For the neighbor response, which is an information reply made in response to the neighbor request, the neighbor response may contain neighbor information corresponding to the failed next hop; the neighbor information is a correspondence between a next hop and a next hop MAC (Media Access Control media access control) address, and it can be understood that the next hop information generally refers to an next hop IP address and a next hop outgoing interface in the ECMP group, and the IP address and the outgoing interface need to have corresponding MAC addresses to be sent on the ethernet, that is, the neighbor information is a correspondence between the next hop IP address and the next hop outgoing interface and the next hop MAC address, so that the next hop needs to have the neighbor information and only takes effect.
In a specific implementation, the ECMP group is composed of a plurality of next hops, the next hops include next hop information, when the next hops of the ECMP group fail, a neighbor request can be sent to the failed next hops according to a preset period, when the failed next hops return neighbor responses to the neighbor requests, next hop information corresponding to the failed next hops is obtained again according to the neighbor responses, and neighbor information corresponding to the next hop information is obtained; the neighbor response contains neighbor information corresponding to the invalid next hop, and further data support is provided for recovering the invalid next hop.
And step 103, recovering the invalid next hop according to the neighbor information to recover the ECMP group.
Optionally, when the next hop in the ECMP group fails, inputting next hop information corresponding to the failed next hop into a state database; and deleting next hop information corresponding to the invalid next hop from the state database after the invalid next hop is restored to restore the ECMP group according to the neighbor information.
When the next hop in the ECMP group fails, recording next hop information corresponding to the failed next hop into a state database and monitoring the state database by an independent process, so that the independent process sends a neighbor request to the failed next hop according to the next hop information of the state database to enable the failed next hop to be restored to an effective state as soon as possible; and, after the ECMP group is recovered, the next hop information corresponding to the failed next hop is deleted from the state database, it may be understood that a neighbor request may be periodically sent to the next hop (member) in the ECMP group to recover the failed next hop as soon as possible, and the normal next hop does not need to periodically or densely send a neighbor request, so that the next hop information corresponding to the failed next hop is deleted from the state database, and it should be noted that, the operation of deleting the next hop information corresponding to the failed next hop from the state database after the recovery of the failed next hop may not be performed, that is, after the recovery of the failed next hop, the next hop information corresponding to the failed next hop may be deleted from the state database or may not be deleted, and the recovery of the failed next hop is not affected, which may be adjusted by those skilled in the art according to the actual situation, which is not limited in the embodiment of the present invention.
In a specific implementation, the ECMP group is composed of a plurality of next hops, the next hops include next hop information, when the next hop of the ECMP group fails, a neighbor request can be sent to the failed next hop according to a preset period, meanwhile, the next hop information corresponding to the failed next hop is recorded into a state database and the state database is monitored, further, the next hop information corresponding to the failed next hop can be obtained again according to the neighbor request and neighbor information corresponding to the next hop information is obtained, and therefore the failed next hop can be recovered according to the corresponding relation between the failed next hop and the MAC address of the failed next hop to recover the ECMP group so as to make up for the defect that the ECMP group is unreliable due to flow triggering, and after the ECMP group is recovered, the next hop information corresponding to the failed next hop is deleted from the state database.
Optionally, the recovering the failed next hop to recover the ECMP group according to the neighbor information includes:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
In a specific implementation, the neighbor information is the corresponding relation between the IP address of the next hop and the output interface of the next hop and the MAC address of the next hop, and when the failed next hop needs to be recovered to recover the ECMP group, the failed next hop can be recovered to recover the ECMP group according to the corresponding relation between the IP address of the failed next hop and the output interface of the failed next hop and the MAC address of the failed next hop.
In the embodiment of the invention, the ECMP group consists of a plurality of next hops, the next hops comprise next hop information, when the next hops of the ECMP group fail, a neighbor request is sent to the failed next hops according to a preset period, when the failed next hops return neighbor responses for the neighbor requests, the next hop information corresponding to the failed next hops is obtained again according to the neighbor responses, and the neighbor information for the next hop information is obtained; the neighbor response comprises neighbor information corresponding to the next hop of failure; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending a neighbor request to the invalid next hop and re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the invalid next hop can be recovered more quickly according to the neighbor information and can be found out quickly to realize load balancing again, and the data flow on the non-invalid next hop is hardly or hardly affected in the recovery process.
In an alternative embodiment, when the next hop is marked as a failed next hop, the failed next hop in the ECMP group is deleted.
In a specific implementation, when the next hop is marked as a failed next hop, the failed next hop in the ECMP group is deleted.
Optionally, when the next hop in the ECMP group fails, the next hop needs to be marked as the failed next hop, specifically, the next hop in the ECMP group includes a plurality of HASH buckets, for which, the HASH bucket may be understood as a modulo HASH value, and the modulo HASH value enters the HASH bucket, where the HASH bucket may be allocated to all the next hops in the ECMP group correspondingly; when a next hop in the ECMP group fails, the next hop is first marked as an unreachable next hop, and the HASH bucket allocated before the unreachable next hop is allocated to other reachable next hops.
Optionally, when the state of the out interface of the failed next hop is out interface DOWN, deleting the failed next hop in the ECMP group, that is, when the state of the out interface of the next hop is out interface DOWN, indicating that the link of the next hop has been abnormal, and deleting the failed next hop in the ECMP group is required.
In addition, when the neighbor information fails, the next hop that fails in the ECMP group is deleted.
For the failure of the neighbor information, the failure is a type of the change of the neighbor information, specifically, when the neighbor information FAILs (for example, when the state of the neighbor information is changed into INCOMPLETE (INCOMPLETE state) or FAIL (failed state)), the neighbor information does not have a legal MAC address any more, and at this time, the next hop of the failure corresponding to the neighbor information can be deleted correspondingly.
It should be noted that, for the change of the neighbor information, there is also a case of updating the next hop, specifically, when the MAC address in the neighbor information changes to other MAC addresses, the corresponding next hop of the neighbor information in the ECMP group does not need to be deleted, and the corresponding next hop in the ECMP group may be updated directly.
For the situation that the neighbor information changes, which may be ARP (Address Resolution Protocol address resolution protocol) aging, it can be understood that in practical application, the situation that the neighbor information fails may be various, so for convenience of description, the illustrated example is simpler, and the embodiment of the present invention will not be described in detail.
Optionally, the ECMP group is provided with a bidirectional forwarding detection protocol, and when the bidirectional forwarding detection protocol detects that a connection reaching a next hop is interrupted, the next hop that fails in the ECMP group is deleted.
For the bidirectional forwarding detection protocol (Bidirectional Forwarding Detection abbreviated BFD), it may be used for network protocols that detect failures between two forwarding points.
It will be appreciated that when the state of the outgoing interface of the ECMP group is the outgoing interface DOWN, the neighbor information fails, and the connection detected to reach the next hop is interrupted, the failed next hop in the ECMP group may be deleted, i.e. the next hop (member) of the current ECMP group is directly updated, so that the route referencing the ECMP group does not need to be re-issued, and the traffic is led to other reachable paths (next hops) and the routing table entry does not need to be updated comprehensively.
In a specific implementation, when the state of the outgoing interface of the ECMP group is the condition that the outgoing interface DOWN, the neighbor information is aged or the neighbor information is not reachable any more, and the condition that a connection reaching the next hop is broken or the like is invalid is found through a protocol such as a bidirectional forwarding detection protocol, the route is not required to be updated, namely, the invalid next hop in the ECMP group is deleted, so that the route referencing the ECMP group is not required to be issued again, the traffic is guided to other reachable paths (next hop), the route table entry is not required to be updated comprehensively, the invalid path can be recovered more quickly, the load balancing can be found and realized again quickly, and in addition, the data flow on the non-invalid path is rarely or hardly influenced in the recovery process, and after the invalid path is recovered, the data of other links is guided to the recovered path.
It can be appreciated that when the state of the egress interface of the failed next hop is the egress interface UP, the failed next hop can be recovered to recover the failed ECMP group; secondly, when the neighbor information of the invalid next hop is obtained again, namely, the corresponding relation between the IP address of the invalid next hop and the corresponding relation between the outgoing interface of the invalid next hop and the MAC address of the invalid next hop are obtained again, the invalid next hop can be recovered to recover the invalid ECMP group, and when the ARP request obtains the ARP response replied by the IP address owner, the neighbor information is obtained again, namely, the corresponding relation between the IP address and the corresponding relation between the outgoing interface and the MAC address are obtained again; in addition, whether the connection reaching the next hop is failed or restored can be found through a bidirectional forwarding detection protocol and the like so as to judge whether the next hop in the ECMP group is failed or validated. After the failed next hop is restored to restore the ECMP group, the data of the other links may be directed onto the restored path (next hop).
Referring to fig. 2, a schematic configuration diagram of an ECMP group according to an embodiment of the invention is shown; as shown in fig. 2, the ECMP group can determine the working condition of the next hop according to the change condition of the neighbor information, the change of the link state and the change of the bidirectional forwarding detection protocol state, when the path fails, the route is not required to be updated, but the member of the current ECMP group is directly updated, that is, the next hop of the ECMP group is updated, so that the route referring to the ECMP group is not required to be issued again, the traffic is led to other reachable paths, the routing table item is not required to be updated comprehensively, and then, after the failed path is recovered, the data of other links is led to the recovered path. For the current invalid path, the invalid next hop information can be recorded into a state database, the database is monitored by an independent process, and for the temporarily unreachable next hop information, a neighbor request is actively and periodically sent to acquire neighbor information again so as to recover the ECMP group, thereby overcoming the defect that the flow triggering is unreliable for the ECMP group; when the failed path is restored, the data of other links are guided to the restored path, and the next hop information of the failure is deleted from the state database.
In the embodiment of the invention, the ECMP group consists of a plurality of next hops, the next hops comprise next hop information, when the next hops of the ECMP group fail, a neighbor request is sent to the failed next hops according to a preset period, when the failed next hops return neighbor responses for the neighbor requests, the next hop information corresponding to the failed next hops is obtained again according to the neighbor responses, and the neighbor information for the next hop information is obtained; the neighbor response comprises neighbor information corresponding to the next hop of failure; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending a neighbor request to the invalid next hop and re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the invalid next hop can be recovered more quickly according to the neighbor information and can be found out quickly to realize load balancing again, and the data flow on the non-invalid next hop is hardly or hardly affected in the recovery process.
Meanwhile, when the state of the outbound interface of the ECMP group is the condition that the outbound interface DOWN, neighbor information aging or neighbor information is not reachable any more, and the condition that a path failure such as interruption occurs in connection reaching the next hop is found through a protocol such as a bidirectional forwarding detection protocol, the route is not required to be updated, but the member of the current ECMP group is directly updated, so that the route referencing the ECMP group is not required to be issued again, traffic is led to other reachable paths (the next hop), and the route table item is not required to be updated comprehensively.
In an alternative embodiment, the next hop in the ECMP group includes a plurality of HASH buckets, the HASH buckets corresponding to HASH values, the method further comprising:
when the ECMP group is recovered, obtaining the affinity and update coefficient value corresponding to the HASH bucket;
obtaining the value of the priority of the HASH barrel according to the affinity and the updated marginal value;
and recovering the invalid next hop according to the value of the priority and the number of the HASH barrels corresponding to each next hop in the ECMP group so as to distribute the HASH barrels to the recovered next hop.
Wherein, for the HASH bucket, it can be the HASH value after taking the model, the HASH value after taking the model enters the HASH bucket, the HASH bucket can also be understood as a corresponding position; for HASH values, a 32-bit unsigned integer value is typically calculated by a hashing algorithm.
For affinity and update agent values, which are the positional affinity and update agent values for the state when the ECMP group first reached the maximum group member number;
alternatively, in response to an allocation instruction of a HASH bucket, the HASH bucket may be allocated to a next hop in the ECMP group, wherein an affinity between the HASH bucket and its corresponding next hop is recorded when the HASH bucket is initially allocated to the next hop in the ECMP group.
It should be noted that, for initial allocation, it may be understood that the process of first issuing the HASH bucket to all next hops of the ECMP group, specifically, when the HASH bucket is issued for the first time, whether the next hop is reachable or not, the next hop is issued for the first time, and after the initial issue of the HASH bucket is completed, if there is a next hop that is not reachable, the next hop that is not reachable is deleted.
In a specific implementation, when a HASH bucket is initially assigned to a next hop in the ECMP group, the affinity between the HASH bucket and its corresponding next hop may be recorded. In the embodiment of the present invention, the affinity between HASH buckets initially allocated for the next hop that fails is recorded as 1, and the affinities of the remaining locations are recorded as 0. It should be noted that, in practical application, each HASH bucket records which next hop is relatively affinitive, and in the embodiment of the present invention, the initial allocation is used as the basis of affinitive, and it is assumed that 1 is initially allocated, and the rest is 0. It will be appreciated that, for the value and basis of the affinity, those skilled in the art may choose according to the actual situation, and the embodiment of the present invention is not limited thereto.
Optionally, the updated value is attribute information of the HASH bucket, and when the next hop in the ECMP group fails and/or the failed next hop recovers, the updated value of the HASH bucket corresponding to the next hop in the current ECMP group may be updated. In the embodiment of the invention, when the updating caused by the next hop failure occurs, the updated replacing value of all the updated HASH buckets is increased by one, and when the updating caused by the next hop recovery occurs, the updated replacing value of all the updated HASH buckets is decreased by one. It will be appreciated that, for the calculation method of the updated proxy value when the updated proxy value changes due to the next hop failure and/or the next hop recovery of the failure, those skilled in the art may adjust according to the actual situation, which is not limited by the embodiment of the present invention.
Optionally, the value of the priority is used as a basis for assigning the HASH bucket to the next hop after recovery, wherein the value of the priority of the HASH bucket is obtained in response to a weighted instruction of affinity and update of the proxy value, and it is understood that the value of the priority may be obtained by weighted calculation of the affinity and update of the proxy value.
In a specific implementation, the next hop in the ECMP group includes a plurality of HASH buckets, the HASH buckets correspond to HASH values, when the ECMP group recovers, affinity and update marginal values corresponding to the HASH buckets are obtained, and then a value of priority of the HASH buckets is obtained according to the affinity and update marginal values, and then a failed next hop is recovered according to the value of priority and the number of HASH buckets corresponding to each next hop in the ECMP group, so that the HASH buckets are allocated to the recovered next hop. The failed next hop can be recovered more quickly and load balancing can be realized again quickly, and the data flow on the non-failed next hop is hardly or hardly affected in the recovery process, and in addition, when the failed path is recovered, the data of other links are led to the recovered path.
In order to enable those skilled in the art to better understand the technical solutions of the embodiments of the present invention, the following is exemplified by an example:
Referring to FIG. 3, a flow diagram of a priority algorithm provided in an embodiment of the invention is shown;
in order to ensure that the data flow which is not on the fault path is not affected in the switching process of the path as far as possible, the system uses the value of the priority (priority) of each position as the basis for migration to a new position when the path is restored, namely, the value of the priority can be used as the basis for the HASH bucket to be allocated to the next hop after restoration, and the value of the priority is calculated by weighting the position affinity (affinity) of the state when the maximum number of the group members is reached for the first time and the update value (epoch), namely, the HASH bucket with higher priority is preferentially migrated to the new path. For simplicity of description, the embodiment of the present invention is illustrated with 16 HASH values (modulo HASH values) distributed to one ECMP group of 4 next hops.
As shown in fig. 3, first, when a data stream is initially issued, the next hop of each HASH value related position is set as an initial graph (first row of first graph in fig. 3), when 3 fails, HASH buckets with the previous next hop of 3 are sequentially allocated to 1, 2 and 4, and the update interval (epoch) value of each change position is added by one (second row of second graph in fig. 3), so as to obtain a second step graph (first row of second graph in fig. 3). When 2 fails again, since 1 is allocated more before, this time from 4 is allocated to 4 and 1 respectively, and the number of 4 and 1 is kept consistent, and the updated value of the update interval (epoch) of the change position is added by one again, so as to obtain a third step chart (the third chart of the first row in fig. 3).
When 3 is restored, the priority obtained by weighting the initial position of 3 (recorded as affinity, in this example, the affinity weighted by 2, the update limit value weighted by 1) and the epoch value is calculated, the position with the highest priority in each value range is allocated to 3, and the update limit value of each position is changed by one to obtain a fourth step chart. Similarly, when 2 is recovered, the fifth step chart is finally recovered, and the principle of 2 recovery is similar to that of 3 recovery described above, and the embodiments of the present invention are not described herein.
It should be noted that, in the case where the restored state is a coincidence state in the present embodiment, the state is not necessarily a state that can be achieved, and it is understood that, in an actual application process, the restored state is various, and the above-mentioned embodiments of the present invention are only one coincidence state.
In the embodiment of the invention, the ECMP group consists of a plurality of next hops, the next hops comprise next hop information, when the next hops of the ECMP group fail, a neighbor request is sent to the failed next hops according to a preset period, when the failed next hops return neighbor responses for the neighbor requests, the next hop information corresponding to the failed next hops is obtained again according to the neighbor responses, and the neighbor information for the next hop information is obtained; the neighbor response comprises neighbor information corresponding to the next hop of failure; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending a neighbor request to the invalid next hop and re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the invalid next hop can be recovered more quickly according to the neighbor information and can be found out quickly to realize load balancing again, and the data flow on the non-invalid next hop is hardly or hardly affected in the recovery process.
Meanwhile, when the state of the outbound interface of the ECMP group is the condition that the outbound interface DOWN, neighbor information aging or neighbor information is not reachable any more, and the condition that a path failure such as interruption occurs in connection reaching the next hop is found through a protocol such as a bidirectional forwarding detection protocol, the route is not required to be updated, but the member of the current ECMP group is directly updated, so that the route referencing the ECMP group is not required to be issued again, traffic is led to other reachable paths (the next hop), and the route table item is not required to be updated comprehensively.
In addition, when the next hop of the path fails, the next hop information is recorded into a database, an independent process periodically requests to find the failed next hop as soon as possible according to the content of the database, and on the basis that the allocation of the data stream on the non-failed path is not affected when the path fails, the affinity value and the update substitution value of each position (HASH bucket) are calculated when the path is recovered, and the value obtained by weighting the affinity value and the update substitution value is used as the value of the priority of the position, so that the value of the priority is used as the basis for migration to the new position, namely the basis for allocation of the HASH bucket to the recovered next hop, and the influence on the original data stream on the non-failed path is reduced as much as possible.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Referring to fig. 4, a structural block diagram of an ECMP group failure recovery device provided in an embodiment of the present invention is shown, where the ECMP group is composed of a plurality of next hops, and the next hops include next hop information, and specifically may include the following modules:
a neighbor request sending module 401, configured to send a neighbor request to a failed next hop according to a preset period when the next hop in the ECMP group fails;
a neighbor information obtaining module 402, configured to, when the failed next hop returns a neighbor response to the neighbor request, re-obtain next hop information corresponding to the failed next hop according to the neighbor response, and obtain neighbor information for the next hop information; wherein the neighbor response contains neighbor information corresponding to the failed next hop;
And the ECMP group recovery module 403 is configured to recover the failed next hop according to the neighbor information to recover the ECMP group.
In an alternative embodiment, the apparatus further comprises:
and the information input module is used for inputting next hop information corresponding to the failed next hop into the state database when the next hop in the ECMP group fails.
In an alternative embodiment, the apparatus further comprises:
and the information deleting module is used for deleting the next hop information corresponding to the invalid next hop from the state database.
In an alternative embodiment, the ECMP group recovery module is specifically configured to:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
In an alternative embodiment, the apparatus further comprises:
and the next hop marking module is used for marking the next hop as the invalid next hop when the next hop in the ECMP group fails.
In an alternative embodiment, the apparatus further comprises:
and the next hop deleting module is used for deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
In an alternative embodiment, the next hop deletion module is specifically configured to:
and deleting the invalid next hop in the ECMP group when the state of the out interface of the invalid next hop is out interface DOWN.
In an alternative embodiment, the next hop deletion module is specifically configured to:
and deleting the next hop which is invalid in the ECMP group when the neighbor information is invalid.
In an alternative embodiment, the ECMP group is provided with a bidirectional forwarding detection protocol, and the next hop deletion module is specifically configured to:
and deleting the invalid next hop in the ECMP group when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted.
In an alternative embodiment, the next hop in the ECMP group includes a plurality of HASH buckets, the HASH buckets corresponding to HASH values, and the apparatus further includes:
the data acquisition module is used for acquiring the affinity and updating the marginal value corresponding to the HASH barrel when the ECMP group is recovered;
a priority value obtaining module, configured to obtain a value of the priority of the HASH bucket according to the affinity and the updated proxy value;
and the next hop recovery module is used for recovering the invalid next hop according to the value of the priority and the number of the HASH barrels corresponding to each next hop in the ECMP group so as to distribute the HASH barrels to the recovered next hop.
In an alternative embodiment, the apparatus further comprises:
and the HASH bucket allocation module is used for responding to the allocation instruction of the HASH bucket and allocating the HASH bucket to the next hop in the ECMP group.
In an alternative embodiment, the data acquisition module is specifically configured to:
when the HASH bucket is initially assigned to a next hop in the ECMP group, the affinity between the HASH bucket and its corresponding next hop is recorded.
In an alternative embodiment, the apparatus further comprises:
and the updating agent value updating module is used for updating the updating agent value of the HASH barrel corresponding to the next hop of the current ECMP group when the next hop in the ECMP group fails and/or the failed next hop is recovered.
In an alternative embodiment, the priority value acquisition module is specifically configured to:
and obtaining the value of the priority of the HASH bucket in response to the weighted instruction of the affinity and the updated marginal value.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
In addition, the embodiment of the invention also provides electronic equipment, which comprises: the processor, the memory, the computer program stored in the memory and capable of running on the processor, the computer program realizes each process of the above embodiment of the recovery method for ECMP group failure when executed by the processor, and can achieve the same technical effects, and for avoiding repetition, the description is omitted here.
FIG. 5 is a schematic diagram of a computer-readable storage medium provided in an embodiment of the present invention;
the embodiment of the present invention further provides a computer readable storage medium 501, where the computer readable storage medium 501 stores a computer program, and when the computer program is executed by a processor, the processes of the above embodiment of the ECMP group failure recovery method are implemented, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here. The computer readable storage medium 501 is, for example, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
Fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.
The electronic device 600 includes, but is not limited to: radio frequency unit 601, network module 602, audio output unit 603, input unit 604, sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, processor 610, and power supply 611. It will be appreciated by those skilled in the art that the electronic device structure shown in fig. 6 is not limiting of the electronic device and that the electronic device may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. In the embodiment of the invention, the electronic equipment comprises, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer and the like.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 601 may be used to receive and send information or signals during a call, specifically, receive downlink data from a base station, and then process the downlink data with the processor 610; and, the uplink data is transmitted to the base station. Typically, the radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 601 may also communicate with networks and other devices through a wireless communication system.
The electronic device provides wireless broadband internet access to the user via the network module 602, such as helping the user to send and receive e-mail, browse web pages, and access streaming media, etc.
The audio output unit 603 may convert audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into an audio signal and output as sound. Also, the audio output unit 603 may also provide audio output (e.g., a call signal reception sound, a message reception sound, etc.) related to a specific function performed by the electronic device 600. The audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.
The input unit 604 is used for receiving audio or video signals. The input unit 604 may include a graphics processor (Graphics Processing Unit, GPU) 6041 and a microphone 6042, the graphics processor 6041 processing image data of still pictures or video obtained by an image capturing apparatus (such as a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 606. The image frames processed by the graphics processor 6041 may be stored in the memory 609 (or other storage medium) or transmitted via the radio frequency unit 601 or the network module 602. Microphone 6042 may receive sound and can process such sound into audio data. The processed audio data may be converted into a format output that can be transmitted to the mobile communication base station via the radio frequency unit 601 in the case of a telephone call mode.
The electronic device 600 also includes at least one sensor 605, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 6061 according to the brightness of ambient light, and the proximity sensor can turn off the display panel 6061 and/or the backlight when the electronic device 600 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for recognizing the gesture of the electronic equipment (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; the sensor 605 may also include a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, etc., which are not described herein.
The display unit 606 is used to display information input by a user or information provided to the user. The display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 607 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 607 includes a touch panel 6071 and other input devices 6072. Touch panel 6071, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on touch panel 6071 or thereabout using any suitable object or accessory such as a finger, stylus, or the like). The touch panel 6071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 610, and receives and executes commands sent from the processor 610. In addition, the touch panel 6071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 607 may include other input devices 6072 in addition to the touch panel 6071. Specifically, other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a track ball, a mouse, and a joystick, which are not described herein.
Further, the touch panel 6071 may be overlaid on the display panel 6061, and when the touch panel 6071 detects a touch operation thereon or thereabout, the touch operation is transmitted to the processor 610 to determine a type of a touch event, and then the processor 610 provides a corresponding visual output on the display panel 6061 according to the type of the touch event. Although in fig. 6, the touch panel 6071 and the display panel 6061 are two independent components for implementing the input and output functions of the electronic device, in some embodiments, the touch panel 6071 and the display panel 6061 may be integrated to implement the input and output functions of the electronic device, which is not limited herein.
The interface unit 608 is an interface to which an external device is connected to the electronic apparatus 600. For example, the external devices may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 608 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the electronic apparatus 600 or may be used to transmit data between the electronic apparatus 600 and an external device.
The memory 609 may be used to store software programs as well as various data. The memory 609 may mainly include a storage program area that may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and a storage data area; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory 609 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 610 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 609, and calling data stored in the memory 609, thereby performing overall monitoring of the electronic device. The processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 610.
The electronic device 600 may also include a power supply 611 (e.g., a battery) for powering the various components, and preferably the power supply 611 may be logically coupled to the processor 610 via a power management system that performs functions such as managing charging, discharging, and power consumption.
In addition, the electronic device 600 includes some functional modules, which are not shown, and will not be described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (20)

1. A method for recovering from failure of an ECMP group, the ECMP group being composed of a plurality of next hops, the next hops including next hop information, the method comprising:
when the next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period;
when the invalid next hop returns a neighbor response aiming at the neighbor request, acquiring next hop information corresponding to the invalid next hop again according to the neighbor response and obtaining neighbor information aiming at the next hop information; wherein the neighbor response contains neighbor information corresponding to the failed next hop;
restoring the invalid next hop to restore the ECMP group according to the neighbor information;
the next hop in the ECMP group comprises a plurality of HASH barrels, and the HASH barrels correspond to HASH values;
when the ECMP group is recovered, obtaining the affinity and update coefficient value corresponding to the HASH bucket;
obtaining the value of the priority of the HASH barrel according to the affinity and the updated marginal value;
and recovering the invalid next hop according to the value of the priority and the number of the HASH barrels corresponding to each next hop in the ECMP group so as to distribute the HASH barrels to the recovered next hop.
2. The method according to claim 1, wherein the method further comprises:
when the next hop in the ECMP group fails, the next hop information corresponding to the failed next hop is input into a state database.
3. The method of claim 2, wherein after the recovering the failed next hop to recover the ECMP group based on the neighbor information, the method further comprises:
and deleting the next hop information corresponding to the invalid next hop from the state database.
4. The method of claim 1, wherein the next hop information includes an IP address corresponding to the next hop and an outgoing interface of the next hop.
5. The method of claim 4, wherein the neighbor information is a correspondence between the next hop and a MAC address of the next hop.
6. The method of claim 5, wherein the recovering the failed next hop to recover the ECMP group based on the neighbor information comprises:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
7. The method according to claim 1, wherein the method further comprises:
when the next hop in the ECMP group fails, marking the next hop as the failed next hop.
8. The method of claim 7, wherein the method further comprises:
and deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
9. The method of claim 8, wherein deleting the failed next hop in the ECMP group when the next hop is marked as a failed next hop comprises:
and deleting the invalid next hop in the ECMP group when the state of the out interface of the invalid next hop is out interface DOWN.
10. The method of claim 8, wherein deleting the failed next hop in the ECMP group when the next hop is marked as a failed next hop comprises:
and deleting the next hop which is invalid in the ECMP group when the neighbor information is invalid.
11. The method of claim 8, wherein the ECMP group is provided with a bidirectional forwarding detection protocol, and wherein deleting the failed next hop in the ECMP group when the next hop is marked as a failed next hop comprises:
And deleting the invalid next hop in the ECMP group when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted.
12. The method of claim 1, wherein the value of the priority is used as a basis for allocation of the HASH bucket to a recovered next hop.
13. The method according to claim 1, wherein the method further comprises:
the HASH bucket is allocated to a next hop in the ECMP group in response to an allocation instruction of the HASH bucket.
14. The method of claim 13, wherein the obtaining the affinity and update-to-proxy value for the HASH bucket comprises:
when the HASH bucket is initially assigned to a next hop in the ECMP group, the affinity between the HASH bucket and its corresponding next hop is recorded.
15. The method of claim 12, wherein the updated proxy value is attribute information of the HASH bucket.
16. The method of claim 15, wherein the method further comprises:
and when the next hop in the ECMP group fails and/or the failed next hop is recovered, updating the updating marginal value of the HASH bucket corresponding to the next hop of the current ECMP group.
17. The method of claim 12, wherein said deriving a value for a priority of the HASH bucket based on the affinity and the updated proxy value comprises:
and obtaining the value of the priority of the HASH bucket in response to the weighted instruction of the affinity and the updated marginal value.
18. An ECMP group failure recovery apparatus, the ECMP group being composed of a plurality of next hops, the next hops including next hop information, the apparatus comprising:
the neighbor request sending module is used for sending a neighbor request to the next hop which fails according to a preset period when the next hop in the ECMP group fails;
the neighbor information acquisition module is used for acquiring the next hop information corresponding to the invalid next hop according to the neighbor response when the invalid next hop returns the neighbor response aiming at the neighbor request, and acquiring the neighbor information aiming at the next hop information; wherein the neighbor response contains neighbor information corresponding to the failed next hop;
an ECMP group recovery module, configured to recover the invalid next hop according to the neighbor information to recover the ECMP group;
the next hop in the ECMP group comprises a plurality of HASH barrels, and the HASH barrels correspond to HASH values;
The data acquisition module is used for acquiring the affinity and updating the marginal value corresponding to the HASH barrel when the ECMP group is recovered;
a priority value obtaining module, configured to obtain a value of the priority of the HASH bucket according to the affinity and the updated proxy value;
and the next hop recovery module is used for recovering the invalid next hop according to the value of the priority and the number of the HASH barrels corresponding to each next hop in the ECMP group so as to distribute the HASH barrels to the recovered next hop.
19. An electronic device comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other via the communication bus;
the memory is used for storing a computer program;
the processor being configured to implement the method of any of claims 1-17 when executing a program stored on a memory.
20. A computer-readable storage medium having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the method of any of claims 1-17.
CN202310239663.8A 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium Active CN115955434B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310239663.8A CN115955434B (en) 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310239663.8A CN115955434B (en) 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115955434A CN115955434A (en) 2023-04-11
CN115955434B true CN115955434B (en) 2023-05-30

Family

ID=85907016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310239663.8A Active CN115955434B (en) 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115955434B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115514702A (en) * 2022-09-16 2022-12-23 苏州盛科科技有限公司 Method and device for quickly switching link, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105721321B (en) * 2014-12-02 2019-09-06 南京中兴新软件有限责任公司 A kind of the outgoing interface update method and device of equal cost multipath
CN115550247A (en) * 2021-06-29 2022-12-30 中兴通讯股份有限公司 Equivalent route management method, switch system and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115514702A (en) * 2022-09-16 2022-12-23 苏州盛科科技有限公司 Method and device for quickly switching link, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115955434A (en) 2023-04-11

Similar Documents

Publication Publication Date Title
US11003639B2 (en) Database data migration method, apparatus, terminal, system, and storage medium
CN110474841B (en) Service request routing processing method and terminal equipment
CN101404620B (en) Method for creating routing list item and switching equipment
CN108509299B (en) Message processing method, device and computer readable storage medium
CN111585760A (en) Key retrieving method, device, terminal and readable medium
JP2022511644A (en) Processing method and terminal
CN107818022B (en) Application program interface merging method, mobile terminal and computer readable storage medium
CN110289991B (en) Fault gateway detection method, device, storage medium and terminal
CN109254972B (en) Offline command word bank updating method, terminal and computer readable storage medium
CN115617278B (en) Path device selection method and device, electronic device and readable storage medium
CN115955434B (en) ECMP group failure recovery method and device, electronic equipment and storage medium
CN110213069B (en) Data forwarding method and device, disaster recovery system and storage medium
CN112395106A (en) Process management method, mobile terminal, and computer-readable storage medium
CN109818967B (en) Notification method, server, mobile terminal and computer readable storage medium
CN111132073B (en) Multicast communication link layer identifier updating method, device and terminal equipment
CN116112403A (en) Cloud service management system and method for application service, electronic equipment and storage medium
CN115987890B (en) Method, device, electronic equipment and storage medium for cross-cluster access to virtual IP address
CN110933166B (en) Consensus platform, terminal, node and path selection method
JP7249436B2 (en) PC5 link establishment method, device and system
CN110198269B (en) Route synchronization system, method and related device for distributed cluster
CN108279985B (en) Interface request protocol transformation method, equipment and computer readable storage medium
CN112398704B (en) Virtual network delay calculation method and terminal equipment
CN115801709B (en) Method and device for managing route MAC address, electronic equipment and storage medium
CN116248576A (en) Communication path selection method, device, electronic equipment and readable storage medium
CN117834277A (en) Communication method and device of isolated network, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant