WO2020136660A1

WO2020136660A1 - Method and system to extend segment routing based traceroute in a multiprotocol label switching network

Info

Publication number: WO2020136660A1
Application number: PCT/IN2018/050883
Authority: WO
Inventors: Anush MOHAN
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2018-12-26
Filing date: 2018-12-26
Publication date: 2020-07-02

Abstract

Methods and systems to extend segment routing based traceroute in a Multiprotocol Label Switching (MPLS) network. In one embodiment, a method comprises: setting a first time-to-live (TTL) count for a first segment to be a first value, where the first segment is identified by a first label in the stack of labels; and sending a first sequence of traceroute packets including the first label and the first TTL count to verify the first segment. The method further comprises: receiving a reply from a second network device; setting a second TTL count for a second segment to be a second value, where the second segment is the immediate next segment of the first segment, and where the second value is set based on the reply; and sending a second sequence of traceroute packets including the second label and the second TTL count to verify the second segment.

Description

SPECIFICATION

METHOD AND SYSTEM TO EXTEND SEGMENT ROUTING BASED TRACEROUTE IN A MULTIPROTOCOL LABEL SWITCHING NETWORK

TECHNICAL FIELD

[0001] Embodiments of the present disclosure relate to the field of networking, and more specifically, relate to methods and systems to extend segment routing based traceroute in a Multiprotocol Label Switching (MPLS) network.

BACKGROUND ART

[0002] Multiprotocol Label Switching (MPLS) is a packet forwarding protocol. MPLS directs data from one node to the next based on short path labels rather than long network addresses, thus avoiding complex lookups in a routing table. MPLS can encapsulate packets of various network protocols and has been deployed widely in communications networks. In an MPLS network, segment routing may be implemented, where an ingress node prepends a header to packets that contain a list of segments, which are instructions that are executed on subsequent nodes in the MPLS network.

[0003] An MPLS network, like any other communications network, requires operations, administration, and management (OAM). In the OAM, traceroute is a popular network diagnostic tool for displaying the route (path) and measuring transit delays of packets across an Internet Protocol (IP) network. The history of the route is recorded as the round-trip times of the packets received from each successive host (remote node) in the route (path). Using traceroute to validate segments in the segment routing requires that a time-to-live (TTL) count to be set properly, and the TTL setting proposed so far in the Internet Engineering Task Force (IETF) request for comments (RFCs) may not work in some scenarios, particularly where segment routing is performed.

SUMMARY

[0004] Embodiments of the invention offer efficient ways to extend segment routing based traceroute in a Multiprotocol Label Switching (MPLS) network. In one embodiment, a method is performed by a network device in a MPLS network, where the network device performs traceroute using a stack of labels and each label identifies a segment in segment routing in the MPLS network. The method comprises: setting a first time-to-live (TTL) count for a first segment to be a first value, where the first segment is identified by a first label in the stack of labels; and sending a first sequence of traceroute packets including the first label and the first TTL count to verify the first segment. The method further comprises: receiving a reply for the first sequence of traceroute packets from a second network device; setting a second TTL count for a second segment to be a second value, where the second segment is the immediate next segment of the first segment as indicated by a second label immediate next to the first label in the stack of labels, and where the second value is set based on the reply for the first sequence of traceroute packets; and sending a second sequence of traceroute packets including the second label and the second TTL count to verify the second segment.

[0005] Embodiments of the invention include network devices to extend segment routing based traceroute in a Multiprotocol Label Switching (MPLS) network. In one embodiment, a network device comprising a processor and computer-readable storage medium is disclosed. The computer-readable storage medium provides instructions that, when executed by the processor, cause the network device to perform: setting a first time-to-live (TTL) count for a first segment to be a first value, where the first segment is identified by a first label in the stack of labels; and sending a first sequence of traceroute packets including the first label and the first TTL count to verify the first segment. The network device is further to perform: receiving a reply for the first sequence of traceroute packets from a second network device; setting a second TTL count for a second segment to be a second value, where the second segment is the immediate next segment of the first segment as indicated by a second label immediate next to the first label in the stack of labels, and where the second value is set based on the reply for the first sequence of traceroute packets; and sending a second sequence of traceroute packets including the second label and the second TTL count to verify the second segment.

[0006] Embodiments of the invention include computer-readable storage media that provide instructions (e.g., computer program) that, when executed by a processor of an electronic device, cause the electronic device to perform operations comprising one or more methods of the embodiments of the invention.

[0007] Through embodiments of the invention, the time-to-live (TTL) count may be set dynamically based on feedback received from another network device, and properly setting TTL values allows a traceroute to operate properly for segment routing in an MPLS network.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] The invention may best be understood by referring to the following description and accompanying drawings that illustrate embodiments of the invention. [0009] Figure 1A shows a traceroute implementation in a Multi-Protocol Label Switching (MPLS) network.

[0010] Figure IB shows a segment routing based traceroute implementation in a Multi protocol Label Switching (MPLS) network per one embodiment of the invention.

[0011] Figure 2A shows a segment routing based traceroute implementation with PHP operations in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention.

[0012] Figure 2B shows a segment routing based traceroute implementation with an adjacency SID in a label stack in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention.

[0013] Figure 3 shows an example flow to set proper TTL values for segment routing in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention.

[0014] Figure 4 shows a segment routing based traceroute implementation with a B-SID node in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention.

[0015] Figure 5 shows the operations at each node for the sequences of traceroute packets per one embodiment of the invention.

[0016] Figure 6A shows a data structure to announce a binding segment identifier (B-SID) per one embodiment of the invention.

[0017] Figure 6B shows a data structure to indicate a TTL setting for nodes within a binding segment identifier (B-SID) domain per one embodiment of the invention.

[0018] Figure 6C shows an example of a traceroute -reply packet including B-SID information per one embodiment of the invention.

[0019] Figure 7 shows a segment routing based traceroute implementation with two B-SID nodes in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention.

[0020] Figure 8 is a flow diagram showing the operations to set proper TTL values for segment routing based traceroute implementation in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention.

[0021] Figure 9 shows a network device implementing the packet forwarding per one embodiment of the invention.

DETAILED DESCRIPTION

[0022] The following description describes methods, apparatus, and computer programs to dynamically set a time-to-live (TTL) count based on feedback received from another network device thus allow a traceroute to operate properly for segment routing in a Multiprotocol Label Switching (MPLS) network. In the following description, numerous specific details such as logic implementations, opcodes, means to specify operands, resource

partitioning/sharing/duplication implementations, types and interrelationships of system components, and logic partitioning/integration choices are set forth to provide a more thorough understanding of the present invention. One skilled in the art will appreciate, however, that the invention may be practiced without such specific details. In other instances, control structures, gate level circuits, and full software instruction sequences have not been shown in detail in order not to obscure the invention. Those of ordinary skill in the art, with the included descriptions, will be able to implement proper functionality without undue experimentation.

[0023] Bracketed text and blocks with dashed borders (such as large dashes, small dashes, dot- dash, and dots) may be used to illustrate optional operations that add additional features to the embodiments of the invention. Such notation, however, should not be taken to mean that these are the only options or optional operations, and/or that blocks with solid borders are not optional in some embodiments of the invention.

Terms

[0024] Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features, and advantages of the enclosed embodiments will be apparent from the following description.

[0025] References in the specification to“one embodiment,”“an embodiment,”“an example embodiment,” and so forth, indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

[0026] The following description and claims may use the terms“coupled” and“connected,” along with their derivatives. These terms are not intended as synonyms for each other.

“Coupled” is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, co-operate or interact with each other.

“Connected” is used to indicate the establishment of wireless or wireline communication between two or more elements that are coupled with each other. A“set,” as used herein, refers to any positive whole number of items including one item.

[0027] An electronic device stores and transmits (internally and/or with other electronic devices over a network) code (which is composed of software instructions and which is sometimes referred to as a computer program code or a computer program) and/or data using machine-readable media (also called computer-readable media), such as machine-readable storage media (e.g., magnetic disks, optical disks, solid state drives, read only memory (ROM), flash memory devices, phase change memory) and machine-readable transmission media (also called a carrier) (e.g., electrical, optical, radio, acoustical or other form of propagated signals - such as carrier waves, infrared signals). Thus, an electronic device (e.g., a computer) includes hardware and software, such as a set of one or more processors (e.g., of which a processor is a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), other electronic circuitry, or a combination of one or more of the preceding) coupled to one or more machine-readable storage media to store code for execution on the set of processors and/or to store data. For instance, an electronic device may include non-volatile memory containing the code since the non-volatile memory can persist code/data even when the electronic device is turned off (when power is removed). When the electronic device is turned on, that part of the code that is to be executed by the processor(s) of the electronic device is typically copied from the slower non-volatile memory into volatile memory (e.g., dynamic random-access memory (DRAM), static random-access memory (SRAM)) of the electronic device. Typical electronic devices also include a set of one or more physical network interface(s) (NI(s)) to establish network connections (to transmit and/or receive code and/or data using propagating signals) with other electronic devices. For example, the set of physical NIs (or the set of physical NI(s) in combination with the set of processors executing code) may perform any formatting, coding, or translating to allow the electronic device to send and receive data whether over a wired and/or a wireless connection. In some embodiments, a physical NI may comprise radio circuitry capable of (1) receiving data from other electronic devices over a wireless connection and/or (2) sending data out to other devices through a wireless connection. This radio circuitry may include transmitter(s), receiver(s), and/or transceiver(s) suitable for radio frequency communication. The radio circuitry may convert digital data into a radio signal having the proper parameters (e.g., frequency, timing, channel, bandwidth, and so forth). The radio signal may then be transmitted through antennas to the appropriate recipient(s). In some embodiments, the set of physical NI(s) may comprise network interface controller(s) (NICs), also known as a network interface card, network adapter, or local area network (LAN) adapter. The NIC(s) may facilitate in connecting the electronic device to other electronic devices allowing them to communicate with wire through plugging in a cable to a physical port connected to a NIC. One or more parts of an embodiment of the invention may be implemented using different combinations of software, firmware, and/or hardware.

[0028] A network device (ND) (also referred to as a network node or node, and these terms are used interchangeably in this disclosure) is an electronic device in a communications network. The network device (e.g., a router, switch, and bridge) is a piece of networking equipment, including hardware and software that communicatively interconnects other equipment on the network (e.g., other network devices, end systems). Some network devices are“multiple services network devices” that provide support for multiple networking functions (e.g., routing, bridging, VLAN (virtual LAN) switching, Layer 2 aggregation, session border control, Quality of Service, and/or subscriber management), and/or provide support for multiple application services (e.g., data, voice, and video). Subscriber end systems (e.g., servers, workstations, laptops, netbooks, palm tops, mobile phones, smartphones, multimedia phones, Voice Over Internet Protocol (VOIP) phones, user equipment, terminals, portable media players, GPS units, gaming systems, set-top boxes) access content/services provided over the Internet and/or content/services provided on virtual private networks (VPNs) overlaid on (e.g., tunneled through) the Internet. The content and/or services are typically provided by one or more end systems (e.g., server end systems) belonging to a service or content provider or end systems participating in a peer to peer service, and may include, for example, public webpages (e.g., free content, store fronts, search services), private webpages (e.g., username/password accessed webpages providing email services), and/or corporate networks over VPNs. Typically, subscriber end systems are coupled (e.g., through customer premise equipment coupled to an access network (wired or wirelessly)) to edge network devices, which are coupled (e.g., through one or more core network devices) to other edge network devices, which are coupled to other end systems (e.g., server end systems). A network device is generally identified by its media access (MAC) address, Internet protocol (IP) address/subnet, network sockets/ports, and/or upper OSI layer identifiers.

[0029] Segment routing (SR) is a routing protocols that leverages the source routing paradigm. A node steers a packet through an SR Policy instantiated as an ordered list of instructions called segments. A segment can represent any instruction, topological or service- based. A segment can have a semantic local to an SR node or global within an SR domain. SR supports per-flow explicit routing while maintaining per-flow state only at the ingress nodes to the SR domain. SR-MPLS is an instantiation of a SR on a multiprotocol label switching (MPLS) data plane.

[0030] A segment is an instruction a node executes on the incoming packet (e.g., forward packet according to shortest path to destination, or, forward packet through a specific interface, or, deliver the packet to a given application/service instance). A segment may be an interior gateway protocol (IGP) segment, which is a segment attached to a piece of information advertised by a link-state IGP, e.g., an IGP prefix or an IGP adjacency.

[0031] A segment is often referred to by its Segment Identifier (SID). An SR-MPLS SID is an MPLS label or an index value into an MPLS label space explicitly associated with a segment identified by the SID. A SID may be a node SID (or node-SID), which identifies a node segment (corresponding to a network device (node)), the SID may be an adjacency SID (or Adj-SID), which identifies an adjacency segment.

[0032] A segment routing domain (SR domain) is set of nodes participating in the source- based routing model. These nodes may be connected to the same physical infrastructure (e.g., a Service Provider's network). They may as well be remotely connected to each other (e.g., an enterprise VPN or an overlay). If multiple protocol instances are deployed, the SR domain most commonly includes all of the protocol instances in a network. However, some deployments may wish to sub-divide the network into multiple SR domains, each of which includes one or more protocol instances. It is expected that all nodes in an SR Domain are managed by the same administrative entity.

[0033] An active segment is a segment that is used by a receiving network device (e.g., a router) to process a packet containing the segment. In the MPLS data plane, it is the top label.

[0034] PUSH is an instruction comprises of the insertion of a segment at the top of the segment list. In SR-MPLS, the top of the segment list is the topmost (outer) label of a label stack. [0035] NEXT is an instruction that becomes active when the active segment is complete, and it includes inspection of the next segment. In SR-MPLS, NEXT is implemented as a POP of the top label.

[0036] CONTINUE: An instruction indicates the active segment is not completed and hence remains active. In SR-MPLS, CONTINUE instruction is implemented as a SWAP of the top label.

[0037] A segment routing (SR) policy is an ordered list of segments. The headend of an SR Policy steers packets onto the SR policy. The list of segments can be specified explicitly in SR- MPLS as a stack of labels. Alternatively, the list of segments is computed based on a destination and a set of optimization objective and constraints (e.g., latency, affinity, SR local group

(SRLG), etc.).

[0038] Segment list depth is the number of segments of an SR policy. The entity instantiating an SR Policy at a node N should be able to discover the depth insertion capability of the node N.

[0039] Binding segment identifier (B-SID) in MPLS is a label with which MPLS traffic that needs to traverse a segment routing - traffic engineering label switched path (SR-TE LSP) in a domain, ingresses the domain. The B-SID maps traffic to a particular SR-TE LSP in its domain.

[0040] Forwarding equivalence class (FEC) describes a set of packets with similar or identical characteristics which may be forwarded the same way; thus, they may be bound to the same MPLS label. Characteristics determining the FEC of a higher-layer packet depend on the configuration of a network device, but typically include one or more of an IP addresses (e.g., the destination IP address) or quality of service class. An FEC tends to correspond to a label switched path (LSP), but an LSP is often used for multiple FECs.

[0041] Penultimate hop popping (PHP) is a function performed by certain network dev ices in an MPLS enabled network, where the outermost label of an MPLS tagged packet is removed by a network device (e.g., a label switch router (LSR)) before the packet is passed to an adjacent network device (e.g,, a label edge router (LER).

[0042] MPLS Diff-Serv (differentiated services, also referred to as DiffServ) tunneling modes allow service providers to manage the quality of service (Qos) that a network node provides to a MPLS packet in a MPLS network. MPLS Diff-Serv tunneling modes include uniform, short pipe, and pipe mode. The uniform mode, short pipe mode, and pipe mode are defined in RFC 2983,“Differentiated Services and Tunnels,” and RFC 3270,“Multi -Protocol Label Switching (MPLS) Support of Differentiated Services.” With the pipe model, the MPLS tunnels are used to hide the intermediate MPLS nodes between LSP ingress and egress from the Diff-Serv perspective. Short pipe model is a variation of the pipe model, where the Diff-Serv forwarding treatment at an LSP egress is applied based on the“tunneled Diff-Serv Information” (i.e.g, Differ-Serv information conveyed in the encapsulated header) rather than on the“LSP Diff-Serv information” (i.e., Diff-Serv information conveyed in the encapsulating header). Because the LSP egress applies its forwarding treatment based on the“Tunneled Diff-Serv information,” the “LSP Diff-Serv information” does not need to be conveyed by the penultimate node to the LSP egress, thus, the short pipe model may operate with PHP.

[0043] Type-length-value (TLV) is an encoding scheme used for optional information element in a certain protocol. A packet may contain multiple TLVs, and each TLV may include one or more additional TLVs within, and any such included TLVs may be referred to as a sub-TLV.

Traceroute Time-to-live Setting: Complexity with Segment Routing

[0044] Figure 1A shows a traceroute implementation in a Multi-Protocol Label Switching (MPLS) network. Network 100 is an MPLS network includes four nodes, R1-R4, and their addresses (e.g., IP addresses) are represented as 1.1.1.1 to 4.4.4.4 for simplicity of discussion. A label switched path (LSP) is established, and the LSP starts at Rl, traverses R2 and R3, and terminates at R4. To trace the route of the LSP, one label is required, and the label corresponds to a forwarding equivalence class (FEC) of R4, and the FEC may be denoted as (4.4.4.4) (i.e., the FEC identifies the destination (IP) address). MPLS traceroute validates the LSP

corresponding to FEC (4.4.4.4).

[0045] At reference 102, Rl sends an MPLS traceroute-request packet with TTL=1, and the label stack has one label of 999004 identifying FEC (4.4.4.4) for R4. FEC (4.4.4.4) may be populated in a FEC sub-TLV, and the label stack may be carried in a downstream detailed mapping (DDM) TLV. The traceroute-request packet follows the LSP and reaches R2, where it is sent to the control plane because of TTL=1. A MPLS ping server on R2 validates in the control plane if the traceroute-request packet is received on the expected interface, label 999004 corresponds to egress FEC (4.4.4.4), and other criteria (e.g.., the ones specified in RFC 8029 entitled“Detecting Multiprotocol Label Switched (MPLS) Data-Plane Failures,” and/or RFC 8287 entitled“LSP Ping/Traceroute for Segment Routing (SR) IGP-Prefix and IGP-Adjacency Segments Identifiers (SIDs) with MPLS Data Planes.”). R2 replies to Rl with a traceroute-reply packet, including its validation result of the traceroute-request packet, the role of R2 (being a transit node in this case) in the LSP, as well as its downstream next hop and label stack towards R3, which is the next hop for the LSP. The label operation is swap (a continuation as defined above) at reference 104.

[0046] Rl validates the traceroute-reply packet and saves the result (e.g., printing it out). Rl then sends out the next traceroute-request packet with TTL=2 to trace the next node in the LSP. The next packet will be forwarded by R2 along the LSP towards R3, and R2 decrements the TTL by one (thus TTL=1 after R2). At R3, because the next packet arrives with TTL= 1 , R3 MPLS ping server validates in the control plane the traceroute-request packet (similar to what R2 did for the first traceroute-request), and replies to R1 with a traceroute-reply packet, including its validation result of the traceroute-request packet, the role of R3 (being a transit node in this case) in the LSP, as well as its downstream next hop and label stack towards R4, which is the next hop for the LSP. The label operation is swap (a continuation as defined above) at reference 106.

[0047] R1 validates the traceroute-reply packet and saves the result (e.g., printing it out). R1 then sends out the next traceroute-request packet with TTL=3 to trace the next node in the LSP, and the next packet will be forwarded by R2 and R3, each decrementing the TTL by one. Thus, at R4, the next packet arrives with TTL=1, and R4 MPLS ping server validates in the control plane the traceroute-request packet. R4 replies to R1 with a traceroute-reply packet, indicating its validation result of the traceroute-request packet, and stating that R3 is the egress node for the LSP. Since the label 999004 is for FEC (4.4.4.4), the label is popped at reference 108. Once R1 receives the traceroute-reply packet, the traceroute iteration stops as the egress for FEC (4.4.4.4) has been reached. Note that the validation of the label on a node occurs only when TTL=1, and when TTL is a larger value at node R2 and R3, each node reduces the TTL value by one and passes the corresponding traceroute-request packet on to the next node.

[0048] The traceroute operations apply to segment routing, where multiple labels may be included in the label stack. Figure IB shows a segment routing based traceroute implementation in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention. The same R1-R4 are included in the network, and a short pipe model is used for MPLS tunneling. A LSP may stitch together three segments (i.e., {R1-R2, R2-R3, and R3-R4}) using a three-label stack to create a forwarding path to R4 (4.4.4.4) at reference 112. Each label in the three-label stack indicates a segment in the LSP. A path computing element (PCE) may program the LSP at R1 with the three-label stack and corresponding FEC information via a LSP initiate request message (e.g., a PCInitiate message specified in the RFC 8281, entitled“Path Computation Element Communication Protocol (PCEP) Extensions for PCE-Initiated LSP Setup in a Stateful PCE Model”)_· Node R1 then knows the label to FEC mapping for each label in the stack and this is expected to be used in MPLS -traceroute request to validate each segment, and the mapping is shown as Table 1.

Table 1. A First Exemplary Label - FEC Mapping

Label

[0049] To support traceroute for such LSP, the correct node needs to validate the FEC corresponding to a received label, and the TTL needs to be set properly. One may set the TTLs for trace-route request sequences as shown in Table 2.

Table 2. A First Exemplary TTL Setting

to the LSP. Each sequence may contain multiple traceroute-request packets that node R1 sends out for route validation, the multiple packets are sent in case that some packets are delayed or dropped in the route. For simplicity of explanation, one packet per sequence of traceroute- request packet is examined in our discussion.

[0051] At node Rl, the first traceroute-request packet is sent with TTL=1 as shown in Table 2, and the TTL value is set only for label 999002. The inner- label TTLs are set to zero as we intend to trace the first segment first. All three FECs corresponding to the three labels are populated into the MPLS traceroute FEC subTLV for traceroute validation. When the traceroute-request packet reaches R2, R2 does the validation as explained above relating to Figure 1 A. One difference is that now R2 is the termination point for segment 999002 corresponding to FEC

(2.2.2.2). Thus, R2 sends a FEC-pop subTLV in its traceroute-reply to Rl. Additionally, since the traceroute-request packet includes FEC subTLVs for inner labels (i.e., 999003 and 999004), and R2 is a transit node for the next label in the stack (i.e., 999003), R2 validates the next FEC

(3.3.3.3) against the next label and indicates its transit role. When Rl receives the traceroute- reply including the FEC-pop subTLV, Rl knows that R2 has terminated the first segment, and subsequent traceroute packets need to validate only the remaining segments.

[0052] The next traceroute-request packet (which is in the second sequence of traceroute packets) is sent with TTL combination shown in Table 2. Only FECs for R3 (3.3.3.3) and R4

(4.4.4.4) are included in the FEC subTLV for validation. At R2, the label 999002 is popped as instructed (shown at reference 114), and it looks up the inner label, label 999003 of the packet, where the swap operation is to be performed. Since TTL=2 for label 999003, no MPLS ping server is involved due to TTL¹1, and the packet is sent to an interface towards R3. R2 decrements the TTL of label 999003 by one before sending the packet (since swap is involved). At R3, since TTL=1 for label 999003, R3 validates the traceroute request and includes a FEC- pop subTLV for FEC (3.3.3.3) in its traceroute-reply to Rl. Additionally, since the traceroute- request packet includes an FEC subTLV for an inner label (i.e., 999004), and R3 knows itself as a transit node for the next label in the stack, R3 validates the next FEC (4.4.4.4) against the next label and indicates its transit role in its traceroute-reply to Rl. After receiving the traceroute- reply, Rl knows that R3 has terminated the second segment 999003, and subsequent traceroute packets need to validate only the remaining segment. Note that the TTL for the next label is set to be one, and the reason of the value will become clear later in the Disclosure (e.g., see discussion relating to Figures 2A-B below)

[0053] Thus, the next traceroute-request packet (which is in the third sequence of traceroute packets) is sent with TTL combination shown in Table 2. Only FEC for R4 (4.4.4.4) is included in the FEC subTLV for validation. At R2, label 999002 is popped, and the packet is forwarded to R3. At R3, label 99903 is popped, and it decreases the TTL for the inner label, label 99904, by one. Then the packet reaches node R4, where the MPLS ping server validates the packet, and figures out that R4 itself is the egress node and there are no further FEC subTLVs to be validated. The traceroute-reply from R4 indicates that R4 is the egress for the LSP and Rl stops the current traceroute iteration as the LSP egress node has reached.

[0054] Note that for the traceroute validation to be performed properly in short-pipe model, TTL values vary for (1) difference traceroute-request packet sequences and (2) different labels as illustrated in Table 2. It is challenging to set proper TTL values. Indeed, RFC 8287 acknowledges the difficulty with tracing a source-routed LSP (e.g., segment routing based LSP), because“when a source-routed LSP has to be traced, there are as many TTLs as there are labels in the stack”; yet RFC 8287 proposes starting TTL to 1 for a segment to be traced, outer-labels (relating to the segment) to a maximum TTL (MAX-TTL), and inner-labels (relating to the segment) to zero:“The LSR that initiates the traceroute SHOULD start by setting the TTL to 1 for the tunnel in the LSP’s label stack it wants to start the tracing from, the TTL of all outer labels in the stack to the max value, and the TTL of all the inner labels in the stack to zero.”

RFC 8287, Section 7.5. The RFC 8287 proposal does not work for the scenario relating to Figure IB - as Table 2 shows, some TTLs need to be set to be value two for the traceroute validation to be performed properly. Thus, an extension to the existing traceroute operations is needed to implement segment routing based traceroute.

TTL Setting Considering PHP

[0055] In a short pipe mode, a node may perform penultimate hop popping (PHP) instead of pop. Figure 2A shows a segment routing based traceroute implementation with PHP operations in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention. The MPLS network is similar to the ones in Figures 1 A-B, but R2 is to perform PHP instead of pop. The LSP is programmed with the label stack {999003, 999004} at reference 202 and the label- FEC mapping is shown in Table 3.

Table 3. A Second Exemplary Label - FEC Mapping

[0056] To perform the traceroute validation properly, the TTL values are set as shown in Table

4.

Table 4. A Second Exemplary TTL Setting

[0057] When the first packet of traceroute sequence 1 from R1 arrives at R2, R2 sees that TTL=1 for the outer label 999003, its MPLS ping server validates the packet and knows that it is a transit node for the LSP and replies to R1 with a traceroute-reply indicating its role as a transit node.

[0058] Upon receiving the traceroute-reply, R1 sends a second packet of the second traceroute sequence from R1 to R2, which will perform PHP on label 999003 as shown at reference 204. With the PHP operation, the outer label 999003 is removed, and R2 sends the packet towards R3 based on the top label lookup (finding the interface toward R3). Note that since the operation is PHP, the TTL value for the inner label is not decremented (unlike pop, which decrements the inner label TTL value by one). The packet arrives at R3, whose MPLS ping server knows that it is the termination node for label 999003, and its MPLS ping server validates the packet, and replies with a traceroute-reply packet, including an FEC-pop subTLV for FEC (3.3.3.3).

Additionally, since the traceroute -request packet includes an FEC subTLV for an inner label (i.e., 999004), and R3 is a transit node for the next label in the stack, R3 validates the next FEC (4.4.4.4) against the next label and indicates its transit role in its traceroute-reply to Rl. As shown at reference 206, R3 performs swap for the next label in the stack, label 999004.

[0059] Upon receiving the traceroute-reply, Rl knows that R3 has terminated segment 999003, and subsequent traceroute packets need to validate only the remaining segment 999004, and Rl sends a packet of the third traceroute sequence from RL At R2, label 999003 and the TTL associated with label 999003 is removed with the PHP operation again, and R2 sends the packet towards R3, which knows the outmost label 999004 is not for itself thus forwarding the packet towards R4 and reducing the TTL for label 999004 by one (occurring with the swap operation). The packet with label 999004 TTL=1 arrives at R4, whose MPLS ping server validates the packet, and replies with a traceroute-reply packet, and also indicates its role as an egress. Note while the label-mapping for label 999004 is pop at reference 208, no pop is performed when TTL=1, and the packet is validated at the control plane instead.

[0060] Upon receiving the traceroute-reply, R1 knows that egress R4 for the LSP has reached, and R1 stops the current traceroute iterations. Note with the PHP operations being considered, the TTL for the active segment in traceroute is set to be two to make the traceroute validation perform properly .

TTL Setting Considering Adjacency SID

[0061] In segment routing, the label stacks may include adjacency SIDs. Figure 2B shows a segment routing based traceroute implementation with an adjacency SID in a label stack in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention. In Figure 2B, The LSP is programmed with the label stack {999003, 999034, 999004} at reference 212, and the label-FEC mapping is shown in Table 5.

Table 5. A Third Exemplary Label - FEC Mapping

[0062] To support traceroute for such LSP, the correct node needs to validate the FEC corresponding to a received label, and the TTL needs to be set properly. One may set the TTLs for trace-route request sequences as the following table shows:

Table 6. A Third Exemplary TTL Setting

[0063] At node Rl, the first traceroute-request packet is sent with TTL=1 as shown in Table 6, and the TTL value is set only for label 999003 and the rest being default zero. When the first packet of traceroute sequence 1 from Rl arrives at R2, R2 sees that TTL=1 for the outer label 999003, its MPLS ping server validates the packet and knows that it is a transit node for the LSP and replies to R1 with a traceroute-reply indicating its role as a transit node.

[0064] Upon receiving the traceroute-reply, R1 sends a second packet of the second traceroute sequence from R1 to R2. The data plane instruction is swap as shown at reference 214, and it forwards the packet on towards R3, with the TTL for the outer label 999003 being reduced by one. The packet is received at R3. With TTL=1, R3 MPLS ping server validates the packet and terminates segment 999003. Additionally, R3 examines the inner label 999034 and knows that it is a transit node for it, and R3 returns a traceroute-reply back to R1 with FEC-pop subTLV for FEC (3.3.3.3) and indicating its role as the transit node for the inner label 999034.

[0065] Upon receiving the traceroute-reply, R1 sends a third traceroute-request packet (from the third traceroute sequence) to R2, which performs swap and forwards the packet towards R3. At R3, the outmost label 999003 is popped, and R3 additionally performs PHP for the adjacency segment 999034 at reference 216. The PHP operation removes the label 999034, and label 999004 becomes the outmost label. The remaining packet is sent towards R4. At R4, as TTL=1 for label 999004, MPLS ping server validates the packet. R4 knows that it is the egress of the LSP, and it replies to R1 with a traceroute-reply indicating so. The reply indicates that R4 terminates segment 999034 and segment 999004. Upon receiving the traceroute-reply, R1 knows that egress R4 for the LSP has reached, and R1 stops the current traceroute iterations. Note with the adjacency SID being considered, the TTL for the active segment in traceroute is set to be two again to make the traceroute validation perform properly.

Operations to Set Proper TTL values per some embodiments

[0066] Based on the examples discussed above, it may be observed that a headend network device (e.g., node Rl) of a LSP path determines the TTL values for an active segment of a traceroute validation for segment routing based on the last reply it receives from a downstream network device. Figure 3 shows an example flow to set proper TTL values for segment routing in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention. Method 300 may be performed by node Rl discussed herein above.

[0067] At reference 302, the traceroute operation starts, and it starts with the first traceroute sequence, and the active segment index is set to be one. The traceroute return code indicates Ingress, and the current segment hop is set to be one. The total label in the label stack is the number of labels pushed for the LSP at the network device (the headend).

[0068] At reference 304, a traceroute-request packet is sent (multiple copies of the traceroute packet in the same sequence may be sent). In the traceroute-request packet of this sequence, the TTL value for any segment that is lower than the active segment index is set to be the maximum TTL value (e.g., 255). The TTL value for the active segment (the segment for which the traceroute validation is performed, e.g., the segment identified by label 999003 in the second traceroute sequence) is set to the current segment hop. If there are more segments (identified by labels inner to the label for the active segment) in the LSP, i.e., the active segment index is less than the number of the total labels in the stack, the immediate next index after the one for the active segment is set to one, and the ones afterward (if any) are set to zero.

[0069] The network device then waits for a traceroute reply packet from a downstream network device at reference 306. If no reply is received within a timeout period, the network device prints out traceroute timeout at reference 308, and increments the current segment hop and traceroute sequence number by one. The flow goes back to reference 304.

[0070] If the traceroute-reply packet is received at reference 306, the flow goes to reference 310, and the headend network device determines if the reply is come from the LSP egress node (e.g. node R4 in the examples above). If it is, the traceroute stops at reference 350, and the traceroute validation results are printed out and saved.

[0071] If the headend network device determines that the reply has not come from the LSP egress node, the flow goes to reference 312, and headend network device records the traceroute validation results from this reply (e.g., printout and save the results). The headend network device sets the active segment index to be incremented by the number of FEC-pop subTLVs received from the trace traceroute reply and increments the sequence number by one.

Additionally, the current segment hop is set to two, if an FEC-pop subTLV is received in the reply, otherwise the current segment hop is incremented by one. The flows go back to reference 304, where the next traceroute-request packet is transmitted.

[0072] As discussed before, the TTL value for the current active segment is set to two if the reply from the last active segment indicates that the last active segment has been terminated. The termination of the last active segment is indicated by an FEC-pop subTLV in the reply in one embodiment. The TTL value is set to two because when the current traceroute-request packet reaches the last active segment, a pop is performed from the last active label, and the TTL value for the current segment is reduced by one. By setting the TTL value to be two, it will start tracing nodes in the current active segment with TTL=1, and the traceroute validation may be performed properly for the current segment.

[0073] The TTL value for the segment after the current active segment (i.e., the“next” segment) is set to one to accommodate any PHP operations on the current active segment label. When the current active segment label operation is‘PHP’ on the node before the terminating node for current active segment, this segment’s label is stripped off and only the remaining packet is sent to the terminating node for current active segment. So TTL value of label below current active segment is set to 1, for active segment terminating node to be able to validate the traceroute request packet (e.g., through the MPLS ping server).

TTL Setting Considering Binding Segment Identifier (B-SID)

[0074] B-SID labels are used to steer packets through an IGP domain whose details are abstracted for a head-end node that is in a different domain, and they may also be used to reduce the label stack imposed on the head-end node. Figure 4 shows a segment routing based traceroute implementation with a B-SID node in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention. The network 400 includes nodes R1-R6, and their addresses are set to be 1.1.1.1 to 6.6.6.6 for simplicity of explanation. R2 has allocated B-SID label 999026, which can be used by the head-end node R1 to steer traffic on LSP R2-R6. The B- SID label remains unchanged even if its mapped R2-R6 LSP path changes due to some traffic engineering (TE)-metric change. In other words, any chum in R2-R6 domain is hidden from the head-end node Rl.

[0075] The current standards such as RFC 8287 provides no guideline as to how the headend node Rl would instruct R2 to populate TTLs for new labels that are pushed from R2. Simply copying the TTL of an outer-label to inner-label does not work. New TTL settings are required to make traceroute validation work.

[0076] For Network 400, the label-FEC mapping is shown in Table 7.

Table 7. A Fourth Exemplary Label - FEC Mapping

[0077] Figure 4 indicates the label stack and the label mapping operations at various nodes. These operations are performed when sequences of traceroute-request packets are sent from the headend node Rl through the LSP, reaching R2-R6.

[0078] Figure 5 shows the operations at each node for the sequences of traceroute packets per one embodiment of the invention. The first sequence (reference 502) is sent with TTL values being set to be one and zero for labels 999002 and 999026, respectively. R2 terminates segment 999002, and it also sends FEC-push subTLV in a traceroute reply for FEC 4.4.4.4 and 5.5.5.5 corresponding to R4 and R5, the two other nodes corresponding to B SID label 999026.

[0079] Upon receiving the reply from R2, Rl sends out the second sequence (reference 504) with TTL values being set to be 255 and two for labels 999002 and 999026, respectively. R2 swaps the label to include the updated stack for BSID, and the TTL setting is swapped with [(999004, 1), (999005, 0), (999006, 0)] and sends the packet to R3. R3 MPLS ping server parses the packet as the outer label 999004 TTL=1 , and sends a reply to Rl, indicating that R3’s role as a transit node.

[0080] Upon receiving the reply from R3, Rl sends out the third sequence (reference 506) with TTL values being set to be 255 and three for labels 999002 and 999026, respectively. R2 swaps the label to include the updated stack for BSID, and the TTL setting is swapped with [(999004, 2), (999005, 1), (999006, 0)]. Note that the TTL setting for the label stack is similar to a headend node Rl according to Figure 3. At R3, swap operation is performed, and the outer label TTL value is reduced by one and the packet is sent towards R4. At R4, R4 MPLS ping server parses the packet as the outer label 999004 TTL=1, it pops the label, and indicates R4 is the transit node for the next label 999005 in its reply to RL

[0081] Upon receiving the reply from R4, Rl sends out the fourth sequence (reference 508) with TTL values being set to be 255 and four for labels 999002 and 999026, respectively. R2 swaps the label to include the updated stack for BSID, and the TTL setting is swapped with [(999004, 255), (999005, 2), (999006, 0)]. At R3, the outer label TTL value is reduced by one, and the packet is sent towards R4. At R4, the outer label 999004 is popped, and the TTL value of the inner label is reduced by one and the TTL settings becomes [(999005, 1), (999006, 0)], and the packet is sent towards R5. R5 MPLS ping server parses the packet as the outer label 999005 TTL=1, and sends a reply to Rl, indicating, via an FEC pop subTLV, that R5 has terminated segment 999005 and it is a transit node for the inner label 999006.

[0082] Upon receiving the reply from R5, Rl sends out the fifth sequence (reference 510) with TTL values being set to be 255 and five for labels 999002 and 999026, respectively. R2 swaps the label to include the updated stack for BSID, and the TTL setting is swapped with [(999004, 255), (999005, 255), (999006, 2)]. At R3, the outer label TTL value is reduced by one, and the packet is sent towards R4. At R4, the outer label 999004 is popped, and the TTL value of the inner label 999005 is reduced by one and the TTL settings becomes [(999005, 254), (999006, 2)], and the packet is sent towards R5. At R5, the outer label 999005 is popped, and the TTL value of the inner label 999006 is reduced by one and the TTL settings becomes [(999006, 1)]. The packet is sent towards R6. R6 MPLS ping server parses the packet as the outer label 999006 TTL=1, and sends a reply to Rl, indicating that R6 has terminated segment 999006 and it is the egress of the LSP. Upon receiving the reply from R6, Rl stops the current traceroute iteration as the LSP egress node has reached.

Data Structures to Announce B-SID Policy [0083] Note that the headend node is unaware of the B-SID policy in the downstream. Yet the headend node needs to know the B-SID policy. Figure 6A shows a data structure to announce a binding segment identifier (B-SID) per one embodiment of the invention. The B-SID indicator TLV has a type of BSID-indicator-TLV, and its length is indicated using a number of bits, and the value is indicated using BSID-LABEL.

[0084] A node replying to a traceroute-request from a headend node may use the TLV to notify the headend that it has processed a binding SID and populates the corresponding B-SID label in this TLV. Additionally, the replying node (either the headend or the ingress node that directs TTL settings) may indicate the TTL setting for the B-SID segments, and it use another TLV to indicates the TLV values.

[0085] Figure 6B shows a data structure to indicate a TTL setting for nodes within a binding segment identifier (B-SID) domain per one embodiment of the invention. The LEC TTL TLV may pass a set of TTL values for pushed LECs and current active segment for a B-SID policy at its ingress node (e.g., node R2 in Ligure 4). The LEC TTL TLV has a type of LEC-TTL-TLV (or other suitable alternative), and its length is indicated using a number of bits, and the value is indicated using a list of TTL values (e.g., each value may be indicated using eight bits). A headend node may use the EEC TTL TLV to indicate to the B-SID ingress node the TTL values to set for swapped label stack.

[0086] A headend node label stack may comprise of additional labels below the B-SID labels (e.g., for traversal after the B-SID domain), these inner-labels are not expected to be processed by any nodes in the LSP path when the segment corresponding to a B-SID label is active. These inner-labels and their corresponding TTL values would be set at a headend node and will be passed transparently through B-SID nodes.

[0087] The headend node may use B-SID label values provided in the B-SID indicator TLV to populate the LEC TTL TLV for the next sequence of traceroute-request packets. A headend node is expected to parse the B-SID indicator TLV if it is presented (e.g., in a UDP ping payload) and the TLV may be set to be the first TLV in a MPLS ping payload if it is presented. Note that a node may use the values from the B-SID indicator TLV and LEC TTL TLV to set the TTL values for labels in an outgoing MPLS stack irrespective of the MPLS TTL values in the incoming packet.

[0088] In addition, an operator may want to hide the details of its domain from a headend (e.g., the headend is managed by a different operator). Based on this concern, a node (e.g., R2) may avoid sending LEC push subTLV in a traceroute reply (which denotes additional segments used by packet while traversing R2 domain). Also, forwarding at the node (e.g., R2) may set a maximum value for TTL (e.g., 255) for each of the newly imposed segment labels (i.e., one corresponding to new FECs added) in the label-stack. Note that Inner-label TTL (the TTL for the segment below the newly imposed segment) is populated as one by head-end node (e.g., R1 in our case and underlying IP destination address is in 127/8 range), irrespective of the TTL value in incoming-label once the node detects a packet to be a MPLS ping or traceroute packet (e.g., when the packet transmits through udp-port: 3503). This would ensure that the MPLS traceroute-request packet traverses R2 domain without issue and only the B-SID policy egress- node (R6 in Figure 4) gets to validate and reply to the traceroute packet. For example, an incoming label {A} is swapped with outgoing label stack {C, D, E} at R2, the maximum TTL value is set for labels C and D (the outmost labels), and one is set for label E.

[0089] Alternatively, when hiding the domain is not an issue, the short pipe model may be broken, and once a node (e.g., in the data plane) detects that a packet in MPLS ping or traceroute request, on any label pop or POP action, the outer-label TTL is copied to the inner-label.

[0090] Each of the B-SID indicator TLV and FEC TTL TLV may be implemented as a sub- TLV within a traceroute-reply packet. Figure 6C shows an example of a traceroute-reply packet including B-SID information per one embodiment of the invention. The traceroute-reply packet 600 includes a B-SID indicator TLV 652 and optionally FEC-pop subTLV 654.

[0091] The FEC TTL TLV (may also referred to as a TTL-array) passes TTL values to be populated in traceroute-request packet when the packet traverses another SR domains. The B- SID indicator TLV indicates traversal to another SR domain to a LSP headend node. These data structure allows the TTL values to be set based on B-SID implementation in segment routing.

[0092] Note that while two TLVs (may also be referred to as sub-TLVs) are used to implement the B-SID policy announcement, a single TLV may be implemented include the information indicated in the B-SID indicator TLV and FEC TTL TLV. Additionally, the B-SID policy announcement may be implemented in other data structures such as a map, a list, an array, a file, a table, a database, etc.

More Examples of TTL Setting Considering Binding Segment Identifiers (B-SID)

[0093] Figure 4 shows a network with six nodes, and R2 is the B-SID node. Embodiments of the invention also support more complicated network. Figure 7 shows a segment routing based traceroute implementation with two B-SID nodes in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention. Network 700 contains 14 nodes, nodes R1-R14. Node R3 and R8 are B-SID nodes. R3 advertises binding SID L_BS for SR-TE LSP to traverse R3-R8 domain via SR-TE LSP-R3-R8 with an out- label stack {L₅, L₇, Lg} (similar to labels 999003-999034 discussed herein above). R8 advertises binding SID L_B to traverse R8-R14 domain via SR-TE LSP-R8-R14 with an out-label stack {Lio, L14}.

[0094] A PCE may program the headend node LSP with a label stack { L3, LBS, and LB U } corresponding to FEC {F3, Fs, and F14} . Each Li corresponds to a node-SID (e.g., FEC F on a corresponding Ri. The label stack steers the headend LSP traffic onto SR-TE LSP-R3-R8 at R3 and SR-TE LSP-R8-R14 at R8 while traversing the R3-R8 and R8-R14 domains. A series of sequences of traceroute-request packets and their corresponding replies may be exchanged.

[0095] At the first sequence, Rl sends a traceroute-request packet with LSP label stack [(L3,

1), (LBS, 0), (LB 14, 0)]. R2 replies with a traceroute-reply indicating its role as‘transit’ for L3.

[0096] At the second sequence, upon receiving the reply, Rl sends a traceroute-request packet with LSP label stack [(L3, 2), (L_Bs, 1), (Lr , 0)]. R2 reduces the TTL count of the outmost label and transmits the packet to R3. R3 replies with a traceroute-reply indicating (1) R3 terminates the outmost segment F3 (sending a FEC-pop subTLV) and (2) its role as‘transit’ for the inner label L_BS- Additionally, it sends a FEC push subTLV for {F5, F7 } , and an indication that the reply is from a B-SID node and label L_Bs is a B-SID label.

[0097] At the third sequence, upon receiving the reply, Rl sends a traceroute-request packet with LSP label stack [(L3, 255), (LBS, 2), (LBM, 1)]. Additionally, Rl will also populate FEC- TTL-TLV with B-SID label L_BS, which was received in B-SID indicator TLV from the reply, as well as TTL { 1, 1, 0} for the FECs {F5, F7, Fs} (i.e., the newly pushed segments and the current active segment at the headend). R2 reduces the TTL count of the outmost label and transmits the packet to R3. On receiving the packet, R3 determines that it is an MPLS ping /traceroute packet (be received through dest-port: 3503) and that the label being looked up is B-SID label, L_Bs· R3 then checks for the presence of FEC-TTL-TLV, and if the label present in this TLV matches B- SID label L_BS in the traceroute-request. If so, R3 uses the TTL values in this TLV to populate TTL values for the swapped label stack. The out-label stack for the packet sent from R3 would be [(L5, 1), (L₇, 1), (Ls, 0)] while the bottom label in the label stack would be (L_Bi4,l), being retained unchanged from what was passed by Rl (the bottom label is omitted in this discussion until the BSID egress node R8 is reached). R4 replies to this traceroute-request and indicates its role as‘transit.”

[0098] At the fourth sequence, upon receiving the reply, Rl sends a traceroute-request packet with LSP label stack [(L3, 255), (L_Bs, 3), (L_Bi4, 1)]. Additionally, Rl will also populate FEC- TTL-TLV with B-SID label LBS, which was received in B-SID indicator TLV from the reply, as well as TTL {2, 1, 0} for the FECs {F5, F7, Fs} (i.e., the newly pushed segments and the current active segment at the headend. R3 populates the swapped label stack with TTL values in this FEC-TTL-TLV. The packet transmits along the LSP and reaches R5, which replies to the traceroute-request indicating its role as‘transit’ for the inner label and also includes a FEC-pop subTLV for FEC {F₅}.

[0099] At the fifth sequence, upon receiving the reply, R1 sends a traceroute-request packet with LSP label stack [(L3, 255), (L_BS, 4), (L_Bi4, 1)]. Additionally, R1 will also populate FEC- TTL-TLV with B-SID label L_BS, which was received in B-SID indicator TLV from the reply, as well as TTL {255, 2, 1 } for the FECs {F5, F7, Fg}. FEC-TTL-TLV continues to be populated as long as B-SID segment is active. R3 uses the TTL values in this TLV to populate the label stack. The packet transmits along the LSP and reaches R6, which replies to the traceroute-request indicating its role as‘transit.’

[00100] At the sixth sequence, upon receiving the reply, R1 sends a traceroute-request packet with LSP label stack [(L3, 255), (L_BS, 5), (L_B 14, 1)]. Additionally, R1 will also populate FEC- TTL-TLV with B-SID label L_BS, which was received in B-SID indicator TLV from the reply, as well as TTL {255, 3, 1 } for the FECs {F5, F7, Fg}. FEC-TTL-TLV continues to be populated as long as B-SID segment is active. The packet transmits along the LSP and reaches R7, which replies to the traceroute-request indicating its role as‘transit’ for the inner label and also includes a FEC-pop subTLV for FEC { F7 } .

[00101] At the seventh sequence, upon receiving the reply, R1 sends a traceroute-request packet with LSP label stack [(L3, 255), (LBS, 6), (LB 14, 1)]. Additionally, R1 will also populate FEC-TTL-TLV with B-SID label L_Bs, which was received in B-SID indicator TLV from the reply, as well as TTL {255, 255, 2} for the FECs {F5, F7, Fg}. FEC-TTL-TLV continues to be populated as long as B-SID segment is active. The packet transmits along the LSP and reaches R8, which replies to the traceroute-request indicating its role as‘transit’ for the inner label (LB 14) and also includes a FEC-pop subTLV for FEC {Fg}. Additionally, since R8 checks the next label (LBU) received in the traceroute-request packet, it knows that the next label is a B-SID, and its reply to R1 indicates B-SID indicator TLV for {L_{E W}} . R8 also sends FEC-push subTLV for FEC { Fio} , which is the segment R8 will push on top of the packet that ingresses with in-label LB14·

[00102] At the eighth sequence, upon receiving the reply, R1 sends a traceroute-request packet with LSP label stack [(L3, 255), (Lsg, 255), (L_B 14, 2)]. Additionally, R1 will also populate FEC- TTL-TLV with B-SID label L_Bi4, which was received in B-SID indicator TLV from the reply, as well as TTL { 1, 1 } for the FECs { Fio, F14} (he., the newly pushed segments and the current active segment at the headend. R2 reduces the TTL count of the outmost label and transmits the packet to R3. On receiving the packet, R3 determines that it is an MPLS ping /traceroute packet (be received through dest-port: 3503), yet the B-SID label being looked up, L_BS, is not a label in its FEC-TTL-TLV, which contains only {L₅, L₇, Ls}. R3 then populates the TTL value of the outer-label in the packet to the maximum value, MAX-TTL. With the outer-label having the maximum TTL value, the packet may traverse R3-R8 domain without being parsed. When the packet arrives R8, which is the ingress of the next B-SID domain, the outer-label of the packet matches the local B-SID label in FEC-TTL-TLV, and R8 uses the TTL values for the TLV to populate its out-label stack. The packet is forwarded towards R9, which replies to this traceroute-request and indicates its role as‘transit.’

[00103] The pattern continues similar to what is done in the earlier sequences. Briefly, at the ninth sequence, R1 populates the FEC-TTL-TLV TTL values with TTL {2, 1 } for the FECs { F₁₀, Fu}. R10 replies with a FEC-pop subTLV for {F₁₀}.

[00104] At the tenth sequence, R1 populates the FEC-TTL-TLV TTL values with TTL {255, 2} for the FECs {F10, F14} . R11 replies, indicating its role as‘transit.’

[00105] At the 11^th sequence, R1 populates the FEC-TTL-TLV TTL values with TTL {255, 3 } for the FECs {Fi₀, F₁₄}. R12 replies, indicating its role as‘transit.’

[00106] At the 12^th sequence, R1 populates the FEC-TTL-TLV TTL values with TTL {255, 3 } for the FECs { F10, F14} . R13 replies, indicating its role as‘transit.’

[00107] At the 13^th sequence, R1 populates the FEC-TTL-TLV TTL values with TTL {255, 4} for the FECs {F₁₀, F }. R13 replies, indicating its role as the‘egress’ and the traceroute iteration completes.

[00108] Embodiments of the invention may also be extended to the scenario where B-SID domain stitches to LSP in another domain with a second B-SID label. For example, the SR-TE LSP-R3-R14 at R3 may have the out-label stack { L5, Lg, Leg-u}· LBS-M is a B-SID label allocated by R8 to traverse R8-R14 domain via SR-TE LSP-R8-R14 with an out-label stack {Lio, Lu}· R3 may allocate B-SID for LSP-R3-R14. The PCE may program headend LSP with a label stack {L3, L } corresponding to FEC { F3, Fu} to steer headend LSP traffic onto LSP-R3- R14 at R3. The same approach discussed herein above may be used to determine B-SID label transition and populate FEC-TTL-TLV, which is applicable from the new B-SID node.

[00109] If an operator wants to hide B-SID domain details from the head-end node, the operate may choose to override TTL passed in traceroute-request and populate MAX-TTL in out-label- stack populated at the downstream BSID node. The next traceroute sequence would start tracing the LSP from the egress of BSID domain, if there are more labels in the LSP label stack at the headend node. [00110] When a node has limited computing resource to forward traceroute packets, the packet may be sent to the control plane of the node (e.g., to its MPLS ping server) for appropriate processing when the ingress label being looked up is a B-SID. Additional overhead may be added for such processing.

[00111] Through using the two data structures to exchange B-SID information between an LSP headend node and downstream B-SID nodes, the proper TTL setting may be populated to perform traceroute validation. Thus, embodiments of the invention support complex B-SID implementation in segment routing.

Operation Flow per embodiments of the invention

[00112] Figure 8 is a flow diagram showing the operations to set proper TTL values for segment routing based traceroute implementation in a Multi-Protocol Label Switching (MPLS) network per one embodiment of the invention. Method 800 may be performed by a network device (e.g., node Rl) to verify a LSP as discussed herein above. The network device performs traceroute using a stack of labels, and each label in the stack identifies a segment of the LSP in segment routing in the MPLS network. The implementation is performed in a MPLS network operating in short pipe model in one embodiment.

[00113] At reference 802, the network device sets a first time-to-live (TTL) count for a first segment to be a first value, wherein the first segment is identified by a first label in the stack of labels. The first label is the outmost label (also referred to as the outer label) of the stack of the labels in one embodiment. The first value may be one when the first segment is the very first segment in the LSP for which the traceroute validation is to be performed. The first value is another integer when the first segment is a later segment in the LSP where the earlier segments have been validated.

[00114] At reference 804, the network device sends a first sequence of traceroute packets including the first label and the first TTL count to verify the first segment. As explained herein above, the downstream network devices will perform label operations, and/or parse the traceroute packets (e.g., traceroute-request packets), and reply (e.g., with traceroute -reply packets). The first sequence of traceroute packets may be the traceroute sequence one discussed herein above relating to table 2, table 4, or table 6 in some embodiments.

[00115] At reference 806, the network device receives a reply for the first sequence of traceroute packets from a second network device (e.g., a traceroute-reply packet from an LSP downstream node).

[00116] At reference 808, the network device sets a second TTL count for a second segment to be a second value, where the second segment is the immediate next segment of the first segment as indicated by a second label immediate next to the first label in the stack of labels, and where the second value is set based on the reply for the first sequence of traceroute packets. At reference 810, the network device sends a second sequence of traceroute packets including the second label and the second TTL count to verify the second segment. The second label is the inner label immediately next to and below the first label in the label stack (the very next inner label). The second label is for verification of the second segment as discussed herein above.

[00117] In one embodiment, the second value is set to two when the reply for the first sequence of traceroute packets indicates that the second network device is a termination point for the first label; and otherwise the second value is set to be one over the first value. The second network device is indicated as the termination point when the second network device has performed a label pop in one embodiment. In one embodiment, the reply for the first sequence of traceroute packets includes a forwarding equivalence class (FEC) type-length-value (TLV) indicating that the second network device has performed the label pop. An example of the operations is discussed relating to reference 312 in Figure 3.

[00118] In one embodiment, the second sequence of traceroute packets further includes a set of TTL counts for labels in the stack of labels. In the second sequence, a third TTL count for a third label is set to be one, where the third label is immediate next to the second label in the stack of labels. The third label is for verifying a third segment immediate next to the second segment on the LSP path. The second sequence of traceroute packets may be the traceroute sequence two discussed herein above relating to table 2, table 4, or table 6 in some

embodiments.

[00119] Additionally, in one embodiment, the set of TTL counts include one or more TTL counts for labels identifying segments that are verified earlier in the traceroute verification sequences, and the TTL values are set to maximum value (e.g., MAX-TTL) in one embodiment. In one embodiment, the stack of labels may include one or more labels below the third label, and the TTL values for these labels are set to zero.

[00120] In one embodiment, the reply for the first sequence of traceroute packets indicating a binding SID (B-SID), and wherein the network device populates values for a set of TTL counts, each count for a label included in the B-SID. In one embodiment, the reply indicating the B-SID uses a first type-length-value (TLV) (e.g., a B-SID indicator TLV 652). In one embodiment, the network device provides the values for the set of TTL counts, and the values are indicated using a second TLV (e.g., a FEC TTL TLV in Figure 6B). The network device provides the values to a B-SID ingress node (e.g., node R2 in Figure 6) to set TTL values for swapped label stack at the B-SID ingress node. In one embodiment, the second network device (e.g., the B-SID ingress node such as node R2 in Figure 6) updates the values for the set of TTL counts for the labels included in the B-SID based on a TTL setting for the B-SID. For example, R3 and R8 have updated the TTL values discussed herein above relating to Figure 7.

[00121] In one embodiment, the network device will receive another reply (after the transit segments within the LSP are verified), which indicates a label switch path (LSP) egress network device has been reached and the network device will then print out traceroute results for the stack of labels and terminating traceroute as discussed herein above.

[00122] Embodiments of the invention is straightforward to implement, and the data structure implementation for announcing B-SID policy can be performed using known data structures.

The configuration of the data structures and the corresponding TTL setting may be fine-tuned per operator preference, thus the extension to traceroute implementation to support segment routing allows an MPLS network using segment routing to be more readily maintained.

A Network device implementing Embodiments of the Invention

[00123] Figure 9 shows a network device implementing the packet forwarding per one embodiment of the invention. The network device 902 may be implemented using custom application-specific integrated-circuits (ASICs) as processors and a special-purpose operating system (OS), or common off-the-shelf (COTS) processors and a standard OS.

[00124] The network device 902 includes hardware 940 comprising a set of one or more processors 942 (which are typically COTS processors or processor cores or ASICs) and physical NIs 946, as well as non-transitory machine -readable storage media 949 having stored therein software 950. During operation, the one or more processors 942 may execute the software 950 to instantiate one or more sets of one or more applications 964A-R. While one embodiment does not implement virtualization, alternative embodiments may use different forms of virtualization. For example, in one such alternative embodiment the virtualization layer 954 represents the kernel of an operating system (or a shim executing on a base operating system) that allows for the creation of multiple instances 962A-R called software containers that may each be used to execute one (or more) of the sets of applications 964A-R. The multiple software containers (also called virtualization engines, virtual private servers, or jails) are user spaces (typically a virtual memory space) that are separate from each other and separate from the kernel space in which the operating system is run. The set of applications running in a given user space, unless explicitly allowed, cannot access the memory of the other processes. In another such alternative embodiment, the virtualization layer 954 represents a hypervisor (sometimes referred to as a virtual machine monitor (VMM)) or a hypervisor executing on top of a host operating system, and each of the sets of applications 964A-R run on top of a guest operating system within an instance 962A-R called a virtual machine (which may in some cases be considered a tightly isolated form of software container) that run on top of the hypervisor - the guest operating system and application may not know that they are running on a virtual machine as opposed to running on a“bare metal” host network device, or through para- virtualization the operating system and/or application may be aware of the presence of virtualization for optimization purposes. In yet other alternative embodiments, one, some, or all of the applications are implemented as unikernel(s), which can be generated by compiling directly with an application only a limited set of libraries (e.g., from a library operating system (LibOS) including drivers/libraries of OS services) that provide the particular OS sendees needed by the application. As a unikernel can be implemented to run directly on hardware 940, directly on a hypervisor (in which case the unikernel is sometimes described as running within a LibOS virtual machine), or in a software container, embodiments can be implemented fully with unikemels running directly on a hypervisor represented by virtualization layer 954, unikernels running within software containers represented by instances 962A-R, or as a combination of unikemels and the above-described techniques (e.g., unikernels and virtual machines both run directly on a hypervisor, unikemels and sets of applications that are run in different software containers).

[00125] The software 950 contains a traceroute controller 951 that performs operations described with reference to Figures 1-6. The traceroute controller 951 may be instantiated within the applications 964 A-R. The instantiation of the one or more sets of one or more applications 964A-R, as well as virtualization if implemented, are collectively referred to as software instance(s) 952. Each set of applications 964A-R, corresponding virtualization construct (e.g., instance 962A-R) if implemented, and that part of the hardware 940 that executes them (be it hardware dedicated to that execution and/or time slices of hardware temporally shared), forms a separate virtual network device 960A-R.

[00126] A network interface (NI) may be physical or virtual. In the context of IP, an interface address is an IP address assigned to a NI, be it a physical NI or virtual NI. A virtual NI may be associated with a physical NI, with another virtual interface, or stand on its own (e.g., a loopback interface, a point-to-point protocol interface). A NI (physical or virtual) may be numbered (a NI with an IP address) or unnumbered (a NI without an IP address).

[00127] Some of the embodiments contemplated herein above are described more fully with reference to the accompanying drawings. Other embodiments, however, are contained within the scope of the subject matter disclosed herein, the disclosed subject matter should not be construed as limited to only the embodiments set forth herein; rather, these embodiments are provided by way of example to convey the scope of the subject matter to those skilled in the art.

[00128] Any appropriate steps, methods, features, functions, or benefits disclosed herein may be performed through one or more functional units or modules of one or more virtual apparatuses. Each virtual apparatus may comprise a number of these functional units. These functional units may be implemented via processing circuitry, which may include one or more microprocessor or microcontrollers, as well as other digital hardware, which may include digital signal processors (DSPs), special-purpose digital logic, and the like. The processing circuitry may be configured to execute program code stored in memory, which may include one or several types of memory such as read-only memory (ROM), random-access memory (RAM), cache memory, flash memory devices, optical storage devices, etc. Program code stored in memory includes program instructions for executing one or more telecommunications and/or data communications protocols as well as instructions for carrying out one or more of the techniques described herein. In some implementations, the processing circuitry may be used to cause the respective functional unit to perform corresponding functions according one or more embodiments of the present disclosure.

[00129] The term unit may have conventional meaning in the field of electronics, electrical devices, and/or electronic devices and may include, for example, electrical and/or electronic circuitry, devices, modules, processors, memories, logic solid state and/or discrete devices, computer programs or instructions for carrying out respective tasks, procedures, computations, outputs, and/or displaying functions, and so on, as such as those that are described herein.

Claims

CLAIMS What is claimed is:

1. A method performed by a network device in a multiprotocol label switching (MPLS) network, wherein the network device performs traceroute using a stack of labels, wherein each label in the stack identifies a segment in segment routing in the MPLS network, the method comprising: setting (802) a first time-to-live (TTL) count for a first segment to be a first value, wherein the first segment is identified by a first label in the stack of labels;

sending (804) a first sequence of traceroute packets including the first label and the first TTL count to verify the first segment;

receiving (806) a reply for the first sequence of traceroute packets from a second network

device;

setting (808) a second TTL count for a second segment to be a second value, wherein the second segment is the immediate next segment of the first segment as indicated by a second label immediate next to the first label in the stack of labels, and wherein the second value is set based on the reply for the first sequence of traceroute packets; and

sending (810) a second sequence of traceroute packets including the second label and the second TTL count to verify the second segment.

2. The method of claim 1, wherein the second value is set to two when the reply for the first sequence of traceroute packets indicates that the second network device is a termination point for the first label; and otherwise the second value is set to be one over the first value.

3. The method of claim 1 or 2, wherein the second sequence of traceroute packets further includes a third TTL count for a third label and a value of the third TTL count is set to be one, wherein the third label is immediate next to the second label in the stack of labels.

4. The method of claim 1 or 2, wherein the reply for the first sequence of traceroute packets indicating a binding SID (B-SID), and wherein the network device populates values for a set of TTL counts, each count for a label included in the B-SID.

5. The method of claim 4, wherein the reply indicating the B-SID uses a first type-length-value (TLV).

6. The method of claim 4, wherein the network device provides the values for the set of TTL counts indicated using a second TLV.

7. The method of claim 4, wherein the second network device updates the values for the set of TTL counts for the labels included in the B-SID based on a TTL setting for the B-SID.

8. A network device, comprising:

a processor (942) and computer-readable storage medium (949) that provides instructions that, when executed by the processor, cause the network device to perform:

computing a shortest path from the network device to a destination network device; setting (802) a first time-to-live (TTL) count for a first segment to be a first value,

wherein the first segment is identified by a first label in a stack of labels, wherein the network device performs traceroute using the stack of labels, wherein each label in the stack identifies a segment in segment routing in a multiprotocol label switching (MPLS) network;

receiving (806) a reply for the first sequence of traceroute packets from a second

network device;

9. The network device of claim 8, wherein the second value is set to two when the reply for the first sequence of traceroute packets indicates that the second network device is a termination point for the first label; and otherwise the second value is set to be one over the first value.

10. The network device of claim 8 or 9, wherein the second sequence of traceroute packets further a third TTL count for a third label and a value of the third TTL count is set to be one, wherein the third label is immediate next to the second label in the stack of labels.

11. The network device of claim 8 or 9, wherein the reply for the first sequence of traceroute packets indicating a binding SID (B-SID), and wherein the network device populates values for a set of TTL counts, each count for a label included in the B-SID.

12. The network device of claim 11, wherein the reply indicating the B-SID uses a first type- length-value (TLV).

13. The network device of claim 11, wherein the network device provides the values for the set of TTL counts indicated using a second TLV to the second network device.

14. A non-transitory computer-readable storage medium (949) that provides instructions that, when executed by a processor of a network device, cause the network device to perform:

setting (802) a first time-to-live (TTL) count for a first segment to be a first value, wherein the first segment is identified by a first label in a stack of labels, wherein the network device performs traceroute using the stack of labels, wherein each label in the stack identifies a segment in segment routing in a multiprotocol label switching (MPLS) network;

device;

15. The non-transitory computer-readable storage medium of claim 14, wherein the second value is set to two when the reply for the first sequence of traceroute packets indicates that the second network device is a termination point for the first label; and otherwise the second value is set to be one over the first value.

16. The non-transitory computer-readable storage medium of claim 14 or 15, wherein the second sequence of traceroute packets further includes a third TTL count for a third label and a value of the third TTL count is set to be one, wherein the third label is immediate next to the second label in the stack of labels.

17. The non-transitory computer-readable storage medium of claim 14 or 15, wherein the reply for the first sequence of traceroute packets indicating a binding SID (B-SID), and wherein the network device populates values for a set of TTL counts, each count for a label included in the B-SID.

18. The non-transitory computer-readable storage medium of claim 17, wherein the reply indicating the B-SID uses a first type-length-value (TLV).

19. The non-transitory computer-readable storage medium of claim 17, wherein network device provides the values for the set of TTL counts indicated using a second TLV to the second network device.

20. The non-transitory computer-readable storage medium of claim 17, wherein the second network device updates the values for the set of TTL counts for the labels included in the B-SID based on a TTL setting for the B-SID.