WO2016149212A1 - Ethernet auto-negotiation techniques for determining link width - Google Patents

Ethernet auto-negotiation techniques for determining link width Download PDF

Info

Publication number
WO2016149212A1
WO2016149212A1 PCT/US2016/022365 US2016022365W WO2016149212A1 WO 2016149212 A1 WO2016149212 A1 WO 2016149212A1 US 2016022365 W US2016022365 W US 2016022365W WO 2016149212 A1 WO2016149212 A1 WO 2016149212A1
Authority
WO
WIPO (PCT)
Prior art keywords
lane
base page
link partner
link
circuitry
Prior art date
Application number
PCT/US2016/022365
Other languages
French (fr)
Inventor
Adee O. RAN
Ilango S. Ganga
Kent C. LUSTED
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Publication of WO2016149212A1 publication Critical patent/WO2016149212A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/40Bus networks
    • H04L12/407Bus networks with decentralised control
    • H04L12/413Bus networks with decentralised control with random access, e.g. carrier-sense multiple-access with collision detection [CSMA-CD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4265Bus transfer protocol, e.g. handshake; Synchronisation on a point to point bus
    • G06F13/4278Bus transfer protocol, e.g. handshake; Synchronisation on a point to point bus using an embedded synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/40Bus networks
    • H04L12/40169Flexible bus arrangements
    • H04L12/40176Flexible bus arrangements involving redundancy
    • H04L12/40182Flexible bus arrangements involving redundancy by using a plurality of communication lines

Definitions

  • the present disclosure relates to enhancements for Ethernet network systems.
  • Some conventional Ethernet network systems support multi-lane links as well as single lane links.
  • An emerging application is the ability to run several single lane links from a multi-lane link connection.
  • the conventional Ethernet protocols require the need for user configuration of ports to support multi-lane to single-lane links, which must be manually enabled, via user intervention, each time the link is brought up.
  • FIG. 1 illustrates a network system consistent with various embodiments of the present disclosure
  • FIG. 2 illustrates a breakout connection example according to one embodiment of the present disclosure
  • FIG. 3 illustrates an example base page according to one embodiment of the present disclosure.
  • FIG. 4 illustrates a flowchart of operations of one example embodiment consistent with the present disclosure
  • FIG. 1 illustrates a network system 100 consistent with various embodiments of the present disclosure.
  • Network system 100 generally includes at least one network node element 102 (also referred to herein as “source node 102" or “sender node”) and at least one end node element 126 (also referred to herein as “receiving node” and “link partner”), each configured to communicate with one another via communication link 124, as shown.
  • network node element 102 also referred to herein as “source node 102" or “sender node”
  • end node element 126 also referred to herein as “receiving node” and “link partner”
  • the network fabric may include a plurality of intermediate node elements and/or end node elements, each connected in series and/or parallel with each other and or/with the source node 102, to form for example, a torus network topology, ring topology, Clos topology, fat tree topology, etc.
  • the source node 102 and/or end node 126 (and/or any intermediate node or nodes) may each comprise a computer node element (e.g., host server system, laptop, tablet, workstation, etc.), switch, router, bridge, hub, fabric interconnect, network storage device, network attached device, nonvolatile memory (NVM) storage device, etc.
  • a computer node element e.g., host server system, laptop, tablet, workstation, etc.
  • switch router, bridge, hub
  • fabric interconnect e.g., network storage device, network attached device, nonvolatile memory (NVM) storage device, etc.
  • NVM nonvolatile memory
  • source node and “end node” are used to simplify the description and are not meant to imply a unidirectional transmission flow. Although one side of a full duplex connection may often be referred to herein, the operations are also applicable to the reverse direction (e.g., from end node 126 to source node 102, via one or more intermediate nodes, etc.).
  • the source node 102 includes a network controller 104 (e.g., network interface card, etc.), a system processor 106 (e.g., multi-core general purpose processor, such as those provided by Intel Corp., etc.) and system memory 108.
  • the system memory 108 may host a plurality of operating system driver stacks 128 to enable, for example, one or more multi-lane communication links, one or more single-lane communication links, etc., and as will be described in greater detail below.
  • the end node 126 (and/or any intermediate node or nodes) each may be configured and operate in a similar manner as the node 102, as described in greater detail below.
  • the source node 102 and the end node 126 may communicate with each other, via link 124, using, for example, an Ethernet communications protocol.
  • the Ethernet communications protocol may be capable of providing communication using a Transmission Control Protocol/Internet Protocol (TCP/IP).
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the Ethernet protocol may comply or be compatible with the Ethernet standard published by the Institute of Electrical and Electronics Engineers (IEEE) titled "IEEE 802.3 Standard,” published in March, 2002 and/or later versions of this standard, for example, the IEEE 802.3 Standard for Ethernet, published 2012; "IEEE Std 802.3bjTM", published 2014, titled: IEEE Standard for Ethernet Amendment 2: Physical Layer Specifications and Management Parameters for 100 Gb/s Operation Over Backplanes and Copper Cables;
  • the source node 102 and the end node 126 may communicate with each other, via link 124 using, for example, a custom and/or proprietary communications protocol.
  • the custom and/or proprietary communications protocol may be at least partially compliant with the aforementioned 802.3 Ethernet communications protocols.
  • the network controller 104 is configured to communicate with the link partner 126 using the aforementioned Ethernet communications protocol.
  • Network controller 104 is also generally configured to perform various operations in a defined order when a link is first established with the link partner 126 (e.g., upon system initialization, establishing a new link with the link partner, etc.). Such operations may include, for example, an auto-negotiation period during which various capabilities of the node 102 and the link partner 126 are exchanged, followed by a link training period during which the quality of the
  • communications link 124 may be determined, followed by a link up state, or data mode, when data frames/packets are exchanged between the node 102 and link partner 126.
  • the auto- negotiation period, link training period and data mode period may be defined under the aforementioned Ethernet communications protocol.
  • the network controller 104 includes PHY circuitry 110 generally configured to interface the node 102 with the end node 126, via communications link 124.
  • PHY circuitry 110 may comply or be compatible with, the aforementioned IEEE 802.3 Ethernet communications protocol, which may include, for example, single-lane PHY protocols such as 10GBASE-KX, 10GBASE-KR, etc., and/or multi-lane PHY protocols such as 10GBASE- KX4, 40GBASE-KR4, 40GBASE-CR4, lOOGBASE-CRlO, 100GBASE-CR4, 100GBASE- KR4, and/or 100GBASE-KP4, etc.
  • PHY circuitry 110 includes transmit circuitry (Tx) 112 configured to transmit data packets and/or frames to the end node 126, via link 124, and receive circuitry (Rx) 114 configured to receive data packets and/or frames from the end node 126, via link 124.
  • Tx transmit circuitry
  • Rx receive circuitry
  • PHY circuitry 110 may also include encoding/decoding circuitry (not shown) configured to perform analog-to-digital and digital- to-analog conversion, encoding and decoding of data, analog parasitic cancellation (for example, cross talk cancellation), and recovery of received data.
  • Rx circuitry 114 may include phase lock loop circuitry (PLL, not shown) configured to coordinate timing of data reception from the end node 126.
  • Source node 102 and end node 126 may each include respective ports 130, 132 which define the number of lanes of the source node 102 and end node 126, respectively.
  • Each lane of the port 130, 126 may include a plurality of logical and/or physical channels (e.g., differential pair channels) that provide separate connections between, for example, the Tx and Rx 112/114 of the node 102 and an Rx and Tx, respectively, of the end node 126.
  • a "single- lane link”, as used herein, is defined as a single Tx Rx transmission pair.
  • a "multi-lane link”, as used herein, is defined as two or more Tx/Rx transmission pairs.
  • Link width refers to the number of lanes in the communication link.
  • the PHY circuitry 110 of the network controller 104 may be duplicated, depending on the number of lanes associated with the port 130.
  • port 130 may include a 4-lanes and the PHY circuitry may be compliant with 10GBASE-KX4, 40GBASE-KR4, 40GBASE-CR4, 100GB ASE- CR4, 100GBASE-KR4, and/or 100GBASE-KP4.
  • the communications link 124 may comprise, for example, a media dependent interface that may include, for example, copper twin-axial cable, backplane traces on a printed circuit board, copper twisted pair cable, etc.
  • the communications link 124 may include a physical cable such as the SFP+ Direct attach cable (defined in appendix E of the SFF-8431 protocol, Rev. 4.1 (published July 6, 2009)), which may be used as a "breakout" cable to connect a multi-lane link to one or more single-lane links.
  • SFP+ Direct attach cable defined in appendix E of the SFF-8431 protocol, Rev. 4.1 (published July 6, 2009)
  • FIG. 2 illustrates a breakout connection example 200 that includes a four lane multi-lane port 202 coupled to four single-lane ports 204A, 204B, 204C and 204D.
  • the multi-lane port 202 may be associated with the source node 102, and the single-lane ports 204A, 204B, 204C and 204D may each be associated with an end node 126 and/or one or more intermediate nodes, etc.
  • the multi-lane port 202 includes a plurality of Tx/Rx pairs, e.g., TxO/RxO coupled to Rx/Tx of port 204 A, Txl/Rxl coupled to Rx/Tx of port 204B, etc.
  • the link 124' in this example may include a 4-port "breakout" cable such as the aforementioned SFP+ Direct attach cable.
  • port 202 may be associated with a 40GBASE-CR4 PHY and each port 204A, 204B, 204C and 204D may be associated with a 10GBASE-KR PHY, and the breakout cable connection 124' provides a 4-lane port to four single-lane ports connection.
  • Network controller 104 also includes a media access control (MAC) circuitry 118 configured to provide addressing and channel access control protocols for communication with the end node 126, as may be defined by the aforementioned Ethernet communications protocol (e.g., MAC circuitry 118 may be a Layer 2 device).
  • Network controller 104 also includes an auto-negotiation circuitry 116 configured to perform auto-negotiation operations between the node 102 and link partner 126. The auto-negotiation operations may be defined by the aforementioned IEEE 802.3 Ethernet communications protocol.
  • circuitry 116 is configured to communicate to the link partner 126 a defined set of capabilities of the node 102.
  • the defined set of capabilities may include, for example, PHY technology abilities, maximum link speed, forward error correction (FEC) and/or FEC mode capabilities, Pause ability, etc., as may be defined by the aforementioned IEEE 802.3 Ethernet communications protocol.
  • the link partner 126 is configured to communicate to the node 102 the defined set capabilities of the link partner 126. The exchange of capabilities between the node 102 and link partner 126 occurs within a defined auto-negotiation time period.
  • the circuitry 116 may be configured to format a link codeword base page (base page) 120 to define the capabilities of the node 102.
  • the base page 120 defined by the
  • the auto-negotiation circuitry 116 is configured to utilize the base page 120 to determine the lane width of the
  • the formatted base page 120 is transmitted to the link partner 118, and a similar base page (not shown) is transmitted from the link partner 118 to the node 102. Determining the lane width may enable, in some embodiments, a multi-lane PHY link to resolve to a plurality of single-lane PHY links.
  • FIG. 3 is an example base page 120' according to one embodiment of the present disclosure. It should be understood that, during the auto-negotiations period, base pages may be exchanged between the source node 102 and the end node 126.
  • the base page 120' of this embodiment is used to define the PHY circuitry utilized by the node 102, which may include, for example, 10BASE-KR, 40BASE-KR4, 40BASE-CR4, lOOGBASE-CRlO, etc.
  • the example base page 120' depicts bits 0-47, as may be defined by the aforementioned Ethernet communications protocol. Bits D20:16 typically represent a transmit NONCE field, where circuitry 116 populates a bit sequence in the transmit NONCE field.
  • Bits D9:5 represent an echoed NONCE field that is utilized by the link partner to transmit the received transmit NONCE field sequence, and thus, this field is populated by the link partner and transmitted back to the source node 102 as an
  • Bits D45:21 represent the technical ability of the PHY circuitry, e.g., each bit represents a flag to signify the PHY capabilities such as 10BASE-KR, 40BASE- KR4, 40BASE-CR4, lOOGBASE-CRlO, etc.
  • the order of the capabilities represented by bits D45:21 may be defined by the aforementioned Ethernet communications protocol.
  • the base page 120' is only a specific example, and the base page 120' may be modified according to the teachings set forth herein.
  • the formatted base page 120 is transmitted to the link partner 126, and a similar base page (not shown) is transmitted from the link partner 126 to the node 102.
  • the auto-negotiation circuitry 116 may be configured to utilize the transmit NONCE and echoed NONCE fields of the base page to automatically determine the lane width capabilities of the link partner.
  • the auto-negotiation circuitry 116 may be configured to format the base page 120 with a desired and/or required lane status of the port associated with the PHY circuitry 110.
  • the source node 102 is configured to format the base page 120 by, among other things, populating the transmit NONCE field bits D20: 16 with a unique identifier and populating bits D45:21 with the technical abilities (PHY types) of the PHY circuitry 110.
  • the port 130 associated with the PHY circuitry 110 is a multi- lane port (e.g., 4-lane port, 10-lane port, etc.)
  • the auto-negotiations circuitry 116 may begin to transmit the formatted base page 120 on a start lane (e.g., lane 0) of the link.
  • circuitry 116 is configured to "listen", on the Rx lines of one or more lanes, for a similarly-formatted base page from a link partner connected to at least one lane.
  • circuitry 116 is configured to enable the PHY circuitry to "listen" for a transmitted base page from a link partner on all of the lanes of the port. If the base page received from the link partner indicates multi-lane capabilities, then the node 102 and the link partner may establish the highest-bandwidth common multi-lane capability, and thus establishing a multi- lane connection between the source node and the end node (link partner).
  • the circuitry 116 may establish a single lane connection with the link partner on that lane, and may further discontinue transmitting multi-lane capabilities by reformatting the base page to include only single lane PHY capabilities.
  • the link partner includes one or more single-lane connections
  • a base page transmitted from the link partner to the source node includes an echoed NONCE field that matches the transmit NONCE field of the base page that was transmitted by the source node, this indicates that the source node and link partner are communicating on the same lane, and a single lane connection may be resolved between the source node and the link partner on that given lane.
  • the starting transmit lane e.g., lane 0
  • the source node is transmitting and listening on Lane 0, while the link partner is transmitting on Lane 1.
  • the circuitry 116 may be configured to wait a defined period of time and shift the transmit function of the base page to the next (or some other) lane. Since the link partner may also shift transmitting its base page to another lane, and to avoid a symmetrical jump between lanes between the source node and end node (which may create an endless loop), the source node and end node may have different defined time periods before a lane shift occurs. For example, the defined time period for the source node may be based on the value of the transmit NONCE field, adjusted by the value of echoed NONCE field. Once the echoed nonce field matches the transmit NONCE field, a single lane link may be resolved for that lane. Of course, if the link partner is a single-lane port device, then the above description to avoid symmetrical lane jumping may not be necessary.
  • Circuitry 116 may be configured to shift to another lane, as described above, using, for example, round-robin ordering (e.g., lane shifting is done in sequence, starting at lane 0) and/or a preprogrammed lane shifting order, etc.
  • round-robin ordering e.g., lane shifting is done in sequence, starting at lane 0
  • preprogrammed lane shifting order etc.
  • the source node 102 may include a multi-lane port, it may be desirable in some application to create one or more single lane connections with one or more link partners, rather than use a multi-lane connection, such as by using a "breakout" cable described above.
  • the circuitry 116 may format the base page 120 to only include single lane PHY capabilities, thus forcing the link partner to establish a single lane connection with the source node on one of the lanes.
  • the source node and/or the end node may be configured to discontinue transmitting multi-lane capabilities by reformatting the base page to include only single lane PHY capabilities, thus enabling at least one single lane connection between the source node and the end node.
  • Circuitry 116 may also be configured to provide lane re-ordering, for example, when a multi-lane port is coupled to a plurality of single lanes (such as a breakout connection, described above).
  • a multi-lane port is coupled to a plurality of single lanes (such as a breakout connection, described above).
  • circuitry 116 may also be configured to map an Rx lane to a corresponding Tx lane to logically create a single lane i. Reordering may be due to accidental swapping of the pairs in the cable, or intentional, due to routing restrictions or other reasons.
  • Reordering may be supported using, for example, upper stack layers (e.g., PMA, PCS, etc.) so that if the lanes are physically re-ordered, the circuitry 116 can identify that and logically correct the order to recover the original data stream.
  • upper stack layers e.g., PMA, PCS, etc.
  • FIG. 4 illustrates a flowchart of operations 400 of one example embodiment consistent with the present disclosure.
  • the operations of this embodiment depict the determination and resolution of a multi-lane link or one or more single-lane links during an auto-negotiations period.
  • a source node having a multi-lane port and multi-lane PHY capabilities sets a lane, Lane_i, to Lane 0.
  • the source node listens for an auto-negotiation base page from one or more link partners on all lanes, and transmits a base page to a link partner on Lane 0, at operation 406.
  • the transmitted base page may indicate multi-lane PHY capabilities and/or single-lane PHY capabilities, and may include a unique lane identifier in a transmit NONCE field.
  • a base page received from (transmitted from) the at least one link partner may indicate multi-lane PHY capabilities and/or single-lane PHY capabilities, and may include a unique lane identifier in a transmit NONCE field, and may also include an echoed NONCE field.
  • the source node determines if a base page that includes multi-lane capabilities is received on any lane, and if so, at operation 410, resolves the link as a multi-lane link between the source node and the link partner. Operations may also include ending the auto-negotiations mode and continuing on to other operational modes 412. If no multi-lane base page is received on any lane, operations may include determining if a single-lane base page is received on any lane 414.
  • operations may also include determining if the received base page from the link partner includes a correct echoed NONCE field (e.g. the echoed NONCE field matches the transmit NONCE field of the base page transmitted by the source node) 416. If the correct echoed NONCE field is determined, operations may also include creating a single lane connection by pairing the lane with the correct received echoed NONCE field with Lane_i 418. Operations may further include shifting to the next lane (e.g., Lane_i+1) and repeating the operations.
  • a correct echoed NONCE field e.g. the echoed NONCE field matches the transmit NONCE field of the base page transmitted by the source node
  • operations may also include creating a single lane connection by pairing the lane with the correct received echoed NONCE field with Lane_i 418. Operations may further include shifting to the next lane (e.g., Lane_i+1) and repeating the operations.
  • operations may also include waiting a defined time period and shifting the transmit lane to the next lane (e.g., Lane_i+1) 420, and repeating the operations until the transmit NONCE and the echoed NONCE match.
  • the host processor 106 may include one or more processor cores and may be configured to execute system software.
  • System software may include, for example, operating system code (e.g., OS kernel code) and local area network (LAN) driver code.
  • LAN driver code may be configured to control, at least in part, the operation of the network controller 104.
  • System memory may include I/O memory buffers configured to store one or more data packets that are to be transmitted by, or received by, network controller 104.
  • Chipset circuitry may generally include "North Bridge" circuitry (not shown) to control communication between the processor, network controller 104 and system memory 108.
  • Node 102 and/or link partner 126 may further include an operating system (OS, not shown) to manage system resources and control tasks that are run on, e.g., node 102.
  • OS operating system
  • the OS may be implemented using Microsoft Windows, HP-UX, Linux, or UNIX, although other operating systems may be used.
  • the OS may be replaced by a virtual machine monitor (or hypervisor) which may provide a layer of abstraction for underlying hardware to various operating systems (virtual machines) running on one or more processing units.
  • the operating system and/or virtual machine may implement one or more protocol stacks.
  • a protocol stack may execute one or more programs to process packets.
  • An example of a protocol stack is a TCP/IP (Transport Control
  • Protocol/Internet Protocol protocol stack comprising one or more programs for handling (e.g., processing or generating) packets to transmit and/or receive over a network.
  • a protocol stack may alternatively be comprised of a dedicated sub-system such as, for example, a TCP offload engine and/or network controller 104.
  • the TCP offload engine circuitry may be configured to provide, for example, packet transport, packet segmentation, packet reassembly, error checking, transmission acknowledgements, transmission retries, etc., without the need for host CPU and/or software involvement.
  • the system memory 108 may comprise one or more of the following types of memory: semiconductor firmware memory, programmable memory, non-volatile memory, read only memory, electrically programmable memory, random access memory, flash memory, magnetic disk memory, and/or optical disk memory. Either additionally or alternatively system memory may comprise other and/or later-developed types of computer- readable storage devices.
  • Embodiments of the operations described herein may be implemented in a system that includes at least one tangible computer-readable storage device having stored thereon, individually or in combination, instructions that when executed by one or more processors perform the operations.
  • the one or more processors may include, for example, a processing unit and/or programmable circuitry in the network controller 104, system processor 106 and/or other processing unit or programmable circuitry.
  • operations according to the methods described herein may be distributed across a plurality of physical devices, such as processing structures at several different physical locations.
  • the storage device may include any type of tangible, non-transitory storage device, for example, any type of disk including floppy disks, optical disks, compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic and static RAMs, erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), flash memories, magnetic or optical cards, or any type of storage device suitable for storing electronic instructions.
  • ROMs read-only memories
  • RAMs random access memories
  • EPROMs erasable programmable read-only memories
  • EEPROMs electrically erasable programmable read-only memories
  • flash memories magnetic or optical cards, or any type of storage device suitable for storing electronic instructions.
  • a hardware description language may be used to specify circuit and/or logic implementation(s) for the various circuitry described herein.
  • the hardware description language may comply or be compatible with a very high speed integrated circuits (VHSIC) hardware description language (VHDL) that may enable semiconductor fabrication of one or more circuits and/or logic described herein.
  • VHSIC very high speed integrated circuits
  • VHDL may comply or be compatible with IEEE Standard 1076- 1987, IEEE Standard 1076.2, IEEE1076.1, IEEE Draft 3.0 of VHDL-2006, IEEE Draft 4.0 of VHDL-2008 and/or other versions of the IEEE VHDL standards and/or other hardware description standards.
  • Circuitry may comprise, for example, singly or in any combination, hardwired circuitry, programmable circuitry, state machine circuitry, logic and/or firmware that stores instructions executed by programmable circuitry.
  • the circuitry may be embodied as an integrated circuit, such as an integrated circuit chip.
  • the circuitry may be formed, at least in part, by the system CPU 106 executing code and/or instructions sets (e.g., software, firmware, etc.) corresponding to the functionality described herein, thus transforming a general-purpose processor into a specific- purpose processing environment to perform one or more of the operations described herein.
  • the network controller 104 may be embodied as a stand-alone integrated circuit or may be incorporated as one of several components on an integrated circuit. In some embodiments, the various components and circuitry of the network controller 104 or other systems may be combined in a system-on-a-chip (SoC) architecture.
  • SoC system-on-a-chip

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Systems (AREA)

Abstract

This disclosure describes, in one embodiment, a network node element that includes a multi-lane port that includes a plurality of lanes for communication with at least one link partner; PHY circuitry including transmit circuitry and receive circuitry for each lane of the multi-lane port; and an auto-negotiations circuitry to transmit, during an auto-negotiation period of transmission between the network controller and the at least one link partner, a first base page on a first lane of the multi-lane port to the link partner, the first base page including a field for specifying a transmit NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the PHY circuitry; the auto-negotiations circuitry also to receive, during the auto-negotiation period, a second base page on at least one lane from the at least one link partner, the second base page including a field for specifying a field for specifying an echoed NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the link partner; and wherein the auto-negotiations circuitry to determine if the link partner includes a multi-lane port or at least one single-lane port based on, at least in part, the at least one PHY capability identified in the second base page and to resolve a multi-lane connection or a single-lane connection with the link partner based on, at least in part, the at least one PHY capability identified in the second base page.

Description

ETHERNET AUTO-NEGOTIATION TECHNIQUES FOR DETERMINING LINK
WIDTH
Inventors:
Adee Ran, Ilango Ganga, and Kent Lusted
CROSS-REFERENCES TO RELATED APPLICATION This application claims the benefit of US provisional application serial number 62/133,320, filed 14 Mar 2015, which is hereby incorporated by reference in its entirety.
FIELD
The present disclosure relates to enhancements for Ethernet network systems.
BACKGROUND
Some conventional Ethernet network systems support multi-lane links as well as single lane links. An emerging application is the ability to run several single lane links from a multi-lane link connection. However, the conventional Ethernet protocols require the need for user configuration of ports to support multi-lane to single-lane links, which must be manually enabled, via user intervention, each time the link is brought up.
BRIEF DESCRIPTION OF DRAWINGS
Features and advantages of the claimed subject matter will be apparent from the following detailed description of embodiments consistent therewith, which description should be considered with reference to the accompanying drawings, wherein:
FIG. 1 illustrates a network system consistent with various embodiments of the present disclosure;
FIG. 2 illustrates a breakout connection example according to one embodiment of the present disclosure;
FIG. 3 illustrates an example base page according to one embodiment of the present disclosure; and
FIG. 4 illustrates a flowchart of operations of one example embodiment consistent with the present disclosure Although the following Detailed Description will proceed with reference being made to illustrative embodiments, many alternatives, modifications, and variations thereof will be apparent to those skilled in the art.
DETAILED DESCRIPTION FIG. 1 illustrates a network system 100 consistent with various embodiments of the present disclosure. Network system 100 generally includes at least one network node element 102 (also referred to herein as "source node 102" or "sender node") and at least one end node element 126 (also referred to herein as "receiving node" and "link partner"), each configured to communicate with one another via communication link 124, as shown. Of course, in some embodiments one or more intermediate nodes (not shown) may be disposed between the source node 102 and the end node 126. The source node 102 and the end node 126 may be included as link partners in a network fabric. It is to be understood that the illustration of FIG. 1 is only for ease of description and that the network fabric may include a plurality of intermediate node elements and/or end node elements, each connected in series and/or parallel with each other and or/with the source node 102, to form for example, a torus network topology, ring topology, Clos topology, fat tree topology, etc. The source node 102 and/or end node 126 (and/or any intermediate node or nodes) may each comprise a computer node element (e.g., host server system, laptop, tablet, workstation, etc.), switch, router, bridge, hub, fabric interconnect, network storage device, network attached device, nonvolatile memory (NVM) storage device, etc. It will be appreciated that the terms "source node" and "end node" are used to simplify the description and are not meant to imply a unidirectional transmission flow. Although one side of a full duplex connection may often be referred to herein, the operations are also applicable to the reverse direction (e.g., from end node 126 to source node 102, via one or more intermediate nodes, etc.).
The source node 102 includes a network controller 104 (e.g., network interface card, etc.), a system processor 106 (e.g., multi-core general purpose processor, such as those provided by Intel Corp., etc.) and system memory 108. The system memory 108 may host a plurality of operating system driver stacks 128 to enable, for example, one or more multi-lane communication links, one or more single-lane communication links, etc., and as will be described in greater detail below. The end node 126 (and/or any intermediate node or nodes) each may be configured and operate in a similar manner as the node 102, as described in greater detail below. The source node 102 and the end node 126 may communicate with each other, via link 124, using, for example, an Ethernet communications protocol. The Ethernet communications protocol may be capable of providing communication using a Transmission Control Protocol/Internet Protocol (TCP/IP). The Ethernet protocol may comply or be compatible with the Ethernet standard published by the Institute of Electrical and Electronics Engineers (IEEE) titled "IEEE 802.3 Standard," published in March, 2002 and/or later versions of this standard, for example, the IEEE 802.3 Standard for Ethernet, published 2012; "IEEE Std 802.3bj™", published 2014, titled: IEEE Standard for Ethernet Amendment 2: Physical Layer Specifications and Management Parameters for 100 Gb/s Operation Over Backplanes and Copper Cables;
"IEEE P802.3by™"/D0.1, titled: Draft Standard for Ethernet Amendment: Media Access Control Parameters, Physical Layers and Management Parameters for 25 Gb/s Operation; etc. In other embodiments, the source node 102 and the end node 126 may communicate with each other, via link 124 using, for example, a custom and/or proprietary communications protocol. The custom and/or proprietary communications protocol may be at least partially compliant with the aforementioned 802.3 Ethernet communications protocols.
The network controller 104 is configured to communicate with the link partner 126 using the aforementioned Ethernet communications protocol. Network controller 104 is also generally configured to perform various operations in a defined order when a link is first established with the link partner 126 (e.g., upon system initialization, establishing a new link with the link partner, etc.). Such operations may include, for example, an auto-negotiation period during which various capabilities of the node 102 and the link partner 126 are exchanged, followed by a link training period during which the quality of the
communications link 124 may be determined, followed by a link up state, or data mode, when data frames/packets are exchanged between the node 102 and link partner 126. The auto- negotiation period, link training period and data mode period may be defined under the aforementioned Ethernet communications protocol.
The network controller 104 includes PHY circuitry 110 generally configured to interface the node 102 with the end node 126, via communications link 124. PHY circuitry 110 may comply or be compatible with, the aforementioned IEEE 802.3 Ethernet communications protocol, which may include, for example, single-lane PHY protocols such as 10GBASE-KX, 10GBASE-KR, etc., and/or multi-lane PHY protocols such as 10GBASE- KX4, 40GBASE-KR4, 40GBASE-CR4, lOOGBASE-CRlO, 100GBASE-CR4, 100GBASE- KR4, and/or 100GBASE-KP4, etc. and/or other PHY circuitry that is compliant with the aforementioned IEEE 802.3 Ethernet communications protocol and/or compliant with an after-developed communications protocol and/or emerging PHY technology specifications such as 25GBASE-CR and/or 25GBASE-KR, etc. PHY circuitry 110 includes transmit circuitry (Tx) 112 configured to transmit data packets and/or frames to the end node 126, via link 124, and receive circuitry (Rx) 114 configured to receive data packets and/or frames from the end node 126, via link 124. Of course, PHY circuitry 110 may also include encoding/decoding circuitry (not shown) configured to perform analog-to-digital and digital- to-analog conversion, encoding and decoding of data, analog parasitic cancellation (for example, cross talk cancellation), and recovery of received data. Rx circuitry 114 may include phase lock loop circuitry (PLL, not shown) configured to coordinate timing of data reception from the end node 126.
Source node 102 and end node 126 may each include respective ports 130, 132 which define the number of lanes of the source node 102 and end node 126, respectively. Each lane of the port 130, 126 may include a plurality of logical and/or physical channels (e.g., differential pair channels) that provide separate connections between, for example, the Tx and Rx 112/114 of the node 102 and an Rx and Tx, respectively, of the end node 126. A "single- lane link", as used herein, is defined as a single Tx Rx transmission pair. A "multi-lane link", as used herein, is defined as two or more Tx/Rx transmission pairs. "Link width", as used herein, refers to the number of lanes in the communication link. The PHY circuitry 110 of the network controller 104 may be duplicated, depending on the number of lanes associated with the port 130. Thus for example, port 130 may include a 4-lanes and the PHY circuitry may be compliant with 10GBASE-KX4, 40GBASE-KR4, 40GBASE-CR4, 100GB ASE- CR4, 100GBASE-KR4, and/or 100GBASE-KP4.
The communications link 124 may comprise, for example, a media dependent interface that may include, for example, copper twin-axial cable, backplane traces on a printed circuit board, copper twisted pair cable, etc. For example, if port 130 is a multi-lane link and port 132 is a single lane link, the communications link 124 may include a physical cable such as the SFP+ Direct attach cable (defined in appendix E of the SFF-8431 protocol, Rev. 4.1 (published July 6, 2009)), which may be used as a "breakout" cable to connect a multi-lane link to one or more single-lane links. FIG. 2 illustrates a breakout connection example 200 that includes a four lane multi-lane port 202 coupled to four single-lane ports 204A, 204B, 204C and 204D. With continued reference to FIG. 1, the multi-lane port 202 may be associated with the source node 102, and the single-lane ports 204A, 204B, 204C and 204D may each be associated with an end node 126 and/or one or more intermediate nodes, etc. The multi-lane port 202 includes a plurality of Tx/Rx pairs, e.g., TxO/RxO coupled to Rx/Tx of port 204 A, Txl/Rxl coupled to Rx/Tx of port 204B, etc. The link 124' in this example may include a 4-port "breakout" cable such as the aforementioned SFP+ Direct attach cable. For example, port 202 may be associated with a 40GBASE-CR4 PHY and each port 204A, 204B, 204C and 204D may be associated with a 10GBASE-KR PHY, and the breakout cable connection 124' provides a 4-lane port to four single-lane ports connection.
Network controller 104 also includes a media access control (MAC) circuitry 118 configured to provide addressing and channel access control protocols for communication with the end node 126, as may be defined by the aforementioned Ethernet communications protocol (e.g., MAC circuitry 118 may be a Layer 2 device). Network controller 104 also includes an auto-negotiation circuitry 116 configured to perform auto-negotiation operations between the node 102 and link partner 126. The auto-negotiation operations may be defined by the aforementioned IEEE 802.3 Ethernet communications protocol. In general, circuitry 116 is configured to communicate to the link partner 126 a defined set of capabilities of the node 102. The defined set of capabilities may include, for example, PHY technology abilities, maximum link speed, forward error correction (FEC) and/or FEC mode capabilities, Pause ability, etc., as may be defined by the aforementioned IEEE 802.3 Ethernet communications protocol. Likewise, the link partner 126 is configured to communicate to the node 102 the defined set capabilities of the link partner 126. The exchange of capabilities between the node 102 and link partner 126 occurs within a defined auto-negotiation time period. The circuitry 116 may be configured to format a link codeword base page (base page) 120 to define the capabilities of the node 102. The base page 120, defined by the
aforementioned IEEE 802.3 Ethernet communications protocol, is a data structure (e.g., 48- bit signal sequence) having certain bits that are utilized to convey defined capabilities of the node 102. According to the teaching of the present disclosure, the auto-negotiation circuitry 116 is configured to utilize the base page 120 to determine the lane width of the
communication link 124 during the auto-negotiation period. The formatted base page 120 is transmitted to the link partner 118, and a similar base page (not shown) is transmitted from the link partner 118 to the node 102. Determining the lane width may enable, in some embodiments, a multi-lane PHY link to resolve to a plurality of single-lane PHY links.
FIG. 3 is an example base page 120' according to one embodiment of the present disclosure. It should be understood that, during the auto-negotiations period, base pages may be exchanged between the source node 102 and the end node 126. The base page 120' of this embodiment is used to define the PHY circuitry utilized by the node 102, which may include, for example, 10BASE-KR, 40BASE-KR4, 40BASE-CR4, lOOGBASE-CRlO, etc. The example base page 120' depicts bits 0-47, as may be defined by the aforementioned Ethernet communications protocol. Bits D20:16 typically represent a transmit NONCE field, where circuitry 116 populates a bit sequence in the transmit NONCE field. This bit sequence is typically a random or pseudo-random number that acts as a unique identifier of the transmit lane of the source node 102. Bits D9:5 represent an echoed NONCE field that is utilized by the link partner to transmit the received transmit NONCE field sequence, and thus, this field is populated by the link partner and transmitted back to the source node 102 as an
acknowledgement that the link between the source node and the link partner is properly established on a given lane. Bits D45:21 represent the technical ability of the PHY circuitry, e.g., each bit represents a flag to signify the PHY capabilities such as 10BASE-KR, 40BASE- KR4, 40BASE-CR4, lOOGBASE-CRlO, etc. The order of the capabilities represented by bits D45:21 may be defined by the aforementioned Ethernet communications protocol. Of course, the base page 120' is only a specific example, and the base page 120' may be modified according to the teachings set forth herein.
With continued reference to FIGS. 1, 2 and 3, during the auto-negotiation period, the formatted base page 120 is transmitted to the link partner 126, and a similar base page (not shown) is transmitted from the link partner 126 to the node 102. The auto-negotiation circuitry 116 may be configured to utilize the transmit NONCE and echoed NONCE fields of the base page to automatically determine the lane width capabilities of the link partner. The auto-negotiation circuitry 116 may be configured to format the base page 120 with a desired and/or required lane status of the port associated with the PHY circuitry 110. During the auto-negotiation period, the source node 102 is configured to format the base page 120 by, among other things, populating the transmit NONCE field bits D20: 16 with a unique identifier and populating bits D45:21 with the technical abilities (PHY types) of the PHY circuitry 110. Assuming that the port 130 associated with the PHY circuitry 110 is a multi- lane port (e.g., 4-lane port, 10-lane port, etc.), the auto-negotiations circuitry 116 may begin to transmit the formatted base page 120 on a start lane (e.g., lane 0) of the link. In addition, the circuitry 116 is configured to "listen", on the Rx lines of one or more lanes, for a similarly-formatted base page from a link partner connected to at least one lane. In one example, circuitry 116 is configured to enable the PHY circuitry to "listen" for a transmitted base page from a link partner on all of the lanes of the port. If the base page received from the link partner indicates multi-lane capabilities, then the node 102 and the link partner may establish the highest-bandwidth common multi-lane capability, and thus establishing a multi- lane connection between the source node and the end node (link partner). If the received base page on any lane indicates single-lane capabilities only, the circuitry 116 may establish a single lane connection with the link partner on that lane, and may further discontinue transmitting multi-lane capabilities by reformatting the base page to include only single lane PHY capabilities.
Assuming that the link partner includes one or more single-lane connections, if a base page transmitted from the link partner to the source node includes an echoed NONCE field that matches the transmit NONCE field of the base page that was transmitted by the source node, this indicates that the source node and link partner are communicating on the same lane, and a single lane connection may be resolved between the source node and the link partner on that given lane. In some embodiments, however, the starting transmit lane (e.g., lane 0) may not coincide with a lane that the link partner is using to transmit its base page, for example, the source node is transmitting and listening on Lane 0, while the link partner is transmitting on Lane 1. In this case, the circuitry 116 may be configured to wait a defined period of time and shift the transmit function of the base page to the next (or some other) lane. Since the link partner may also shift transmitting its base page to another lane, and to avoid a symmetrical jump between lanes between the source node and end node (which may create an endless loop), the source node and end node may have different defined time periods before a lane shift occurs. For example, the defined time period for the source node may be based on the value of the transmit NONCE field, adjusted by the value of echoed NONCE field. Once the echoed nonce field matches the transmit NONCE field, a single lane link may be resolved for that lane. Of course, if the link partner is a single-lane port device, then the above description to avoid symmetrical lane jumping may not be necessary.
Circuitry 116 may be configured to shift to another lane, as described above, using, for example, round-robin ordering (e.g., lane shifting is done in sequence, starting at lane 0) and/or a preprogrammed lane shifting order, etc.
In some embodiments, although the source node 102 may include a multi-lane port, it may be desirable in some application to create one or more single lane connections with one or more link partners, rather than use a multi-lane connection, such as by using a "breakout" cable described above. In such embodiments, the circuitry 116 may format the base page 120 to only include single lane PHY capabilities, thus forcing the link partner to establish a single lane connection with the source node on one of the lanes. In other embodiments, if both the source node and the link partner support multi-lane capabilities, but it is determined that one of the lanes is not operational or has substantially degraded, the source node and/or the end node may be configured to discontinue transmitting multi-lane capabilities by reformatting the base page to include only single lane PHY capabilities, thus enabling at least one single lane connection between the source node and the end node.
Circuitry 116 may also be configured to provide lane re-ordering, for example, when a multi-lane port is coupled to a plurality of single lanes (such as a breakout connection, described above). In the conventional Ethernet protocol, there does not exist a mechanism to pair and reorder Tx from one lane a Rx from another lane. Accordingly, circuitry 116 may also be configured to map an Rx lane to a corresponding Tx lane to logically create a single lane i. Reordering may be due to accidental swapping of the pairs in the cable, or intentional, due to routing restrictions or other reasons. Reordering may be supported using, for example, upper stack layers (e.g., PMA, PCS, etc.) so that if the lanes are physically re-ordered, the circuitry 116 can identify that and logically correct the order to recover the original data stream.
FIG. 4 illustrates a flowchart of operations 400 of one example embodiment consistent with the present disclosure. The operations of this embodiment depict the determination and resolution of a multi-lane link or one or more single-lane links during an auto-negotiations period. At operation 402, a source node having a multi-lane port and multi-lane PHY capabilities sets a lane, Lane_i, to Lane 0. At operation 404, the source node listens for an auto-negotiation base page from one or more link partners on all lanes, and transmits a base page to a link partner on Lane 0, at operation 406. The transmitted base page may indicate multi-lane PHY capabilities and/or single-lane PHY capabilities, and may include a unique lane identifier in a transmit NONCE field. A base page received from (transmitted from) the at least one link partner may indicate multi-lane PHY capabilities and/or single-lane PHY capabilities, and may include a unique lane identifier in a transmit NONCE field, and may also include an echoed NONCE field. At operation 408, the source node determines if a base page that includes multi-lane capabilities is received on any lane, and if so, at operation 410, resolves the link as a multi-lane link between the source node and the link partner. Operations may also include ending the auto-negotiations mode and continuing on to other operational modes 412. If no multi-lane base page is received on any lane, operations may include determining if a single-lane base page is received on any lane 414. If a single-lane base page is received, operations may also include determining if the received base page from the link partner includes a correct echoed NONCE field (e.g. the echoed NONCE field matches the transmit NONCE field of the base page transmitted by the source node) 416. If the correct echoed NONCE field is determined, operations may also include creating a single lane connection by pairing the lane with the correct received echoed NONCE field with Lane_i 418. Operations may further include shifting to the next lane (e.g., Lane_i+1) and repeating the operations. If there is a mismatch between the transmit NONCE field and the echoed NONCE field, operations may also include waiting a defined time period and shifting the transmit lane to the next lane (e.g., Lane_i+1) 420, and repeating the operations until the transmit NONCE and the echoed NONCE match.
The foregoing includes example system architectures and methodologies.
Modifications to the present disclosure are possible. The host processor 106 may include one or more processor cores and may be configured to execute system software. System software may include, for example, operating system code (e.g., OS kernel code) and local area network (LAN) driver code. LAN driver code may be configured to control, at least in part, the operation of the network controller 104. System memory may include I/O memory buffers configured to store one or more data packets that are to be transmitted by, or received by, network controller 104. Chipset circuitry may generally include "North Bridge" circuitry (not shown) to control communication between the processor, network controller 104 and system memory 108.
Node 102 and/or link partner 126 may further include an operating system (OS, not shown) to manage system resources and control tasks that are run on, e.g., node 102. For example, the OS may be implemented using Microsoft Windows, HP-UX, Linux, or UNIX, although other operating systems may be used. In some embodiments, the OS may be replaced by a virtual machine monitor (or hypervisor) which may provide a layer of abstraction for underlying hardware to various operating systems (virtual machines) running on one or more processing units. The operating system and/or virtual machine may implement one or more protocol stacks. A protocol stack may execute one or more programs to process packets. An example of a protocol stack is a TCP/IP (Transport Control
Protocol/Internet Protocol) protocol stack comprising one or more programs for handling (e.g., processing or generating) packets to transmit and/or receive over a network. A protocol stack may alternatively be comprised of a dedicated sub-system such as, for example, a TCP offload engine and/or network controller 104. The TCP offload engine circuitry may be configured to provide, for example, packet transport, packet segmentation, packet reassembly, error checking, transmission acknowledgements, transmission retries, etc., without the need for host CPU and/or software involvement.
The system memory 108 may comprise one or more of the following types of memory: semiconductor firmware memory, programmable memory, non-volatile memory, read only memory, electrically programmable memory, random access memory, flash memory, magnetic disk memory, and/or optical disk memory. Either additionally or alternatively system memory may comprise other and/or later-developed types of computer- readable storage devices.
Embodiments of the operations described herein may be implemented in a system that includes at least one tangible computer-readable storage device having stored thereon, individually or in combination, instructions that when executed by one or more processors perform the operations. The one or more processors may include, for example, a processing unit and/or programmable circuitry in the network controller 104, system processor 106 and/or other processing unit or programmable circuitry. Thus, it is intended that operations according to the methods described herein may be distributed across a plurality of physical devices, such as processing structures at several different physical locations. The storage device may include any type of tangible, non-transitory storage device, for example, any type of disk including floppy disks, optical disks, compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic and static RAMs, erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), flash memories, magnetic or optical cards, or any type of storage device suitable for storing electronic instructions.
In some embodiments, a hardware description language (HDL) may be used to specify circuit and/or logic implementation(s) for the various circuitry described herein. For example, in one embodiment the hardware description language may comply or be compatible with a very high speed integrated circuits (VHSIC) hardware description language (VHDL) that may enable semiconductor fabrication of one or more circuits and/or logic described herein. The VHDL may comply or be compatible with IEEE Standard 1076- 1987, IEEE Standard 1076.2, IEEE1076.1, IEEE Draft 3.0 of VHDL-2006, IEEE Draft 4.0 of VHDL-2008 and/or other versions of the IEEE VHDL standards and/or other hardware description standards.
"Circuitry," as used in any embodiment herein, may comprise, for example, singly or in any combination, hardwired circuitry, programmable circuitry, state machine circuitry, logic and/or firmware that stores instructions executed by programmable circuitry. The circuitry may be embodied as an integrated circuit, such as an integrated circuit chip. In some embodiments, the circuitry may be formed, at least in part, by the system CPU 106 executing code and/or instructions sets (e.g., software, firmware, etc.) corresponding to the functionality described herein, thus transforming a general-purpose processor into a specific- purpose processing environment to perform one or more of the operations described herein. In some embodiments, the network controller 104 may be embodied as a stand-alone integrated circuit or may be incorporated as one of several components on an integrated circuit. In some embodiments, the various components and circuitry of the network controller 104 or other systems may be combined in a system-on-a-chip (SoC) architecture.
The terms and expressions which have been employed herein are used as terms of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding any equivalents of the features shown and described (or portions thereof), and it is recognized that various modifications are possible within the scope of the claims. Various features, aspects, and embodiments have been described herein. The features, aspects, and embodiments are susceptible to combination with one another as well as to variation and modification, as will be understood by those having skill in the art. The present disclosure should, therefore, be considered to encompass such combinations, variations, and modifications.

Claims

CLAIMS What is claimed is:
1. A network node element, comprising:
a multi-lane port that includes a plurality of lanes for communication with at least one link partner;
PHY circuitry including transmit circuitry and receive circuitry for each lane of the multi-lane port; and
an auto-negotiations circuitry to transmit, during an auto-negotiation period of transmission between the network node element and the at least one link partner, a first base page on a first lane of the multi-lane port to the link partner, the first base page including a field for specifying a transmit NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the PHY circuitry; the auto- negotiations circuitry also to receive, during the auto-negotiation period, a second base page on at least one lane from the at least one link partner, the second base page including a field for specifying a field an echoed NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the link partner; and wherein the auto-negotiations circuitry to determine if the link partner includes a multi-lane port or at least one single-lane port based on, at least in part, the at least one PHY capability identified in the second base page and to resolve a multi-lane connection or a single-lane connection with the link partner based on, at least in part, the at least one PHY capability identified in the second base page.
2. The network node element of claim 1, wherein the first and second base pages comprise a link codeword base page.
3. The network node element of claim 1, wherein the at least one PHY capability includes a maximum speed of the PHY circuitry and the at least one link width includes the number of lanes associated with the multi-lane port.
4. The network node element of claim 1, wherein if the at least one PHY capability and the at least one link width of the second base page indicate that the link partner is specifying a single-lane connection, the auto-negotiations circuitry is further to determine if a match exists between the echoed NONCE field of the second Base page and the transmit NONCE field of the first base page.
5. The network node element of claim 4, wherein if a match exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page, the auto-negotiations circuitry further to pair the transmit circuitry that transmits the first base page and the receive circuitry that receives the second base page to form a single lane connection between the network node element and the link partner.
6. The network node element of claim 4, wherein if a match does not exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page, the auto-negotiations circuitry further to wait a defined period of time and transmit the first base page on a second lane of the multi-lane port to the link partner.
7. The network node element of claim 1, wherein if the at least one PHY capability and the at least one link width of the second base page indicate that the link partner is capable of a multi-lane connection, the auto-negotiations circuitry is further to resolve a multi-lane connection between the node element and the link partner by selecting the fastest common PHY capability between the node element and the link partner.
8. A method comprising:
transmitting, by a network controller during an auto-negotiation period of transmission between the network controller and at least one link partner, a first base page on a first lane of a multi-lane port to the link partner, the first base page including a field for specifying a transmit NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the network controller;
receiving, by the network controller during the auto-negotiation period, a second base page on at least one lane from the at least one link partner, the second base page including a field for specifying a field for specifying an echoed NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the link partner;
determining, by the network controller during the auto-negotiation period, if the link partner includes a multi-lane port or at least one single-lane port based on, at least in part, the at least one PHY capability identified in the second base page; and resolving, by the network controller during the auto-negotiation period, a multi-lane connection or a single-lane connection with the link partner based on, at least in part, the at least one PHY capability identified in the second base page.
9. The method of claim 8, wherein the first and second base pages comprise a link codeword base page.
10. The method of claim 8, wherein the at least one PHY capability includes a maximum speed of PHY circuitry and the at least one link width includes the number of lanes associated with the multi-lane port.
11. The method of claim 8, further comprising: determining if the at least one PHY capability and the at least one link width of the second base page indicate that the link partner is specifying a single-lane connection; and determining if a match exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page.
12. The method of claim 11, further comprising: determining if a match exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page; and pairing transmit circuitry of the network controller receive circuitry of the link partner to form a single lane connection between the network controller and the link partner.
13. The method of claim 11, further comprising: determining if a match does not exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page; and waiting a defined period of time and transmitting the first base page on a second lane of the multi-lane port to the link partner.
14. The method of claim 8, further comprising: determining if the at least one PHY capability and the at least one link width of the second base page indicate that the link partner is capable of a multi-lane connection; and resolving a multi-lane connection between the network controller and the link partner by selecting the fastest common PHY capability between the network controller and the link partner.
15. A computer-readable storage device having stored thereon instructions that when executed by one or more processors result in the following operations comprising: transmit, by a network controller during an auto-negotiation period of transmission between the network controller and at least one link partner, a first base page on a first lane of a multi-lane port to the link partner, the first base page including a field for specifying a transmit NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the network controller;
receive, by the network controller during the auto-negotiation period, a second base page on at least one lane from the at least one link partner, the second base page including a field for specifying a field for specifying an echoed NONCE sequence number and a field for identifying at least one PHY capability and at least one link width associated with the link partner;
determine, by the network controller during the auto-negotiation period, if the link partner includes a multi-lane port or at least one single-lane port based on, at least in part, the at least one PHY capability identified in the second base page; and
resolve, by the network controller during the auto-negotiation period, a multi-lane connection or a single-lane connection with the link partner based on, at least in part, the at least one PHY capability identified in the second base page.
16. The computer-readable storage device of claim 15, wherein the instructions that when executed by one or more processors results in the following additional operations comprising: wherein the first and second base pages comprise a link codeword base page.
17. The computer-readable storage device of claim 15, wherein the at least one PHY capability includes a maximum speed of PHY circuitry and the at least one link width includes the number of lanes associated with the multi-lane port.
18. The computer-readable storage device of claim 15, wherein the instructions that when executed by one or more processors results in the following additional operations comprising: determine if the at least one PHY capability and the at least one link width of the second base page indicate that the link partner is specifying a single-lane connection; and determining if a match exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page.
19. The computer-readable storage device of claim 18, wherein the instructions that when executed by one or more processors results in the following additional operations comprising: determine if a match exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page; and pair transmit circuitry of the network controller receive circuitry of the link partner to form a single lane connection between the network controller and the link partner.
20. The computer-readable storage device of claim 18, wherein the instructions that when executed by one or more processors results in the following additional operations comprising: determine if a match does not exists between the echoed NONCE field of the second base page and the transmit NONCE field of the first base page; and wait a defined period of time and transmit the first base page on a second lane of the multi-lane port to the link partner.
21. The computer-readable storage device of claim 15, wherein the instructions that when executed by one or more processors results in the following additional operations comprising: determine if the at least one PHY capability and the at least one link width of the second base page indicate that the link partner is capable of a multi-lane connection; and resolve a multi- lane connection between the network controller and the link partner by selecting the fastest common PHY capability between the network controller and the link partner.
PCT/US2016/022365 2015-03-14 2016-03-14 Ethernet auto-negotiation techniques for determining link width WO2016149212A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562133320P 2015-03-14 2015-03-14
US62/133,320 2015-03-14

Publications (1)

Publication Number Publication Date
WO2016149212A1 true WO2016149212A1 (en) 2016-09-22

Family

ID=56919305

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/022365 WO2016149212A1 (en) 2015-03-14 2016-03-14 Ethernet auto-negotiation techniques for determining link width

Country Status (1)

Country Link
WO (1) WO2016149212A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789870A (en) * 2016-11-10 2017-05-31 华为技术有限公司 The method and ethernet device of transmitting signaling

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040208180A1 (en) * 2003-04-15 2004-10-21 Light Allen Miles System and method for supporting auto-negotiation among standards having different rates
US20070008898A1 (en) * 2005-06-29 2007-01-11 Intel Corporation Point-to-point link negotiation method and apparatus
US7720135B2 (en) * 2002-11-07 2010-05-18 Intel Corporation System, method and device for autonegotiation
US7836199B2 (en) * 2008-09-24 2010-11-16 Applied Micro Circuits Corporation System and method for multilane link rate negotiation
US8661313B2 (en) * 2009-03-09 2014-02-25 Intel Corporation Device communication techniques

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7720135B2 (en) * 2002-11-07 2010-05-18 Intel Corporation System, method and device for autonegotiation
US20040208180A1 (en) * 2003-04-15 2004-10-21 Light Allen Miles System and method for supporting auto-negotiation among standards having different rates
US20070008898A1 (en) * 2005-06-29 2007-01-11 Intel Corporation Point-to-point link negotiation method and apparatus
US7836199B2 (en) * 2008-09-24 2010-11-16 Applied Micro Circuits Corporation System and method for multilane link rate negotiation
US8661313B2 (en) * 2009-03-09 2014-02-25 Intel Corporation Device communication techniques

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789870A (en) * 2016-11-10 2017-05-31 华为技术有限公司 The method and ethernet device of transmitting signaling

Similar Documents

Publication Publication Date Title
US20220191128A1 (en) System and method for performing on-the-fly reduction in a network
US10305802B2 (en) Reliable transport of ethernet packet data with wire-speed and packet data rate match
US10084692B2 (en) Streaming bridge design with host interfaces and network on chip (NoC) layers
US9112722B2 (en) PMA-size training frame for 100GBASE-KP4
US9948507B2 (en) Backchannel communications for initialization of high-speed networks
US9338261B2 (en) Method for rapid PMA alignment in 100GBASE-KP4
US6961347B1 (en) High-speed interconnection link having automated lane reordering
EP3167580B1 (en) Method, system and logic for configuring a local link based on a remote link partner
CN109617572B (en) Decorrelating training pattern sequences between lanes and interconnects in a high-speed multi-lane link
US9455916B2 (en) Method and system for changing path and controller thereof
US11277308B2 (en) Technologies for autonegotiating 10G and 1G serial communications over copper cable
US20180227149A1 (en) Adaptive equalization channel extension retimer link-up methodology
US8589776B2 (en) Translation between a first communication protocol and a second communication protocol
US8861334B2 (en) Method and apparatus for lossless link recovery between two devices interconnected via multi link trunk/link aggregation group (MLT/LAG)
US20150186201A1 (en) Robust link training protocol
US10484519B2 (en) Auto-negotiation over extended backplane
US7843966B2 (en) Communication system for flexible use in different application scenarios in automation technology
WO2015039687A1 (en) Transmission and reception devices for reducing the delay in end-to-end delivery of network packets
WO2016149212A1 (en) Ethernet auto-negotiation techniques for determining link width
US11868209B2 (en) Method and system for sequencing data checks in a packet
US10237378B1 (en) Low-latency metadata-based packet rewriter
CN110881005A (en) Controller, method for adjusting packet communication rule and network communication system
WO2024001874A1 (en) Mode negotiation method and apparatus, device, system, and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16765563

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16765563

Country of ref document: EP

Kind code of ref document: A1